WO2020248829A1 - Audio and video processing method and display device - Google Patents

Audio and video processing method and display device Download PDF

Info

Publication number
WO2020248829A1
WO2020248829A1 PCT/CN2020/093101 CN2020093101W WO2020248829A1 WO 2020248829 A1 WO2020248829 A1 WO 2020248829A1 CN 2020093101 W CN2020093101 W CN 2020093101W WO 2020248829 A1 WO2020248829 A1 WO 2020248829A1
Authority
WO
WIPO (PCT)
Prior art keywords
audio
focal length
length information
video
current image
Prior art date
Application number
PCT/CN2020/093101
Other languages
French (fr)
Chinese (zh)
Inventor
杨香斌
王峰
Original Assignee
海信视像科技股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 海信视像科技股份有限公司 filed Critical 海信视像科技股份有限公司
Publication of WO2020248829A1 publication Critical patent/WO2020248829A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/141Systems for two-way working between two video terminals, e.g. videophone
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/4302Content synchronisation processes, e.g. decoder synchronisation
    • H04N21/4307Synchronising the rendering of multiple content streams or additional data on devices, e.g. synchronisation of audio on a mobile phone with the video output on the TV screen
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/431Generation of visual interfaces for content selection or interaction; Content or additional data rendering
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/431Generation of visual interfaces for content selection or interaction; Content or additional data rendering
    • H04N21/4312Generation of visual interfaces for content selection or interaction; Content or additional data rendering involving specific graphical features, e.g. screen layout, special fonts or colors, blinking icons, highlights or animations
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/443OS processes, e.g. booting an STB, implementing a Java virtual machine in an STB or power management in an STB
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/478Supplemental services, e.g. displaying phone caller identification, shopping application
    • H04N21/4788Supplemental services, e.g. displaying phone caller identification, shopping application communicating with other users, e.g. chatting
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/485End-user interface for client configuration
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/485End-user interface for client configuration
    • H04N21/4852End-user interface for client configuration for modifying audio parameters, e.g. switching between mono and stereo
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/67Focus control based on electronic image sensor signals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working

Definitions

  • This application relates to the technical field of display devices, and in particular to an audio and video processing method and a display device.
  • smart TVs have gradually begun to set up cameras for voice and video calls to realize the function of "watching and chatting" on the TV.
  • Smart TVs are usually fixedly installed in relatively large spaces such as living rooms, and people often keep a certain distance from them when they use them; and when people use smart TVs for voice and video calls, they often accompany people's movement. If the person using the smart TV voice and video call is moving along, the sound will be unstable. For example, if the voice and video call occurs, the voice fluctuates and the voice is very unstable, and sometimes even affects the voice and video call. get on.
  • This application provides an audio adjustment method, a video call method, and a display device to ensure the stability of the sound in the video call.
  • the present application provides an audio and video processing method, the method includes: receiving a current image generated according to a local image collected by a camera, and receiving generated audio according to a microphone collecting local sound; obtaining focal length information corresponding to the current image; The focal length information and a preset correspondence relationship are used to obtain microphone gains, where different microphone gains in the preset correspondence relationship correspond to different focal length information; the audio is adjusted according to the acquired microphone gain value; and the adjusted audio is sent To the peer device of the video call.
  • this application provides an audio and video processing method, the method includes: receiving the current image generated by the local image collected by the camera, and receiving the audio generated by the local sound collected by the microphone;
  • the focal length information of the current image adjusts the audio, and sends the adjusted audio and the current image to the peer device of the video call; if it is in the recording state, there is no need to adjust the audio according to the focal length information of the current image To generate a video file based on the current image and the audio.
  • the present application provides an audio and video processing method, the method includes: the auxiliary chip transmits the video image collected by the camera after automatic zoom processing to the main chip, and transmits the focal length information corresponding to the video image to the main chip
  • the main chip receives the video image and the focal length information; the main chip obtains microphone gain according to the focal length information, and performs gain processing on the audio corresponding to the video image according to the microphone gain ,
  • the main chip synchronizes the gain-processed audio with the video image, and transmits the synchronized audio and video to the display frame of the opposite end.
  • the present application provides a display device, the display device includes: a camera; a microphone; a controller, the controller is configured to: receive the current image generated according to the local image collected by the camera, and receive the local sound collected according to the microphone Generate audio; obtain focal length information corresponding to the current image; obtain microphone gain according to the focal length information and a preset correspondence relationship, wherein different microphone gains in the preset correspondence relationship correspond to different focal length information; according to the acquired microphone gain Adjust the audio according to the value; send the adjusted audio to the peer device of the video call.
  • the present application provides a display device, the display device includes: a camera; a microphone; a controller, the controller is configured to: receive the current image generated according to the local image collected by the camera, and receive the local sound collected according to the microphone Generate audio; if it is in a video call state, adjust the audio according to the focal length information of the current image, and send the adjusted audio and the current image to the peer device of the video call; if it is in a non-video call state, There is no need to adjust the audio according to the focal length information of the current image, and an audio and video file is generated according to the current image and the audio.
  • the present application provides a display device, the display device comprising: a camera; a microphone; a main chip and an auxiliary chip connected to each other; the auxiliary chip receives the local image collected by the camera, and converts the local image
  • the current image generated after automatic zoom processing is transmitted to the main chip, and the focal length information corresponding to the current image is transmitted to the main chip;
  • the main chip receives the current image and the focal length information;
  • the main chip Obtain the microphone gain according to the focal length information, and perform gain processing on the audio corresponding to the current image according to the microphone gain to reduce the fluctuation of the audio volume sent locally to the opposite end;
  • the main chip processes the audio after gain processing Synchronize with the video image, and transmit the synchronized audio and video to the display device at the opposite end.
  • Fig. 1 exemplarily shows a schematic diagram of an operation scenario between a display device and a control device according to an embodiment
  • FIG. 2 exemplarily shows a block diagram of the hardware configuration of the control device 100 according to the embodiment
  • FIG. 3 exemplarily shows a block diagram of the hardware configuration of the display device 200 according to the embodiment
  • FIG. 4 exemplarily shows a block diagram of the hardware architecture of the display device 200 according to FIG. 3;
  • FIG. 5 exemplarily shows a schematic diagram of the functional configuration of the display device 200 according to the embodiment
  • Fig. 6a exemplarily shows a schematic diagram of software configuration in the display device 200 according to the embodiment
  • FIG. 6b exemplarily shows a configuration diagram of an application program in the display device 200 according to the embodiment
  • FIG. 7 exemplarily shows a schematic diagram of the user interface in the display device 200 according to the embodiment.
  • FIG. 8 exemplarily shows a schematic flowchart of an audio adjustment method according to an embodiment
  • FIG. 9 exemplarily shows the calculation principle diagram of the focal length information according to the embodiment.
  • Fig. 10 exemplarily shows a schematic flowchart of a video call method according to an embodiment.
  • various external device interfaces are usually provided on the display device to facilitate the connection of different peripheral devices or cables to realize corresponding functions.
  • a high-definition camera is connected to the interface of the display device, if the hardware system of the display device does not have the hardware interface of the high-pixel camera that receives the source code, then the data received by the camera cannot be presented to the display of the display device. On the screen.
  • the hardware system of traditional display devices only supports one hard decoding resource, and usually only supports 4K resolution video decoding. Therefore, when you want to realize the video chat while watching Internet TV, in order not to reduce
  • the definition of the network video screen requires the use of hard decoding resources (usually the GPU in the hardware system) to decode the network video.
  • the general-purpose processor such as CPU
  • the video chat screen is processed by soft decoding.
  • Some embodiments of the present application disclose a dual hardware system architecture to implement multiple channels of video chat data (at least one local video).
  • module used in the various embodiments of the present application can refer to any known or later developed hardware, software, firmware, artificial intelligence, fuzzy logic, or combination of hardware or/and software code, which can execute related components. Function.
  • remote control used in the various embodiments of this application refers to a component of an electronic device (such as the display device disclosed in this application), which can generally control the electronic device wirelessly within a short distance.
  • the component can generally be connected to an electronic device using infrared and/or radio frequency (RF) signals and/or Bluetooth, and can also include at least one of functional modules such as WiFi, wireless USB, Bluetooth, and motion sensors.
  • RF radio frequency
  • a handheld touch remote control uses a user interface in a touch screen to replace most of the physical built-in hard keys in general remote control devices.
  • gesture used in the embodiments of the present application refers to a user's behavior through a change of hand shape or hand movement to express expected ideas, actions, goals, and/or results.
  • the term "hardware system” used in the various embodiments of this application may include integrated circuit (IC), printed circuit board (Printed circuit board, PCB) and other mechanical, optical, electrical, and magnetic devices with computing, At least one of the physical components of control, storage, input and output functions.
  • IC integrated circuit
  • PCB printed circuit board
  • other mechanical, optical, electrical, and magnetic devices with computing At least one of the physical components of control, storage, input and output functions.
  • Fig. 1 exemplarily shows a schematic diagram of an operation scenario between a display device and a control device according to an embodiment. As shown in FIG. 1, the user can operate the display device 200 by controlling the device 100.
  • the control device 100 may be a remote controller 100A, which can communicate with the display device 200 through at least one of infrared protocol communication, Bluetooth protocol communication, ZigBee protocol communication, or other short-distance communication methods for The display device 200 is controlled by wireless or other wired methods.
  • the user can control the display device 200 by inputting user instructions through keys on the remote control, voice input, control panel input, etc.
  • the user can control the display device 200 by inputting corresponding control commands through the volume plus and minus keys, channel control keys, up/down/left/right movement keys, voice input keys, menu keys, and switch buttons on the remote control. Function.
  • the control device 100 can also be a smart device, such as a mobile terminal 100B, a tablet computer, a computer, a notebook computer, etc., which can be connected through a local area network (LAN, Wide Area Network), a wide area network (WAN, Wide Area Network), and a wireless local area network ((WLAN) , Wireless Local Area Network) or at least one of other networks communicates with the display device 200, and controls the display device 200 through an application program corresponding to the display device 200. For example, using an application program running on a smart device Control the display device 200.
  • the application can provide various controls for the user through an intuitive user interface (UI, User Interface) on the screen associated with the smart device.
  • UI User Interface
  • both the mobile terminal 100B and the display device 200 can be installed with software applications, so that the connection and communication between the two can be realized through a network communication protocol, thereby realizing one-to-one control operation and data communication.
  • the mobile terminal 100B can establish a control command protocol with the display device 200, synchronize the remote control keyboard to the mobile terminal 100B, and control the display device 200 by controlling the user interface of the mobile terminal 100B; or the mobile terminal 100B
  • the audio and video content displayed on the screen is transmitted to the display device 200 to realize the synchronous display function.
  • the display device 200 can also communicate with the server 300 through multiple communication methods.
  • the display device 200 may be allowed to communicate with the server 300 through at least one of a local area network, a wireless local area network, or other networks.
  • the server 300 may provide various contents and interactions to the display device 200.
  • the display device 200 transmits and receives information, interacts with an Electronic Program Guide (EPG, Electronic Program Guide), receives software program updates, or accesses a remotely stored digital media library.
  • EPG Electronic Program Guide
  • the server 300 may be a group or multiple groups, and may be one or more types of servers.
  • the server 300 provides other network service content such as video on demand and advertising services.
  • the display device 200 may be a liquid crystal display, an OLED (Organic Light Emitting Diode) display, a projection display device, or a smart TV.
  • OLED Organic Light Emitting Diode
  • the specific display device type, size, resolution, etc. are not limited, and those skilled in the art can understand that the display device 200 can make some changes in performance and configuration as required.
  • the display device 200 may additionally provide a smart network TV function that provides a computer support function. Examples include Internet TV, Smart TV, Internet Protocol TV (IPTV) and so on.
  • IPTV Internet Protocol TV
  • the display device may be connected or provided with a camera, which is used to present the picture captured by the camera on the display interface of the display device or other display devices to realize interactive chats between users.
  • the picture captured by the camera may be displayed on the display device in full screen, half screen, or in any selectable area.
  • the camera is connected to the monitor rear shell through a connecting plate, and is fixedly installed on the upper middle of the monitor rear shell.
  • a connecting plate As an installable method, it can be fixedly installed at any position of the monitor rear shell to ensure its It is sufficient that the image capture area is not blocked by the rear shell, for example, the image capture area and the display device have the same orientation.
  • the camera can be connected to the display back shell through a connecting plate or other conceivable connectors.
  • a lifting motor is installed on the connector.
  • the camera used in this application may have 16 million pixels to achieve the purpose of ultra-high-definition display. In actual use, a camera with higher or lower than 16 million pixels can also be used.
  • the content displayed in different application scenarios of the display device can be merged in many different ways, so as to achieve functions that cannot be achieved by traditional display devices.
  • the user can video chat with at least one other user while watching a video program.
  • the presentation of the video program can be used as the background picture, and the video chat window is displayed on the background picture.
  • At least one video chat is conducted across terminals.
  • the user can video chat with at least one other user while entering the education application for learning.
  • students can realize remote interaction with teachers while learning content in educational applications. Visually, you can call this function "learning and chatting”.
  • a video chat is conducted with players entering the game.
  • players entering the game.
  • a player enters a game application to participate in a game, it can realize remote interaction with other players. Visually, you can call this function "watch and play".
  • the game scene is integrated with the video picture, and the portrait in the video picture is cut out and displayed on the game picture to improve user experience.
  • somatosensory games such as ball games, boxing games, running games, dancing games, etc.
  • human body postures and movements are acquired through the camera, body detection and tracking, and the detection of human bone key points data, and then the game Animations are integrated to realize games such as sports and dance scenes.
  • the user can interact with at least one other user in video and voice in the K song application.
  • multiple users can jointly complete the recording of a song.
  • the user can turn on the camera locally to obtain pictures and videos, which is vivid, and this function can be called "look in the mirror".
  • Fig. 2 exemplarily shows a configuration block diagram of the control device 100 according to an exemplary embodiment.
  • the control device 100 includes a controller 110, a communicator 130, a user input/output interface 140, a memory 190, and a power supply 180.
  • the control device 100 is configured to control the display device 200, and can receive user input operation instructions, and convert the operation instructions into instructions that can be recognized and responded to by the display device 200, and serve as an interactive intermediary between the user and the display device 200 effect.
  • the user operates the channel addition and subtraction keys on the control device 100, and the display device 200 responds to the channel addition and subtraction operations.
  • control device 100 may be a smart device.
  • control device 100 can install various applications for controlling the display device 200 according to user requirements.
  • the mobile terminal 100B or other smart electronic devices can perform similar functions to the control device 100 after installing an application for controlling the display device 200.
  • the user can install various function keys or virtual buttons of the graphical user interface that can be provided on the mobile terminal 100B or other smart electronic devices by installing applications to realize the function of the physical keys of the control device 100.
  • the controller 110 includes at least one of a processor 112, a RAM 113 and a ROM 114, a communication interface, and a communication bus.
  • the controller 110 is used to control the operation and operation of the control device 100, as well as the communication and cooperation between internal components, and external and internal data processing functions.
  • the communicator 130 realizes communication of control signals and data signals with the display device 200 under the control of the controller 110. For example, the received user input signal is sent to the display device 200.
  • the communicator 130 may include at least one of communication modules such as a WIFI module 131, a Bluetooth module 132, and an NFC module 133.
  • the user input/output interface 140 wherein the input interface includes at least one of input interfaces such as a microphone 141, a touch panel 142, a sensor 143, and a button 144.
  • input interfaces such as a microphone 141, a touch panel 142, a sensor 143, and a button 144.
  • the user can implement the user instruction input function through voice, touch, gesture, pressing and other actions.
  • the input interface converts the received analog signal into a digital signal and the digital signal into a corresponding instruction signal, which is sent to the display device 200.
  • the output interface includes an interface for sending the received user instruction to the display device 200.
  • it may be an infrared interface or a radio frequency interface.
  • the user input instruction needs to be converted into an infrared control signal according to the infrared control protocol, and sent to the display device 200 via the infrared sending module.
  • a radio frequency signal interface a user input instruction needs to be converted into a digital signal, which is then modulated according to the radio frequency control signal modulation protocol, and then sent to the display device 200 by the radio frequency transmitting terminal.
  • control device 100 includes at least one of a communicator 130 and an output interface.
  • the control device 100 is configured with a communicator 130, such as: WIFI, Bluetooth, NFC and other modules, which can encode user input instructions through the WIFI protocol, or Bluetooth protocol, or NFC protocol, and send to the display device 200.
  • a communicator 130 such as: WIFI, Bluetooth, NFC and other modules, which can encode user input instructions through the WIFI protocol, or Bluetooth protocol, or NFC protocol, and send to the display device 200.
  • the memory 190 is used to store various operating programs, data and applications for driving and controlling the control device 100 under the control of the controller 110.
  • the memory 190 can store various control signal instructions input by the user.
  • the power supply 180 is used to provide operating power support for each element of the control device 100 under the control of the controller 110. Can battery and related control circuit.
  • FIG. 3 exemplarily shows a hardware configuration block diagram of a hardware system in the display device 200 according to an exemplary embodiment.
  • the mechanism relationship of the hardware system can be shown in Figure 3.
  • one hardware system in the dual hardware system architecture is referred to as the first hardware system or A system, A chip, and the other hardware system is referred to as the second hardware system or N system, N chip.
  • the A chip includes the controller and various interfaces of the A chip
  • the N chip includes the controller and various interfaces of the N chip.
  • An independent operating system may be installed in the A chip and the N chip, so that there are two independent but interrelated subsystems in the display device 200.
  • the A chip and the N chip can realize connection, communication and power supply through multiple different types of interfaces.
  • the interface type of the interface between the A chip and the N chip may include at least one of general-purpose input/output (GPIO), USB interface, HDMI interface, UART interface, and the like.
  • GPIO general-purpose input/output
  • USB interface USB interface
  • HDMI interface HDMI interface
  • UART interface UART interface
  • One or more of these interfaces can be used between the A chip and the N chip for communication or power transmission.
  • the N chip can be powered by an external power source
  • the A chip can be powered by the N chip instead of the external power source.
  • the A chip may also include interfaces for connecting other devices or components, such as the MIPI interface for connecting to a camera (Camera) shown in FIG. 3, a Bluetooth interface, etc.
  • the N chip can also include a VBY interface for connecting to the display screen TCON (Timer Control Register), which is used to connect a power amplifier (Amplifier, AMP) and a speaker (Speaker). ) I2S interface; and at least one of IR/Key interface, USB interface, Wifi interface, Bluetooth interface, HDMI interface, Tuner interface, etc.
  • TCON Timer Control Register
  • AMP power amplifier
  • Speaker speaker
  • I2S interface I2S interface
  • I2S interface I2S interface
  • IR/Key interface at least one of IR/Key interface, USB interface, Wifi interface, Bluetooth interface, HDMI interface, Tuner interface, etc.
  • FIG. 4 is only an exemplary description of the dual hardware system architecture of the present application, and does not represent a limitation to the present application. In practical applications, both hardware systems can contain more or less hardware or interfaces as required.
  • FIG. 4 exemplarily shows a hardware architecture block diagram of the display device 200 according to FIG. 3.
  • the hardware system of the display device 200 may include an A chip and an N chip, and modules connected to the A chip or the N chip through various interfaces.
  • the N chip may include a tuner and demodulator 220, a communicator 230, an external device interface 250, a controller 210, a memory 290, a user input interface, a video processor 260-1, an audio processor 260-2, a display 280, and an audio output interface 272. At least one of the power supplies. In other embodiments, the N chip may also include more or fewer modules.
  • the tuner and demodulator 220 is used to perform modulation and demodulation processing such as amplifying, mixing, and resonating broadcast television signals received through wired or wireless methods, thereby demodulating the user’s information from multiple wireless or cable broadcast television signals. Select the audio and video signals carried in the frequency of the TV channel, and additional information (such as EPG data signals).
  • the signal path of the tuner and demodulator 220 can be varied, such as: terrestrial broadcasting, cable broadcasting, satellite broadcasting or Internet broadcasting; and according to different modulation types, the signal adjustment method can be digitally modulated The method may also be an analog modulation method; and according to different types of received television signals, the tuner demodulator 220 may demodulate analog signals and/or digital signals.
  • the tuner and demodulator 220 is also used to respond to the TV channel frequency selected by the user and the TV signal carried by the frequency according to the user's selection and control by the controller 210.
  • the tuner demodulator 220 may also be in an external device, such as an external set-top box.
  • the set-top box outputs TV audio and video signals through modulation and demodulation, and inputs them to the display device 200 through the external device interface 250.
  • the communicator 230 is a component for communicating with external devices or external servers according to various communication protocol types.
  • the communicator 230 may include a WIFI module 231, a Bluetooth communication protocol module 232, a wired Ethernet communication protocol module 233, and an infrared communication protocol module and other network communication protocol modules or near field communication protocol modules.
  • the display device 200 may establish a control signal and a data signal connection with an external control device or content providing device through the communicator 230.
  • the communicator may receive the control signal of the remote controller 100 according to the control of the controller.
  • the external device interface 250 is a component that provides data transmission between the N chip controller 210 and the A chip and other external devices.
  • the external device interface can be connected to external devices such as set-top boxes, game devices, notebook computers, etc. in a wired/wireless manner, and can receive external devices such as video signals (such as moving images), audio signals (such as music), and additional information (such as EPG). ) And other data.
  • the external device interface 250 may include: a high-definition multimedia interface (HDMI) terminal 251, a composite video blanking synchronization (CVBS) terminal 252, an analog or digital component terminal 253, a universal serial bus (USB) terminal 254, red, green, and blue ( RGB) terminal (not shown in the figure) and any one or more.
  • HDMI high-definition multimedia interface
  • CVBS composite video blanking synchronization
  • USB universal serial bus
  • RGB red, green, and blue
  • the controller 210 controls the work of the display device 200 and responds to user operations by running various software control programs (such as an operating system and/or various application programs) stored on the memory 290.
  • various software control programs such as an operating system and/or various application programs
  • the controller 210 includes at least one of a read-only memory RAM 213, a random access memory ROM 214, a graphics processor 216, a CPU processor 212, a communication interface 218, and a communication bus.
  • RAM213 and ROM214, graphics processor 216, CPU processor 212, and communication interface 218 are connected by a bus.
  • the graphics processor 216 is used to generate various graphics objects, such as icons, operation menus, and user input instructions to display graphics. Including an arithmetic unit, which performs operations by receiving various interactive commands input by the user, and displays various objects according to display attributes. As well as including a renderer, various objects obtained based on the arithmetic unit are generated, and the rendering result is displayed on the display 280.
  • the CPU processor 212 is configured to execute operating system and application program instructions stored in the memory 290. And according to receiving various interactive instructions input from the outside, to execute various applications, data and content, so as to finally display and play various audio and video content.
  • the CPU processor 212 may include multiple processors.
  • the multiple processors may include one main processor and multiple or one sub-processors.
  • the main processor is used to perform some operations of the display device 200 in the pre-power-on mode, and/or to display images in the normal mode.
  • the communication interface may include the first interface 218-1 to the nth interface 218-n. These interfaces may be network interfaces connected to external devices via a network.
  • the controller 210 may control the overall operation of the display device 200. For example, in response to receiving a user command for selecting a UI object to be displayed on the display 280, the controller 210 may perform an operation related to the object selected by the user command.
  • the object may be any one of the selectable objects, such as a hyperlink or an icon.
  • Operations related to the selected object for example: display operations connected to hyperlink pages, documents, images, etc., or perform operations corresponding to the icon.
  • the user command for selecting the UI object may be a command input through various input devices (for example, a mouse, a keyboard, a touch pad, etc.) connected to the display device 200 or a voice command corresponding to the voice spoken by the user.
  • the memory 290 includes storing various software modules for driving and controlling the display device 200.
  • various software modules stored in the memory 290 include: at least one of a basic module, a detection module, a communication module, a display control module, a browser module, and various service modules.
  • the basic module is the underlying software module used for signal communication between various hardware in the display device 200 and sending processing and control signals to the upper module.
  • the detection module is a management module used to collect various information from various sensors or user input interfaces, and perform digital-to-analog conversion and analysis management.
  • the voice recognition module includes a voice analysis module and a voice command database module.
  • the display control module is a module for controlling the display 280 to display image content, and can be used to play information such as multimedia image content and UI interfaces.
  • the communication module is a module used for control and data communication with external devices.
  • the browser module is a module used to perform data communication between browsing servers.
  • the service module is a module used to provide various services and various applications.
  • the memory 290 is also used to store and receive external data and user data, images of various items in various user interfaces, and visual effect diagrams of focus objects.
  • the user input interface is used to send a user's input signal to the controller 210, or to transmit a signal output from the controller to the user.
  • the control device (such as a mobile terminal or a remote control) may send input signals input by the user, such as a power switch signal, a channel selection signal, and a volume adjustment signal, to the user input interface, and then the user input interface forwards the input signal to the controller;
  • the control device may receive output signals such as audio, video, or data output from the user input interface processed by the controller, and display the received output signal or output the received output signal as audio or vibration.
  • the user may input a user command on a graphical user interface (GUI) displayed on the display 280, and the user input interface receives the user input command through the graphical user interface (GUI).
  • GUI graphical user interface
  • the user can input a user command by inputting a specific sound or gesture, and the user input interface recognizes the sound or gesture through the sensor to receive the user input command.
  • the video processor 260-1 is used to receive video signals, and perform video data processing such as decompression, decoding, scaling, noise reduction, frame rate conversion, resolution conversion, and image synthesis according to the standard codec protocol of the input signal.
  • the video signal displayed or played directly on the display 280.
  • the video processor 260-1 includes a demultiplexing module, a video decoding module, an image synthesis module, a frame rate conversion module, a display formatting module, and the like.
  • the demultiplexing module is used to demultiplex the input audio and video data stream. For example, if MPEG-2 is input, the demultiplexing module will demultiplex into video signals and audio signals.
  • the video decoding module is used to process the demultiplexed video signal, including decoding and scaling.
  • An image synthesis module such as an image synthesizer, is used to superimpose and mix the GUI signal generated by the graphics generator with the zoomed video image according to user input or itself to generate an image signal for display.
  • Frame rate conversion module used to convert the frame rate of the input video, such as converting the frame rate of the input 24Hz, 25Hz, 30Hz, 60Hz video to the frame rate of 60Hz, 120Hz or 240Hz, where the input frame rate can be compared with the source
  • the video stream is related, and the output frame rate can be related to the update rate of the display.
  • the input has a usual format, such as frame insertion.
  • the display formatting module is used to change the signal output by the frame rate conversion module into a signal that conforms to a display format such as a display, such as format conversion of the signal output by the frame rate conversion module to output RGB data signals.
  • the display 280 is used to receive the image signal input from the video processor 260-1, display video content and images, and a menu control interface.
  • the display 280 includes a display component for presenting a picture and a driving component for driving image display.
  • the displayed video content can be from the video in the broadcast signal received by the tuner and demodulator 220, or from the video content input by the communicator or the interface of an external device.
  • the display 220 simultaneously displays a user manipulation interface UI generated in the display device 200 and used to control the display device 200.
  • the display 280 it also includes a driving component for driving the display.
  • the display 280 is a projection display, it may also include a projection device and a projection screen.
  • the audio processor 260-2 is used to receive audio signals, and perform decompression and decoding according to the standard codec protocol of the input signal, as well as audio data processing such as noise reduction, digital-to-analog conversion, and amplification processing, and the result can be in the speaker 272 The audio signal to be played.
  • the audio output interface 270 is used to receive the audio signal output by the audio processor 260-2 under the control of the controller 210.
  • the audio output interface may include a speaker 272 or output to an external audio output terminal 274 of the generator of an external device, such as : External audio terminal or headphone output terminal, etc.
  • the video processor 260-1 may include one or more chips.
  • the audio processor 260-2 may also include one or more chips.
  • the video processor 260-1 and the audio processor 260-2 may be separate chips, or they may be integrated with the controller 210 in one or more chips.
  • the power supply is used to provide power supply support for the display device 200 with power input from an external power supply under the control of the controller 210.
  • the power supply may include a built-in power supply circuit installed inside the display device 200, or may be a power supply installed outside the display device 200, such as a power interface that provides an external power supply in the display device 200.
  • the A chip may include a controller 310, a communicator 330, a detector 340, and a memory 390. In some embodiments, it may also include a user input interface, a video processor, an audio processor, a display, and an audio output interface. In some embodiments, there may also be a power supply that independently powers the A chip.
  • the communicator 330 is a component for communicating with external devices or external servers according to various communication protocol types.
  • the communicator 330 may include a WIFI module 331, a Bluetooth communication protocol module 332, a wired Ethernet communication protocol module 333, and an infrared communication protocol module and other network communication protocol modules or near field communication protocol modules.
  • the communicator 330 of the A chip and the communicator 230 of the N chip also interact with each other.
  • the WiFi module 231 of the N chip is used to connect to an external network and generate network communication with an external server and the like.
  • the WiFi module 331 of the A chip is used to connect to the WiFi module 231 of the N chip, and does not directly connect to an external network or the like. Therefore, for the user, a display device as in the above embodiment can externally display a WiFi account.
  • the detector 340 is a component used by the chip of the display device A to collect signals from the external environment or interact with the outside.
  • the detector 340 may include a light receiver 342, a sensor used to collect the intensity of ambient light, which can adaptively display parameter changes by collecting ambient light, etc.; it may also include an image collector 341, such as a camera, a camera, etc., which can be used to collect external Environmental scenes, as well as gestures used to collect user attributes or interact with users, can adaptively change display parameters, and can also recognize user gestures to achieve the function of interaction with users.
  • the external device interface 350 provides components for data transmission between the controller 310 and the N chip or other external devices.
  • the external device interface can be connected to external devices such as set-top boxes, game devices, notebook computers, etc., in a wired/wireless manner.
  • the controller 310 controls the work of the display device 200 and responds to user operations by running various software control programs (such as installed third-party applications, etc.) stored on the memory 390 and interacting with the N chip.
  • various software control programs such as installed third-party applications, etc.
  • the controller 310 includes at least one of a read-only memory ROM313, a random access memory RAM314, a graphics processor 316, a CPU processor 312, a communication interface 318, and a communication bus.
  • the ROM 313 and the RAM 314, the graphics processor 316, the CPU processor 312, and the communication interface 318 are connected by a bus.
  • the CPU processor 312 runs the system startup instruction in the ROM, and copies the temporary data of the operating system stored in the memory 390 to the RAM 314 to start the operating system. After the operating system is started, the CPU processor 312 copies the temporary data of the various application programs in the memory 390 to the RAM 314, and then starts to run and start the various application programs.
  • the CPU processor 312 is used to execute the operating system and application instructions stored in the memory 390, communicate with the N chip, transmit and interact with signals, data, instructions, etc., and execute various interactive instructions received from external inputs Various applications, data and content, in order to finally display and play various audio and video content.
  • the communication interface may include the first interface 318-1 to the nth interface 318-n. These interfaces may be network interfaces connected to external devices via a network, or network interfaces connected to the N chip via a network.
  • the controller 310 may control the overall operation of the display device 200. For example, in response to receiving a user command for selecting a UI object to be displayed on the display 280, the controller 210 may perform an operation related to the object selected by the user command.
  • the graphics processor 316 is used to generate various graphics objects, such as icons, operation menus, and user input instructions to display graphics. Including an arithmetic unit, which performs operations by receiving various interactive commands input by the user, and displays various objects according to display attributes. As well as including a renderer, various objects obtained based on the arithmetic unit are generated, and the rendering result is displayed on the display 280.
  • Both the graphics processor 316 of the A chip and the graphics processor 216 of the N chip can generate various graphics objects. Differentily, if application 1 is installed on the A chip and application 2 is installed on the N chip, when the user is in the interface of the application 1 and the user inputs instructions in the application 1, the A chip graphics processor 316 generates a graphic object. When the user is on the interface of Application 2 and performs the user-input instructions in Application 2, the graphics processor 216 of the N chip generates the graphics object.
  • Fig. 5 exemplarily shows a schematic diagram of a functional configuration of a display device according to an exemplary embodiment.
  • the memory 390 of the A chip and the memory 290 of the N chip are used to store operating systems, applications, content, and user data, respectively.
  • the controller 310 of the A chip and the memory 290 of the N chip The system operation of driving the display device 200 and responding to various operations of the user are performed under the control of the controller 210.
  • the memory 390 of the A chip and the memory 290 of the N chip may include volatile and/or nonvolatile memory.
  • the memory 290 is used to store the operating program that drives the controller 210 in the display device 200, and store various application programs built in the display device 200, various application programs downloaded by the user from an external device, and application related programs The various graphical user interfaces, and various objects related to the graphical user interface, user data information, and various internal data supporting applications.
  • the memory 290 is used to store system software such as an operating system (OS) kernel, middleware, and applications, and to store input video data and audio data, and other user data.
  • OS operating system
  • the memory 290 is used to store driver programs and related data such as the video processor 260-1 and the audio processor 260-2, the display 280, the communication interface 230, the tuner and demodulator 220, and the input/output interface.
  • the memory 290 may store software and/or programs.
  • the software programs used to represent an operating system (OS) include, for example, kernels, middleware, application programming interfaces (APIs), and/or application programs.
  • OS operating system
  • the kernel may control or manage system resources, or functions implemented by other programs (such as the middleware, API, or application program), and the kernel may provide interfaces to allow middleware and APIs, or applications to access the controller , In order to achieve control or management of system resources.
  • the memory 290 includes a broadcast receiving module 2901, a channel control module 2902, a volume control module 2903, an image control module 2904, a display control module 2905, an audio control module 2906, an external command recognition module 2907, a communication control module 2908, and an optical receiver At least one of a module 2909, a power control module 2910, an operating system 2911, and other application programs 2912, a browser module, and so on.
  • the controller 210 executes various software programs in the memory 290, such as: broadcast and television signal reception and demodulation function, TV channel selection control function, volume selection control function, image control function, display control function, audio control function, external command Various functions such as identification function, communication control function, optical signal receiving function, power control function, software control platform supporting various functions, and browser function.
  • the memory 390 includes storing various software modules for driving and controlling the display device 200.
  • various software modules stored in the memory 390 include: at least one of a basic module, a detection module, a communication module, a display control module, a browser module, and various service modules. Since the functions of the memory 390 and the memory 290 are relatively similar, please refer to the memory 290 for related parts, and will not be repeated here.
  • the memory 390 includes an image control module 3904, an audio control module 2906, an external command recognition module 3907, a communication control module 3908, an optical receiving module 3909, an operating system 3911, and other application programs 3912, a browser module, and so on.
  • the controller 210 executes various software programs in the memory 290, such as: image control function, display control function, audio control function, external command recognition function, communication control function, light signal receiving function, power control function, support for various Functional software control platform, and various functions such as browser functions.
  • the external command recognition module 2907 of the N chip and the external command recognition module 3907 of the A chip can recognize different commands.
  • the external command recognition module 3907 of the A chip may include a graphic recognition module 2907-1.
  • the graphic recognition module 3907-1 stores a graphic database, and the camera receives external commands. In order to control the display device, the corresponding relationship is made with the instructions in the graphics database.
  • the voice receiving device and the remote controller are connected to the N chip, the external command recognition module 2907 of the N chip may include a voice recognition module 2907-2.
  • the graphics recognition module 2907-2 stores a voice database, and the voice receiving device, etc.
  • the external voice commands or time correspond to the commands in the voice database to control the display device.
  • a control device 100 such as a remote controller is connected to the N chip, and the key command recognition module interacts with the control device 100.
  • Fig. 6a exemplarily shows a configuration block diagram of the software system in the display device 200 according to an exemplary embodiment.
  • the operating system 2911 includes operating software for processing various basic system services and for implementing hardware-related tasks, acting as a medium for data processing between application programs and hardware components.
  • part of the operating system kernel may include a series of software to manage the hardware resources of the display device and provide services for other programs or software codes.
  • part of the operating system kernel may include one or more device drivers, and the device drivers may be a set of software codes in the operating system to help operate or control devices or hardware associated with the display device.
  • the drive may contain code to manipulate video, audio, and/or other multimedia components. Examples include displays, cameras, Flash, WiFi, and audio drivers.
  • the accessibility module 2911-1 is used to modify or access the application program, so as to realize the accessibility of the application program and the operability of its display content.
  • the communication module 2911-2 is used to connect to other peripherals via related communication interfaces and communication networks.
  • the user interface module 2911-3 is used to provide objects that display the user interface for access by various applications, and can realize user operability.
  • the control application 2911-4 is used to control process management, including runtime applications.
  • the event transmission system 2914 can be implemented in the operating system 2911 or in the application 2912. In some embodiments, it is implemented in the operating system 2911 on the one hand, and implemented in the application program 2912 at the same time, for monitoring various user input events, and responding to the recognition results of various events or sub-events according to various events. And implement one or more sets of pre-defined operation procedures.
  • the event monitoring module 2914-1 is used to monitor input events or sub-events of the user input interface.
  • the event recognition module 2914-1 is used to input the definitions of various events to various user input interfaces, recognize various events or sub-events, and transmit them to the processing to execute the corresponding one or more groups of processing programs .
  • the event or sub-event refers to the input detected by one or more sensors in the display device 200 and the input of an external control device (such as the control device 100).
  • an external control device such as the control device 100.
  • various sub-events of voice input, gesture input sub-events of gesture recognition, and sub-events of remote control button command input of control devices include multiple forms, including but not limited to one or a combination of pressing up/down/left/right/, confirming keys, and pressing keys.
  • non-physical buttons such as moving, pressing, and releasing.
  • the interface layout management module 2913 which directly or indirectly receives various user input events or sub-events monitored by the event transmission system 2914, is used to update the layout of the user interface, including but not limited to the position of each control or sub-control in the interface, and the container
  • the size, position, level, etc. of the interface are related to various execution operations.
  • the application layer of the display device includes various applications that can be executed on the display device 200.
  • the application layer 2912 of the N chip may include, but is not limited to, one or more applications, such as video-on-demand applications, application centers, and game applications.
  • the application layer 3912 of the A chip may include, but is not limited to, one or more applications, such as a live TV application, a media center application, and so on. It should be noted that the application programs contained on the A chip and the N chip are determined according to the operating system and other designs. This application does not need to specifically limit and divide the application programs contained on the A chip and the N chip.
  • Live TV applications can provide live TV through different sources.
  • a live TV application may use input from cable TV, wireless broadcasting, satellite services, or other types of live TV services to provide TV signals.
  • the live TV application can display the video of the live TV signal on the display device 200.
  • Video-on-demand applications can provide videos from different storage sources. Unlike live TV applications, VOD provides video display from certain storage sources. For example, the video on demand can come from the server side of cloud storage, and from the local hard disk storage that contains the stored video programs.
  • Media center applications can provide various multimedia content playback applications.
  • the media center can provide services that are different from live TV or video on demand, and users can access various images or audio through the media center application.
  • Application center can provide storage of various applications.
  • the application program may be a game, an application program, or some other application program that is related to a computer system or other device but can be run on a display device.
  • the application center can obtain these applications from different sources, store them in the local storage, and then run on the display device 200.
  • FIG. 7 exemplarily shows a schematic diagram of a user interface in the display device 200 according to an exemplary embodiment.
  • the user interface includes multiple view display areas, for example, a first view display area 201 and a play screen 202, where the play screen includes layout of one or more different items.
  • the user interface also includes a selector indicating that the item is selected, and the position of the selector can be moved through user input to change the selection of different items.
  • multiple view display areas can present display screens of different levels.
  • the display area of the first view may present the content of the video chat item
  • the display area of the second view may present the content of the application layer item (eg, webpage video, VOD display, application screen, etc.).
  • the content of the display area of the second view includes content displayed on the video layer and part of the content displayed on the floating layer
  • the content of the display area of the first view includes content displayed on the floating layer.
  • the floating layers used in the first view display area and the second view display area are different floating layers.
  • the presentation of different view display areas has different priorities, and the display priorities of the view display areas are different between view display areas with different priorities.
  • the priority of the system layer (such as the video layer) is higher than that of the application layer.
  • two different display windows can be drawn in the same layer to achieve the same level of display.
  • the selector can be in the first view display area and the second view. Switch between display areas (ie switch between two display windows).
  • the size and position of the display area of the first view change, the size and position of the display area of the second view may change accordingly.
  • independent operating systems may be installed in the A chip and the N chip, so that there are two independent but related sub-systems in the display device 200. system.
  • both the A chip and the N chip can be independently installed with Android and various APPs, so that each chip can realize a certain function, and the A chip and the N chip can realize a certain function in cooperation.
  • a smart TV 200 that is not dual-chip (for example, a single-chip smart TV), there is a system chip, and the operating system controls the realization of all functions of the smart TV.
  • the camera is connected to the auxiliary chip, which can perform artificial intelligence operations on the image obtained by the camera;
  • the microphone is connected to the main chip, and the main chip performs gain processing on the sound collected by the microphone.
  • the auxiliary chip collects video images through the camera, and uses artificial intelligence application technologies such as face recognition and motion (lip shape) recognition.
  • the picture during the video call is no longer Limited to a fixed focal length picture, but a zoomable video that can be focused on the target speaker through the combination of face recognition + lip recognition, which can realize automatic face recognition regardless of the corner of the person or how far away from the camera.
  • Partial focus that is, the face can be kept unchanged in the display frame of the opposite end of the display device.
  • the display size of the face does not change as the distance between the person and the camera changes, but when the distance between the person and the camera changes, the person and the microphone (far-field The distance of the microphone will also change.
  • the embodiment of the present application also provides an audio adjustment method.
  • FIG. 8 is a schematic flowchart of an audio adjustment method provided by an embodiment of the application. As shown in Figure 8, the audio adjustment method provided by the embodiment of the present application includes:
  • the controller may include a main chip and an auxiliary chip.
  • the auxiliary chip obtains an image through a camera, and automatically focuses on the face when the image is collected.
  • the face auto-focusing realizes focusing by the phase method.
  • the phase method of focusing refers to judging whether it is in focus by the time sequence of the light beam reaching the photosensitive element, that is, the phase shift amount.
  • the camera will place a grid plate with light-transmitting and opaque lines alternately parallel to the photosensitive element, and place two light-receiving elements symmetrically along the optical axis at an appropriate position behind the grid plate.
  • the grid plate moves up and down along the vertical direction of the optical axis.
  • the focus plane and the network plate coincide (ie in focus)
  • the light passing through the grid plate will reach the two light receiving elements behind the plate at the same time;
  • the two beams can only reach the light receiving element one after the other, and there is a phase difference between the output signals.
  • the camera can quickly determine where to shift, instead of moving back and forth several times like the contrast type to achieve focus. Refer to Figure 9 for the calculation principle, which will not be repeated here.
  • the image transmitted to the opposite end is a cropped face image, that is, due to the existence of autofocus, regardless of the distance between the person and the TV, the face in the collected image is locally transmitted
  • the video received by the opposite end cannot feel the change in the distance between the local people and the TV. But if the sound uses a fixed gain, the face received by the opposite end cannot change the distance, but the sound received by the opposite end will change with the distance between the local person and the TV.
  • the camera of the auxiliary chip When the camera of the auxiliary chip collects images, it will automatically focus in real time and output focal length information in real time.
  • image processing the sharpness and focus of the image are determined by the amount of high-frequency components of the image. If there are more high-frequency components, the image is clear. Otherwise, the image is blurred. You need to adjust the focus to achieve clarity and determine the clear image.
  • FFT Fourier Transform
  • DCT Discrete Cosine Transform
  • each frame of image will output a value that characterizes whether the image is clear, such as image distance.
  • the calculation methods of Image distance include high frequency component method, smoothing method, threshold integration method, gray difference method, Laplace image energy function, and so on.
  • an improved gray-scale difference method can be used as the image sharpness evaluation function, that is, the sum of the squares of the brightness value differences between all pixels of an image and the surrounding pixels as the image
  • the focus evaluation function of calculates the value of the adjacent same-field image evaluation function, the focus evaluation function is as follows:
  • f(x, y) is expressed as the brightness value of the pixel in the x-th row and y-th column. This algorithm selects two adjacent pixels (the left and upper side of f(x,y) pixels) for comparison. When the image is in focus, F(x,y) is the largest, that is, the corresponding Image distance value is the largest. The image distance is calculated in real time by adjusting the focal length of the lens adaptively, and when the relative maximum is reached, the autofocus is completed and the corresponding focal length information is output.
  • the controller may not distinguish between the main chip and the auxiliary chip.
  • the controller starts the camera to collect the local image according to the input operation, and generates the current image according to the local image, and controls the microphone to collect the local sound to generate audio. Since the current image corresponds to the focal length information, the focal length information of the current image is acquired to adjust the sound.
  • the application that starts the camera and speaker for audio and video collection may be a video call application, or a video/self-timer application. Therefore, after collecting local images and local sounds, the controller also needs to determine whether it is in a video call Status. If it is, it means that the application that starts the camera and speaker for audio and video collection is a video call application. The audio needs to be adjusted according to the focal length information of the current image. The controller adjusts the audio after the focal length information of the current image. The audio and the current image are sent to the peer device of the video call. If it is not in the video call state, it means that the application that starts the camera and speaker for audio and video collection is a video/self-portrait application. There is no need to adjust the focus information of the current image. For the audio, the control system directly generates audio and video files based on the current image and the audio.
  • the video call status can be obtained by marking the application that starts the camera and speaker for audio and video collection through the application manager part of the controller, and the video call is made when the application that starts the camera and speaker for audio and video collection is started. It is marked as a video call state when it is applied, and it is marked as a non-video call state when other applications such as recording/Selfie and other applications are started when the camera and speaker are used for audio and video capture.
  • the user selects the video/self-portrait application to preview or record audio and video through the camera and speaker
  • the controller activates the camera to collect video images
  • the application can first display the preview interface , Display the current image after data processing based on the image collected by the camera in the preview interface, where the data processing can be the adjustment of image quality (such as brightness, contrast, chroma, color temperature, etc.), adding controls (decorative controls, layers) Etc.), or at least one of other treatments.
  • the preview interface can also be provided with a control for generating audio and video files.
  • the controller In response to the user's selection of the control for generating audio and video files, the controller generates the current image according to the image collected by starting the camera to collect the video image, and starts the speaker to collect the sound to generate audio , And combine the current image and audio into audio and video files.
  • the interface of the application may continue to display the current image.
  • the buffer data is continuously and periodically generated in the recorded audio and video files, and upon receiving the input operation instruction to save the video (an exemplary recording end operation, or the only operation instruction), according to the buffer
  • the data generates a video file. This can speed up the generation of video files that users feel.
  • the change information of Image distance reflects the change information of the focal length of the current lens, and so on, it can correspond to the change of the distance of the current user from the camera of the display device, that is, the change of the distance from the last call. .
  • the Image distance becomes larger, it means that the distance between the current user and the display device (camera) becomes larger, and if the Image distance becomes smaller, it means that the distance between the current user and the display device (camera) becomes smaller. Therefore, according to the Image distance and its changes, find out the focal length corresponding to when the Image distance reaches the relative maximum, that is, the focal length information corresponding to the current image.
  • the distance between the current user and the far-field microphone is determined according to the focal length information, thereby determining the change of the distance from the far-field microphone, and then obtaining the microphone gain, and performing gain processing on the collected audio data through the microphone gain.
  • the distance between the current user and the display device (camera) and the corresponding focal length information are calculated, and the empirical value is used to establish a preset correspondence between the microphone gain and the focal length information.
  • the microphone gain can be obtained according to the preset correspondence relationship between the microphone gain and the focal length information.
  • the preset correspondence between the microphone gain and the focus information is established based on the empirical value, as shown in Table 1, where Table 1 is only given as an example, and is not a limitation of this application.
  • the acquired focal length information is 0.2 mm
  • the corresponding microphone gain can be obtained as -10dB, and then the acquired microphone gain is used for the corresponding collected Audio data undergoes gain processing.
  • the corresponding microphone gain is calculated by interpolation. For example, if the acquired focal length information is 0.375 mm, the microphone gain can be calculated by the following equation, where X is the corresponding microphone gain.
  • the distance between the current user and the display device (camera) and the corresponding image distance may also be counted to establish a preset function model of the microphone gain and focal length information.
  • the focal length information corresponding to the current image in the video call is determined, the preset function model of microphone gain and focal length information is obtained; the microphone gain can be obtained according to the focal length information and the preset function model combining the microphone gain and focal length information .
  • an adaptive method may be adopted to acquire the focal length information according to the focal length information.
  • the controller will receive the local image collected by the camera, and according to the position of the person in the local image The local image is cropped to generate a current image of a preset size. Because the camera adjusts the focal length when the image is collected to obtain a clear image of a person, the current image corresponds to a focal length information.
  • the distance from the display device will change, but since the image transmitted to the opposite end is captured from the local image, the opposite end may not be able to see the image.
  • the movement of a person relative to the display device, but the change in the distance between the display device and the display device will cause the volume of the sound data collected by the speaker to change, so adjusting the gain of the audio data through the current image corresponding to different focal lengths can offset the person and display
  • the change of the distance between the devices will cause the volume of the sound data collected by the speaker to change.
  • the distance to the display device will change, but because the image transmitted to the opposite end is the area corresponding to the face/human body intercepted from the local image as the transmission to the opposite end.
  • the image on the other end may not see the movement of the person relative to the display device from the image. Since the focal length of the camera changes with the face/human body, adjusting the audio data gain through the current image corresponding to different focal lengths can offset the human and The change of the distance between the display devices will cause the volume of the sound data collected by the speaker to change, thus ensuring the consistency of the sound and image sent to the peer device.
  • the microphone gain obtained through the focal length information is used to adjust the audio received by the microphone, that is, the obtained microphone gain value is used to perform gain processing on the audio received by the microphone, which is convenient to ensure the audio sound received by the microphone. Stability of size.
  • the microphone gain is obtained through the focal length information, but the speaker gain is not processed, so that the audio of the opposite end can be output normally.
  • the audio adjustment method provided by the present application includes: obtaining the focal length information of the current video image in the video call, and obtaining the microphone gain according to the focal length information.
  • the focal length information is obtained according to the automatic zoom processing of the video image
  • the corresponding microphone gain is obtained according to the focal length information
  • the current audio data in the video call is gain processing by obtaining the microphone gain .
  • the focal length information is used to process the video image in the video call to determine the microphone gain, so as to realize the gain processing of the audio data based on the distance between the person and the microphone during the video call, so as to reduce the impact of the change of the distance between the person and the TV.
  • the fluctuation caused by the volume of the voice sent locally to the opposite end makes the volume of the voice sent locally to the opposite end basically unchanged, ensuring the stability of the sound during the video call.
  • FIG. 10 is a schematic flowchart of a video call method provided by an embodiment of the application.
  • the video call method provided by the embodiment of the present application includes:
  • the auxiliary chip transmits the video image processed by the automatic zoom to the main chip, and transmits the focal length information corresponding to the video image to the main chip.
  • the auxiliary chip receives the initial video image generated by the local image collected by the camera, and performs automatic zoom processing on the initial video image to generate a zoomed image.
  • the auxiliary chip determines whether the focal length information corresponding to the current image is consistent with the above The focal length information corresponding to the current image at one moment is the same; when the focal length information corresponding to the current image is different from the focal length information corresponding to the current image at the previous moment, the focal length information corresponding to the video image is transmitted to the main chip; When the focal length information corresponding to the current image is the same as the focal length information corresponding to the current image at the previous moment, the focal length information is not transmitted to the main chip or an identifier indicating that the focal length information remains unchanged is transmitted to the main chip. At this time, the focal length information is generated based on the focal length information of the auto zoom.
  • the auxiliary chip can directly transmit the initial image collected by the camera to the main chip.
  • the camera performs physical zoom processing, so the initial image is sufficient. Perform cropping to generate the current image.
  • the focal length information is generated according to the physical focal length information of the camera.
  • the auxiliary chip may perform automatic zoom processing on the initial image obtained by the physical zoom of the camera, and then generate a zoomed image and send it to the main chip.
  • the focal length information is generated according to the physical focal length information of the camera and the focal length information of the automatic zoom.
  • the auxiliary chip receives the initial video image generated by the local image collected by the camera, and automatically zooms the initial video image according to the position of the human face or human body and cuts it into a preset size to generate a zoomed image.
  • the main chip is cut.
  • the camera can be a zoom camera or a fixed focus camera.
  • the initial video image is generated by controlling the focal length of the camera according to the position of the face or human body. At this time, the initial video image can be directly used as a zoomed camera. , You can also continue auto zoom processing.
  • S202 The main chip receives the video image and the focal length information.
  • the main chip recognizes the face or human body in the zoomed image, and cuts the image according to the relative position of the face or human body in the image to generate the current image to send to the opposite device .
  • the focal length information of the current image is the focal length information of the image after zooming.
  • a face or a human body is recognized on the initial image, and the image is cut according to the relative position of the face or the human body in the image to generate the current image to send to the opposite device.
  • the main chip obtains the microphone gain according to the focal length information, and performs gain processing on the audio corresponding to the video image according to the microphone gain, so as to reduce the fluctuation of the audio volume sent locally to the peer.
  • S204 The main chip synchronizes the gain-processed audio with the video image, and transmits the synchronized audio and video to the opposite device.
  • the synchronized audio and video are periodically encapsulated into data packets and sent to a peer display device, so that the peer display device parses and plays audio and video.
  • the auxiliary chip collects a video image through a camera, and obtains an auto-zoomable video image through an auto-zoom process during the acquisition of the video image, and outputs focal length information corresponding to the auto-zoomed video image.
  • the auxiliary chip transmits the video image processed by the automatic zoom to the main chip, and at the same time transmits the focal length information corresponding to the video image to the main chip.
  • the main chip and the auxiliary chip include at least one of the communication modes of a network, a serial port, a USB, and an HDMI. Therefore, the auxiliary chip can transmit the video image and the focal length information corresponding to the video image to the main chip through the network, serial port, USB or HDMI.
  • the auxiliary chip can dynamically select any communication mode of network, serial port, USB, and HDMI based on the stability of communication between the auxiliary chip and the main chip, which is not specifically limited here.
  • the main chip When the auxiliary chip transmits the video image and the focal length information corresponding to the video image to the main chip through the network, serial port, USB or HDMI, the main chip receives the video image and the focal length information corresponding to the video image.
  • the video image processed by the automatic zoom is an image cropped in the image collected by the camera according to the focal length. In some embodiments, the video image processed by the automatic zoom is a face image cropped according to the tracking result of the automatic zoom.
  • the main chip collects audio information through a microphone, where the audio information collected by the microphone is synchronized with the video image collected by the camera.
  • the main chip receives the video image and the focal length information corresponding to the video image transmitted by the auxiliary chip, it determines the microphone gain according to the focal length information corresponding to the video image, and then performs gain processing on the collected audio information by determining the obtained microphone gain to obtain Audio after gain processing.
  • the main chip After performing gain processing on the collected audio information, the main chip synchronizes the gain-processed audio with the video image to obtain audio-visual synchronized video call data, and transmits the audio-visual synchronized video call data to the display frame of the opposite end. Complete the video call.
  • the video call method provided in the embodiments of this application realizes the mutual cooperation of the main chip and the auxiliary chip, and solves the problem that a single computing chip must support the instant communication function (video codec, transmission) of the video call and also perform real-time artificial intelligence algorithms. (Face recognition, lip shape recognition) calculation pressure.
  • the focal length information is obtained according to the automatic zoom processing of the video image, and the corresponding microphone gain is obtained according to the focal length information. Obtain the microphone gain.
  • the distinction between the main chip and the auxiliary chip may not be set, and all the corresponding operations are directly executed by the controller.
  • the main chip acquiring the microphone gain according to the focal length information includes:
  • the main chip obtains the preset correspondence between microphone gain and focus information
  • the main chip acquiring the microphone gain according to the focal length information includes:
  • the main chip acquires a preset function model of microphone gain and focal length information
  • the steps of obtaining the microphone gain through the preset corresponding relationship between the microphone gain and the focus information or the preset function model of the microphone gain and the focus information can refer to the audio adjustment method provided in the foregoing embodiment, which will not be repeated here.
  • the acquisition of the microphone gain in the video call method provided in the embodiments of the present application is not limited to the acquisition of the preset correspondence between the microphone gain and the focus information or the preset function model of the microphone gain and the focus information, and an adaptive method may also be used to obtain the focus. information.
  • the transmitting the focal length information corresponding to the video image to the main chip includes:
  • the focal length information corresponding to the video image is different from the focal length information corresponding to the video image at the previous moment, the focal length information corresponding to the video image is transmitted to the main chip.
  • the auxiliary chip transmits the video image processed by the automatic zoom to the main chip, by comparing the focal length information corresponding to the video image with the focal length information corresponding to the video image at the previous moment, it is determined that the focal length information corresponding to the video image at the current moment is compared Whether the focal length information corresponding to the video image has changed at the previous moment, when the focal length information corresponding to the video image at the current moment changes compared with the focal length information corresponding to the video image at the previous moment, the focal length information corresponding to the video image is transmitted to The main chip can now reduce the computational consumption of the main chip.
  • a display device is also provided.
  • the display device provided by the embodiment of the present application includes a display, and the display is configured to display a user interface;
  • the main chip connected to the display and the auxiliary chip connected to the main chip through at least one of a network, a serial port, a USB and an HDMI communication mode, wherein the main chip is configured to perform the methods provided in the foregoing embodiments Audio adjustment method; or,
  • the main chip and the auxiliary chip are configured to cooperatively execute the video call method provided in the foregoing embodiment.
  • the present application provides an audio and video processing method, the method includes:
  • the obtaining the microphone gain according to the focal length information includes: obtaining a preset correspondence between the microphone gain and the focal length information; searching the preset correspondence according to the focal length information to obtain the microphone gain .
  • the obtaining microphone gain according to the focal length information includes: obtaining a preset function model of microphone gain and focal length information; according to the focal length information and a preset combination of the microphone gain and focal length information Set up a function model to obtain microphone gain.
  • the method further includes: sending the current image to the video caller The opposite device.
  • the method before the obtaining the focal length information corresponding to the current image, the method further includes: determining whether it is currently in a video call state, and then performing the step of obtaining focal length information corresponding to the current image to process the audio; if If it is not in a video call state, the step of obtaining focal length information corresponding to the current image is not performed to process the audio.
  • the adjusting the audio according to the focal length information of the current image, and sending the adjusted audio and the current image to the peer device of the video call includes: obtaining the focal length corresponding to the current image Information; obtain microphone gain according to the focal length information and a preset correspondence, wherein different microphone gains in the preset correspondence correspond to different focal length information; adjust the audio according to the acquired microphone gain value; after the adjustment The audio and the current image are sent to the peer device of the video call.
  • generating a video file based on the current image and the audio includes: generating buffer data based on the current image and the audio superimposing; receiving an input operation instruction to save the video, according to the buffer The data generates a video file.
  • the audio adjustment method and the video call method please refer to the above-mentioned embodiment and other features of the display device provided in the embodiment of the present application can be referred to the display device 200 or other non-dual-chip display devices provided in the above-mentioned embodiment, which will not be repeated here.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Human Computer Interaction (AREA)
  • Controls And Circuits For Display Device (AREA)

Abstract

The present application provides an audio adjustment method, a video call method, and a display device, which are suitable for social TVs. The method comprises: obtaining focal length information corresponding to a current image during a video call; obtaining microphone gain according to the focal length information; and adjusting an audio received by a microphone according to the obtained microphone gain value. According to the present application, the focal length information is used to process video images during a video call to determine microphone gain, so that gain processing is performed on the audio data on the basis of the distance between a person and a microphone during the video call, thereby ensuring the sound stability during the video call.

Description

音视频处理方法及显示设备Audio and video processing method and display device
本申请要求于2019年06月10日提交中国专利局、申请号为201910497121.4、申请名称为″麦克风增益调节方法、视频聊天方法及显示设备″的中国专利申请以及2019年08月09日提交中国专利局、申请号为201910736428.5、申请名称为″音频增益调节方法、视频聊天方法及显示设备″的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application requires a Chinese patent application to be submitted to the Chinese Patent Office on June 10, 2019, the application number is 201910497121.4, and the application name is "Microphone gain adjustment method, video chat method and display device" and a Chinese patent application on August 9, 2019 Office, the application number is 201910736428.5, the application name is "audio gain adjustment method, video chat method and display device" Chinese patent application priority, the entire content of which is incorporated in this application by reference.
技术领域Technical field
本申请涉及显示设备技术领域,尤其涉及一种音视频处理方法及显示设备。This application relates to the technical field of display devices, and in particular to an audio and video processing method and a display device.
背景技术Background technique
随着智能电视的发展,智能电视上逐渐开始设置摄像头,用于语音视频通话,实现电视″边看边聊″的功能。通常智能电视被固定安装在客厅等相对较大空间内,人们使用的时候常常与其保持一定的距离;并且当人们在使用智能电视语音视频通话时,常伴随人的移动。若使用智能电视语音视频通话的人伴随移动,则会出现声音不平稳,如出现语音视频通话对对端的人听着声音忽大忽小,声音很不稳定,有时候甚至会影响语音视频通话的进行。With the development of smart TVs, smart TVs have gradually begun to set up cameras for voice and video calls to realize the function of "watching and chatting" on the TV. Smart TVs are usually fixedly installed in relatively large spaces such as living rooms, and people often keep a certain distance from them when they use them; and when people use smart TVs for voice and video calls, they often accompany people's movement. If the person using the smart TV voice and video call is moving along, the sound will be unstable. For example, if the voice and video call occurs, the voice fluctuates and the voice is very unstable, and sometimes even affects the voice and video call. get on.
然而目前语音视频通话主要应用场景时在手机等手持移动设备上,虽然为保证在语音视频通话时的语音质量,手机等手持移动设备多是对声音进行降噪处理或增益处理。但是因为应用在手机等手持移动设备上的语音视频通话多是属于近场语音视频通话,所以目前手机等手持移动设备上的对声音进行降噪处理或增益处理多是对确定距离场景下声音的处理。但在使用智能电视进行语音视频通话属于远场语音视频通话,声音发出者人距离电视上麦克风的距离通常具有相对较大的距离并且距离随着声音发出者人的移动还可能存在变化。因此,手机等手持移动设备对语音视频通话中声音的处理技术并不能满足智能电视中语音视频通话使 用场景下声音稳定性的需求。However, the current main application scenarios for voice and video calls are on handheld mobile devices such as mobile phones. Although in order to ensure the voice quality during voice and video calls, mobile phones and other handheld mobile devices mostly perform noise reduction or gain processing on sound. However, because most of the voice and video calls applied on mobile phones and other handheld mobile devices are near-field voice and video calls, the current noise reduction or gain processing on the sound of handheld mobile devices such as mobile phones is mostly for the sound in a certain distance scenario deal with. However, a voice and video call using a smart TV is a far-field voice and video call. The distance between the speaker and the microphone on the TV usually has a relatively large distance and the distance may vary with the movement of the speaker. Therefore, the sound processing technology of mobile phones and other handheld mobile devices for voice and video calls cannot meet the demand for sound stability in the use of voice and video calls in smart TVs.
发明内容Summary of the invention
本申请提供了一种音频调节方法、视频通话方法及显示设备,保证视频通话中声音的稳定性。This application provides an audio adjustment method, a video call method, and a display device to ensure the stability of the sound in the video call.
第一方面,本申请提供了一种音视频处理方法,所述方法包括:接收根据摄像头采集本地图像生成的当前图像,并接收根据麦克风采集本地声音生成音频;获取当前图像对应的焦距信息;根据所述焦距信息及预设对应关系,获取麦克风增益,其中所述预设对应关系中不同的麦克风增益对应不同的焦距信息;根据获取到的麦克风增益值调整所述音频;将调整后的音频发送给视频通话的对端设备。In a first aspect, the present application provides an audio and video processing method, the method includes: receiving a current image generated according to a local image collected by a camera, and receiving generated audio according to a microphone collecting local sound; obtaining focal length information corresponding to the current image; The focal length information and a preset correspondence relationship are used to obtain microphone gains, where different microphone gains in the preset correspondence relationship correspond to different focal length information; the audio is adjusted according to the acquired microphone gain value; and the adjusted audio is sent To the peer device of the video call.
第二方面,本申请提供了一种音视频处理方法,所述方法包括:接收根据摄像头采集本地图像生成的当前图像,并接收根据麦克风采集本地声音生成音频;若处于视频通话状态,则根据所述当前图像的焦距信息调整所述音频,并将调整后的音频和所述当前图像发送给视频通话的对端设备;若处于录像状态,则无需根据所述当前图像的焦距信息调整所述音频,根据所述当前图像和所述音频生成录像文件。In the second aspect, this application provides an audio and video processing method, the method includes: receiving the current image generated by the local image collected by the camera, and receiving the audio generated by the local sound collected by the microphone; The focal length information of the current image adjusts the audio, and sends the adjusted audio and the current image to the peer device of the video call; if it is in the recording state, there is no need to adjust the audio according to the focal length information of the current image To generate a video file based on the current image and the audio.
第三方面,本申请提供了一种音视频处理方法,所述方法包括:辅芯片将通过自动变焦处理后摄像头采集的视频图像传输至主芯片,并将所述视频图像对应的焦距信息传输至所述主芯片;所述主芯片接收所述视频图像和所述焦距信息;所述主芯片根据所述焦距信息获取麦克风增益,并根据所述麦克风增益对所述视频图像对应的音频进行增益处理,以减小本地发送给对端的音频音量的波动;所述主芯片将增益处理后的音频与所述视频图像同步,并将同步后的音频和视频传输至对端的显示框。In a third aspect, the present application provides an audio and video processing method, the method includes: the auxiliary chip transmits the video image collected by the camera after automatic zoom processing to the main chip, and transmits the focal length information corresponding to the video image to the main chip The main chip; the main chip receives the video image and the focal length information; the main chip obtains microphone gain according to the focal length information, and performs gain processing on the audio corresponding to the video image according to the microphone gain , In order to reduce the fluctuation of the audio volume sent locally to the opposite end; the main chip synchronizes the gain-processed audio with the video image, and transmits the synchronized audio and video to the display frame of the opposite end.
第四方面,本申请提供了一种显示设备,所述显示设备包括:摄像头;麦克风;控制器,所述控制器用于:接收根据摄像头采集本地图像生成的当前图像,并接收根据麦克风采集本 地声音生成音频;获取当前图像对应的焦距信息;根据所述焦距信息及预设对应关系,获取麦克风增益,其中所述预设对应关系中不同的麦克风增益对应不同的焦距信息;根据获取到的麦克风增益值调整所述音频;将调整后的音频发送给视频通话的对端设备。In a fourth aspect, the present application provides a display device, the display device includes: a camera; a microphone; a controller, the controller is configured to: receive the current image generated according to the local image collected by the camera, and receive the local sound collected according to the microphone Generate audio; obtain focal length information corresponding to the current image; obtain microphone gain according to the focal length information and a preset correspondence relationship, wherein different microphone gains in the preset correspondence relationship correspond to different focal length information; according to the acquired microphone gain Adjust the audio according to the value; send the adjusted audio to the peer device of the video call.
第五方面,本申请提供了一种显示设备,所述显示设备包括:摄像头;麦克风;控制器,所述控制器用于:接收根据摄像头采集本地图像生成的当前图像,并接收根据麦克风采集本地声音生成音频;若处于视频通话状态,则根据所述当前图像的焦距信息调整所述音频,并将调整后的音频和所述当前图像发送给视频通话的对端设备;若处于非视频通话状态,则无需根据所述当前图像的焦距信息调整所述音频,根据所述当前图像和所述音频生成音视频文件。In a fifth aspect, the present application provides a display device, the display device includes: a camera; a microphone; a controller, the controller is configured to: receive the current image generated according to the local image collected by the camera, and receive the local sound collected according to the microphone Generate audio; if it is in a video call state, adjust the audio according to the focal length information of the current image, and send the adjusted audio and the current image to the peer device of the video call; if it is in a non-video call state, There is no need to adjust the audio according to the focal length information of the current image, and an audio and video file is generated according to the current image and the audio.
第六方面,本申请提供了一种显示设备,所述显示设备包括:摄像头;麦克风;相互连接的主芯片和辅芯片;所述辅芯片接收所述摄像头采集的本地图像,将所述本地图像通过自动变焦处理后生成的当前图像传输至主芯片,并将所述当前图像对应的焦距信息传输至所述主芯片;所述主芯片接收所述当前图像和所述焦距信息;所述主芯片根据所述焦距信息获取麦克风增益,并根据所述麦克风增益对所述当前图像对应的音频进行增益处理,以减小本地发送给对端的音频音量的波动;所述主芯片将增益处理后的音频与所述视频图像同步,并将同步后的音频和视频传输至对端的显示设备。In a sixth aspect, the present application provides a display device, the display device comprising: a camera; a microphone; a main chip and an auxiliary chip connected to each other; the auxiliary chip receives the local image collected by the camera, and converts the local image The current image generated after automatic zoom processing is transmitted to the main chip, and the focal length information corresponding to the current image is transmitted to the main chip; the main chip receives the current image and the focal length information; the main chip Obtain the microphone gain according to the focal length information, and perform gain processing on the audio corresponding to the current image according to the microphone gain to reduce the fluctuation of the audio volume sent locally to the opposite end; the main chip processes the audio after gain processing Synchronize with the video image, and transmit the synchronized audio and video to the display device at the opposite end.
附图说明Description of the drawings
为了更清楚地说明本申请的实施方式,下面将对实施例中所需要使用的附图作简单地介绍,显而易见地,对于本领域普通技术人员而言,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。In order to explain the implementation of the present application more clearly, the following will briefly introduce the drawings needed in the embodiments. Obviously, for those of ordinary skill in the art, without creative labor, Other drawings can be obtained from these drawings.
图1中示例性示出了根据实施例中显示设备与控制装置之间操作场景的示意图;Fig. 1 exemplarily shows a schematic diagram of an operation scenario between a display device and a control device according to an embodiment;
图2中示例性示出了根据实施例中控制装置100的硬件配置框图;FIG. 2 exemplarily shows a block diagram of the hardware configuration of the control device 100 according to the embodiment;
图3中示例性示出了根据实施例中显示设备200的硬件配置框图;FIG. 3 exemplarily shows a block diagram of the hardware configuration of the display device 200 according to the embodiment;
图4中示例性示出了根据图3显示设备200的硬件架构框图;FIG. 4 exemplarily shows a block diagram of the hardware architecture of the display device 200 according to FIG. 3;
图5中示例性示出了根据实施例中显示设备200的功能配置示意图;FIG. 5 exemplarily shows a schematic diagram of the functional configuration of the display device 200 according to the embodiment;
图6a中示例性示出了根据实施例中显示设备200中软件配置示意图;Fig. 6a exemplarily shows a schematic diagram of software configuration in the display device 200 according to the embodiment;
图6b中示例性示出了根据实施例中显示设备200中应用程序的配置示意图;FIG. 6b exemplarily shows a configuration diagram of an application program in the display device 200 according to the embodiment;
图7中示例性示出了根据实施例中显示设备200中用户界面的示意图;FIG. 7 exemplarily shows a schematic diagram of the user interface in the display device 200 according to the embodiment;
图8中示例性示出了根据实施例中音频调节方法的流程示意图;FIG. 8 exemplarily shows a schematic flowchart of an audio adjustment method according to an embodiment;
图9中示例性示出了根据实施例中焦距信息的计算原理图;FIG. 9 exemplarily shows the calculation principle diagram of the focal length information according to the embodiment;
图10中示例性示出了根据实施例中视频通话方法的流程示意图。Fig. 10 exemplarily shows a schematic flowchart of a video call method according to an embodiment.
具体实施方式Detailed ways
为使本申请的目的、实施方式和优点更加清楚,下面将结合本申请示例性实施例中的附图,对本申请示例性实施例进行清楚、完整地描述,显然,所描述的示例性实施例仅是本申请一部分实施例,而不是全部的实施例。In order to make the purpose, implementation and advantages of the present application clearer, the exemplary embodiments of the present application will be described clearly and completely with reference to the accompanying drawings in the exemplary embodiments of the present application. Obviously, the described exemplary embodiments It is only a part of the embodiments of this application, but not all the embodiments.
为便于用户使用,显示设备上通常会设置各种外部装置接口,以便于连接不同的外设设备或线缆以实现相应的功能。而在显示设备的接口上连接有高清晰度的摄像头时,如果显示设备的硬件系统没有接收源码的高像素摄像头的硬件接口,那么就会导致无法将摄像头接收到的数据呈现到显示设备的显示屏上。For the convenience of users, various external device interfaces are usually provided on the display device to facilitate the connection of different peripheral devices or cables to realize corresponding functions. When a high-definition camera is connected to the interface of the display device, if the hardware system of the display device does not have the hardware interface of the high-pixel camera that receives the source code, then the data received by the camera cannot be presented to the display of the display device. On the screen.
并且,受制于硬件结构,传统显示设备的硬件系统仅支持一路硬解码资源,且通常最大仅能支持4K分辨率的视频解码,因此当要实现边观看网络电视边进行视频聊天时,为了不降低网络视频画面清晰度,就需要使用硬解码资源(通常是硬件系统中的GPU)对网络视频进行解码,而在此情况下,只能采取由硬件系统中的通用处理器(例如CPU)对视频进行软解码的方式处理视频聊天画面。In addition, due to the hardware structure, the hardware system of traditional display devices only supports one hard decoding resource, and usually only supports 4K resolution video decoding. Therefore, when you want to realize the video chat while watching Internet TV, in order not to reduce The definition of the network video screen requires the use of hard decoding resources (usually the GPU in the hardware system) to decode the network video. In this case, the general-purpose processor (such as CPU) in the hardware system can only be used to decode the video. The video chat screen is processed by soft decoding.
采用软解码处理视频聊天画面,会大大增加CPU的数据处理负担,当CPU的数据处理负担过重时,可能会出现画面卡顿或者不流畅的问题。进一步的,受制于CPU的数据处理能力,当采用CPU软解码处理视频聊天画面时,通常无法实现多路视频通话,当用户想要再同一聊天场景同时与多个其他用户进行视频聊天时,会出现接入受阻的情况。Using soft decoding to process the video chat screen will greatly increase the data processing burden of the CPU. When the CPU's data processing burden is too heavy, the picture may freeze or become unsmooth. Further, subject to the data processing capability of the CPU, when the CPU soft decoding is used to process the video chat screen, it is usually impossible to achieve multi-channel video calls. When the user wants to simultaneously video chat with multiple other users in the same chat scene, it will There is a situation where access is blocked.
本申请一些实施方式公开了一种双硬件系统架构,以实现多路视频聊天数据(至少一路本地视频)。Some embodiments of the present application disclose a dual hardware system architecture to implement multiple channels of video chat data (at least one local video).
下面首先结合附图对本申请所涉及的概念进行说明。在此需要指出的是,以下对各个概念的说明,仅为了使本申请的内容更加容易理解,并不表示对本申请保护范围的限定。The following first describes the concepts involved in the present application with reference to the drawings. It should be pointed out here that the following description of each concept is only to make the content of this application easier to understand, and does not mean to limit the protection scope of this application.
本申请各实施例中使用的术语″模块″,可以是指任何已知或后来开发的硬件、软件、 固件、人工智能、模糊逻辑或硬件或/和软件代码的组合,能够执行与该元件相关的功能。The term "module" used in the various embodiments of the present application can refer to any known or later developed hardware, software, firmware, artificial intelligence, fuzzy logic, or combination of hardware or/and software code, which can execute related components. Function.
本申请各实施例中使用的术语″遥控器″,是指电子设备(如本申请中公开的显示设备)的一个组件,该组件通常可在较短的距离范围内无线控制电子设备。该组件一般可以使用红外线和/或射频(RF)信号和/或蓝牙与电子设备连接,也可以包括WiFi、无线USB、蓝牙、动作传感器等功能模块中的至少一个。例如:手持式触摸遥控器,是以触摸屏中用户界面取代一般遥控装置中的大部分物理内置硬键。The term "remote control" used in the various embodiments of this application refers to a component of an electronic device (such as the display device disclosed in this application), which can generally control the electronic device wirelessly within a short distance. The component can generally be connected to an electronic device using infrared and/or radio frequency (RF) signals and/or Bluetooth, and can also include at least one of functional modules such as WiFi, wireless USB, Bluetooth, and motion sensors. For example, a handheld touch remote control uses a user interface in a touch screen to replace most of the physical built-in hard keys in general remote control devices.
本申请各实施例中使用的术语″手势″,是指用户通过一种手型的变化或手部运动等动作,用于表达预期想法、动作、目的/或结果的用户行为。The term "gesture" used in the embodiments of the present application refers to a user's behavior through a change of hand shape or hand movement to express expected ideas, actions, goals, and/or results.
本申请各实施例中使用的术语″硬件系统″,可以包含由集成电路(Integrated Circuit,IC)、印刷电路板(Printed circuit board,PCB)等机械、光、电、磁器件构成的具有计算、控制、存储、输入和输出功能的实体部件中的至少一个。The term "hardware system" used in the various embodiments of this application may include integrated circuit (IC), printed circuit board (Printed circuit board, PCB) and other mechanical, optical, electrical, and magnetic devices with computing, At least one of the physical components of control, storage, input and output functions.
图1中示例性示出了根据实施例中显示设备与控制装置之间操作场景的示意图。如图1所示,用户可通过控制装置100来操作显示设备200。Fig. 1 exemplarily shows a schematic diagram of an operation scenario between a display device and a control device according to an embodiment. As shown in FIG. 1, the user can operate the display device 200 by controlling the device 100.
其中,控制装置100可以是遥控器100A,其可与显示设备200之间通过红外协议通信、蓝牙协议通信、紫蜂(ZigBee)协议通信或其他短距离通信方式中的至少一个进行通信,用于通过无线或其他有线方式来控制显示设备200。用户可以通过遥控器上按键、语音输入、控制面板输入等输入用户指令,来控制显示设备200。如:用户可以通过遥控器上音量加减键、频道控制键、上/下/左/右的移动按键、语音输入按键、菜单键、开关机按键等输入相应控制指令,来实现控制显示设备200的功能。The control device 100 may be a remote controller 100A, which can communicate with the display device 200 through at least one of infrared protocol communication, Bluetooth protocol communication, ZigBee protocol communication, or other short-distance communication methods for The display device 200 is controlled by wireless or other wired methods. The user can control the display device 200 by inputting user instructions through keys on the remote control, voice input, control panel input, etc. For example, the user can control the display device 200 by inputting corresponding control commands through the volume plus and minus keys, channel control keys, up/down/left/right movement keys, voice input keys, menu keys, and switch buttons on the remote control. Function.
控制装置100也可以是智能设备,如移动终端100B、平板电脑、计算机、笔记本电脑等,其可以通过本地网(LAN,Local Area Network)、广域网(WAN,Wide Area Network)、无线局域网((WLAN,Wireless Local Area Network)或其他网络中的至少一个与显示设备200之间通信,并通过与显示设备200相应的应用程序实现对显示设备200的控制。例如,使用在智能设备上运行的应用程序控制显示设备200。该应用程序可以在与智能设备关联的屏幕上通过直观的用户界面(UI,User Interface)为用户提供各种控制。The control device 100 can also be a smart device, such as a mobile terminal 100B, a tablet computer, a computer, a notebook computer, etc., which can be connected through a local area network (LAN, Wide Area Network), a wide area network (WAN, Wide Area Network), and a wireless local area network ((WLAN) , Wireless Local Area Network) or at least one of other networks communicates with the display device 200, and controls the display device 200 through an application program corresponding to the display device 200. For example, using an application program running on a smart device Control the display device 200. The application can provide various controls for the user through an intuitive user interface (UI, User Interface) on the screen associated with the smart device.
示例的,移动终端100B与显示设备200均可安装软件应用,从而可通过网络通信协议实现二者之间的连接通信,进而实现一对一控制操作的和数据通信的目的。如:可以使移动终端100B与显示设备200建立控制指令协议,将遥控控制键盘同步到移动终端100B上,通过控制移动终端100B上用户界面,实现控制显示设备200的功能;也可以将移动终端100B上显示的音视频内容传输到显示设备200上,实现同步显示功能。For example, both the mobile terminal 100B and the display device 200 can be installed with software applications, so that the connection and communication between the two can be realized through a network communication protocol, thereby realizing one-to-one control operation and data communication. For example, the mobile terminal 100B can establish a control command protocol with the display device 200, synchronize the remote control keyboard to the mobile terminal 100B, and control the display device 200 by controlling the user interface of the mobile terminal 100B; or the mobile terminal 100B The audio and video content displayed on the screen is transmitted to the display device 200 to realize the synchronous display function.
如图1所示,显示设备200还可与服务器300通过多种通信方式进行数据通信。在本申请各个实施例中,可允许显示设备200通过局域网、无线局域网或其他网络中的至少一个与服务器300进行通信连接。服务器300可以向显示设备200提供各种内容和互动。As shown in FIG. 1, the display device 200 can also communicate with the server 300 through multiple communication methods. In various embodiments of the present application, the display device 200 may be allowed to communicate with the server 300 through at least one of a local area network, a wireless local area network, or other networks. The server 300 may provide various contents and interactions to the display device 200.
示例的,显示设备200通过发送和接收信息,以及电子节目指南(EPG,Electronic Program Guide)互动,接收软件程序更新,或访问远程储存的数字媒体库。服务器300可以是一组,也可以是多组,可以是一类或多类服务器。通过服务器300提供视频点播和广告服务等其他网络服务内容。Illustratively, the display device 200 transmits and receives information, interacts with an Electronic Program Guide (EPG, Electronic Program Guide), receives software program updates, or accesses a remotely stored digital media library. The server 300 may be a group or multiple groups, and may be one or more types of servers. The server 300 provides other network service content such as video on demand and advertising services.
显示设备200,可以是液晶显示器、OLED(Organic Light Emitting Diode)显示器、投影显示设备、智能电视。具体显示设备类型,尺寸大小和分辨率等不作限定,本领技术人员可以理解的是,显示设备200可以根据需要做性能和配置上的一些改变。The display device 200 may be a liquid crystal display, an OLED (Organic Light Emitting Diode) display, a projection display device, or a smart TV. The specific display device type, size, resolution, etc. are not limited, and those skilled in the art can understand that the display device 200 can make some changes in performance and configuration as required.
显示设备200除了提供广播接收电视功能之外,还可以附加提供计算机支持功能的智能网络电视功能。示例的包括,网络电视、智能电视、互联网协议电视(IPTV)等。In addition to providing the broadcast receiving TV function, the display device 200 may additionally provide a smart network TV function that provides a computer support function. Examples include Internet TV, Smart TV, Internet Protocol TV (IPTV) and so on.
如图1所述,显示设备上可以连接或设置有摄像头,用于将摄像头拍摄到的画面呈现在本显示设备或其他显示设备的显示界面上,以实现用户之间的交互聊天。在一些实施方式中,摄像头拍摄到的画面可在显示设备上全屏显示、半屏显示、或者显示任意可选区域。As shown in Figure 1, the display device may be connected or provided with a camera, which is used to present the picture captured by the camera on the display interface of the display device or other display devices to realize interactive chats between users. In some embodiments, the picture captured by the camera may be displayed on the display device in full screen, half screen, or in any selectable area.
作为一种可选的连接方式,摄像头通过连接板与显示器后壳连接,固定安装在显示器后壳的上侧中部,作为可安装的方式,可以固定安装在显示器后壳的任意位置,能保证其图像采集区域不被后壳遮挡即可,例如,图像采集区域与显示设备的显示朝向相同。As an optional connection method, the camera is connected to the monitor rear shell through a connecting plate, and is fixedly installed on the upper middle of the monitor rear shell. As an installable method, it can be fixedly installed at any position of the monitor rear shell to ensure its It is sufficient that the image capture area is not blocked by the rear shell, for example, the image capture area and the display device have the same orientation.
作为另一种可选的连接方式,摄像头通过连接板或者其他可想到的连接器可升降的与显示后壳连接,连接器上安装有升降马达,在用户观看角度,当用户要使用摄像头或者有应用程序要使用摄像头时,再升出显示器之上,当不需要使用摄像头时,其可内嵌到后壳之后,以达到保护摄像头免受损坏。As another optional connection method, the camera can be connected to the display back shell through a connecting plate or other conceivable connectors. A lifting motor is installed on the connector. When the user wants to use the camera or there is When the application wants to use the camera, it is raised above the display. When the camera is not needed, it can be embedded behind the back shell to protect the camera from damage.
作为一种实施例,本申请所采用的摄像头可以为1600万像素,以达到超高清显示目的。在实际使用中,也可采用比1600万像素更高或更低的摄像头。As an embodiment, the camera used in this application may have 16 million pixels to achieve the purpose of ultra-high-definition display. In actual use, a camera with higher or lower than 16 million pixels can also be used.
当显示设备上安装有摄像头以后,显示设备不同应用场景所显示的内容可得到多种不同方式的融合,从而达到传统显示设备无法实现的功能。When a camera is installed on the display device, the content displayed in different application scenarios of the display device can be merged in many different ways, so as to achieve functions that cannot be achieved by traditional display devices.
示例性的,用户可以在边观看视频节目的同时,与至少一位其他用户进行视频聊天。视频节目的呈现可作为背景画面,视频聊天的窗口显示在背景画面之上。形象的,可以称该功能为″边看边聊″。Exemplarily, the user can video chat with at least one other user while watching a video program. The presentation of the video program can be used as the background picture, and the video chat window is displayed on the background picture. Visually, you can call this function "watch and chat".
可选的,在″边看边聊″的场景中,在观看直播视频或网络视频的同时,跨终端的进行 至少一路的视频聊天。Optionally, in the scenario of "watching while chatting", while watching live video or network video, at least one video chat is conducted across terminals.
另一示例中,用户可以在边进入教育应用学习的同时,与至少一位其他用户进行视频聊天。例如,学生在学习教育应用程序中内容的同时,可实现与老师的远程互动。形象的,可以称该功能为″边学边聊″。In another example, the user can video chat with at least one other user while entering the education application for learning. For example, students can realize remote interaction with teachers while learning content in educational applications. Visually, you can call this function "learning and chatting".
另一示例中,用户在玩纸牌游戏时,与进入游戏的玩家进行视频聊天。例如,玩家在进入游戏应用参与游戏时,可实现与其他玩家的远程互动。形象的,可以称该功能为″边看边玩″。In another example, when a user is playing a card game, a video chat is conducted with players entering the game. For example, when a player enters a game application to participate in a game, it can realize remote interaction with other players. Visually, you can call this function "watch and play".
可选的,游戏场景与视频画面进行融合,将视频画面中人像进行抠图,显示在游戏画面中,提升用户体验。Optionally, the game scene is integrated with the video picture, and the portrait in the video picture is cut out and displayed on the game picture to improve user experience.
可选的,在体感类游戏中(如打球类、拳击类、跑步类、跳舞类等),通过摄像头获取人体姿势和动作,肢体检测和追踪、人体骨骼关键点数据的检测,再与游戏中动画进行融合,实现如体育、舞蹈等场景的游戏。Optionally, in somatosensory games (such as ball games, boxing games, running games, dancing games, etc.), human body postures and movements are acquired through the camera, body detection and tracking, and the detection of human bone key points data, and then the game Animations are integrated to realize games such as sports and dance scenes.
另一示例中,用户可以在K歌应用中,与至少一位其他用户进行视频和语音的交互。形象的,可以称该功能为″边看边唱″。优选的,当至少一位用户在聊天场景进入该应用时,可多个用户共同完成一首歌的录制。In another example, the user can interact with at least one other user in video and voice in the K song application. Visually, you can call this function "watch and sing". Preferably, when at least one user enters the application in the chat scene, multiple users can jointly complete the recording of a song.
另一个示例中,用户可在本地打开摄像头获取图片和视频,形象的,可以称该功能为″照镜子″。In another example, the user can turn on the camera locally to obtain pictures and videos, which is vivid, and this function can be called "look in the mirror".
在另一些示例中,还可以再增加更多功能或减少上述功能。本申请对该显示设备的功能不作具体限定。In other examples, more functions can be added or the aforementioned functions can be reduced. This application does not specifically limit the function of the display device.
图2中示例性示出了根据示例性实施例中控制装置100的配置框图。如图3所示,控制装置100包括控制器110、通信器130、用户输入/输出接口140、存储器190、供电电源180。Fig. 2 exemplarily shows a configuration block diagram of the control device 100 according to an exemplary embodiment. As shown in FIG. 3, the control device 100 includes a controller 110, a communicator 130, a user input/output interface 140, a memory 190, and a power supply 180.
控制装置100被配置为可控制所述显示设备200,以及可接收用户的输入操作指令,且将操作指令转换为显示设备200可识别和响应的指令,起到用户与显示设备200之间交互中介作用。如:用户通过操作控制装置100上频道加减键,显示设备200响应频道加减的操作。The control device 100 is configured to control the display device 200, and can receive user input operation instructions, and convert the operation instructions into instructions that can be recognized and responded to by the display device 200, and serve as an interactive intermediary between the user and the display device 200 effect. For example, the user operates the channel addition and subtraction keys on the control device 100, and the display device 200 responds to the channel addition and subtraction operations.
在一些实施方式中,控制装置100可是一种智能设备。如:控制装置100可根据用户需求安装控制显示设备200的各种应用。In some embodiments, the control device 100 may be a smart device. For example, the control device 100 can install various applications for controlling the display device 200 according to user requirements.
在一些实施方式中,如图1所示,移动终端100B或其他智能电子设备,可在安装操控显示设备200的应用之后,起到控制装置100类似功能。如:用户可以通过安装应用,在移动终端100B或其他智能电子设备上可提供的图形用户界面的各种功能键或虚拟按钮,以实现控制装置100实体按键的功能。In some embodiments, as shown in FIG. 1, the mobile terminal 100B or other smart electronic devices can perform similar functions to the control device 100 after installing an application for controlling the display device 200. For example, the user can install various function keys or virtual buttons of the graphical user interface that can be provided on the mobile terminal 100B or other smart electronic devices by installing applications to realize the function of the physical keys of the control device 100.
控制器110包括处理器112、RAM113和ROM114、通信接口以及通信总线中的至少一个。控制器110用于控制控制装置100的运行和操作,以及内部各部件之间通信协作以及外部和内部的数据处理功能。The controller 110 includes at least one of a processor 112, a RAM 113 and a ROM 114, a communication interface, and a communication bus. The controller 110 is used to control the operation and operation of the control device 100, as well as the communication and cooperation between internal components, and external and internal data processing functions.
通信器130在控制器110的控制下,实现与显示设备200之间控制信号和数据信号的通信。如:将接收到的用户输入信号发送至显示设备200上。通信器130可包括WIFI模块131、蓝牙模块132、NFC模块133等通信模块中至少一种。The communicator 130 realizes communication of control signals and data signals with the display device 200 under the control of the controller 110. For example, the received user input signal is sent to the display device 200. The communicator 130 may include at least one of communication modules such as a WIFI module 131, a Bluetooth module 132, and an NFC module 133.
用户输入/输出接口140,其中,输入接口包括麦克风141、触摸板142、传感器143、按键144等输入接口中至少一者。如:用户可以通过语音、触摸、手势、按压等动作实现用户指令输入功能,输入接口通过将接收的模拟信号转换为数字信号,以及数字信号转换为相应指令信号,发送至显示设备200。The user input/output interface 140, wherein the input interface includes at least one of input interfaces such as a microphone 141, a touch panel 142, a sensor 143, and a button 144. For example, the user can implement the user instruction input function through voice, touch, gesture, pressing and other actions. The input interface converts the received analog signal into a digital signal and the digital signal into a corresponding instruction signal, which is sent to the display device 200.
输出接口包括将接收的用户指令发送至显示设备200的接口。在一些实施方式中,可以是红外接口,也可以是射频接口。如:红外信号接口时,需要将用户输入指令按照红外控制协议转化为红外控制信号,经红外发送模块进行发送至显示设备200。再如:射频信号接口时,需将用户输入指令转化为数字信号,然后按照射频控制信号调制协议进行调制后,由射频发送端子发送至显示设备200。The output interface includes an interface for sending the received user instruction to the display device 200. In some embodiments, it may be an infrared interface or a radio frequency interface. For example, in the case of an infrared signal interface, the user input instruction needs to be converted into an infrared control signal according to the infrared control protocol, and sent to the display device 200 via the infrared sending module. For another example, in the case of a radio frequency signal interface, a user input instruction needs to be converted into a digital signal, which is then modulated according to the radio frequency control signal modulation protocol, and then sent to the display device 200 by the radio frequency transmitting terminal.
在一些实施方式中,控制装置100包括通信器130和输出接口中至少一者。控制装置100中配置通信器130,如:WIFI、蓝牙、NFC等模块,可将用户输入指令通过WIFI协议、或蓝牙协议、或NFC协议编码,发送至显示设备200.In some embodiments, the control device 100 includes at least one of a communicator 130 and an output interface. The control device 100 is configured with a communicator 130, such as: WIFI, Bluetooth, NFC and other modules, which can encode user input instructions through the WIFI protocol, or Bluetooth protocol, or NFC protocol, and send to the display device 200.
存储器190,用于在控制器110的控制下存储驱动和控制控制装置100的各种运行程序、数据和应用。存储器190,可以存储用户输入的各类控制信号指令。The memory 190 is used to store various operating programs, data and applications for driving and controlling the control device 100 under the control of the controller 110. The memory 190 can store various control signal instructions input by the user.
供电电源180,用于在控制器110的控制下为控制装置100各元件提供运行电力支持。可以电池及相关控制电路。The power supply 180 is used to provide operating power support for each element of the control device 100 under the control of the controller 110. Can battery and related control circuit.
图3中示例性示出了根据示例性实施例中显示设备200中硬件系统的硬件配置框图。FIG. 3 exemplarily shows a hardware configuration block diagram of a hardware system in the display device 200 according to an exemplary embodiment.
在采用双硬件系统架构时,硬件系统的机构关系可以图3所示。为便于表述以下将双硬件系统架构中的一个硬件系统称为第一硬件系统或A系统、A芯片,并将另一个硬件系统称为第二硬件系统或N系统、N芯片。A芯片包含A芯片的控制器及各类接口,N芯片则包含N芯片的控制器及各类接口。A芯片及N芯片中可以各自安装有独立的操作系统,从而使显示设备200中存在两个在独立但又存在相互关联的子系统。When the dual hardware system architecture is adopted, the mechanism relationship of the hardware system can be shown in Figure 3. For ease of description, one hardware system in the dual hardware system architecture is referred to as the first hardware system or A system, A chip, and the other hardware system is referred to as the second hardware system or N system, N chip. The A chip includes the controller and various interfaces of the A chip, and the N chip includes the controller and various interfaces of the N chip. An independent operating system may be installed in the A chip and the N chip, so that there are two independent but interrelated subsystems in the display device 200.
如图3所示,A芯片与N芯片之间可以通过多个不同类型的接口实现连接、通信及供电。A芯片与N芯片之间接口的接口类型可以包括通用输入输出接口(General-purpose  input/output,GPIO)、USB接口、HDMI接口、UART接口等中的至少一个。A芯片与N芯片之间可以使用这些接口中的一个或多个进行通信或电力传输。例如图3所示,在双硬件系统架构下,可以由外接的电源(power)为N芯片供电,而A芯片则可以不由外接电源,而由N芯片供电。As shown in Figure 3, the A chip and the N chip can realize connection, communication and power supply through multiple different types of interfaces. The interface type of the interface between the A chip and the N chip may include at least one of general-purpose input/output (GPIO), USB interface, HDMI interface, UART interface, and the like. One or more of these interfaces can be used between the A chip and the N chip for communication or power transmission. For example, as shown in Figure 3, in the dual hardware system architecture, the N chip can be powered by an external power source, and the A chip can be powered by the N chip instead of the external power source.
除用于与N芯片进行连接的接口之外,A芯片还可以包含用于连接其他设备或组件的接口,例如图3中所示的用于连接摄像头(Camera)的MIPI接口,蓝牙接口等。In addition to the interface for connecting with the N chip, the A chip may also include interfaces for connecting other devices or components, such as the MIPI interface for connecting to a camera (Camera) shown in FIG. 3, a Bluetooth interface, etc.
类似的,除用于与N芯片进行连接的接口之外,N芯片还可以包含用于连接显示屏TCON(Timer Control Register)的VBY接口,用于连接功率放大器(Amplifier,AMP)及扬声器(Speaker)的i2S接口;以及IR/Key接口,USB接口,Wifi接口,蓝牙接口,HDMI接口,Tuner接口等中的至少一个。Similarly, in addition to the interface for connecting with the N chip, the N chip can also include a VBY interface for connecting to the display screen TCON (Timer Control Register), which is used to connect a power amplifier (Amplifier, AMP) and a speaker (Speaker). ) I2S interface; and at least one of IR/Key interface, USB interface, Wifi interface, Bluetooth interface, HDMI interface, Tuner interface, etc.
下面结合图4对本申请双硬件系统架构进行进一步的说明。需要说明的是图4仅仅是对本申请双硬件系统架构的一个示例性说明,并不表示对本申请的限定。在实际应用中,两个硬件系统均可根据需要包含更多或更少的硬件或接口。The dual hardware system architecture of the present application will be further described below in conjunction with FIG. 4. It should be noted that FIG. 4 is only an exemplary description of the dual hardware system architecture of the present application, and does not represent a limitation to the present application. In practical applications, both hardware systems can contain more or less hardware or interfaces as required.
图4中示例性示出了根据图3显示设备200的硬件架构框图。如图4所示,显示设备200的硬件系统可以包括A芯片和N芯片,以及通过各类接口与A芯片或N芯片相连接的模块。FIG. 4 exemplarily shows a hardware architecture block diagram of the display device 200 according to FIG. 3. As shown in FIG. 4, the hardware system of the display device 200 may include an A chip and an N chip, and modules connected to the A chip or the N chip through various interfaces.
N芯片可以包括调谐解调器220、通信器230、外部装置接口250、控制器210、存储器290、用户输入接口、视频处理器260-1、音频处理器260-2、显示器280、音频输出接口272、供电电源中的至少一个。在其他实施例中N芯片也可以包括更多或更少的模块。The N chip may include a tuner and demodulator 220, a communicator 230, an external device interface 250, a controller 210, a memory 290, a user input interface, a video processor 260-1, an audio processor 260-2, a display 280, and an audio output interface 272. At least one of the power supplies. In other embodiments, the N chip may also include more or fewer modules.
其中,调谐解调器220,用于对通过有线或无线方式接收广播电视信号,进行放大、混频和谐振等调制解调处理,从而从多个无线或有线广播电视信号中解调出用户所选择电视频道的频率中所携带的音视频信号,以及附加信息(例如EPG数据信号)。根据电视信号广播制式不同,调谐解调器220的信号途径可以有很多种,诸如:地面广播、有线广播、卫星广播或互联网广播等;以及根据调制类型不同,所述信号的调整方式可以数字调制方式,也可以模拟调制方式;以及根据接收电视信号种类不同,调谐解调器220可以解调模拟信号和/或数字信号。Among them, the tuner and demodulator 220 is used to perform modulation and demodulation processing such as amplifying, mixing, and resonating broadcast television signals received through wired or wireless methods, thereby demodulating the user’s information from multiple wireless or cable broadcast television signals. Select the audio and video signals carried in the frequency of the TV channel, and additional information (such as EPG data signals). According to different television signal broadcasting systems, the signal path of the tuner and demodulator 220 can be varied, such as: terrestrial broadcasting, cable broadcasting, satellite broadcasting or Internet broadcasting; and according to different modulation types, the signal adjustment method can be digitally modulated The method may also be an analog modulation method; and according to different types of received television signals, the tuner demodulator 220 may demodulate analog signals and/or digital signals.
调谐解调器220,还用于根据用户选择,以及由控制器210控制,响应用户选择的电视频道频率以及该频率所携带的电视信号。The tuner and demodulator 220 is also used to respond to the TV channel frequency selected by the user and the TV signal carried by the frequency according to the user's selection and control by the controller 210.
在其他一些示例性实施例中,调谐解调器220也可在外置设备中,如外置机顶盒等。这样,机顶盒通过调制解调后输出电视音视频信号,经过外置装置接口250输入至显示设备200中。In some other exemplary embodiments, the tuner demodulator 220 may also be in an external device, such as an external set-top box. In this way, the set-top box outputs TV audio and video signals through modulation and demodulation, and inputs them to the display device 200 through the external device interface 250.
通信器230是用于根据各种通信协议类型与外部设备或外部服务器进行通信的组件。例如:通信器230可以包括WIFI模块231,蓝牙通信协议模块232,有线以太网通信协议模块233,及红外通信协议模块等其他网络通信协议模块或近场通信协议模块。The communicator 230 is a component for communicating with external devices or external servers according to various communication protocol types. For example, the communicator 230 may include a WIFI module 231, a Bluetooth communication protocol module 232, a wired Ethernet communication protocol module 233, and an infrared communication protocol module and other network communication protocol modules or near field communication protocol modules.
显示设备200可以通过通信器230与外部控制设备或内容提供设备之间建立控制信号和数据信号的连接。例如,通信器可根据控制器的控制接收遥控器100的控制信号。The display device 200 may establish a control signal and a data signal connection with an external control device or content providing device through the communicator 230. For example, the communicator may receive the control signal of the remote controller 100 according to the control of the controller.
外部装置接口250,是提供N芯片控制器210和A芯片及外部其他设备间数据传输的组件。外部装置接口可按照有线/无线方式与诸如机顶盒、游戏装置、笔记本电脑等的外部设备连接,可接收外部设备的诸如视频信号(例如运动图像)、音频信号(例如音乐)、附加信息(例如EPG)等数据。The external device interface 250 is a component that provides data transmission between the N chip controller 210 and the A chip and other external devices. The external device interface can be connected to external devices such as set-top boxes, game devices, notebook computers, etc. in a wired/wireless manner, and can receive external devices such as video signals (such as moving images), audio signals (such as music), and additional information (such as EPG). ) And other data.
其中,外部装置接口250可以包括:高清多媒体接口(HDMI)端子251、复合视频消隐同步(CVBS)端子252、模拟或数字分量端子253、通用串行总线(USB)端子254、红绿蓝(RGB)端子(图中未示出)等任一个或多个。本申请不对外部装置接口的数量和类型进行限制。Among them, the external device interface 250 may include: a high-definition multimedia interface (HDMI) terminal 251, a composite video blanking synchronization (CVBS) terminal 252, an analog or digital component terminal 253, a universal serial bus (USB) terminal 254, red, green, and blue ( RGB) terminal (not shown in the figure) and any one or more. This application does not limit the number and types of external device interfaces.
控制器210,通过运行存储在存储器290上的各种软件控制程序(如操作系统和/或各种应用程序),来控制显示设备200的工作和响应用户的操作。The controller 210 controls the work of the display device 200 and responds to user operations by running various software control programs (such as an operating system and/or various application programs) stored on the memory 290.
如图4所示,控制器210包括只读存储器RAM213、随机存取存储器ROM214、图形处理器216、CPU处理器212、通信接口218、以及通信总线中的至少一个。其中,RAM213和ROM214以及图形处理器216、CPU处理器212、通信接口218通过总线相连接。As shown in FIG. 4, the controller 210 includes at least one of a read-only memory RAM 213, a random access memory ROM 214, a graphics processor 216, a CPU processor 212, a communication interface 218, and a communication bus. Among them, RAM213 and ROM214, graphics processor 216, CPU processor 212, and communication interface 218 are connected by a bus.
ROM213,用于存储各种系统启动的指令。如在收到开机信号时,显示设备200电源开始启动,CPU处理器212运行ROM中系统启动指令,将存储在存储器290的操作系统的临时数据拷贝至RAM214中,以开始运行启动操作系统。当操作系统启动完成后,CPU处理器212再将存储器290中各种应用程序的临时数据拷贝至RAM214中,然后,开始运行启动各种应用程序。ROM213, used to store various system startup instructions. For example, when the power-on signal is received, the power of the display device 200 starts to start, and the CPU processor 212 runs the system start-up instruction in the ROM, and copies the temporary data of the operating system stored in the memory 290 to the RAM 214 to start the operating system. After the operating system is started, the CPU processor 212 copies the temporary data of the various application programs in the memory 290 to the RAM 214, and then starts to run and start the various application programs.
图形处理器216,用于产生各种图形对象,如:图标、操作菜单、以及用户输入指令显示图形等。包括运算器,通过接收用户输入各种交互指令进行运算,根据显示属性显示各种对象。以及包括渲染器,产生基于运算器得到的各种对象,进行渲染的结果显示在显示器280上。The graphics processor 216 is used to generate various graphics objects, such as icons, operation menus, and user input instructions to display graphics. Including an arithmetic unit, which performs operations by receiving various interactive commands input by the user, and displays various objects according to display attributes. As well as including a renderer, various objects obtained based on the arithmetic unit are generated, and the rendering result is displayed on the display 280.
CPU处理器212,用于执行存储在存储器290中操作系统和应用程序指令。以及根据接收外部输入的各种交互指令,来执行各种应用程序、数据和内容,以便最终显示和播放各种音视频内容。The CPU processor 212 is configured to execute operating system and application program instructions stored in the memory 290. And according to receiving various interactive instructions input from the outside, to execute various applications, data and content, so as to finally display and play various audio and video content.
在一些示例性实施例中,CPU处理器212,可以包括多个处理器。所述多个处理器中可包括一个主处理器以及多个或一个子处理器。主处理器,用于在预加电模式中执行显示设备200一些操作,和/或在正常模式下显示画面的操作。多个或一个子处理器,用于执行在待机模式等状态下的一种操作。In some exemplary embodiments, the CPU processor 212 may include multiple processors. The multiple processors may include one main processor and multiple or one sub-processors. The main processor is used to perform some operations of the display device 200 in the pre-power-on mode, and/or to display images in the normal mode. Multiple or one sub-processor, used to perform an operation in the standby mode and other states.
通信接口,可包括第一接口218-1到第n接口218-n。这些接口可以是经由网络被连接到外部设备的网络接口。The communication interface may include the first interface 218-1 to the nth interface 218-n. These interfaces may be network interfaces connected to external devices via a network.
控制器210可以控制显示设备200的整体操作。例如:响应于接收到用于选择在显示器280上显示UI对象的用户命令,控制器210便可以执行与由用户命令选择的对象有关的操作。The controller 210 may control the overall operation of the display device 200. For example, in response to receiving a user command for selecting a UI object to be displayed on the display 280, the controller 210 may perform an operation related to the object selected by the user command.
其中,所述对象可以是可选对象中的任何一个,例如超链接或图标。与所选择的对象有关操作,例如:显示连接到超链接页面、文档、图像等操作,或者执行与图标相对应程序的操作。用于选择UI对象用户命令,可以是通过连接到显示设备200的各种输入装置(例如,鼠标、键盘、触摸板等)输入命令或者与由用户说出语音相对应的语音命令。Wherein, the object may be any one of the selectable objects, such as a hyperlink or an icon. Operations related to the selected object, for example: display operations connected to hyperlink pages, documents, images, etc., or perform operations corresponding to the icon. The user command for selecting the UI object may be a command input through various input devices (for example, a mouse, a keyboard, a touch pad, etc.) connected to the display device 200 or a voice command corresponding to the voice spoken by the user.
存储器290,包括存储用于驱动和控制显示设备200的各种软件模块。如:存储器290中存储的各种软件模块,包括:基础模块、检测模块、通信模块、显示控制模块、浏览器模块、和各种服务模块等中的至少一个。The memory 290 includes storing various software modules for driving and controlling the display device 200. For example, various software modules stored in the memory 290 include: at least one of a basic module, a detection module, a communication module, a display control module, a browser module, and various service modules.
其中,基础模块是用于显示设备200中各个硬件之间信号通信、并向上层模块发送处理和控制信号的底层软件模块。检测模块是用于从各种传感器或用户输入接口中收集各种信息,并进行数模转换以及分析管理的管理模块。Among them, the basic module is the underlying software module used for signal communication between various hardware in the display device 200 and sending processing and control signals to the upper module. The detection module is a management module used to collect various information from various sensors or user input interfaces, and perform digital-to-analog conversion and analysis management.
例如:语音识别模块中包括语音解析模块和语音指令数据库模块。显示控制模块是用于控制显示器280进行显示图像内容的模块,可以用于播放多媒体图像内容和UI界面等信息。通信模块,是用于与外部设备之间进行控制和数据通信的模块。浏览器模块,是用于执行浏览服务器之间数据通信的模块。服务模块,是用于提供各种服务以及各类应用程序在内的模块。For example: the voice recognition module includes a voice analysis module and a voice command database module. The display control module is a module for controlling the display 280 to display image content, and can be used to play information such as multimedia image content and UI interfaces. The communication module is a module used for control and data communication with external devices. The browser module is a module used to perform data communication between browsing servers. The service module is a module used to provide various services and various applications.
同时,存储器290还用于存储接收外部数据和用户数据、各种用户界面中各个项目的图像以及焦点对象的视觉效果图等。At the same time, the memory 290 is also used to store and receive external data and user data, images of various items in various user interfaces, and visual effect diagrams of focus objects.
用户输入接口,用于将用户的输入信号发送给控制器210,或者,将从控制器输出的信号传送给用户。示例性的,控制装置(例如移动终端或遥控器)可将用户输入的诸如电源开关信号、频道选择信号、音量调节信号等输入信号发送至用户输入接口,再由用户输入接口转送至控制器;或者,控制装置可接收经控制器处理从用户输入接口输出的音频、视频或数据等输出信号,并且显示接收的输出信号或将接收的输出信号输出为音频或振动形式。The user input interface is used to send a user's input signal to the controller 210, or to transmit a signal output from the controller to the user. Exemplarily, the control device (such as a mobile terminal or a remote control) may send input signals input by the user, such as a power switch signal, a channel selection signal, and a volume adjustment signal, to the user input interface, and then the user input interface forwards the input signal to the controller; Alternatively, the control device may receive output signals such as audio, video, or data output from the user input interface processed by the controller, and display the received output signal or output the received output signal as audio or vibration.
在一些实施方式中,用户可在显示器280上显示的图形用户界面(GUI)输入用户命令,则用户输入接口通过图形用户界面(GUI)接收用户输入命令。或者,用户可通过输入特定的声音或手势进行输入用户命令,则用户输入接口通过传感器识别出声音或手势,来接收用户输入命令。In some embodiments, the user may input a user command on a graphical user interface (GUI) displayed on the display 280, and the user input interface receives the user input command through the graphical user interface (GUI). Alternatively, the user can input a user command by inputting a specific sound or gesture, and the user input interface recognizes the sound or gesture through the sensor to receive the user input command.
视频处理器260-1,用于接收视频信号,根据输入信号的标准编解码协议,进行解压缩、解码、缩放、降噪、帧率转换、分辨率转换、图像合成等视频数据处理,可得到直接在显示器280上显示或播放的视频信号。The video processor 260-1 is used to receive video signals, and perform video data processing such as decompression, decoding, scaling, noise reduction, frame rate conversion, resolution conversion, and image synthesis according to the standard codec protocol of the input signal. The video signal displayed or played directly on the display 280.
示例的,视频处理器260-1,包括解复用模块、视频解码模块、图像合成模块、帧率转换模块、显示格式化模块等。Illustratively, the video processor 260-1 includes a demultiplexing module, a video decoding module, an image synthesis module, a frame rate conversion module, a display formatting module, and the like.
其中,解复用模块,用于对输入音视频数据流进行解复用处理,如输入MPEG-2,则解复用模块进行解复用成视频信号和音频信号等。Among them, the demultiplexing module is used to demultiplex the input audio and video data stream. For example, if MPEG-2 is input, the demultiplexing module will demultiplex into video signals and audio signals.
视频解码模块,用于对解复用后的视频信号进行处理,包括解码和缩放处理等。The video decoding module is used to process the demultiplexed video signal, including decoding and scaling.
图像合成模块,如图像合成器,其用于将图形生成器根据用户输入或自身生成的GUI信号,与缩放处理后视频图像进行叠加混合处理,以生成可供显示的图像信号。An image synthesis module, such as an image synthesizer, is used to superimpose and mix the GUI signal generated by the graphics generator with the zoomed video image according to user input or itself to generate an image signal for display.
帧率转换模块,用于对输入视频的帧率进行转换,如将输入的24Hz、25Hz、30Hz、60Hz视频的帧率转换为60Hz、120Hz或240Hz的帧率,其中,输入帧率可以与源视频流有关,输出帧率可以与显示器的更新率有关。输入有通常的格式采用如插帧方式实现。Frame rate conversion module, used to convert the frame rate of the input video, such as converting the frame rate of the input 24Hz, 25Hz, 30Hz, 60Hz video to the frame rate of 60Hz, 120Hz or 240Hz, where the input frame rate can be compared with the source The video stream is related, and the output frame rate can be related to the update rate of the display. The input has a usual format, such as frame insertion.
显示格式化模块,用于将帧率转换模块输出的信号,改变为符合诸如显示器显示格式的信号,如将帧率转换模块输出的信号进行格式转换以输出RGB数据信号。The display formatting module is used to change the signal output by the frame rate conversion module into a signal that conforms to a display format such as a display, such as format conversion of the signal output by the frame rate conversion module to output RGB data signals.
显示器280,用于接收源自视频处理器260-1输入的图像信号,进行显示视频内容和图像以及菜单操控界面。显示器280包括用于呈现画面的显示器组件以及驱动图像显示的驱动组件。显示视频内容,可以来自调谐解调器220接收的广播信号中的视频,也可以来自通信器或外部设备接口输入的视频内容。显示器220,同时显示显示设备200中产生且用于控制显示设备200的用户操控界面UI。The display 280 is used to receive the image signal input from the video processor 260-1, display video content and images, and a menu control interface. The display 280 includes a display component for presenting a picture and a driving component for driving image display. The displayed video content can be from the video in the broadcast signal received by the tuner and demodulator 220, or from the video content input by the communicator or the interface of an external device. The display 220 simultaneously displays a user manipulation interface UI generated in the display device 200 and used to control the display device 200.
以及,根据显示器280类型不同,还包括用于驱动显示的驱动组件。或者,倘若显示器280为一种投影显示器,还可以包括一种投影装置和投影屏幕。And, depending on the type of the display 280, it also includes a driving component for driving the display. Alternatively, if the display 280 is a projection display, it may also include a projection device and a projection screen.
音频处理器260-2,用于接收音频信号,根据输入信号的标准编解码协议,进行解压缩和解码,以及降噪、数模转换、和放大处理等音频数据处理,得到可以在扬声器272中播放的音频信号。The audio processor 260-2 is used to receive audio signals, and perform decompression and decoding according to the standard codec protocol of the input signal, as well as audio data processing such as noise reduction, digital-to-analog conversion, and amplification processing, and the result can be in the speaker 272 The audio signal to be played.
音频输出接口270,用于在控制器210的控制下接收音频处理器260-2输出的音频信号, 音频输出接口可包括扬声器272,或输出至外接设备的发生装置的外接音响输出端子274,如:外接音响端子或耳机输出端子等。The audio output interface 270 is used to receive the audio signal output by the audio processor 260-2 under the control of the controller 210. The audio output interface may include a speaker 272 or output to an external audio output terminal 274 of the generator of an external device, such as : External audio terminal or headphone output terminal, etc.
在其他一些示例性实施例中,视频处理器260-1可以包括一个或多个芯片组成。音频处理器260-2,也可以包括一个或多个芯片组成。In some other exemplary embodiments, the video processor 260-1 may include one or more chips. The audio processor 260-2 may also include one or more chips.
以及,在其他一些示例性实施例中,视频处理器260-1和音频处理器260-2,可以为单独的芯片,也可以与控制器210一起集成在一个或多个芯片中。And, in some other exemplary embodiments, the video processor 260-1 and the audio processor 260-2 may be separate chips, or they may be integrated with the controller 210 in one or more chips.
供电电源,用于在控制器210控制下,将外部电源输入的电力为显示设备200提供电源供电支持。供电电源可以包括安装显示设备200内部的内置电源电路,也可以是安装在显示设备200外部的电源,如在显示设备200中提供外接电源的电源接口。The power supply is used to provide power supply support for the display device 200 with power input from an external power supply under the control of the controller 210. The power supply may include a built-in power supply circuit installed inside the display device 200, or may be a power supply installed outside the display device 200, such as a power interface that provides an external power supply in the display device 200.
与N芯片相类似,如图4所示,A芯片可以包括控制器310、通信器330、检测器340、存储器390。在某些实施例中还可以包括用户输入接口、视频处理器、音频处理器、显示器、音频输出接口。在某些实施例中,也可以存在独立为A芯片供电的供电电源。Similar to the N chip, as shown in FIG. 4, the A chip may include a controller 310, a communicator 330, a detector 340, and a memory 390. In some embodiments, it may also include a user input interface, a video processor, an audio processor, a display, and an audio output interface. In some embodiments, there may also be a power supply that independently powers the A chip.
通信器330是用于根据各种通信协议类型与外部设备或外部服务器进行通信的组件。例如:通信器330可以包括WIFI模块331,蓝牙通信协议模块332,有线以太网通信协议模块333,及红外通信协议模块等其他网络通信协议模块或近场通信协议模块。The communicator 330 is a component for communicating with external devices or external servers according to various communication protocol types. For example, the communicator 330 may include a WIFI module 331, a Bluetooth communication protocol module 332, a wired Ethernet communication protocol module 333, and an infrared communication protocol module and other network communication protocol modules or near field communication protocol modules.
A芯片的通信器330和N芯片的通信器230也有相互交互。例如,N芯片的WiFi模块231用于连接外部网络,与外部服务器等产生网络通信。A芯片的WiFi模块331用于连接至N芯片的WiFi模块231,而不与外界网络等产生直接连接。因此,对于用户而言,一个如上述实施例中的显示设备至对外显示一个WiFi账号。The communicator 330 of the A chip and the communicator 230 of the N chip also interact with each other. For example, the WiFi module 231 of the N chip is used to connect to an external network and generate network communication with an external server and the like. The WiFi module 331 of the A chip is used to connect to the WiFi module 231 of the N chip, and does not directly connect to an external network or the like. Therefore, for the user, a display device as in the above embodiment can externally display a WiFi account.
检测器340,是显示设备A芯片用于采集外部环境或与外部交互的信号的组件。检测器340可以包括光接收器342,用于采集环境光线强度的传感器,可以通过采集环境光来自适应显示参数变化等;还可以包括图像采集器341,如相机、摄像头等,可以用于采集外部环境场景,以及用于采集用户的属性或与用户交互手势,可以自适应变化显示参数,也可以识别用户手势,以实现与用户之间互动的功能。The detector 340 is a component used by the chip of the display device A to collect signals from the external environment or interact with the outside. The detector 340 may include a light receiver 342, a sensor used to collect the intensity of ambient light, which can adaptively display parameter changes by collecting ambient light, etc.; it may also include an image collector 341, such as a camera, a camera, etc., which can be used to collect external Environmental scenes, as well as gestures used to collect user attributes or interact with users, can adaptively change display parameters, and can also recognize user gestures to achieve the function of interaction with users.
外部装置接口350,提供控制器310与N芯片或外部其他设备间数据传输的组件。外部装置接口可按照有线/无线方式与诸如机顶盒、游戏装置、笔记本电脑等的外部设备连接。The external device interface 350 provides components for data transmission between the controller 310 and the N chip or other external devices. The external device interface can be connected to external devices such as set-top boxes, game devices, notebook computers, etc., in a wired/wireless manner.
控制器310,通过运行存储在存储器390上的各种软件控制程序(如用安装的第三方应用等),以及与N芯片的交互,来控制显示设备200的工作和响应用户的操作。The controller 310 controls the work of the display device 200 and responds to user operations by running various software control programs (such as installed third-party applications, etc.) stored on the memory 390 and interacting with the N chip.
如图4所示,控制器310包括只读存储器ROM313、随机存取存储器RAM314、图形处理器316、CPU处理器312、通信接口318、以及通信总线中的至少一个。其中,ROM313和 RAM314以及图形处理器316、CPU处理器312、通信接口318通过总线相连接。As shown in FIG. 4, the controller 310 includes at least one of a read-only memory ROM313, a random access memory RAM314, a graphics processor 316, a CPU processor 312, a communication interface 318, and a communication bus. Among them, the ROM 313 and the RAM 314, the graphics processor 316, the CPU processor 312, and the communication interface 318 are connected by a bus.
ROM313,用于存储各种系统启动的指令。CPU处理器312运行ROM中系统启动指令,将存储在存储器390的操作系统的临时数据拷贝至RAM314中,以开始运行启动操作系统。当操作系统启动完成后,CPU处理器312再将存储器390中各种应用程序的临时数据拷贝至RAM314中,然后,开始运行启动各种应用程序。ROM313, used to store various system startup instructions. The CPU processor 312 runs the system startup instruction in the ROM, and copies the temporary data of the operating system stored in the memory 390 to the RAM 314 to start the operating system. After the operating system is started, the CPU processor 312 copies the temporary data of the various application programs in the memory 390 to the RAM 314, and then starts to run and start the various application programs.
CPU处理器312,用于执行存储在存储器390中操作系统和应用程序指令,和与N芯片进行通信、信号、数据、指令等传输与交互,以及根据接收外部输入的各种交互指令,来执行各种应用程序、数据和内容,以便最终显示和播放各种音视频内容。The CPU processor 312 is used to execute the operating system and application instructions stored in the memory 390, communicate with the N chip, transmit and interact with signals, data, instructions, etc., and execute various interactive instructions received from external inputs Various applications, data and content, in order to finally display and play various audio and video content.
通信接口,可包括第一接口318-1到第n接口318-n。这些接口可以是经由网络被连接到外部设备的网络接口,也可以是经由网络被连接到N芯片的网络接口。The communication interface may include the first interface 318-1 to the nth interface 318-n. These interfaces may be network interfaces connected to external devices via a network, or network interfaces connected to the N chip via a network.
控制器310可以控制显示设备200的整体操作。例如:响应于接收到用于选择在显示器280上显示UI对象的用户命令,控制器210便可以执行与由用户命令选择的对象有关的操作。The controller 310 may control the overall operation of the display device 200. For example, in response to receiving a user command for selecting a UI object to be displayed on the display 280, the controller 210 may perform an operation related to the object selected by the user command.
图形处理器316,用于产生各种图形对象,如:图标、操作菜单、以及用户输入指令显示图形等。包括运算器,通过接收用户输入各种交互指令进行运算,根据显示属性显示各种对象。以及包括渲染器,产生基于运算器得到的各种对象,进行渲染的结果显示在显示器280上。The graphics processor 316 is used to generate various graphics objects, such as icons, operation menus, and user input instructions to display graphics. Including an arithmetic unit, which performs operations by receiving various interactive commands input by the user, and displays various objects according to display attributes. As well as including a renderer, various objects obtained based on the arithmetic unit are generated, and the rendering result is displayed on the display 280.
A芯片的图形处理器316与N芯片的图形处理器216均能产生各种图形对象。区别性的,若应用1安装于A芯片,应用2安装在N芯片,当用户在应用1的界面,且在应用1内进行用户输入的指令时,由A芯片图形处理器316产生图形对象。当用户在应用2的界面,且在应用2内进行用户输入的指令时,由N芯片的图形处理器216产生图形对象。Both the graphics processor 316 of the A chip and the graphics processor 216 of the N chip can generate various graphics objects. Differentily, if application 1 is installed on the A chip and application 2 is installed on the N chip, when the user is in the interface of the application 1 and the user inputs instructions in the application 1, the A chip graphics processor 316 generates a graphic object. When the user is on the interface of Application 2 and performs the user-input instructions in Application 2, the graphics processor 216 of the N chip generates the graphics object.
图5中示例性示出了根据示例性实施例中显示设备的功能配置示意图。Fig. 5 exemplarily shows a schematic diagram of a functional configuration of a display device according to an exemplary embodiment.
在一些实施方式中,如图5所示,A芯片的存储器390和N芯片的存储器290分别用于存储操作系统、应用程序、内容和用户数据等,在A芯片的控制器310和N芯片的控制器210的控制下执行驱动显示设备200的系统运行以及响应用户的各种操作。A芯片的存储器390和N芯片的存储器290可以包括易失性和/或非易失性存储器。In some embodiments, as shown in FIG. 5, the memory 390 of the A chip and the memory 290 of the N chip are used to store operating systems, applications, content, and user data, respectively. The controller 310 of the A chip and the memory 290 of the N chip The system operation of driving the display device 200 and responding to various operations of the user are performed under the control of the controller 210. The memory 390 of the A chip and the memory 290 of the N chip may include volatile and/or nonvolatile memory.
对于N芯片,存储器290,用于存储驱动显示设备200中控制器210的运行程序,以及存储显示设备200内置各种应用程序,以及用户从外部设备下载的各种应用程序、以及与应用程序相关的各种图形用户界面,以及与图形用户界面相关的各种对象,用户数据信息,以及各种支持应用程序的内部数据。存储器290用于存储操作系统(OS)内核、中间件和应用等系统软件,以及存储输入的视频数据和音频数据、及其他用户数据。For the N chip, the memory 290 is used to store the operating program that drives the controller 210 in the display device 200, and store various application programs built in the display device 200, various application programs downloaded by the user from an external device, and application related programs The various graphical user interfaces, and various objects related to the graphical user interface, user data information, and various internal data supporting applications. The memory 290 is used to store system software such as an operating system (OS) kernel, middleware, and applications, and to store input video data and audio data, and other user data.
存储器290,用于存储视频处理器260-1和音频处理器260-2、显示器280、通信接口230、调谐解调器220、输入/输出接口等驱动程序和相关数据。The memory 290 is used to store driver programs and related data such as the video processor 260-1 and the audio processor 260-2, the display 280, the communication interface 230, the tuner and demodulator 220, and the input/output interface.
在一些实施方式中,存储器290可以存储软件和/或程序,用于表示操作系统(OS)的软件程序包括,例如:内核、中间件、应用编程接口(API)和/或应用程序。示例性的,内核可控制或管理系统资源,或其它程序所实施的功能(如所述中间件、API或应用程序),以及内核可以提供接口,以允许中间件和API,或应用访问控制器,以实现控制或管理系统资源。In some embodiments, the memory 290 may store software and/or programs. The software programs used to represent an operating system (OS) include, for example, kernels, middleware, application programming interfaces (APIs), and/or application programs. Exemplarily, the kernel may control or manage system resources, or functions implemented by other programs (such as the middleware, API, or application program), and the kernel may provide interfaces to allow middleware and APIs, or applications to access the controller , In order to achieve control or management of system resources.
示例的,存储器290,包括广播接收模块2901、频道控制模块2902、音量控制模块2903、图像控制模块2904、显示控制模块2905、音频控制模块2906、外部指令识别模块2907、通信控制模块2908、光接收模块2909、电力控制模块2910、操作系统2911、以及其他应用程序2912、浏览器模块等等中的至少一个。控制器210通过运行存储器290中各种软件程序,来执行诸如:广播电视信号接收解调功能、电视频道选择控制功能、音量选择控制功能、图像控制功能、显示控制功能、音频控制功能、外部指令识别功能、通信控制功能、光信号接收功能、电力控制功能、支持各种功能的软件操控平台、以及浏览器功能等各类功能。For example, the memory 290 includes a broadcast receiving module 2901, a channel control module 2902, a volume control module 2903, an image control module 2904, a display control module 2905, an audio control module 2906, an external command recognition module 2907, a communication control module 2908, and an optical receiver At least one of a module 2909, a power control module 2910, an operating system 2911, and other application programs 2912, a browser module, and so on. The controller 210 executes various software programs in the memory 290, such as: broadcast and television signal reception and demodulation function, TV channel selection control function, volume selection control function, image control function, display control function, audio control function, external command Various functions such as identification function, communication control function, optical signal receiving function, power control function, software control platform supporting various functions, and browser function.
存储器390,包括存储用于驱动和控制显示设备200的各种软件模块。如:存储器390中存储的各种软件模块,包括:基础模块、检测模块、通信模块、显示控制模块、浏览器模块、和各种服务模块等中的至少一个。由于存储器390与存储器290的功能比较相似,相关之处参见存储器290即可,在此就不再赘述。The memory 390 includes storing various software modules for driving and controlling the display device 200. For example, various software modules stored in the memory 390 include: at least one of a basic module, a detection module, a communication module, a display control module, a browser module, and various service modules. Since the functions of the memory 390 and the memory 290 are relatively similar, please refer to the memory 290 for related parts, and will not be repeated here.
示例的,存储器390,包括图像控制模块3904、音频控制模块2906、外部指令识别模块3907、通信控制模块3908、光接收模块3909、操作系统3911、以及其他应用程序3912、浏览器模块等等。控制器210通过运行存储器290中各种软件程序,来执行诸如:图像控制功能、显示控制功能、音频控制功能、外部指令识别功能、通信控制功能、光信号接收功能、电力控制功能、支持各种功能的软件操控平台、以及浏览器功能等各类功能。For example, the memory 390 includes an image control module 3904, an audio control module 2906, an external command recognition module 3907, a communication control module 3908, an optical receiving module 3909, an operating system 3911, and other application programs 3912, a browser module, and so on. The controller 210 executes various software programs in the memory 290, such as: image control function, display control function, audio control function, external command recognition function, communication control function, light signal receiving function, power control function, support for various Functional software control platform, and various functions such as browser functions.
区别性的,N芯片的外部指令识别模块2907和A芯片的外部指令识别模块3907可识别不同的指令。Differentily, the external command recognition module 2907 of the N chip and the external command recognition module 3907 of the A chip can recognize different commands.
示例性的,由于摄像头等图像接收设备与A芯片连接,因此,A芯片的外部指令识别模块3907可包括图形识别模块2907-1,图形识别模块3907-1内存储有图形数据库,摄像头接收到外界的图形指令时,与图形数据库中的指令进行对应关系,以对显示设备作出指令控制。而由于语音接收设备以及遥控器与N芯片连接,因此,N芯片的外部指令识别模块2907可包括语音识别模块2907-2,图形识别模块2907-2内存储有语音数据库,语音接收设备等接收到外界的语音指令或时,与语音数据库中的指令进行对应关系,以对显示设备作出指令控制。 同样的,遥控器等控制装置100与N芯片连接,由按键指令识别模块与控制装置100进行指令交互。Exemplarily, because the image receiving device such as the camera is connected to the A chip, the external command recognition module 3907 of the A chip may include a graphic recognition module 2907-1. The graphic recognition module 3907-1 stores a graphic database, and the camera receives external commands. In order to control the display device, the corresponding relationship is made with the instructions in the graphics database. Since the voice receiving device and the remote controller are connected to the N chip, the external command recognition module 2907 of the N chip may include a voice recognition module 2907-2. The graphics recognition module 2907-2 stores a voice database, and the voice receiving device, etc. The external voice commands or time correspond to the commands in the voice database to control the display device. Similarly, a control device 100 such as a remote controller is connected to the N chip, and the key command recognition module interacts with the control device 100.
图6a中示例性示出了根据示例性实施例中显示设备200中软件系统的配置框图。Fig. 6a exemplarily shows a configuration block diagram of the software system in the display device 200 according to an exemplary embodiment.
对N芯片,如图6a中所示,操作系统2911,包括用于处理各种基础系统服务和用于实施硬件相关任务的执行操作软件,充当应用程序和硬件组件之间完成数据处理的媒介。For the N chip, as shown in Figure 6a, the operating system 2911 includes operating software for processing various basic system services and for implementing hardware-related tasks, acting as a medium for data processing between application programs and hardware components.
一些实施方式中,部分操作系统内核可以包含一系列软件,用以管理显示设备硬件资源,并为其他程序或软件代码提供服务。In some embodiments, part of the operating system kernel may include a series of software to manage the hardware resources of the display device and provide services for other programs or software codes.
其他一些实施方式中,部分操作系统内核可包含一个或多个设备驱动器,设备驱动器可以是操作系统中的一组软件代码,帮助操作或控制显示设备关联的设备或硬件。驱动器可以包含操作视频、音频和/或其他多媒体组件的代码。示例的,包括显示器、摄像头、Flash、WiFi和音频驱动器。In some other embodiments, part of the operating system kernel may include one or more device drivers, and the device drivers may be a set of software codes in the operating system to help operate or control devices or hardware associated with the display device. The drive may contain code to manipulate video, audio, and/or other multimedia components. Examples include displays, cameras, Flash, WiFi, and audio drivers.
其中,可访问性模块2911-1,用于修改或访问应用程序,以实现应用程序的可访问性和对其显示内容的可操作性。Among them, the accessibility module 2911-1 is used to modify or access the application program, so as to realize the accessibility of the application program and the operability of its display content.
通信模块2911-2,用于经由相关通信接口和通信网络与其他外设的连接。The communication module 2911-2 is used to connect to other peripherals via related communication interfaces and communication networks.
用户界面模块2911-3,用于提供显示用户界面的对象,以供各应用程序访问,可实现用户可操作性。The user interface module 2911-3 is used to provide objects that display the user interface for access by various applications, and can realize user operability.
控制应用程序2911-4,用于控制进程管理,包括运行时间应用程序等。The control application 2911-4 is used to control process management, including runtime applications.
事件传输系统2914,可在操作系统2911内或应用程序2912中实现。一些实施方式中,一方面在在操作系统2911内实现,同时在应用程序2912中实现,用于监听各种用户输入事件,将根据各种事件指代响应各类事件或子事件的识别结果,而实施一组或多组预定义的操作的处理程序。The event transmission system 2914 can be implemented in the operating system 2911 or in the application 2912. In some embodiments, it is implemented in the operating system 2911 on the one hand, and implemented in the application program 2912 at the same time, for monitoring various user input events, and responding to the recognition results of various events or sub-events according to various events. And implement one or more sets of pre-defined operation procedures.
其中,事件监听模块2914-1,用于监听用户输入接口输入事件或子事件。Among them, the event monitoring module 2914-1 is used to monitor input events or sub-events of the user input interface.
事件识别模块2914-1,用于对各种用户输入接口输入各类事件的定义,识别出各种事件或子事件,且将其传输给处理用以执行其相应一组或多组的处理程序。The event recognition module 2914-1 is used to input the definitions of various events to various user input interfaces, recognize various events or sub-events, and transmit them to the processing to execute the corresponding one or more groups of processing programs .
其中,事件或子事件,是指显示设备200中一个或多个传感器检测的输入,以及外界控制设备(如控制装置100等)的输入。如:语音输入各种子事件,手势识别的手势输入子事件,以及控制装置的遥控按键指令输入的子事件等。示例的,遥控器中一个或多个子事件包括多种形式,包括但不限于按键按上/下/左右/、确定键、按键按住等中一个或组合。以及非实体按键的操作,如移动、按住、释放等操作。Among them, the event or sub-event refers to the input detected by one or more sensors in the display device 200 and the input of an external control device (such as the control device 100). Such as: various sub-events of voice input, gesture input sub-events of gesture recognition, and sub-events of remote control button command input of control devices. For example, one or more sub-events in the remote control include multiple forms, including but not limited to one or a combination of pressing up/down/left/right/, confirming keys, and pressing keys. And the operations of non-physical buttons, such as moving, pressing, and releasing.
界面布局管理模块2913,直接或间接接收来自于事件传输系统2914监听到各用户输入 事件或子事件,用于更新用户界面的布局,包括但不限于界面中各控件或子控件的位置,以及容器的大小或位置、层级等与界面布局相关各种执行操作。The interface layout management module 2913, which directly or indirectly receives various user input events or sub-events monitored by the event transmission system 2914, is used to update the layout of the user interface, including but not limited to the position of each control or sub-control in the interface, and the container The size, position, level, etc. of the interface are related to various execution operations.
由于A芯片的操作系统3911与N芯片的操作系统2911的功能比较相似,相关之处参见操作系统2911即可,在此就不再赘述。Since the functions of the operating system 3911 of the A chip and the operating system 2911 of the N chip are relatively similar, please refer to the operating system 2911 for related details, and will not be repeated here.
如图6b中所示的交互界面中的应用程序控件,显示设备的应用程序层包含可在显示设备200执行的各种应用程序。As shown in FIG. 6b for the application control in the interactive interface, the application layer of the display device includes various applications that can be executed on the display device 200.
N芯片的应用程序层2912可包含但不限于一个或多个应用程序,如:视频点播应用程序、应用程序中心、游戏应用等。A芯片的应用程序层3912可包含但不限于一个或多个应用程序,如:直播电视应用程序、媒体中心应用程序等。需要说明的是,A芯片和N芯片上分别包含什么应用程序是根据操作系统和其他设计确定的,本申请无需对A芯片和N芯片上所包含的应用程序做具体的限定和划分。The application layer 2912 of the N chip may include, but is not limited to, one or more applications, such as video-on-demand applications, application centers, and game applications. The application layer 3912 of the A chip may include, but is not limited to, one or more applications, such as a live TV application, a media center application, and so on. It should be noted that the application programs contained on the A chip and the N chip are determined according to the operating system and other designs. This application does not need to specifically limit and divide the application programs contained on the A chip and the N chip.
直播电视应用程序,可以通过不同的信号源提供直播电视。例如,直播电视应用程可以使用来自有线电视、无线广播、卫星服务或其他类型的直播电视服务的输入提供电视信号。以及,直播电视应用程序可在显示设备200上显示直播电视信号的视频。Live TV applications can provide live TV through different sources. For example, a live TV application may use input from cable TV, wireless broadcasting, satellite services, or other types of live TV services to provide TV signals. And, the live TV application can display the video of the live TV signal on the display device 200.
视频点播应用程序,可以提供来自不同存储源的视频。不同于直播电视应用程序,视频点播提供来自某些存储源的视频显示。例如,视频点播可以来自云存储的服务器端、来自包含已存视频节目的本地硬盘储存器。Video-on-demand applications can provide videos from different storage sources. Unlike live TV applications, VOD provides video display from certain storage sources. For example, the video on demand can come from the server side of cloud storage, and from the local hard disk storage that contains the stored video programs.
媒体中心应用程序,可以提供各种多媒体内容播放的应用程序。例如,媒体中心,可以为不同于直播电视或视频点播,用户可通过媒体中心应用程序访问各种图像或音频所提供服务。Media center applications can provide various multimedia content playback applications. For example, the media center can provide services that are different from live TV or video on demand, and users can access various images or audio through the media center application.
应用程序中心,可以提供储存各种应用程序。应用程序可以是一种游戏、应用程序,或某些和计算机系统或其他设备相关但可以在显示设备中运行的其他应用程序。应用程序中心可从不同来源获得这些应用程序,将它们储存在本地储存器中,然后在显示设备200上可运行。Application center, can provide storage of various applications. The application program may be a game, an application program, or some other application program that is related to a computer system or other device but can be run on a display device. The application center can obtain these applications from different sources, store them in the local storage, and then run on the display device 200.
图7中示例性示出了根据示例性实施例中显示设备200中用户界面的示意图。如图7所示,用户界面包括多个视图显示区,示例的,第一视图显示区201和播放画面202,其中,播放画面包括布局一个或多个不同项目。以及用户界面中还包括指示项目被选择的选择器,可通过用户输入而移动选择器的位置,以改变选择不同的项目。FIG. 7 exemplarily shows a schematic diagram of a user interface in the display device 200 according to an exemplary embodiment. As shown in FIG. 7, the user interface includes multiple view display areas, for example, a first view display area 201 and a play screen 202, where the play screen includes layout of one or more different items. And the user interface also includes a selector indicating that the item is selected, and the position of the selector can be moved through user input to change the selection of different items.
需要说明的是,多个视图显示区可以呈现不同层级的显示画面。如,第一视图显示区可呈现视频聊天项目内容,第二视图显示区可呈现应用层项目内容(如,网页视频、VOD显示、 应用程序画面等)。It should be noted that multiple view display areas can present display screens of different levels. For example, the display area of the first view may present the content of the video chat item, and the display area of the second view may present the content of the application layer item (eg, webpage video, VOD display, application screen, etc.).
在一些实施方式中,第二视图显示区的内容包括视频层显示的内容,以及部分浮层显示的内容,第一视图显示区的内容包含浮层显示的内容。第一视图显示区和第二视图显示区中使用的浮层为不同的悬浮层。In some embodiments, the content of the display area of the second view includes content displayed on the video layer and part of the content displayed on the floating layer, and the content of the display area of the first view includes content displayed on the floating layer. The floating layers used in the first view display area and the second view display area are different floating layers.
在一些实施方式中,不同视图显示区的呈现存在优先级区别,优先级不同的视图显示区之间,视图显示区的显示优先级不同。如,系统层(例如视频层)的优先级高于应用层的优先级,当用户在应用层使用获取选择器和画面切换时,不遮挡系统层的视图显示区的画面显示;以及,根据用户的选择使应用层的视图显示区的大小和位置发生变化时,系统层的视图显示区的大小和位置不受影响。In some embodiments, the presentation of different view display areas has different priorities, and the display priorities of the view display areas are different between view display areas with different priorities. For example, the priority of the system layer (such as the video layer) is higher than that of the application layer. When the user uses the acquisition selector and screen switching in the application layer, the screen display in the view display area of the system layer is not blocked; and, according to the user When the size and position of the view display area of the application layer change, the size and position of the view display area of the system layer are not affected.
在一些实施方式中,例如画中画的方式,可以在同时图层中绘制两个不同的显示窗口,实现相同层级的显示画面,此时,选择器可以在第一视图显示区和第二视图显示区之间做切换(即在两个显示窗口之间切换)。此时,在一些实施方式中,当第一视图显示区的大小和位置发生变化时,第二视图显示区的大小和位置可随及发生改变。In some implementations, such as picture-in-picture, two different display windows can be drawn in the same layer to achieve the same level of display. At this time, the selector can be in the first view display area and the second view. Switch between display areas (ie switch between two display windows). At this time, in some embodiments, when the size and position of the display area of the first view change, the size and position of the display area of the second view may change accordingly.
在一些实施方式中,对于双芯片的智能电视200而言,由于A芯片及N芯片中可能分别安装有独立的操作系统,从而使显示设备200中存在两个在独立但又存在相互关联的子系统。例如,A芯片和N芯片均可以独立安装有安卓(Android)及各类APP,使得每个芯片均可以实现一定的功能,并且使A芯片和N芯片协同实现某项功能。In some embodiments, for a dual-chip smart TV 200, independent operating systems may be installed in the A chip and the N chip, so that there are two independent but related sub-systems in the display device 200. system. For example, both the A chip and the N chip can be independently installed with Android and various APPs, so that each chip can realize a certain function, and the A chip and the N chip can realize a certain function in cooperation.
在一些实施方式中,非双芯片的智能电视200(例如单芯片智能电视),存在一个系统芯片,操作系统控制智能电视的所有功能实现。In some embodiments, for a smart TV 200 that is not dual-chip (for example, a single-chip smart TV), there is a system chip, and the operating system controls the realization of all functions of the smart TV.
在本申请实施例提供的显示设备中,摄像头连接辅芯片,辅芯片可对摄像头获得的图像进行人工智能运算;麦克风连接主芯片,主芯片对麦克风采集到的声音进行增益处理。当通过本申请实施例提供的显示设备进行视频通话时,辅芯片通过摄像头采集视频图像,并且利用人脸识别、动作(唇形)识别等人工智能应用技术,视频通话过程中的画面不再是局限于固定焦距的画面,而是可以通过人脸识别+唇形识别相结合的目标说话人聚焦的可变焦视频,可实现无论人哪个角落,或者离摄像头多远,都可以做到自动进行脸部对焦,也就是显示设备对端的显示框内可保持人脸不变。在通过本申请实施例提供的显示设备进行视频通话时,可以实现随着人距离摄像头距离改变人脸的显示大小不变,但是当人距离摄像头距离改变时,人与显示设备上麦克风(远场麦克风)的距离也将发生改变那么为保证在人脸的显示大小不变的同时声音能够具有一定的稳定性,本申请实施例还提供了一种音频调节方法。In the display device provided by the embodiment of the present application, the camera is connected to the auxiliary chip, which can perform artificial intelligence operations on the image obtained by the camera; the microphone is connected to the main chip, and the main chip performs gain processing on the sound collected by the microphone. When a video call is made through the display device provided in the embodiment of the application, the auxiliary chip collects video images through the camera, and uses artificial intelligence application technologies such as face recognition and motion (lip shape) recognition. The picture during the video call is no longer Limited to a fixed focal length picture, but a zoomable video that can be focused on the target speaker through the combination of face recognition + lip recognition, which can realize automatic face recognition regardless of the corner of the person or how far away from the camera. Partial focus, that is, the face can be kept unchanged in the display frame of the opposite end of the display device. When making a video call through the display device provided by the embodiment of the application, the display size of the face does not change as the distance between the person and the camera changes, but when the distance between the person and the camera changes, the person and the microphone (far-field The distance of the microphone will also change. In order to ensure that the sound can have a certain stability while the display size of the face remains unchanged, the embodiment of the present application also provides an audio adjustment method.
附图8为本申请实施例提供的一种音频调节方法的流程示意图。如附图8所示,本申请实 施例提供的音频调节方法,包括:FIG. 8 is a schematic flowchart of an audio adjustment method provided by an embodiment of the application. As shown in Figure 8, the audio adjustment method provided by the embodiment of the present application includes:
S101:获取视频通话中当前图像对应的焦距信息。S101: Obtain focal length information corresponding to the current image in the video call.
在一些实施方式中,控制器可以包括主芯片和辅芯片,在视频通话中,辅芯片通过摄像头获得图像,并在采集图像是对人脸自动对焦。在本申请实施例中,人脸自动对焦通过相位法实现对焦。相位法对焦是指通过光束到达感光元件的时间先后来判断是否合焦的,也就是相位偏移量。自动对焦时,相机会在感光元件位置放置一个由透光和不透光的线条交替平行排布的网格板,并在网格板后的适当位置沿光轴对称放置两个受光元件。In some embodiments, the controller may include a main chip and an auxiliary chip. During a video call, the auxiliary chip obtains an image through a camera, and automatically focuses on the face when the image is collected. In the embodiment of the present application, the face auto-focusing realizes focusing by the phase method. The phase method of focusing refers to judging whether it is in focus by the time sequence of the light beam reaching the photosensitive element, that is, the phase shift amount. During autofocus, the camera will place a grid plate with light-transmitting and opaque lines alternately parallel to the photosensitive element, and place two light-receiving elements symmetrically along the optical axis at an appropriate position behind the grid plate.
对焦时,网格板沿光轴垂直方向上下移动,当聚焦面和网络板位置重合时(即合焦),透过网格板的光线会同时到达板后的两个受光元件;离焦(前焦或后焦)时,两光束只能一前一后到达受光元件,其输出信号之间就存在相位差。因前焦和后焦的波峰位置不同,相机可以迅速判断应往那边偏移,而不必像反差式那样前后移动多次才能合焦。计算原理参见附图9,在此不再赘述。When focusing, the grid plate moves up and down along the vertical direction of the optical axis. When the focus plane and the network plate coincide (ie in focus), the light passing through the grid plate will reach the two light receiving elements behind the plate at the same time; In the case of front focus or back focus), the two beams can only reach the light receiving element one after the other, and there is a phase difference between the output signals. Because the peak positions of the front focus and the back focus are different, the camera can quickly determine where to shift, instead of moving back and forth several times like the contrast type to achieve focus. Refer to Figure 9 for the calculation principle, which will not be repeated here.
在一些实施方式中,传输到对端的图像是裁切出的人脸图像,也即,由于自动对焦的存在,无论人与电视的距离是多少,本地会将采集到的图像中的人脸传输给对端,对端从接受到的视频并不能感受到本地人和电视的距离变化。但如果声音采用固定的增益,对端接受的人脸不能给人距离远近的变化,但是对端接受到的声音会随着本地人和电视的距离的变化而变化。In some embodiments, the image transmitted to the opposite end is a cropped face image, that is, due to the existence of autofocus, regardless of the distance between the person and the TV, the face in the collected image is locally transmitted For the opposite end, the video received by the opposite end cannot feel the change in the distance between the local people and the TV. But if the sound uses a fixed gain, the face received by the opposite end cannot change the distance, but the sound received by the opposite end will change with the distance between the local person and the TV.
在辅芯片的摄像头采集图像时,会实时自动对焦并实时输出焦距信息。在图像处理中,图像的清晰和聚焦的程度由图像的高频分量的多少来决定的,高频分量多,则图像清晰,反之则图像模糊,需要调整焦距以达到清晰,判断图像清晰的方法有傅里叶变换(FFT)和离散余弦变换(DCT);所以每一帧图像都会输出一个表征图像是否清晰的值,如图像距离(Image distance)。在本申请实施例中,Image distance的计算方法有高频分量法、平滑法、阈值积分法、灰度差分法、拉普拉斯像能函数等。When the camera of the auxiliary chip collects images, it will automatically focus in real time and output focal length information in real time. In image processing, the sharpness and focus of the image are determined by the amount of high-frequency components of the image. If there are more high-frequency components, the image is clear. Otherwise, the image is blurred. You need to adjust the focus to achieve clarity and determine the clear image. There are Fourier Transform (FFT) and Discrete Cosine Transform (DCT); therefore, each frame of image will output a value that characterizes whether the image is clear, such as image distance. In the embodiments of the present application, the calculation methods of Image distance include high frequency component method, smoothing method, threshold integration method, gray difference method, Laplace image energy function, and so on.
为了能快速输出当前图像对应的焦距信息,可采用一种改进的灰度差分方法作为图像的清晰度评价函数,即将一张图像所有像素的朗读至于周围相近像素的亮度值差的平方和作为图像的聚焦评价函数,计算相邻同场图像评价函数的值,聚焦评价函数如下:In order to quickly output the focal length information corresponding to the current image, an improved gray-scale difference method can be used as the image sharpness evaluation function, that is, the sum of the squares of the brightness value differences between all pixels of an image and the surrounding pixels as the image The focus evaluation function of, calculates the value of the adjacent same-field image evaluation function, the focus evaluation function is as follows:
Figure PCTCN2020093101-appb-000001
Figure PCTCN2020093101-appb-000001
f(x,y)表示为第x行、第y列像素的亮度值。本算法通过选取邻近的两个像素(f(x,y)像 素的左侧和上侧)作比较,当图像聚焦清晰时,F(x,y)最大,即对应的Image distance值最大,自适应地通过调整镜头的焦距步长,实时计算Image distance,达到相对最大时,即完成自动对焦,输出相应的焦距信息。f(x, y) is expressed as the brightness value of the pixel in the x-th row and y-th column. This algorithm selects two adjacent pixels (the left and upper side of f(x,y) pixels) for comparison. When the image is in focus, F(x,y) is the largest, that is, the corresponding Image distance value is the largest. The image distance is calculated in real time by adjusting the focal length of the lens adaptively, and when the relative maximum is reached, the autofocus is completed and the corresponding focal length information is output.
在一些实施方式中,控制器可以不区分主芯片和辅芯片,控制器根据输入的操作,启动摄像头采集本地图像,并根据本地图像生成当前图像,控制麦克风采集本地声音以生成音频。由于当前图像对应有焦距信息,因此获取当前图像的焦距信息以进行声音的调整。In some embodiments, the controller may not distinguish between the main chip and the auxiliary chip. The controller starts the camera to collect the local image according to the input operation, and generates the current image according to the local image, and controls the microphone to collect the local sound to generate audio. Since the current image corresponds to the focal length information, the focal length information of the current image is acquired to adjust the sound.
在一些实施方式中,启动摄像头和扬声器进行音视频采集的应用可能是视频通话应用,也有可能是录像/自拍应用,因此在采集本地图像和采集本地声音之后,控制器还需要判断是否处于视频通话状态,如果是,则表征启动摄像头和扬声器进行音视频采集的应用是视频通话应用,需要根据当前图像的焦距信息调整所述音频,控制器在当前图像的焦距信息调整所述音频后将调整后的音频和所述当前图像发送给视频通话的对端设备,如果不是视频通话状态,则表征启动摄像头和扬声器进行音视频采集的应用是录像/自拍应用,则无需根据当前图像的焦距信息调整所述音频,控制系统直接根据当前图像和所述音频生成音视频文件。In some embodiments, the application that starts the camera and speaker for audio and video collection may be a video call application, or a video/self-timer application. Therefore, after collecting local images and local sounds, the controller also needs to determine whether it is in a video call Status. If it is, it means that the application that starts the camera and speaker for audio and video collection is a video call application. The audio needs to be adjusted according to the focal length information of the current image. The controller adjusts the audio after the focal length information of the current image. The audio and the current image are sent to the peer device of the video call. If it is not in the video call state, it means that the application that starts the camera and speaker for audio and video collection is a video/self-portrait application. There is no need to adjust the focus information of the current image. For the audio, the control system directly generates audio and video files based on the current image and the audio.
在一些实施方式中,视频通话状态的获取可以是通过控制器中的应用管理器部分对启动摄像头和扬声器进行音视频采集的应用进行标记,在启动摄像头和扬声器进行音视频采集的应用时视频通话应用的时候,标记为视频通话状态,在启动摄像头和扬声器进行音视频采集的应用时录像/自拍等其他应用的时候标记为非视频通话状态。In some embodiments, the video call status can be obtained by marking the application that starts the camera and speaker for audio and video collection through the application manager part of the controller, and the video call is made when the application that starts the camera and speaker for audio and video collection is started. It is marked as a video call state when it is applied, and it is marked as a non-video call state when other applications such as recording/Selfie and other applications are started when the camera and speaker are used for audio and video capture.
在一些实施方式中,用户选中录像/自拍应用程序,以通过摄像头和扬声器进行音视频的预览或记录,控制器启动摄像头进行视频图像的采集,启动扬声器进行声音的采集,应用可以首先展示预览界面,在预览界面中显示根据摄像头采集到的图像进行数据处理后的当前图像,其中数据处理可以是图像质量的调整(例如亮度,对比度,色度,色温等),增加控件(装饰控件,图层等),或其他处理中的至少一个。预览界面中还可以设置有生成音视频文件的控件,响应于用户对生成音视频文件控件的选中,控制器根据启动摄像头进行视频图像的采集的图像生成当前图像,启动扬声器进行声音的采集生成音频,并将当前图像和音频合成音视频文件。在一些实施方式中,用户对生成音视频文件控件的选中后,应用的界面还可以继续展示当前图像。In some embodiments, the user selects the video/self-portrait application to preview or record audio and video through the camera and speaker, the controller activates the camera to collect video images, activates the speaker to collect sound, and the application can first display the preview interface , Display the current image after data processing based on the image collected by the camera in the preview interface, where the data processing can be the adjustment of image quality (such as brightness, contrast, chroma, color temperature, etc.), adding controls (decorative controls, layers) Etc.), or at least one of other treatments. The preview interface can also be provided with a control for generating audio and video files. In response to the user's selection of the control for generating audio and video files, the controller generates the current image according to the image collected by starting the camera to collect the video image, and starts the speaker to collect the sound to generate audio , And combine the current image and audio into audio and video files. In some embodiments, after the user selects the control for generating audio and video files, the interface of the application may continue to display the current image.
在一些实施方式中,录制音视频文件中持续的周期性的生成缓存数据,在接收输入的保存录像的操作指令(示例性的录制结束的操作,或才行的操作指令),根据所述缓存数据生成录像文件。这样可以使得用户感受到的录像文件的生成速度加快。In some embodiments, the buffer data is continuously and periodically generated in the recorded audio and video files, and upon receiving the input operation instruction to save the video (an exemplary recording end operation, or the only operation instruction), according to the buffer The data generates a video file. This can speed up the generation of video files that users feel.
S102:根据所述焦距信息获取麦克风增益。S102: Acquire microphone gain according to the focal length information.
在一些实施方式中,Image distance的变化信息会反映出当前镜头的焦距变化信息,以此类推,可以对应到当前用户距离显示设备摄像头的距离变化,也就是和上次通话时的距离的变化情况。通常当Image distance变大,则表示当前用户与显示设备(摄像头)之间的距离变大,如果Image distance变小,则表示当前用户与显示设备(摄像机)之间的距离变小。因此根据Image distance及其变化情况找出Image distance达到相对最大时对应的焦距,即当前图像对应的焦距信息。根据焦距信息确定当前用户与远场麦克风的距离,从而确定出与远场麦克风距离的变化情况,进而获取到麦克风增益,通过所述麦克风增益对采集到的音频数据进行增益处理。In some embodiments, the change information of Image distance reflects the change information of the focal length of the current lens, and so on, it can correspond to the change of the distance of the current user from the camera of the display device, that is, the change of the distance from the last call. . Generally, when the Image distance becomes larger, it means that the distance between the current user and the display device (camera) becomes larger, and if the Image distance becomes smaller, it means that the distance between the current user and the display device (camera) becomes smaller. Therefore, according to the Image distance and its changes, find out the focal length corresponding to when the Image distance reaches the relative maximum, that is, the focal length information corresponding to the current image. The distance between the current user and the far-field microphone is determined according to the focal length information, thereby determining the change of the distance from the far-field microphone, and then obtaining the microphone gain, and performing gain processing on the collected audio data through the microphone gain.
在一些实施方式中,为便于麦克风增益的获取,对当前用户与显示设备(摄像机)之间的距离以及对应的焦距信息进行统计,取经验值建立麦克风增益与对焦距信息预设对应关系。当确定了视频通话中当前图像对应的焦距信息,根据麦克风增益与对焦距信息预设对应关系即可获取到麦克风增益。In some embodiments, in order to facilitate the acquisition of microphone gain, the distance between the current user and the display device (camera) and the corresponding focal length information are calculated, and the empirical value is used to establish a preset correspondence between the microphone gain and the focal length information. When the focal length information corresponding to the current image in the video call is determined, the microphone gain can be obtained according to the preset correspondence relationship between the microphone gain and the focal length information.
根据经验值建立麦克风增益与对焦距信息预设对应关系,如表1所示,其中表1只是示例性的给出,并不作为本申请的限定。The preset correspondence between the microphone gain and the focus information is established based on the empirical value, as shown in Table 1, where Table 1 is only given as an example, and is not a limitation of this application.
表1:Table 1:
对应的距离Corresponding distance 焦距信息Focal length information 麦克风增益Microphone gain
3米3 meters 0.3毫米0.3mm 0dB0dB
4米4 meters 0.4毫米0.4mm 10dB10dB
2米2 meters 0.2毫米0.2mm -10dB-10dB
2.5米2.5 meters 0.25毫米0.25mm -8dB-8dB
3.5米3.5 meters 0.35毫米0.35mm 8dB8dB
因此,当获取到焦距信息为0.2毫米,那么根据表1示出的麦克风增益与对焦距信息预设对应关系,可以获得对应的麦克风增益为-10dB,然后通过获得的麦克风增益对采集到的相应音频数据进行增益处理。Therefore, when the acquired focal length information is 0.2 mm, then according to the preset correspondence between the microphone gain and the focal length information shown in Table 1, the corresponding microphone gain can be obtained as -10dB, and then the acquired microphone gain is used for the corresponding collected Audio data undergoes gain processing.
如果获取到的焦距信息不在表中,则通过插值法计算相应的麦克风增益。比如,获取到的焦距信息为0.375毫米,则可以通过以下方程计算出麦克风增益,其中X为对应的麦克风增益。If the acquired focal length information is not in the table, the corresponding microphone gain is calculated by interpolation. For example, if the acquired focal length information is 0.375 mm, the microphone gain can be calculated by the following equation, where X is the corresponding microphone gain.
Figure PCTCN2020093101-appb-000002
Figure PCTCN2020093101-appb-000002
可计算出X=9 dB,即当焦距信息为0.375毫米时,对应的麦克风增益为9 dB。It can be calculated that X=9 dB, that is, when the focal length information is 0.375 mm, the corresponding microphone gain is 9 dB.
在一些实施方式中,为便于麦克风增益的获取,还可以对当前用户与显示设备(摄像机)之间的距离以及对应的Image distance进行统计,建立麦克风增益与焦距信息的预设函数模型。当确定了视频通话中当前图像对应的焦距信息,获取麦克风增益与焦距信息的预设函数模型;根据所述焦距信息以及结合所述麦克风增益与焦距信息的预设函数模型即可获取到麦克风增益。In some embodiments, in order to facilitate the acquisition of the microphone gain, the distance between the current user and the display device (camera) and the corresponding image distance may also be counted to establish a preset function model of the microphone gain and focal length information. When the focal length information corresponding to the current image in the video call is determined, the preset function model of microphone gain and focal length information is obtained; the microphone gain can be obtained according to the focal length information and the preset function model combining the microphone gain and focal length information .
在一些实施方式中,为便于麦克风增益的获取,可以根据焦距信息采用自适应方法获取焦距信息。In some embodiments, in order to facilitate the acquisition of the microphone gain, an adaptive method may be adopted to acquire the focal length information according to the focal length information.
在一些实施方式中,由于启动摄像头和扬声器进行音视频采集的应用可能是视频通话应用,为了保证视频通话的显示效果,控制器会接收摄像头采集的本地图像,并根据人在本地图像中的位置对本地图像进行裁剪以生成预设尺寸的当前图像,由于摄像头为获取清晰的人物图像,调整采集图像时的焦距,因此当前图像对应一焦距信息。In some implementations, since the application that starts the camera and speaker for audio and video collection may be a video call application, in order to ensure the display effect of the video call, the controller will receive the local image collected by the camera, and according to the position of the person in the local image The local image is cropped to generate a current image of a preset size. Because the camera adjusts the focal length when the image is collected to obtain a clear image of a person, the current image corresponds to a focal length information.
在一些实施方式中,由于人的位置的移动,会导致和显示设备之间的距离的变化,但是由于传输给对端的图像是从本地图像中截取的,因此对端可能从图像上看不到人相对显示设备的移动,但由于和显示设备之间的距离的变化会导致扬声器采集到的声音数据发生音量大小的变化,因此通过当前图像对应不同的焦距调整音频数据的增益可以冲抵人和显示设备之间的距离的变化会导致扬声器采集到的声音数据发生音量大小的变化。In some implementations, due to the movement of the person’s position, the distance from the display device will change, but since the image transmitted to the opposite end is captured from the local image, the opposite end may not be able to see the image. The movement of a person relative to the display device, but the change in the distance between the display device and the display device will cause the volume of the sound data collected by the speaker to change, so adjusting the gain of the audio data through the current image corresponding to different focal lengths can offset the person and display The change of the distance between the devices will cause the volume of the sound data collected by the speaker to change.
在一些实施方式中,由于人的位置的移动,会导致和显示设备之间的距离的变化,但是由于传输给对端的图像是从本地图像中截取的人脸/人体对应的区域作为传输给对端的图像,此对端可能从图像上看不到人相对显示设备的移动,由于摄像头的焦距是随人脸/人体进行变化的,通过当前图像对应不同的焦距调整音频数据的增益可以冲抵人和显示设备之间的距离的变化会导致扬声器采集到的声音数据发生音量大小的变化,因此保证了发给对端设备的声音和图像的一致性。In some embodiments, due to the movement of the person’s position, the distance to the display device will change, but because the image transmitted to the opposite end is the area corresponding to the face/human body intercepted from the local image as the transmission to the opposite end. The image on the other end may not see the movement of the person relative to the display device from the image. Since the focal length of the camera changes with the face/human body, adjusting the audio data gain through the current image corresponding to different focal lengths can offset the human and The change of the distance between the display devices will cause the volume of the sound data collected by the speaker to change, thus ensuring the consistency of the sound and image sent to the peer device.
S103:根据获取到的麦克风增益值调整麦克风接收到的音频。S103: Adjust the audio received by the microphone according to the acquired microphone gain value.
在一些实施方式中,使用通过焦距信息获取到的麦克风增益对麦克风接收到的音频进行调整,即使用获取到的麦克风增益值对麦克风接收到的音频进行增益处理,便于保证麦克风接收到的音频声音大小的稳定性。In some embodiments, the microphone gain obtained through the focal length information is used to adjust the audio received by the microphone, that is, the obtained microphone gain value is used to perform gain processing on the audio received by the microphone, which is convenient to ensure the audio sound received by the microphone. Stability of size.
在一些实施方式中,通过焦距信息获取到的麦克风增益,但对于扬声器的增益不进行处 理,以使得对端的音频可以正常输出。In some embodiments, the microphone gain is obtained through the focal length information, but the speaker gain is not processed, so that the audio of the opposite end can be output normally.
因此本申请提供的音频调节方法,包括:获取视频通话中当前视频图像的焦距信息,根据所述焦距信息获取麦克风增益。在通过智能电视进行可变焦处理的视频通话的过程中,根据对视频图像的自动变焦处理获得焦距信息,根据焦距信息获得相应的麦克风增益,通过获得麦克风增益对视频通话中当前音频数据进行增益处理。在本申请中,利用焦距信息对视频通话中视频图像的处理,确定麦克风增益,从而实现在视频通话过程中基于人与麦克风距离对音频数据进行增益处理,以减小人与电视距离的变化对本地发送给对端的声音音量带来的波动,使得本地发送给对端的声音音量基本不变,保证视频通话过程中声音的稳定性。Therefore, the audio adjustment method provided by the present application includes: obtaining the focal length information of the current video image in the video call, and obtaining the microphone gain according to the focal length information. In the process of a video call with zoom processing through a smart TV, the focal length information is obtained according to the automatic zoom processing of the video image, the corresponding microphone gain is obtained according to the focal length information, and the current audio data in the video call is gain processing by obtaining the microphone gain . In this application, the focal length information is used to process the video image in the video call to determine the microphone gain, so as to realize the gain processing of the audio data based on the distance between the person and the microphone during the video call, so as to reduce the impact of the change of the distance between the person and the TV. The fluctuation caused by the volume of the voice sent locally to the opposite end makes the volume of the voice sent locally to the opposite end basically unchanged, ensuring the stability of the sound during the video call.
基于本申请实施提供的音频调节方法,本申请实施例还提供了一种视频通话方法。附图10为本申请实施例提供的视频通话方法的流程示意图。Based on the audio adjustment method provided by the implementation of this application, an embodiment of this application also provides a video call method. FIG. 10 is a schematic flowchart of a video call method provided by an embodiment of the application.
如附图10所示,本申请实施例提供的视频通话方法,包括:As shown in FIG. 10, the video call method provided by the embodiment of the present application includes:
S201:辅芯片将通过自动变焦处理后视频图像传输至主芯片,并将所述视频图像对应的焦距信息传输至所述主芯片。S201: The auxiliary chip transmits the video image processed by the automatic zoom to the main chip, and transmits the focal length information corresponding to the video image to the main chip.
在一些实施方式中,辅芯片接收摄像头采集本地图像生成的初始视频图像,并将初始视频图像进行自动变焦处理以生成变焦处理后的图像。在一些实施方式中,由于视频图像的获取和传输是连续的,如果初始视频图像处理后的变焦处理后的图像焦距信息发生了改变,则辅芯片判断所述当前图像对应的焦距信息是否与上一时刻当前图像对应的焦距信息相同;在所述当前图像对应的焦距信息与上一时刻当前图像对应的焦距信息不同时,将所述视频图像对应的焦距信息传输至所述主芯片;在所述当前图像对应的焦距信息与上一时刻当前图像对应的焦距信息相同时,不传输焦距信息传输至所述主芯片或传输表征焦距信息不变的标识给所述主芯片。此时焦距信息是根据自动变焦的焦距信息生成的。In some embodiments, the auxiliary chip receives the initial video image generated by the local image collected by the camera, and performs automatic zoom processing on the initial video image to generate a zoomed image. In some embodiments, since the acquisition and transmission of the video image is continuous, if the image focal length information after the zoom processing after the initial video image processing changes, the auxiliary chip determines whether the focal length information corresponding to the current image is consistent with the above The focal length information corresponding to the current image at one moment is the same; when the focal length information corresponding to the current image is different from the focal length information corresponding to the current image at the previous moment, the focal length information corresponding to the video image is transmitted to the main chip; When the focal length information corresponding to the current image is the same as the focal length information corresponding to the current image at the previous moment, the focal length information is not transmitted to the main chip or an identifier indicating that the focal length information remains unchanged is transmitted to the main chip. At this time, the focal length information is generated based on the focal length information of the auto zoom.
在一些实施方式中,辅芯片可以直接根据将摄像头采集的初始图像传递给所述主芯片,所述初始图像采集的过程中,所述摄像头进行了物理变焦的处理,因此所述初始图像即可进行裁切生成当前图像。此时焦距信息是根据摄像头的物理焦距信息生成的。In some embodiments, the auxiliary chip can directly transmit the initial image collected by the camera to the main chip. During the initial image acquisition, the camera performs physical zoom processing, so the initial image is sufficient. Perform cropping to generate the current image. At this time, the focal length information is generated according to the physical focal length information of the camera.
在一些实施方式中,辅芯片可以将摄像头进行了物理变焦得到的初始图像进行自动变焦处理后生成变焦处理后的图像发送给主芯片。此时焦距信息是根据摄像头的物理焦距信息和自动变焦的焦距信息生成的。In some embodiments, the auxiliary chip may perform automatic zoom processing on the initial image obtained by the physical zoom of the camera, and then generate a zoomed image and send it to the main chip. At this time, the focal length information is generated according to the physical focal length information of the camera and the focal length information of the automatic zoom.
在一些实施方式中,辅芯片接收摄像头采集本地图像生成的初始视频图像,并将初始视频图像根据人脸或人体的位置进行自动变焦并裁切成预设尺寸后生成变焦处理后的图像,无需主芯片进行裁切处理。摄像头可以是变焦的摄像头也可以是定焦的摄像头,在摄像头是变 焦的摄像头时,根据人脸或人体的位置通过控制摄像头的焦距生成初始视频图像,此时初始视频图像可以直接作为变焦处理后的图像,也可以继续进行自动变焦处理。In some embodiments, the auxiliary chip receives the initial video image generated by the local image collected by the camera, and automatically zooms the initial video image according to the position of the human face or human body and cuts it into a preset size to generate a zoomed image. The main chip is cut. The camera can be a zoom camera or a fixed focus camera. When the camera is a zoom camera, the initial video image is generated by controlling the focal length of the camera according to the position of the face or human body. At this time, the initial video image can be directly used as a zoomed camera. , You can also continue auto zoom processing.
S202:所述主芯片接收所述视频图像和所述焦距信息。S202: The main chip receives the video image and the focal length information.
在一些实施方式中,主芯片是识别变焦处理后的图像中的人脸或人体,并根据人脸或人体在图像中的相对位置进行图像的裁切生成当前图像以发给所述对端设备。In some embodiments, the main chip recognizes the face or human body in the zoomed image, and cuts the image according to the relative position of the face or human body in the image to generate the current image to send to the opposite device .
在一些实施方式中,由于裁切并不改变图像的焦距,因此当前图像的焦距信息就是变焦处理后的图像的焦距信息。In some embodiments, since cropping does not change the focal length of the image, the focal length information of the current image is the focal length information of the image after zooming.
在一些实施方式中对初始图像进行人脸或人体的识别,并并根据人脸或人体在图像中的相对位置进行图像的裁切生成当前图像以发给所述对端设备。In some embodiments, a face or a human body is recognized on the initial image, and the image is cut according to the relative position of the face or the human body in the image to generate the current image to send to the opposite device.
S203:所述主芯片根据所述焦距信息获取麦克风增益,并根据所述麦克风增益对所述视频图像对应的音频进行增益处理,以减小本地发送给对端的音频音量的波动。S203: The main chip obtains the microphone gain according to the focal length information, and performs gain processing on the audio corresponding to the video image according to the microphone gain, so as to reduce the fluctuation of the audio volume sent locally to the peer.
S204:所述主芯片将增益处理后的音频与所述视频图像同步,并将同步后的音频和视频传输至对端的设备。S204: The main chip synchronizes the gain-processed audio with the video image, and transmits the synchronized audio and video to the opposite device.
在一些实施方式中,同步后的音频和视频周期性的被封装为数据包并发送个对端显示设备,以使对端显示设备解析进行音频和视频的播放。In some embodiments, the synchronized audio and video are periodically encapsulated into data packets and sent to a peer display device, so that the peer display device parses and plays audio and video.
在一些实施方式中,,辅芯片通过摄像头采集视频图像,并在获得视频图像采集过程中经自动变焦处理获得可自动变焦的视频图像,并输出与自动变焦的视频图像对应的焦距信息。辅芯片将通过自动变焦处理后视频图像传输至主芯片,同时将所述视频图像对应的焦距信息传输至主芯片。在本申请实施例中主芯片与辅芯片之间包括网络、串口、USB和HDMI通信方式中的至少一种。因此辅芯片可通过网络、串口、USB或HDMI将视频图像和视频图像对应的焦距信息传输至主芯片。其中,辅芯片可基于辅芯片与主芯片通信的稳定性动态选取网络、串口、USB和HDMI任意一种通信方式,在此不做具体限定。In some embodiments, the auxiliary chip collects a video image through a camera, and obtains an auto-zoomable video image through an auto-zoom process during the acquisition of the video image, and outputs focal length information corresponding to the auto-zoomed video image. The auxiliary chip transmits the video image processed by the automatic zoom to the main chip, and at the same time transmits the focal length information corresponding to the video image to the main chip. In the embodiment of the present application, the main chip and the auxiliary chip include at least one of the communication modes of a network, a serial port, a USB, and an HDMI. Therefore, the auxiliary chip can transmit the video image and the focal length information corresponding to the video image to the main chip through the network, serial port, USB or HDMI. Among them, the auxiliary chip can dynamically select any communication mode of network, serial port, USB, and HDMI based on the stability of communication between the auxiliary chip and the main chip, which is not specifically limited here.
当辅芯片通过网络、串口、USB或HDMI将视频图像和视频图像对应的焦距信息传输至主芯片,主芯片接收所述视频图像和视频图像对应的焦距信息。When the auxiliary chip transmits the video image and the focal length information corresponding to the video image to the main chip through the network, serial port, USB or HDMI, the main chip receives the video image and the focal length information corresponding to the video image.
在一些实施方式中,自动变焦处理后的视频图像是根据焦距在摄像头采集的图中裁切出的图像。在一些实施方式中自动变焦处理后的视频图像是根据自动变焦的追踪结果,裁切出的人脸图像。In some embodiments, the video image processed by the automatic zoom is an image cropped in the image collected by the camera according to the focal length. In some embodiments, the video image processed by the automatic zoom is a face image cropped according to the tracking result of the automatic zoom.
在一些实施方式中,主芯片通过麦克风采集音频信息,其中麦克风采集音频信息与摄像头采集视频图像同步。当主芯片接收到辅芯片传输的视频图像和视频图像对应的焦距信息时,根据所述视频图像对应的焦距信息确定麦克风增益,然后通过确定获得的麦克风增益对采集 到的音频信息进行增益处理,获得增益处理后的音频。In some embodiments, the main chip collects audio information through a microphone, where the audio information collected by the microphone is synchronized with the video image collected by the camera. When the main chip receives the video image and the focal length information corresponding to the video image transmitted by the auxiliary chip, it determines the microphone gain according to the focal length information corresponding to the video image, and then performs gain processing on the collected audio information by determining the obtained microphone gain to obtain Audio after gain processing.
在对采集到的音频信息进行增益处理后,主芯片将增益处理后的音频与所述视频图像同步,获得音画同步视频通话数据,将音画同步视频通话数据传输至对端的显示框,进而完成视频通话。After performing gain processing on the collected audio information, the main chip synchronizes the gain-processed audio with the video image to obtain audio-visual synchronized video call data, and transmits the audio-visual synchronized video call data to the display frame of the opposite end. Complete the video call.
在本申请实施例中提供的视频通话方法,实现了主芯片和辅芯片相互协作解决了单一计算芯片既要支持视频通话的即时通信功能(视频编解码,传输),又要进行实时人工智能算法(人脸识别,唇形识别)的计算压力。同时在本申请实施例中提供的视频通话方法,在通过智能电视进行可变焦处理的视频通话的过程中,根据对视频图像的自动变焦处理获得焦距信息,根据焦距信息获得相应的麦克风增益,通过获得麦克风增益对视频通话中当前音频数据进行增益处理,实现在视频通话过程中基于人与麦克风距离对音频数据进行增益处理,保证视频通话过程中声音的稳定性,即视频过程中可以做到实时变焦来聚焦人脸并且声音能够实时稳定平滑。The video call method provided in the embodiments of this application realizes the mutual cooperation of the main chip and the auxiliary chip, and solves the problem that a single computing chip must support the instant communication function (video codec, transmission) of the video call and also perform real-time artificial intelligence algorithms. (Face recognition, lip shape recognition) calculation pressure. At the same time, in the video call method provided in the embodiments of the present application, in the process of performing a zoomable video call through a smart TV, the focal length information is obtained according to the automatic zoom processing of the video image, and the corresponding microphone gain is obtained according to the focal length information. Obtain the microphone gain. Perform gain processing on the current audio data in the video call, realize the gain processing on the audio data based on the distance between the person and the microphone during the video call, and ensure the stability of the sound during the video call, that is, it can be real-time during the video process Zoom to focus on the face and the sound can be stable and smooth in real time.
在一些实施方式中,可以不设置主芯片和辅芯片的区分,直接由控制器执行对应的所有操作。In some embodiments, the distinction between the main chip and the auxiliary chip may not be set, and all the corresponding operations are directly executed by the controller.
为便于麦克风增益的获取,在一些实施方式中,提供的视频通话方法中,所述主芯片根据所述焦距信息获取麦克风增益,包括:In order to facilitate the acquisition of microphone gain, in some embodiments, in the video call method provided, the main chip acquiring the microphone gain according to the focal length information includes:
所述主芯片获取麦克风增益与对焦距信息预设对应关系;The main chip obtains the preset correspondence between microphone gain and focus information;
根据所述焦距信息查找所述焦距信息与麦克风增益预设对应关系,获取麦克风增益。Search for a preset correspondence between the focal length information and the microphone gain according to the focal length information, and obtain the microphone gain.
或者,为便于麦克风增益的获取,在本申请实施例提供的视频通话方法中,所述主芯片根据所述焦距信息获取麦克风增益,包括:Or, in order to facilitate the acquisition of microphone gain, in the video call method provided in the embodiment of the present application, the main chip acquiring the microphone gain according to the focal length information includes:
所述主芯片获取麦克风增益与焦距信息的预设函数模型;The main chip acquires a preset function model of microphone gain and focal length information;
根据所述焦距信息以及结合所述麦克风增益与焦距信息的预设函数模型,获取麦克风增益。Acquire microphone gain according to the focal length information and a preset function model combining the microphone gain and focal length information.
上述通过麦克风增益与对焦距信息预设对应关系或麦克风增益与焦距信息的预设函数模型获取麦克风增益的步骤可参见上述实施例提供的音频调节方法,在此不再赘述。在本申请实施例提供的视频通话方法取麦克风增益的获取不局限于通过麦克风增益与对焦距信息预设对应关系或麦克风增益与焦距信息的预设函数模型获取,还可以采用自适应方法获取焦距信息。The steps of obtaining the microphone gain through the preset corresponding relationship between the microphone gain and the focus information or the preset function model of the microphone gain and the focus information can refer to the audio adjustment method provided in the foregoing embodiment, which will not be repeated here. The acquisition of the microphone gain in the video call method provided in the embodiments of the present application is not limited to the acquisition of the preset correspondence between the microphone gain and the focus information or the preset function model of the microphone gain and the focus information, and an adaptive method may also be used to obtain the focus. information.
在一些实施方式中,在本申请实施例提供的视频通话方法中,所述将所述视频图像对应的焦距信息传输至所述主芯片,包括:In some implementation manners, in the video call method provided by the embodiments of the present application, the transmitting the focal length information corresponding to the video image to the main chip includes:
判断所述视频图像对应的焦距信息是否与上一时刻视频图像对应的焦距信息相同;Determining whether the focal length information corresponding to the video image is the same as the focal length information corresponding to the video image at the previous moment;
当所述视频图像对应的焦距信息与上一时刻视频图像对应的焦距信息不同时,将所述视频图像对应的焦距信息传输至主芯片。When the focal length information corresponding to the video image is different from the focal length information corresponding to the video image at the previous moment, the focal length information corresponding to the video image is transmitted to the main chip.
如此,在辅芯片将通过自动变焦处理后视频图像传输至主芯片时,通过比较视频图像对应的焦距信息与上一时刻视频图像对应的焦距信息,判断当前时刻的视频图像对应的焦距信息相较于上一时刻视频图像对应的焦距信息是否发生变化,当当前时刻的视频图像对应的焦距信息相较于上一时刻视频图像对应的焦距信息发生变化时,在将视频图像对应的焦距信息传输至主芯片,可现对减少主芯片的计算消耗。In this way, when the auxiliary chip transmits the video image processed by the automatic zoom to the main chip, by comparing the focal length information corresponding to the video image with the focal length information corresponding to the video image at the previous moment, it is determined that the focal length information corresponding to the video image at the current moment is compared Whether the focal length information corresponding to the video image has changed at the previous moment, when the focal length information corresponding to the video image at the current moment changes compared with the focal length information corresponding to the video image at the previous moment, the focal length information corresponding to the video image is transmitted to The main chip can now reduce the computational consumption of the main chip.
在一些实施方式中,还提供了一种显示设备。本申请实施例提供的显示设备,包括显示器,所述显示器被配置为显示用户界面;In some embodiments, a display device is also provided. The display device provided by the embodiment of the present application includes a display, and the display is configured to display a user interface;
与所述显示器通信连接的控制器,所述控制器被配置为执行呈现用户界面:A controller communicatively connected with the display, the controller being configured to perform the presentation of the user interface:
与所述显示器连接的主芯片、以及与所述主芯片通过网络、串口、USB和HDMI通信方式中的至少一种连接的辅芯片,其中,所述主芯片被配置为执行上述实施例提供的音频调节方法;或者,The main chip connected to the display and the auxiliary chip connected to the main chip through at least one of a network, a serial port, a USB and an HDMI communication mode, wherein the main chip is configured to perform the methods provided in the foregoing embodiments Audio adjustment method; or,
所述主芯片和辅芯片被配置为协同执行上述实施例提供的所述的视频通话方法。The main chip and the auxiliary chip are configured to cooperatively execute the video call method provided in the foregoing embodiment.
在一些实施方式中,本申请提供了一种音视频处理方法,所述方法包括:In some implementation manners, the present application provides an audio and video processing method, the method includes:
接收根据摄像头采集本地图像生成的当前图像,并接收根据麦克风采集本地声音生成音频;获取当前图像对应的焦距信息;根据所述焦距信息及预设对应关系,获取麦克风增益,其中所述预设对应关系中不同的麦克风增益对应不同的焦距信息;根据获取到的麦克风增益值调整所述音频;将调整后的音频发送给视频通话的对端设备。Receive the current image generated by the local image collected by the camera, and receive the audio generated by the local sound collected by the microphone; obtain the focal length information corresponding to the current image; obtain the microphone gain according to the focal length information and the preset correspondence, wherein the preset corresponds Different microphone gains in the relationship correspond to different focal length information; adjust the audio according to the acquired microphone gain value; and send the adjusted audio to the peer device of the video call.
在本申请的一些实施方式中,所述根据所述焦距信息获取麦克风增益,包括:获取麦克风增益与对焦距信息预设对应关系;根据所述焦距信息查找所述预设对应关系,获取麦克风增益。In some embodiments of the present application, the obtaining the microphone gain according to the focal length information includes: obtaining a preset correspondence between the microphone gain and the focal length information; searching the preset correspondence according to the focal length information to obtain the microphone gain .
在本申请的一些实施方式中,所述根据所述焦距信息获取麦克风增益,包括:获取麦克风增益与焦距信息的预设函数模型;根据所述焦距信息以及结合所述麦克风增益与焦距信息的预设函数模型,获取麦克风增益。In some embodiments of the present application, the obtaining microphone gain according to the focal length information includes: obtaining a preset function model of microphone gain and focal length information; according to the focal length information and a preset combination of the microphone gain and focal length information Set up a function model to obtain microphone gain.
在本申请的一些实施方式中,在所述接收根据摄像头采集本地图像生成的当前图像,并接收根据麦克风采集本地声音生成音频之后,所述方法还包括:将所述当前图像发送给视频通话的对端设备。In some embodiments of the present application, after receiving the current image generated according to the local image collected by the camera, and receiving the audio generated according to the local sound collected by the microphone, the method further includes: sending the current image to the video caller The opposite device.
在一些实施方式中,在所述获取当前图像对应的焦距信息之前,所述方法还包括:确定 当前是否处于视频通话状态,则执行获取当前图像对应的焦距信息的步骤以处理所述音频;若未处于视频通话状态,则不执行获取当前图像对应的焦距信息的步骤来处理所述音频。In some embodiments, before the obtaining the focal length information corresponding to the current image, the method further includes: determining whether it is currently in a video call state, and then performing the step of obtaining focal length information corresponding to the current image to process the audio; if If it is not in a video call state, the step of obtaining focal length information corresponding to the current image is not performed to process the audio.
在本申请的一些实施方式中,接收根据摄像头采集本地图像生成的当前图像,并接收根据麦克风采集本地声音生成音频;若处于视频通话状态,则根据所述当前图像的焦距信息调整所述音频,并将调整后的音频和所述当前图像发送给视频通话的对端设备;若处于录像状态,则无需根据所述当前图像的焦距信息调整所述音频,根据所述当前图像和所述音频生成录像文件。In some embodiments of the present application, receiving the current image generated from the local image collected by the camera, and receiving the audio generated by collecting the local sound from the microphone; if it is in a video call state, adjust the audio according to the focal length information of the current image, And send the adjusted audio and the current image to the peer device of the video call; if it is in the recording state, there is no need to adjust the audio according to the focal length information of the current image, and generate according to the current image and the audio Video files.
在本申请一些实施方式中,所述根据所述当前图像的焦距信息调整所述音频,并将调整后的音频和所述当前图像发送给视频通话的对端设备包括:获取当前图像对应的焦距信息;根据所述焦距信息及预设对应关系,获取麦克风增益,其中所述预设对应关系中不同的麦克风增益对应不同的焦距信息;根据获取到的麦克风增益值调整所述音频;将调整后的音频和所述当前图像发送给视频通话的对端设备。In some embodiments of the present application, the adjusting the audio according to the focal length information of the current image, and sending the adjusted audio and the current image to the peer device of the video call includes: obtaining the focal length corresponding to the current image Information; obtain microphone gain according to the focal length information and a preset correspondence, wherein different microphone gains in the preset correspondence correspond to different focal length information; adjust the audio according to the acquired microphone gain value; after the adjustment The audio and the current image are sent to the peer device of the video call.
在本申请的一些实施方式中,根据所述当前图像和所述音频生成录像文件包括:根据所述当前图像和所述音频叠加生成缓存数据;接收输入的保存录像的操作指令,根据所述缓存数据生成录像文件。In some embodiments of the present application, generating a video file based on the current image and the audio includes: generating buffer data based on the current image and the audio superimposing; receiving an input operation instruction to save the video, according to the buffer The data generates a video file.
音频调节方法和视频通话方法参见上述实施例以及有关于本申请实施例提供的显示设备的其他特征参见于上述实施例提供的显示设备200或其他非双芯片的显示设备,在此不再赘述。For the audio adjustment method and the video call method, please refer to the above-mentioned embodiment and other features of the display device provided in the embodiment of the present application can be referred to the display device 200 or other non-dual-chip display devices provided in the above-mentioned embodiment, which will not be repeated here.
基于本申请中示出的示例性实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本申请所附权利要求保护的范围。此外,虽然本申请中公开内容按照示范性一个或几个实例来介绍,但应理解,可以就这些公开内容的各个方面也可以单独构成一个完整实施方式。Based on the exemplary embodiments shown in this application, all other embodiments obtained by a person of ordinary skill in the art without creative work shall fall within the protection scope of the appended claims of this application. In addition, although the disclosure in this application is introduced in accordance with one or several exemplary examples, it should be understood that various aspects of these disclosures can also constitute a complete implementation separately.
应当理解,本申请中说明书和权利要求书及上述附图中的术语″第一″、″第二″、″第三″等是用于区别类似的对象或同类的实体,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,除非另外说明,例如能够根据本申请实施例图示或描述中给出那些以外的顺序实施。It should be understood that the terms "first", "second", "third", etc. in the specification and claims of this application and the above-mentioned drawings are used to distinguish similar objects or entities of the same kind, and are not necessarily used to describe A specific order or sequence. It should be understood that the data used in this way can be interchanged under appropriate circumstances, unless otherwise specified, for example, it can be implemented according to an order other than those given in the illustration or description of the embodiments of the present application.
此外,术语″包括″和″具有″以及他们的任何变形,意图在于覆盖但不排他的包含,例如,包含了一系列组件的产品或设备不必限于清楚地列出的那些组件,而是可包括没有清楚地列出的或对于这些产品或设备固有的其它组件。In addition, the terms "including" and "having" and any variations of them are intended to cover but not exclusively include. For example, a product or device including a series of components need not be limited to those clearly listed, but may include Other components that are not clearly listed or are inherent to these products or equipment.

Claims (15)

  1. 一种音视频处理方法,其特征在于,所述方法包括:An audio and video processing method, characterized in that the method includes:
    接收根据摄像头采集本地图像生成的当前图像,并接收根据麦克风采集本地声音生成音频;Receive the current image generated by the local image collected by the camera, and receive the audio generated by the local sound collected by the microphone;
    获取当前图像对应的焦距信息;Obtain focal length information corresponding to the current image;
    根据所述焦距信息及预设对应关系,获取麦克风增益,其中所述预设对应关系中不同的麦克风增益对应不同的焦距信息;Acquiring microphone gains according to the focal length information and the preset correspondence, wherein different microphone gains in the preset correspondence correspond to different focal length information;
    根据获取到的麦克风增益值调整所述音频;Adjusting the audio according to the acquired microphone gain value;
    将调整后的音频发送给视频通话的对端设备。Send the adjusted audio to the peer device of the video call.
  2. 根据权利要求1所述的音视频处理方法,其特征在于,所述根据所述焦距信息获取麦克风增益,包括:The audio and video processing method according to claim 1, wherein the obtaining microphone gain according to the focal length information comprises:
    获取麦克风增益与对焦距信息预设对应关系;Obtain the preset correspondence between microphone gain and focus information;
    根据所述焦距信息查找所述预设对应关系,获取麦克风增益。Finding the preset correspondence relationship according to the focal length information, and obtaining the microphone gain.
  3. 根据权利要求1所述的音视频处理方法,其特征在于,所述根据所述焦距信息获取麦克风增益,包括:The audio and video processing method according to claim 1, wherein the obtaining microphone gain according to the focal length information comprises:
    获取麦克风增益与焦距信息的预设函数模型;Obtain a preset function model of microphone gain and focal length information;
    根据所述焦距信息以及结合所述麦克风增益与焦距信息的预设函数模型,获取麦克风增益。Acquire microphone gain according to the focal length information and a preset function model combining the microphone gain and focal length information.
  4. 根据权利要求1所述的音视频处理方法,其特征在于,在所述接收根据摄像头采集本地图像生成的当前图像,并接收根据麦克风采集本地声音生成音频之后,所述方法还包括:将所述当前图像发送给视频通话的对端设备。The audio and video processing method according to claim 1, characterized in that, after receiving the current image generated according to the local image collected by the camera, and receiving the audio generated according to the local sound collected by the microphone, the method further comprises: The current image is sent to the peer device of the video call.
  5. 根据权利要求1所述的音视频处理方法,其特征在于,在所述获取当前图像对应的焦 距信息之前,所述方法还包括:The audio and video processing method according to claim 1, characterized in that, before said obtaining the focal distance information corresponding to the current image, the method further comprises:
    确定当前是否处于视频通话状态,则执行获取当前图像对应的焦距信息的步骤以处理所述音频;Determine whether it is currently in a video call state, then execute the step of obtaining focal length information corresponding to the current image to process the audio;
    若未处于视频通话状态,则不执行获取当前图像对应的焦距信息的步骤来处理所述所述音频。If it is not in a video call state, the step of obtaining focal length information corresponding to the current image is not performed to process the audio.
  6. 一种音视频处理方法,包括:An audio and video processing method, including:
    接收根据摄像头采集本地图像生成的当前图像,并接收根据麦克风采集本地声音生成音频;Receive the current image generated by the local image collected by the camera, and receive the audio generated by the local sound collected by the microphone;
    若处于视频通话状态,则根据所述当前图像的焦距信息调整所述音频,并将调整后的音频和所述当前图像发送给视频通话的对端设备;若处于录像状态,则无需根据所述当前图像的焦距信息调整所述音频,根据所述当前图像和所述音频生成录像文件。If it is in a video call state, adjust the audio according to the focal length information of the current image, and send the adjusted audio and the current image to the peer device of the video call; The audio is adjusted by the focal length information of the current image, and a video file is generated according to the current image and the audio.
  7. 如权利要求6所述的音视频处理方法,其特征在于,所述根据所述当前图像的焦距信息调整所述音频,并将调整后的音频和所述当前图像发送给视频通话的对端设备包括:The audio and video processing method of claim 6, wherein the audio is adjusted according to the focal length information of the current image, and the adjusted audio and the current image are sent to the peer device of the video call include:
    获取当前图像对应的焦距信息;Obtain focal length information corresponding to the current image;
    根据所述焦距信息及预设对应关系,获取麦克风增益,其中所述预设对应关系中不同的麦克风增益对应不同的焦距信息;Acquiring microphone gains according to the focal length information and the preset correspondence, wherein different microphone gains in the preset correspondence correspond to different focal length information;
    根据获取到的麦克风增益值调整所述音频;Adjusting the audio according to the acquired microphone gain value;
    将调整后的音频和所述当前图像发送给视频通话的对端设备。Send the adjusted audio and the current image to the peer device of the video call.
  8. 如权利要求6所述的音视频处理方法,其特征在于,根据所述当前图像和所述音频生成录像文件包括:8. The audio and video processing method of claim 6, wherein generating a video file according to the current image and the audio comprises:
    根据所述当前图像和所述音频叠加生成缓存数据;Generating cache data according to the current image and the audio superimposed;
    接收输入的保存录像的操作指令,根据所述缓存数据生成录像文件。Receive an input operation instruction for saving the video, and generate a video file according to the buffered data.
  9. 一种音视频处理方法,其特征在于,所述方法包括:An audio and video processing method, characterized in that the method includes:
    辅芯片将通过自动变焦处理后摄像头采集的视频图像传输至主芯片,并将所述视频图像对应的焦距信息传输至所述主芯片;The auxiliary chip transmits the video image collected by the camera after the automatic zoom processing to the main chip, and transmits the focal length information corresponding to the video image to the main chip;
    所述主芯片接收所述视频图像和所述焦距信息;The main chip receives the video image and the focal length information;
    所述主芯片根据所述焦距信息获取麦克风增益,并根据所述麦克风增益对所述视频图像对应的音频进行增益处理,以减小本地发送给对端的音频音量的波动;The main chip obtains the microphone gain according to the focal length information, and performs gain processing on the audio corresponding to the video image according to the microphone gain, so as to reduce the fluctuation of the audio volume sent locally to the peer;
    所述主芯片将增益处理后的音频与所述视频图像同步,并将同步后的音频和视频传输至对端的显示框。The main chip synchronizes the audio after gain processing with the video image, and transmits the synchronized audio and video to the display frame of the opposite end.
  10. 一种显示设备,包括:A display device including:
    摄像头;camera;
    麦克风;microphone;
    控制器,所述控制器用于:The controller is used for:
    接收根据摄像头采集本地图像生成的当前图像,并接收根据麦克风采集本地声音生成音频;Receive the current image generated by the local image collected by the camera, and receive the audio generated by the local sound collected by the microphone;
    获取当前图像对应的焦距信息;Obtain focal length information corresponding to the current image;
    根据所述焦距信息及预设对应关系,获取麦克风增益,其中所述预设对应关系中不同的麦克风增益对应不同的焦距信息;Acquiring microphone gains according to the focal length information and the preset correspondence, wherein different microphone gains in the preset correspondence correspond to different focal length information;
    根据获取到的麦克风增益值调整所述音频;Adjusting the audio according to the acquired microphone gain value;
    将调整后的音频发送给视频通话的对端设备。Send the adjusted audio to the peer device of the video call.
  11. 如权利要求10所述的显示设备,其特征在于,所述控制器还:11. The display device of claim 10, wherein the controller further:
    将所述当前图像发送给视频通话的对端设备。Send the current image to the peer device of the video call.
  12. 如权利要求10所述的显示设备,其特征在于,所述控制器在所述获取当前图像对应 的焦距信息之前,所述控制器还用于:The display device according to claim 10, wherein before the controller obtains the focal length information corresponding to the current image, the controller is further configured to:
    确定当前是否处于视频通话状态,则执行获取当前图像对应的焦距信息的步骤以处理所述音频;Determine whether it is currently in a video call state, then execute the step of obtaining focal length information corresponding to the current image to process the audio;
    若未处于视频通话状态,则不执行获取当前图像对应的焦距信息的步骤来处理所述所述音频If it is not in a video call state, the step of obtaining focal length information corresponding to the current image is not performed to process the audio
  13. 一种显示设备,其特征在于:包括:A display device, characterized in that it includes:
    摄像头;camera;
    麦克风;microphone;
    控制器,所述控制器用于:The controller is used for:
    接收根据摄像头采集本地图像生成的当前图像,并接收根据麦克风采集本地声音生成音频;Receive the current image generated by the local image collected by the camera, and receive the audio generated by the local sound collected by the microphone;
    若处于视频通话状态,则根据所述当前图像的焦距信息调整所述音频,并将调整后的音频和所述当前图像发送给视频通话的对端设备;If it is in a video call state, adjust the audio according to the focal length information of the current image, and send the adjusted audio and the current image to the peer device of the video call;
    若处于非视频通话状态,则无需根据所述当前图像的焦距信息调整所述音频,根据所述当前图像和所述音频生成音视频文件。If it is in a non-video call state, there is no need to adjust the audio according to the focal length information of the current image, and an audio and video file is generated according to the current image and the audio.
  14. 一种显示设备,其特征在于,包括:A display device, characterized by comprising:
    摄像头;camera;
    麦克风;microphone;
    相互连接的主芯片和辅芯片;Main chip and auxiliary chip connected to each other;
    所述辅芯片接收所述摄像头采集的本地图像,将所述本地图像通过自动变焦处理后传输至主芯片,并将所述当前图像对应的焦距信息传输至所述主芯片;The auxiliary chip receives the local image collected by the camera, transmits the local image to the main chip after automatic zoom processing, and transmits the focal length information corresponding to the current image to the main chip;
    所述主芯片接收所述当前图像和所述焦距信息,并根据自动变焦处理后的本地图像生成 当前图像;The main chip receives the current image and the focal length information, and generates the current image according to the local image after automatic zoom processing;
    所述主芯片根据所述焦距信息获取麦克风增益,并根据所述麦克风增益对所述当前图像对应的音频进行增益处理,以减小本地发送给对端的音频音量的波动;The main chip obtains the microphone gain according to the focal length information, and performs gain processing on the audio corresponding to the current image according to the microphone gain, so as to reduce the fluctuation of the audio volume sent locally to the peer;
    所述主芯片将增益处理后的音频与所述视频图像同步,并将同步后的音频和视频传输至对端的显示设备。The main chip synchronizes the audio after gain processing with the video image, and transmits the synchronized audio and video to the display device of the opposite end.
  15. 根据权利要求14所述的视频通话方法,其特征在于,所述将所述当前图像对应的焦距信息传输至所述主芯片,包括:The video call method of claim 14, wherein the transmitting the focal length information corresponding to the current image to the main chip comprises:
    判断所述当前图像对应的焦距信息是否与上一时刻当前图像对应的焦距信息相同;Judging whether the focal length information corresponding to the current image is the same as the focal length information corresponding to the current image at the previous moment;
    在所述当前图像对应的焦距信息与上一时刻当前图像对应的焦距信息不同时,将所述视频图像对应的焦距信息传输至所述主芯片;Transmitting the focal length information corresponding to the video image to the main chip when the focal length information corresponding to the current image is different from the focal length information corresponding to the current image at the previous moment;
    在所述当前图像对应的焦距信息与上一时刻当前图像对应的焦距信息相同时,不传输焦距信息传输至所述主芯片或传输表征焦距信息不变的标识给所述主芯片。When the focal length information corresponding to the current image is the same as the focal length information corresponding to the current image at the previous moment, the focal length information is not transmitted to the main chip or an identifier indicating that the focal length information remains unchanged is transmitted to the main chip.
PCT/CN2020/093101 2019-06-10 2020-05-29 Audio and video processing method and display device WO2020248829A1 (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
CN201910497121.4 2019-06-10
CN201910497121 2019-06-10
CN201910736428.5 2019-08-09
CN201910736428.5A CN112073663B (en) 2019-06-10 2019-08-09 Audio gain adjusting method, video chat method and display device

Publications (1)

Publication Number Publication Date
WO2020248829A1 true WO2020248829A1 (en) 2020-12-17

Family

ID=73658481

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/093101 WO2020248829A1 (en) 2019-06-10 2020-05-29 Audio and video processing method and display device

Country Status (2)

Country Link
CN (1) CN112073663B (en)
WO (1) WO2020248829A1 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115134499B (en) * 2022-06-28 2024-02-02 世邦通信股份有限公司 Audio and video monitoring method and system
CN115484406B (en) * 2022-08-21 2024-05-21 复旦大学 Multi-chip array communication method suitable for array camera

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1901663A (en) * 2006-07-25 2007-01-24 华为技术有限公司 Video frequency communication system with sound position information and its obtaining method
CN105578097A (en) * 2015-07-10 2016-05-11 宇龙计算机通信科技(深圳)有限公司 Video recording method and terminal
CN106157986A (en) * 2016-03-29 2016-11-23 联想(北京)有限公司 A kind of information processing method and device, electronic equipment
US20170353811A1 (en) * 2016-06-03 2017-12-07 Nureva, Inc. Method, apparatus and computer-readable media for virtual positioning of a remote participant in a sound space
US20190069080A1 (en) * 2017-08-28 2019-02-28 Bose Corporation User-controlled beam steering in microphone array

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101199349B1 (en) * 2004-08-27 2012-11-09 엘지전자 주식회사 Mobile phone having image communication function
US20070172083A1 (en) * 2006-01-25 2007-07-26 Cheng-Te Tseng Method and apparatus for controlling a gain of a voice signal
US8319858B2 (en) * 2008-10-31 2012-11-27 Fortemedia, Inc. Electronic apparatus and method for receiving sounds with auxiliary information from camera system
TW201019719A (en) * 2008-11-14 2010-05-16 Asia Optical Co Inc Gain-calibrating appararus for optical image stablizer and method thereof
CN101534413B (en) * 2009-04-14 2012-07-04 华为终端有限公司 System, method and apparatus for remote representation
JP5531774B2 (en) * 2010-05-20 2014-06-25 リコーイメージング株式会社 Automatic focusing apparatus and camera equipped with the same
JP5921121B2 (en) * 2011-09-09 2016-05-24 キヤノン株式会社 Imaging apparatus, control method and program thereof, and recording medium
CN103369209B (en) * 2013-07-31 2016-08-17 上海通途半导体科技有限公司 Vedio noise reduction device and method
CN106328156B (en) * 2016-08-22 2020-02-18 华南理工大学 Audio and video information fusion microphone array voice enhancement system and method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1901663A (en) * 2006-07-25 2007-01-24 华为技术有限公司 Video frequency communication system with sound position information and its obtaining method
CN105578097A (en) * 2015-07-10 2016-05-11 宇龙计算机通信科技(深圳)有限公司 Video recording method and terminal
CN106157986A (en) * 2016-03-29 2016-11-23 联想(北京)有限公司 A kind of information processing method and device, electronic equipment
US20170353811A1 (en) * 2016-06-03 2017-12-07 Nureva, Inc. Method, apparatus and computer-readable media for virtual positioning of a remote participant in a sound space
US20190069080A1 (en) * 2017-08-28 2019-02-28 Bose Corporation User-controlled beam steering in microphone array

Also Published As

Publication number Publication date
CN112073663A (en) 2020-12-11
CN112073663B (en) 2023-08-11

Similar Documents

Publication Publication Date Title
WO2020248668A1 (en) Display and image processing method
WO2021031629A1 (en) Display apparatus, and multi-function button application method for control device
WO2021031623A1 (en) Display apparatus, file sharing method, and server
WO2021189358A1 (en) Display device and volume adjustment method
WO2020248714A1 (en) Data transmission method and device
US11917329B2 (en) Display device and video communication data processing method
WO2020248681A1 (en) Display device and method for displaying bluetooth switch states
WO2021031620A1 (en) Display device and backlight brightness adjustment method
WO2021031598A1 (en) Self-adaptive adjustment method for video chat window position, and display device
CN112788422A (en) Display device
WO2021031589A1 (en) Display device and dynamic color gamut space adjustment method
WO2020248829A1 (en) Audio and video processing method and display device
CN112463267B (en) Method for presenting screen saver information on display device screen and display device
CN112783380A (en) Display apparatus and method
WO2020248699A1 (en) Sound processing method and display apparatus
CN110602540B (en) Volume control method of display equipment and display equipment
WO2020248790A1 (en) Voice control method and display device
CN112399235B (en) Camera shooting effect enhancement method and display device of intelligent television
CN112073773A (en) Screen interaction method and device and display equipment
WO2020248886A1 (en) Image processing method and display device
WO2020248788A1 (en) Voice control method and display device
CN113242383B (en) Display device and image calibration method for automatic focusing imaging of display device
CN113630633B (en) Display device and interaction control method
CN112399223B (en) Method for improving moire fringe phenomenon and display device
CN113645502B (en) Method for dynamically adjusting control and display device

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20822037

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20822037

Country of ref document: EP

Kind code of ref document: A1