CN118283202A

CN118283202A - Display equipment and audio processing method

Info

Publication number: CN118283202A
Application number: CN202410210725.7A
Authority: CN
Inventors: 徐磊; 李文龙; 于清晓; 王之奎
Original assignee: Hisense Visual Technology Co Ltd
Current assignee: Hisense Visual Technology Co Ltd
Filing date: 2024-02-26
Publication date: 2024-07-02

Abstract

Some embodiments of the present application provide a display apparatus and an audio processing method, which may obtain an audio signal to be played in response to a play instruction of the audio signal. Wherein the audio signal to be played comprises a first audio signal and a second audio signal. And then performing hardware decoding on the first audio signal by the main decoder and performing hardware decoding on the second audio signal by the auxiliary decoder to generate a first mixing parameter of the first audio signal and a second mixing parameter of the second audio signal. And then based on the first mixing parameter and the second mixing parameter, performing mixing on the decoded first audio signal and the decoded second audio signal through a mixer so as to generate a mixed audio signal and playing the mixed audio signal. Therefore, when the display equipment mixes a plurality of audio signals, the mixing parameters of the audio signals are adjusted according to the requirements, so that the mixing effect can meet the requirements of different use scenes, and the user experience of a user when the display equipment is used is improved.

Description

Display equipment and audio processing method

Technical Field

The embodiment of the application relates to the technical field of display equipment, in particular to display equipment and an audio processing method.

Background

The display device is an intelligent device capable of presenting a user interface and supporting user interaction. Taking intelligent electricity as an example, the intelligent television is based on the Internet application technology, is provided with an open operating system and a chip, has an open application platform, can realize a bidirectional man-machine interaction function, and is a display equipment product integrating multiple functions of video, entertainment, data and the like, and the intelligent television is used for meeting the diversified and personalized requirements of users. The display device includes a display and an audio system. The display is a core component of a display device for displaying image and video content. An audio system is a component for playing sound, including speakers, audio decoders, audio amplifiers, and the like. The audio system may provide a stereo or surround sound effect.

When the display device plays the video and audio, other sounds need to be superimposed on the video and audio to prompt the user for some important information. For example, if the television device receives an earthquake early warning while the user watches a television program, an earthquake prompt tone is superimposed on the video tone.

However, the display device can only realize simple superposition of sound, the obtained sound effect often cannot provide good audio-visual experience, important information cannot be highlighted, a user can miss the important information easily, and the experience of the user is reduced.

Disclosure of Invention

The exemplary embodiment of the application provides a display device and an audio processing method, which can improve the sound mixing effect of the display device and improve the user experience.

In one aspect, some embodiments of the present application provide a display apparatus, including: a display configured to display a user interface; an audio output interface configured to play an audio signal controller including a primary decoder, at least one secondary decoder, and a mixer; the main decoder is configured to perform hardware decoding on the first audio signal; the secondary decoder is configured to perform hardware decoding on the second audio signal; a controller configured to:

responding to a playing instruction of the audio signals, and acquiring audio signals to be played, wherein the audio signals to be played comprise a first audio signal and a second audio signal;

Performing hardware decoding on the first audio signal by the main decoder and performing hardware decoding on the second audio signal by the sub decoder;

generating a first mixing parameter of the first audio signal and a second mixing parameter of the second audio signal;

Performing, by the mixer, mixing of the decoded first audio signal and the decoded second audio signal based on the first mixing parameter and the second mixing parameter to generate a mixed audio signal;

The mixed audio signal is played through the audio output interface.

In another aspect, some embodiments of the present application further provide an audio processing method, where the audio processing method is applied to the display device of the first aspect, and the display device includes a display, a playing component, and a controller, and the controller includes a main decoder, at least one sub-decoder, and a mixer, and the method includes:

The mixed audio signal is played through the audio output interface.

As can be seen from the above technical solutions, some embodiments of the present application provide a display device and an audio processing method, where the method can obtain an audio signal to be played in response to a playing instruction of the audio signal. Wherein the audio signal to be played comprises a first audio signal and a second audio signal. And then performing hardware decoding on the first audio signal by the main decoder and performing hardware decoding on the second audio signal by the auxiliary decoder to generate a first mixing parameter of the first audio signal and a second mixing parameter of the second audio signal. And then based on the first mixing parameter and the second mixing parameter, performing mixing on the decoded first audio signal and the decoded second audio signal through a mixer so as to generate a mixed audio signal and playing the mixed audio signal. Therefore, the first audio signal can be decoded by hardware through the main decoder, the second audio signal is decoded by hardware through the auxiliary decoder, and the first audio signal mixing parameters and the second audio signal mixing parameters of the first audio signal and the second audio signal are generated, so that when a plurality of audio signals are mixed on the display device, the mixing parameters of the audio signals are adjusted according to the requirements, the mixing effect can meet the requirements of different use scenes, and the user experience of a user when the display device is used is improved.

Drawings

In order to more clearly illustrate the embodiments of the present application or the implementation of the related art, the drawings that are required for the embodiments or the related art description will be briefly described, and it is apparent that the drawings in the following description are some embodiments of the present application and that other drawings may be obtained according to these drawings for a person having ordinary skill in the art.

FIG. 1 is a usage scenario of a display device according to an embodiment of the present application;

FIG. 2 is a block diagram of a hardware configuration of a control device according to an embodiment of the present application;

FIG. 3 is a hardware configuration diagram of a display device according to an embodiment of the present application;

FIG. 4 is a software configuration diagram of a display device according to an embodiment of the present application;

FIG. 5 is a flow chart of an audio processing method according to an embodiment of the present application;

FIG. 6 is a flowchart of a method for controlling a sub-decoder to perform decoding according to an embodiment of the present application;

fig. 7 is a flowchart of a method for generating a first mixing parameter according to an embodiment of the present application;

Fig. 8 is a flowchart of a method for generating a second mixing parameter according to an embodiment of the present application;

FIG. 9 is an input interface for a dialog application for a display device provided in an embodiment of the application;

FIG. 10 is an output interface for a dialog application for a display device provided in an embodiment of the application;

FIG. 11 is a flowchart of a processing method when there is an input event based on a voice application by a user according to an embodiment of the present application;

FIG. 12 is a flow chart of a method of generating a mixed audio signal provided in an embodiment of the present application;

Fig. 13 is a flowchart of still another method for generating a mixed audio signal according to an embodiment of the present application.

Detailed Description

For the purposes of making the objects and embodiments of the present application more apparent, an exemplary embodiment of the present application will be described in detail below with reference to the accompanying drawings in which exemplary embodiments of the present application are illustrated, it being apparent that the exemplary embodiments described are only some, but not all, of the embodiments of the present application.

It should be noted that the brief description of the terminology in the present application is for the purpose of facilitating understanding of the embodiments described below only and is not intended to limit the embodiments of the present application. Unless otherwise indicated, these terms should be construed in their ordinary and customary meaning.

The terms first, second, third and the like in the description and in the claims and in the above-described figures are used for distinguishing between similar or similar objects or entities and not necessarily for describing a particular sequential or chronological order, unless otherwise indicated. It is to be understood that the terms so used are interchangeable under appropriate circumstances.

The terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a product or apparatus that comprises a list of elements is not necessarily limited to all elements explicitly listed, but may include other elements not expressly listed or inherent to such product or apparatus.

The term "module" refers to any known or later developed hardware, software, firmware, artificial intelligence, fuzzy logic, or combination of hardware or/and software code that is capable of performing the function associated with that element.

Fig. 1 is a schematic diagram of a usage scenario of a display device according to an embodiment. As shown in fig. 1, the display device 200 is also in data communication with a server 400, and a user can operate the display device 200 through the smart device 300 or the control apparatus 100.

In some embodiments, the control apparatus 100 may be a remote controller, and the communication between the remote controller and the display device includes at least one of infrared protocol communication or bluetooth protocol communication, and other short-range communication modes, and the display device 200 is controlled by a wireless or wired mode. The user may control the display apparatus 200 by inputting a user instruction through at least one of a key on a remote controller, a voice input, a control panel input, and the like.

In some embodiments, the smart device 300 may include any one of a mobile terminal, tablet, computer, notebook, AR/VR device, etc.

In some embodiments, the smart device 300 may also be used to control the display device 200. For example, the display device 200 is controlled using an application running on a smart device.

In some embodiments, the smart device 300 and the display device may also be used for communication of data.

In some embodiments, the display device 200 may also perform control in a manner other than the control apparatus 100 and the smart device 300, for example, the voice command control of the user may be directly received through a module configured inside the display device 200 device for acquiring voice commands, or the voice command control of the user may be received through a voice control apparatus configured outside the display device 200 device.

In some embodiments, the display device 200 is also in data communication with a server 400. The display device 200 may be permitted to make communication connections via a Local Area Network (LAN), a Wireless Local Area Network (WLAN), and other networks. The server 400 may provide various contents and interactions to the display device 200. The server 400 may be a cluster, or may be multiple clusters, and may include one or more types of servers.

In some embodiments, software steps performed by one step execution body may migrate on demand to be performed on another step execution body in data communication therewith. For example, software steps executed by the server may migrate to be executed on demand on a display device in data communication therewith, and vice versa.

Fig. 2 exemplarily shows a block diagram of a configuration of the control apparatus 100 in accordance with an exemplary embodiment. As shown in fig. 2, the control device 100 includes a first controller 110, a communication interface 130, a user input/output interface 140, a memory, and a power supply. The control apparatus 100 may receive an input operation instruction of a user and convert the operation instruction into an instruction recognizable and responsive to the display device 200, and function as an interaction between the user and the display device 200.

Fig. 3 shows a hardware configuration block diagram of the display device 200 in accordance with an exemplary embodiment.

In some embodiments, display apparatus 200 includes at least one of a modem 210, a communicator 220, a detector 230, a device interface 240, a controller 250, a display 260, an audio output interface 270, memory, a power supply, a user interface.

In some embodiments, the controller 250 includes a central processor, a video processor, an audio processor, a graphic processor, a RAM, a ROM, and first to nth interfaces for input/output.

In some embodiments the controller 250 further comprises a primary decoder, at least one secondary decoder and a mixer.

Wherein the main decoder is configured to perform hardware decoding on the first audio signal. The secondary decoder is configured to perform hardware decoding on the second audio signal. The mixer is configured to perform a mixing operation on the plurality of audio signals.

The main decoder and the auxiliary decoder are both hardware decoders, and can be Digital Signal Processing (DSP) chips or special decoding chips. The basic operation of the primary and secondary decoders comprises the following steps:

First, the primary and secondary decoders receive digital audio data, and then the primary and secondary decoders need to recognize the format of the audio data, such as different audio coding formats for moving picture experts compression standard audio layer 3 (Moving Picture Experts Group Audio Layer III, MP 3), advanced audio coding (Advanced Audio Coding, AAC), WAV, lossless audio compression coding (Free Lossless Audio Codec, FLAC), etc. Each audio coding format has a corresponding compression and coding scheme, so the primary decoder and the secondary decoder need to select a corresponding decoding algorithm according to the difference of the audio coding formats. After the audio format is identified, the primary decoder and the secondary decoder begin to decompress the audio data. For lossy compression formats (e.g., MP3, AAC), the primary and secondary decoders may recover the original audio data. For lossless compression formats (e.g., FLAC, WAV), the primary and secondary decoders directly decompress the original audio data.

The main decoder and the sub-decoder convert the decompressed digital audio data into analog audio signals for audio output. This is typically done by a Digital-to-Analog Converter (DAC) that converts the Digital audio signal to an Analog audio signal.

The decoding mode of the primary decoder and the secondary decoder is generally implemented by circuits and algorithms inside the decoding chip. Different decoders may employ different decoding algorithms and techniques to meet various audio decoding requirements and provide high quality audio output.

In some embodiments, the display 260 includes a display screen component for presenting a picture, and a driving component for driving an image display, for receiving an image signal from the controller output, for displaying video content, image content, and components of a menu manipulation interface, and a user manipulation UI interface, etc.

In some embodiments, the display 260 may be at least one of a liquid crystal display, an organic light emitting semiconductor (Organic Electroluminescence Display, OLED) display, and a projection display, and may also be a projection device and a projection screen.

In some embodiments, communicator 220 is a component for communicating with external devices or servers according to various communication protocol types.

In some embodiments, the detector 230 is used to collect signals of the external environment or interaction with the outside. For example, detector 230 includes a light receiver, a sensor for capturing the intensity of ambient light; either the detector 230 comprises an image collector, such as a camera, which may be used to collect external environmental scenes, user attributes or user interaction gestures, or the detector 230 comprises a sound collector, such as a microphone or the like, for receiving external sounds.

In some embodiments, the device interface 240 may include, but is not limited to, the following: high definition multimedia interface (High Definition Multimedia Interface, HDMI), analog or data high definition component input interface (component), composite video input interface (CVBS), universal serial bus (Universal Serial Bus, USB) input interface, RGB port, etc. The input/output interface may be a composite input/output interface formed by a plurality of interfaces.

In some embodiments, the controller 250 controls the operation of the display device and responds to user operations through various software control programs stored on the memory. The controller 250 controls the overall operation of the display apparatus 200. For example: in response to receiving a user command to select a UI object to be displayed on the display 260, the controller 250 may perform an operation related to the object selected by the user command.

In some embodiments, the object may be any one of selectable objects, such as a hyperlink, an icon, or other operable control. The operations related to the selected object are: displaying an operation of connecting to a hyperlink page, a document, an image, or the like, or executing an operation of a program corresponding to the icon.

In some embodiments, the user interface 280 is an interface (e.g., physical keys on a display device body, or the like) that may be used to receive control inputs.

Fig. 4 is a schematic diagram of a software configuration of a display device according to some embodiments of the present application, in some embodiments, a system of the display device 200 may be divided into three layers, namely, an application layer, a middleware layer, and a hardware layer from top to bottom.

The application layer mainly comprises applications on the television, and an application framework (Application Framework), wherein the applications are mainly applications developed based on Browser, such as: HTML5 APPs; a native application (NATIVE APPS);

The application framework (Application Framework) is a complete program model with all the basic functions required by standard application software, such as: file access, data exchange, etc., and the interface for the use of these functions (toolbar, status column, menu, dialog box).

The native application (NATIVE APPS) may support online or offline, message pushing, or local resource access.

The middleware layer includes middleware such as various television protocols, multimedia protocols, and system components. The middleware can use basic services (functions) provided by the system software to connect various parts of the application system or different applications on the network, so that the purposes of resource sharing and function sharing can be achieved.

The hardware layer mainly comprises a HAL interface, hardware and a driver, wherein the HAL interface is a unified interface for all the television chips to be docked, and specific logic is realized by each chip. The driving mainly comprises: audio drive, display drive, bluetooth drive, camera drive, WIFI drive, USB drive, HDMI drive, sensor drive (e.g., fingerprint sensor, temperature sensor, pressure sensor, etc.), and power supply drive, etc.

The hardware layer may communicate and control with sound decoders (e.g., primary and secondary decoders in embodiments of the present application) and underlying audio interfaces. The hardware layer may also provide drivers for the sound decoder to perform functions such as initializing, configuring, and controlling the sound decoder.

In some embodiments, the display device 200 further includes a business logic layer and an adaptation layer. Wherein the business logic layer can process audio data. The function of the business logic layer comprises decoding the input audio data and analyzing and processing according to the audio format and the coding standard. The tone quality and effect of the audio are adjusted by means of balancers, surround sound, reverberation, etc. The business logic layer can also perform volume control, channel selection, audio source switching, etc. to meet the personalized needs and preferences of the user for sound.

In some implementations, the display device 200 may interact with an audio data source (e.g., audio file, streaming media, etc.) and an audio output device. The display device 200 may provide an interactive interface with a source of audio data, such as a communicator 220, to read and process audio data. The display apparatus 200 may also interact with an audio output apparatus including speakers, headphones, audio interface, etc. through the device interface 240 or audio output interface, and the display apparatus 200 may also control the output channels of audio and make volume adjustments of the output audio.

In the process of playing the media asset, if the prompt tone or the voice broadcast information needs to be played, the display device 200 needs to mix the media asset audio with the prompt tone or the voice broadcast information and play the mixed audio signal. In some embodiments, the display apparatus 200 may perform audio mixing through a mixer, which first performs tuning and balancing processing on input audio signals, and then performs mixing processing on the respective audio signals subjected to the tuning and balancing processing, and combines them into one single audio output signal. During the mixing process, the mixer may perform mixing proportion and effect processing on different audio signals, such as reverberation, delay, equalization, and the like. Finally, the mixer outputs the mixed audio signal. Through the steps, the mixer can mix a plurality of audio signals together and perform tuning, balancing and effect processing according to requirements, so that the function of audio mixing is realized.

However, the display apparatus 200 performs audio mixing by a mixer, and can superimpose only a cue tone or voice broadcast information on the audio of the television program, and the effect after audio mixing cannot emphasize the cue tone or voice broadcast information.

In order to improve the audio mixing effect of the display device 200, so that the display device 200 can meet the requirements of various usage scenarios when performing audio mixing, and improve the experience of the user in using the display device 200, some embodiments of the present application provide an audio processing method, which is executed by the controller 250 in the embodiments of the present application. Fig. 5 is a flowchart of an audio processing method according to some embodiments of the present application.

In order to facilitate further understanding of the technical solutions in some embodiments of the present application, the following describes in detail each step of the audio processing method provided in the embodiments of the present application with reference to some specific embodiments and the accompanying drawings.

S100: and responding to the playing instruction of the audio signal, and acquiring the audio signal to be played.

Wherein the audio signal to be played comprises a first audio signal and a second audio signal. The first audio signal and the second audio signal may be speech, music or sound effects. In some embodiments, the first audio signal may be a media play audio and the second audio signal may be a warning audio, such as a seismic warning tone.

In some embodiments, the play instruction of the audio signal may be a play instruction automatically generated by the display device 200 or a play instruction input by a user. Illustratively, the user may input a play instruction through a key of the control apparatus 100, or the user may input a play instruction through a voice application of the display device 200, for example, the user speaks "please play notification audio and music audio", and the display device 200 recognizes such a voice input event, that is, a play instruction considered to be an audio signal input by the user is received.

The play command of the audio signal may specify or relate to the audio signal to be played, for example, the user may directly specify the audio signal to be played when inputting voice, or for the play command automatically generated by the display device 200, the play command of different types of audio signals may relate to different audio signals to be played. The association relationship between the audio signal to be played and the playing command of the audio signal may be stored in the display device 200 in advance, so that after the display device 200 automatically generates the playing command of the audio signal, the audio signal to be played may be quickly determined according to the association relationship between the audio signal to be played and the playing command of the audio signal.

The audio signal playing instruction may further include first information related to a volume parameter of the first audio signal, second information related to a gain parameter of the first audio signal, third information related to a balance parameter of the first audio signal, fourth information related to a volume parameter of the second audio signal, fifth information related to a gain parameter of the second audio signal, and sixth information related to a balance parameter of the second audio signal.

In some embodiments, the first information related to the volume parameter of the first audio signal may include "set the volume of audio a to 50", "turn the volume of audio a up". The second information related to the gain parameter of the first audio signal may include "set the gain of audio a to 60", "set the gain of audio a to be greater than the gain of audio B". The third information related to the balance parameter of the first audio signal may include "set the equalizer frequency of audio a to 10kHz", "set the equalizer frequency of audio a to be higher than the equalizer frequency of audio B". Where "audio a" refers to a first audio signal and "audio B" refers to a second audio signal.

The fourth information related to the volume parameter of the second audio signal may include "set the volume of audio B to 40", "turn the volume of audio B down". The fifth information related to the gain parameter of the second audio signal may include "set the gain of audio B to 50", "set the gain of audio B to be smaller than the gain of audio a". The sixth information related to the balance parameter of the second audio signal may include "set the equalizer frequency of audio B to 1kHz", "set the equalizer frequency of audio B to be lower than the equalizer frequency of audio a". Where "audio a" refers to a first audio signal and "audio B" refers to a second audio signal.

S200: hardware decoding is performed on the first audio signal by the main decoder, and hardware decoding is performed on the second audio signal by the sub decoder.

The main decoder and the auxiliary decoder in the embodiment of the application are both hardware audio decoders. The number of the secondary decoders may be plural, and two or more audio frequencies may be decoded by hardware respectively by the primary decoder and the secondary decoder, and then the decoded audio frequencies may be mixed to generate a mixed audio signal, and the mixed audio signal may be played.

In some embodiments, the display device 200 may only need to play the first audio signal, in which case only the primary decoder needs to be enabled and the secondary decoder in a standby state. In case that the second audio signal is included in the audio signal to be played, a running event is generated in which the second audio signal is extracted from the audio signal to be played. At this time, a start decoding instruction is generated, and the start sub-decoder performs hardware decoding on the second audio signal.

As shown in fig. 6, the controller 250 may perform hardware decoding on the first audio signal by the main decoder and hardware decoding on the second audio signal by the sub decoder by: a first audio signal is extracted from the audio signal to be played. The first audio signal is transmitted to the primary decoder such that the primary decoder performs hardware decoding on the first audio signal after receiving the first audio signal. And extracting a second audio signal from the audio signal to be played. In response to a run Event (run Event) extracting a second audio signal from the audio signal to be played, a start decoding instruction is generated. The running event refers to a specific event occurring in the running process of the program, such as user input, system error, running state change of the program, and the like. Running an event typically triggers a corresponding handler or event handling function so that the program can respond and handle accordingly upon occurrence of the event. After generating the start decoding instruction, the controller 250 transmits the second audio signal and the start decoding instruction to the sub-decoder, so that the sub-decoder performs hardware decoding on the second audio signal after receiving the start decoding instruction and the second audio signal.

After the second audio signal is extracted from the audio signal to be played, a start decoding instruction is generated in response to the operation event of the second audio signal extracted from the audio signal to be played, and the second audio signal and the start decoding instruction are sent to the auxiliary decoder, so that the auxiliary decoder performs hardware decoding on the second audio signal after receiving the start decoding instruction and the second audio signal. The auxiliary decoder can be started to decode when needed, so that the energy waste caused by continuous starting of the auxiliary decoder is reduced, and the auxiliary decoder can be started timely when needed, so that the decoding requirement of the display device 200 is met.

S300: a first mixing parameter of the first audio signal is generated.

In some embodiments, the first mixing parameters of the first audio signal comprise one or more combinations of the following parameters: volume parameters, gain parameters, balance parameters.

The volume parameter is a parameter for adjusting the volume of the volume channel where the first audio signal is located. The Volume parameter may be a Channel Volume (Channel Volume), with each audio Channel having an independent Volume parameter. By adjusting the volume parameters of the individual channels, the volume of the audio of each channel in the mixing can be controlled.

The gain parameter is used to adjust the intensity of the audio signal. The gain parameters may include: the Input Gain (i.e., the Input Gain on each audio channel controls the initial level of the Input signal. The intensity of the input signal can be increased by increasing the input gain, and the intensity of the input signal can be decreased by decreasing the input gain. The use of input gain may ensure an initial level of equalization for all audio sources. A channel gain (CHANNEL GAIN) for adjusting the volume ratio of each audio channel in the mixing output.

The balance parameter is a parameter for adjusting the balancer in the mixing process. During mixing, an Equalizer (Equalizer), which is a tool for adjusting the frequency response of an audio signal, may be used. It can enhance or attenuate audio content in a particular frequency band to achieve a desired sound effect. The balancer may have the following parameters: a Center Frequency (Center Frequency) for determining a Frequency band to be adjusted. The balancer is typically divided into a plurality of frequency bands, each having a center frequency. Gain (Gain) determines the frequency band amplitude to be enhanced or attenuated. By increasing or decreasing the gain value, the volume level of a particular frequency band may be changed. Bandwidth (Bandwidth), which may determine the range of the frequency band. A narrower bandwidth means that only the frequency band around the center frequency is tuned, whereas a wider bandwidth will affect a wider frequency range. By reasonably adjusting the balancer parameters, the balance and definition of the audio can be improved to meet the requirements in the mixing process. Different types of music and sound material may require different balancer settings.

In some embodiments, the controller 250 generates the first mixing parameters of the first audio signal may be performed by: first information related to a volume parameter of a first audio signal in a play instruction is extracted. And generating volume parameters in the first mixing parameters according to the first information. Second information related to gain parameters of the first audio signal in the play instruction is extracted. And generating gain parameters in the first mixing parameters according to the second information. And extracting third information related to the balance parameter of the first audio signal in the playing instruction. And generating balance parameters in the first mixing parameters according to the third information.

Illustratively, the controller 250 extracts the first information of "set the volume of the audio a to 50" in the play instruction, wherein "audio a" is the first audio signal, and the controller 250 generates the volume parameter "50" in the first mixing parameters. The controller 250 extracts the second information of "set gain of audio a to 60" in the play instruction, and the controller 250 generates the gain parameter "60" in the first mixing parameter. The controller 250 extracts the third information of "set equalizer frequency of audio a to 10kHz" in the play instruction, and the controller 250 generates the balance parameter "10kHz" in the first mixing parameters.

In some embodiments, the playing instruction may include all of the first information, the second information and the third information, or may include only one or two of the first information, the second information and the third information. When the controller 250 is executed, if the first information is included in the play instruction, the controller 250 performs extracting the first information related to the volume parameter of the first audio signal in the play instruction. And generating volume parameters in the first mixing parameters according to the first information. If the play instruction includes the second information, the controller 250 performs extracting the second information related to the gain parameter of the first audio signal in the play instruction. And generating gain parameters in the first mixing parameters according to the second information. If the third information is included in the play instruction, the controller performs extraction of the third information related to the balance parameter of the first audio signal in the play instruction. And generating balance parameters in the first mixing parameters according to the third information.

S400: a second mixing parameter of the second audio signal is generated.

The second mixing parameters may also include one or more of a combination of volume parameter, gain parameter, and balance parameter, corresponding to the first mixing parameters.

In some embodiments, the controller 250 generates the second mixing parameters of the second audio signal may be performed by: fourth information related to a volume parameter of the second audio signal in the play instruction is extracted. And generating volume parameters in the second mixing parameters according to the fourth information. Fifth information related to gain parameters of the second audio signal in the play instruction is extracted. And generating gain parameters in the second mixing parameters according to the fifth information. Sixth information related to balance parameters of the second audio signal in the play instruction is extracted. And generating balance parameters in the second mixing parameters according to the sixth information.

Illustratively, the controller 250 extracts the fourth information "set the volume of the audio B to 40" in the play instruction, wherein "audio B" is the second audio signal, and the controller 250 generates the volume parameter "40" in the second mixing parameter. The controller 250 extracts fifth information of "set gain of audio B to 50" in the play instruction, and the controller 250 generates gain parameter "50" in the second mixing parameter. The controller 250 extracts sixth information of "set equalizer frequency of audio B to 1kHz" in the play instruction, and the controller 250 generates the balance parameter "1kHz" in the second mixing parameters.

In some embodiments, the playing instruction may include all of the fourth information, the fifth information and the sixth information, or may include only one or two of the fourth information, the fifth information and the sixth information. When the controller 250 is executing, if the fourth information is included in the play instruction, the controller 250 executes the extraction of the fourth information related to the volume parameter of the second audio signal in the play instruction. And generating volume parameters in the second mixing parameters according to the fourth information. If the play instruction includes fifth information, the controller 250 performs extraction of fifth information related to a gain parameter of the second audio signal in the play instruction. And generating gain parameters in the second mixing parameters according to the fifth information. If the sixth information is included in the play instruction, the controller performs extraction of the sixth information related to the balance parameter of the second audio signal in the play instruction. And generating balance parameters in the second mixing parameters according to the sixth information.

In some embodiments, there may be more than one second audio signal in the audio to be played, where for each second audio signal, a corresponding second mixing parameter is generated, so as to ensure that the mixing effect of each second audio signal can be controlled during the mixing process.

S500: based on the first mixing parameter and the second mixing parameter, mixing is performed on the decoded first audio signal and the decoded second audio signal by a mixer to generate a mixed audio signal.

In some embodiments, when the display apparatus 200 performs mixing on the decoded first audio signal and the decoded second audio signal through the mixer, channel parameters may be set according to the first mixing parameter and the second mixing parameter, and for each channel, gain, volume, balance, etc. may be set according to the corresponding mixing parameters. The gain is used for controlling the intensity of the input signal, the volume is used for controlling the volume level of the channel output, and the balance is used for adjusting the balance of the left and right channels. And adjusting the main volume, setting the main volume of the overall mixed sound output, and controlling the overall volume level. The mixed signal can mix the audio signals of different channels together. The application effect is as follows: special effects such as reverberation, delay, etc. can be applied during the mixing. Monitoring and adjusting, wherein in the process of mixing sound, the mixed output is continuously monitored and adjusted according to the requirement. Ensures that the volume balance and the overall mixing effect of each audio source meet the requirements.

S600: the mixed audio signal is played through the audio output interface.

The display device 200 may provide a variety of audio output interfaces for transmitting audio signals to external audio devices or headphones. Through these audio output interfaces, playback of the mixed audio signal can be performed. The Audio output interface is an Audio/Video (a/V) interface. Is a physical connection interface for transmitting audio and video signals.

In some implementations, the display device 200 is configured with a voice application. A voice application refers to an application program that interacts with voice technology. A user may interact with the display device 200 using a voice application. In the process of performing the voice interaction by the display device 200, the effect of the voice interaction can be improved by adjusting the first mixing parameter and the second mixing parameter.

When the display device 200 detects the presence of a user voice input event, a user interface as shown in fig. 9 may be displayed and the audio play volume of the audio output interface may be reduced or the audio output interface may be controlled to pause audio play. After receiving a voice command input by a user, generating a voice audio signal to be output according to the voice command. The voice command of the user may include a voice content of "volume up notification", at which time an input box 901 of the dialog application is displayed on the user interface, and an icon of a microphone is included in the input box 901, and a text corresponding to the received voice command, for example, "volume up notification".

The voice audio signal generated by the display apparatus 200 to be output may be "good, the volume of the notification is turned up". And decoding the voice audio signal by the auxiliary decoder, and performing audio mixing on the decoded voice audio signal and the first audio signal by the audio mixer based on the first audio mixing parameter and the second audio mixing parameter to generate a voice broadcasting signal. The first audio signal may be a media audio signal, and the volume in the first mixing parameter corresponding to the first audio signal may be lower than the volume in the second mixing parameter corresponding to the voice audio signal. When the voice broadcasting signal is played, the effect of playing is that under the television background sound with smaller volume, the voice audio signal with larger volume is overlapped. For example, when the program played by the display device 200 is the audio signal of the audio resource a, the audio volume of the audio signal of the audio resource a is small, the audio signal of "good, the volume of the notification has been turned up" is large, and thus the audio signal of "good, the volume of the notification has been turned up" can be highlighted more clearly.

The user interface for outputting the voice broadcast signal is shown in fig. 10, and includes an output box 1001 of the dialog application, where text corresponding to the output voice of the dialog application is displayed in the output box 1001, so that the user can determine the output content of the dialog application more clearly from the displayed text, for example, the output box 1001 may include the text "good, volume of the adjusted notification".

When the display device 200 detects the presence of a user voice input event, the controller 250 of the display device may process the user voice input event in the manner shown in fig. 11.

As shown in fig. 11, in the presence of a user input event based on a voice application, the controller 250 may perform the steps of: and in response to the input event of the user based on the voice application, modifying the volume parameter in the first mixing parameters to be the target volume. And receiving a voice instruction input by a user. Speech recognition is performed on the speech instructions to generate text information. Keyword extraction is performed on the text information. A speech audio signal is obtained for a speech application based on keyword feedback. Decoding is performed on the voice audio signal by the sub-decoder. And performing, by the mixer, mixing of the decoded speech audio signal and the first audio signal based on the first mixing parameter and the second mixing parameter to obtain a voice broadcast signal. And controlling the audio output interface to play the voice broadcasting signal.

The target volume is used for reducing the audio playing volume of the audio output interface, or is used for controlling the audio output interface to pause audio playing. Under the condition that the target volume is used for controlling the audio output interface to pause the audio playing, the television video can be continuously played or also paused, if the television video is continuously played, when the playing of the audio signal corresponding to the television video is resumed, the played audio and the television video are synchronous.

For example, the correspondence relationship between the keywords and the fed-back voice audio signal may be preset, or the voice audio signal may be generated by a machine learning algorithm based on the keywords. If the keyword ' shutdown ' exists in the text information, the voice audio signal fed back based on the keyword can be ' your voice, and is about to be shutdown. If the keyword 'volume up' exists in the text message, the voice audio signal fed back based on the keyword can be 'you good' which volume up.

By modifying the volume parameter in the first mixing parameter to be the target volume in response to the input event of the user based on the voice application, and performing the mixing on the decoded voice audio signal and the first audio signal by the mixer based on the first mixing parameter and the second mixing parameter to obtain the voice broadcast signal, the user can more clearly hear the feedback of the voice application by adjusting the first mixing parameter and the second mixing parameter when using the voice application on the display device 200, thereby improving the user experience in the voice interaction process.

In some embodiments, a default mixing rule corresponding to the type of the first audio signal and the type of the second audio signal may be preconfigured in the display apparatus 200, and when mixing, mixing is performed by the mixer in combination with the default mixing rule, the first mixing parameter, and the second mixing parameter.

For example, the type of the first audio signal is media play audio, the type of the second audio signal is early warning audio, and the corresponding default mixing rule may be that the volume of the first audio signal is lower than the volume of the second audio signal. At this time, if the generated first mixing parameter includes a balance parameter of the first audio signal and the generated second mixing parameter includes a balance parameter of the second audio signal, the mixer may perform mixing of the first audio signal and the second audio signal with reference to the above-described default mixing rule, the first mixing parameter, and the second mixing parameter.

As shown in fig. 12, in some embodiments, the controller 250 may also generate a mixed audio signal according to a default mixing rule set by the audio signal type, i.e., the controller 250 may acquire the type of the first audio signal and the type of the second audio signal. And acquiring default mixing rules corresponding to the first audio signal and the second audio signal according to the type of the first audio signal and the type of the second audio signal. And performing, by the mixer, mixing of the decoded first audio signal and the decoded second audio signal in combination with the default mixing rule, the first mixing parameter and the second mixing parameter to generate a mixed audio signal.

In some embodiments, the controller 250 may determine whether a default mixing rule corresponding to the type of the first audio signal and the type of the second audio signal exists before acquiring the default mixing rule corresponding to the first audio signal and the second audio signal according to the type of the first audio signal and the type of the second audio signal, and in case that the default mixing rule corresponding to the type of the first audio signal and the type of the second audio signal exists, perform the step of acquiring the default mixing rule corresponding to the first audio signal and the second audio signal again.

Here, the default mixing rule corresponding to the first audio signal and the second audio signal is a default mixing rule preconfigured in the display apparatus 200 and associated with the type of the first audio signal and the type of the second audio signal.

By setting the default mixing rule, the mixing rule of most audio signal types can be preset, and when mixing, each parameter of the mixing is not required to be temporarily set, so that the setting time can be saved, the mixing speed can be accelerated, the waiting time of a user can be reduced, and the user experience can be improved.

In some embodiments, the first audio signal is of a media play audio, the second audio signal is of a warning audio, the default mixing rule is that a default volume parameter of the first audio signal is a first set value, a default volume parameter of the second audio signal is a second set value, the second set value is greater than the first set value, and a difference between the second set value and the first set value is greater than a first set threshold.

The media asset playing audio can be audio corresponding to television video, and the early warning audio can be early warning sounds such as earthquake early warning. The second set value corresponding to the default volume parameter of the second audio signal in the default mixing rule is greater than the first set value corresponding to the default audio parameter of the first audio signal, and the difference between the second set value and the first set value is greater than the first set threshold, in some embodiments, the first set threshold may be 10, then the second set value may be 50, and the first set value may be 30.

In some embodiments, the type of first audio signal is speech audio and the type of second audio signal is music audio. In some embodiments, the type of the first audio signal is speech audio and the type of the second audio signal is background audio. In some embodiments, the type of first audio signal is speech audio and the type of second audio signal is music audio and background audio. At this time, the default mixing rule is that the default volume parameter of the first audio signal is a first set value, the default volume parameter of the second audio signal is a second set value, the first set value is greater than the second set value, and the difference between the first set value and the second set value is greater than the second set threshold.

For example, for evening scenes, the speech audio may be audio input by a presenter through a microphone, the music audio may be music for a baking atmosphere, and the background audio may be the applause of a live audience. In some embodiments, the audio that is desired to be mixed includes speech audio, music audio, and background audio. At this time, in order to highlight the audio input by the host through the microphone, the first set value corresponding to the default volume parameter of the first audio signal in the default mixing rule is larger than the second set value corresponding to the default audio parameter of the second audio signal, and the difference between the first set value and the second set value is larger than the second set threshold. In some embodiments, the second set threshold may be 10, the first set value may be 50, and the second set values corresponding to the music audio and the background audio may be the same, for example, both 30. Or in one embodiment, the second set value corresponding to the music audio and the background audio are different, the second set value corresponding to the music audio is 30, and the second set value corresponding to the background audio is 20.

After acquiring the default mixing rule, the controller 250 may perform generation of the mixed audio signal in the manner as shown in fig. 13, that is, the controller 250 may: and setting the gain of the first audio output channel corresponding to the decoded first audio signal as a first set value. And setting the gain of a second audio output channel corresponding to the decoded second audio signal to a second set value. And performing, by the mixer, mixing of the decoded first audio signal and the decoded second audio signal in combination with the first mixing parameter and the second mixing parameter to generate a mixed audio signal.

The volume of the first audio signal may be set by setting the first audio output channel gain, and the volume of the second audio signal may be set by setting the second audio output channel gain.

The display device 200 and the audio processing method provided by some embodiments of the present application may obtain an audio signal to be played in response to a play instruction of the audio signal. Wherein the audio signal to be played comprises a first audio signal and a second audio signal. And then performing hardware decoding on the first audio signal by the main decoder and performing hardware decoding on the second audio signal by the auxiliary decoder to generate a first mixing parameter of the first audio signal and a second mixing parameter of the second audio signal. And then based on the first mixing parameter and the second mixing parameter, performing mixing on the decoded first audio signal and the decoded second audio signal through a mixer so as to generate a mixed audio signal and playing the mixed audio signal. Therefore, the first audio signal can be decoded by hardware through the main decoder, the second audio signal is decoded by hardware through the auxiliary decoder, and the first audio signal mixing parameters and the second audio signal mixing parameters of the first audio signal and the second audio signal are generated, so that when a plurality of audio signals are mixed on the display device 200, the mixing parameters of the audio signals are adjusted according to the requirements, the mixing effect can meet the requirements of different use scenes, and the user experience of a user when the display device 200 is used is improved.

Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present application, and not for limiting the same. Although the application has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments may be modified or some or all of the technical features may be replaced with equivalents. Such modifications and substitutions do not depart from the spirit of the application.

The foregoing description, for purposes of explanation, has been presented in conjunction with specific embodiments. The illustrative discussions above are not intended to be exhaustive or to limit the embodiments to the precise forms disclosed above. Many modifications and variations are possible in light of the above teaching. The embodiments were chosen and described in order to best explain the principles and the practical application, to thereby enable others skilled in the art to best utilize the embodiments and various embodiments with various modifications as are suited to the particular use contemplated.

Claims

1. A display device, characterized by comprising:

a display configured to display a user interface;

an audio output interface configured to play audio signals

A controller including a main decoder, at least one sub decoder and a mixer; the main decoder is configured to perform hardware decoding on a first audio signal; the secondary decoder is configured to perform hardware decoding on the second audio signal;

the controller is configured to:

responding to a playing instruction of an audio signal, and acquiring the audio signal to be played, wherein the audio signal to be played comprises a first audio signal and a second audio signal;

performing hardware decoding on the first audio signal by the primary decoder and performing hardware decoding on the second audio signal by the secondary decoder;

playing the mixed audio signal through the audio output interface.

2. The display device of claim 1, wherein the first mixing parameters comprise one or a combination of: volume parameters, gain parameters, balance parameters; the controller performs generating a first mixing parameter of the first audio signal, and is further configured to:

Extracting first information related to the volume parameter of the first audio signal in the playing instruction;

Generating a volume parameter in the first mixing parameters according to the first information;

Extracting second information related to gain parameters of the first audio signal in the playing instruction;

generating gain parameters in the first mixing parameters according to the second information;

extracting third information related to the balance parameter of the first audio signal in the playing instruction;

And generating balance parameters in the first mixing parameters according to the third information.

3. The display device of claim 1, wherein the second mixing parameters comprise one or a combination of: volume parameters, gain parameters, balance parameters; the controller executing the second mixing parameters that generate the second audio signal is further configured to:

extracting fourth information related to the volume parameter of the second audio signal in the playing instruction;

Generating a volume parameter in the second mixing parameters according to the fourth information;

Extracting fifth information related to gain parameters of the second audio signal in the play instruction;

generating gain parameters in the second mixing parameters according to the fifth information;

Extracting sixth information related to balance parameters of the second audio signal in the play instruction;

and generating balance parameters in the second mixing parameters according to the sixth information.

4. The display device of claim 1, wherein the controller is further configured to:

extracting a second audio signal from the audio signal to be played;

generating a decoding starting instruction in response to the operation event of the second audio signal extracted from the audio signal to be played;

And sending the second audio signal and the start decoding instruction to the auxiliary decoder, so that the auxiliary decoder executes hardware decoding on the second audio signal after receiving the start decoding instruction and the second audio signal.

5. The display device of claim 1, wherein the controller is further configured to:

Responding to an input event of a user based on voice application, and modifying a volume parameter in the first mixing parameters into a target volume, wherein the target volume is used for reducing the audio playing volume of the audio output interface or controlling the audio output interface to pause audio playing;

Receiving a voice instruction input by a user;

Performing speech recognition on the speech instruction to generate text information;

and performing keyword extraction on the text information.

6. The display device of claim 5, wherein the controller is further configured to:

acquiring a voice audio signal fed back by a voice application based on the keywords;

Performing decoding on the speech audio signal by the sub-decoder;

performing, by the mixer, mixing of the decoded speech audio signal and the first audio signal based on the first mixing parameter and the second mixing parameter to obtain a voice broadcast signal;

and controlling the audio output interface to play the voice broadcasting signal.

7. The display device of claim 1, wherein the controller performs mixing of the decoded first audio signal and the decoded second audio signal by the mixer based on the first mixing parameter and the second mixing parameter to generate a mixed audio signal, further configured to:

acquiring the type of the first audio signal and the type of the second audio signal;

According to the type of the first audio signal and the type of the second audio signal, obtaining a default audio mixing rule corresponding to the first audio signal and the second audio signal;

And executing audio mixing on the decoded first audio signal and the decoded second audio signal by the audio mixer by combining the default audio mixing rule, the first audio mixing parameter and the second audio mixing parameter so as to generate a mixed audio signal.

8. The display device of claim 7, wherein the first audio signal is of a media playback audio type, the second audio signal is of a warning audio type, the default mixing rule is that a default volume parameter of the first audio signal is a first set value, a default volume parameter of the second audio signal is a second set value, the second set value is greater than the first set value, and a difference between the second set value and the first set value is greater than a first set threshold;

the controller is further configured to:

Setting a first audio output channel gain corresponding to the decoded first audio signal to the first set value;

Setting a second audio output channel gain corresponding to the decoded second audio signal to the second set value;

and performing, by the mixer, mixing of the decoded first audio signal and the decoded second audio signal in combination with the first mixing parameter and the second mixing parameter to generate a mixed audio signal.

9. The display device according to claim 7, wherein the type of the first audio signal is voice audio, the type of the second audio signal is music audio and/or background audio, the default mixing rule is that a default volume parameter of the first audio signal is a first set value, a default volume parameter of the second audio signal is a second set value, the first set value is larger than the second set value, and a difference between the first set value and the second set value is larger than a second set threshold;

the controller is further configured to:

10. An audio processing method, wherein the audio processing method is applied to a display device, the display device comprising a display, a playing component, and a controller, the controller comprising a primary decoder, at least one secondary decoder, and a mixer, the method comprising:

playing the mixed audio signal through the audio output interface.