CN113096681B - Display device, multi-channel echo cancellation circuit and multi-channel echo cancellation method - Google Patents

Display device, multi-channel echo cancellation circuit and multi-channel echo cancellation method Download PDF

Info

Publication number
CN113096681B
CN113096681B CN202110378801.1A CN202110378801A CN113096681B CN 113096681 B CN113096681 B CN 113096681B CN 202110378801 A CN202110378801 A CN 202110378801A CN 113096681 B CN113096681 B CN 113096681B
Authority
CN
China
Prior art keywords
audio
channel
processing
echo
sound
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110378801.1A
Other languages
Chinese (zh)
Other versions
CN113096681A (en
Inventor
杨香斌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hisense Visual Technology Co Ltd
Original Assignee
Hisense Visual Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hisense Visual Technology Co Ltd filed Critical Hisense Visual Technology Co Ltd
Priority to CN202110378801.1A priority Critical patent/CN113096681B/en
Publication of CN113096681A publication Critical patent/CN113096681A/en
Application granted granted Critical
Publication of CN113096681B publication Critical patent/CN113096681B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • GPHYSICS
    • G03PHOTOGRAPHY; CINEMATOGRAPHY; ANALOGOUS TECHNIQUES USING WAVES OTHER THAN OPTICAL WAVES; ELECTROGRAPHY; HOLOGRAPHY
    • G03BAPPARATUS OR ARRANGEMENTS FOR TAKING PHOTOGRAPHS OR FOR PROJECTING OR VIEWING THEM; APPARATUS OR ARRANGEMENTS EMPLOYING ANALOGOUS TECHNIQUES USING WAVES OTHER THAN OPTICAL WAVES; ACCESSORIES THEREFOR
    • G03B31/00Associated working of cameras or projectors with sound-recording or sound-reproducing means
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09FDISPLAYING; ADVERTISING; SIGNS; LABELS OR NAME-PLATES; SEALS
    • G09F9/00Indicating arrangements for variable information in which the information is built-up on a support by selection or combination of individual elements
    • G09F9/30Indicating arrangements for variable information in which the information is built-up on a support by selection or combination of individual elements in which the desired character or characters are formed by combining individual elements
    • G09F9/33Indicating arrangements for variable information in which the information is built-up on a support by selection or combination of individual elements in which the desired character or characters are formed by combining individual elements being semiconductor devices, e.g. diodes
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09FDISPLAYING; ADVERTISING; SIGNS; LABELS OR NAME-PLATES; SEALS
    • G09F9/00Indicating arrangements for variable information in which the information is built-up on a support by selection or combination of individual elements
    • G09F9/30Indicating arrangements for variable information in which the information is built-up on a support by selection or combination of individual elements in which the desired character or characters are formed by combining individual elements
    • G09F9/35Indicating arrangements for variable information in which the information is built-up on a support by selection or combination of individual elements in which the desired character or characters are formed by combining individual elements being liquid crystals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/439Processing of audio elementary streams
    • H04N21/4398Processing of audio elementary streams involving reformatting operations of audio signals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/64Constructional details of receivers, e.g. cabinets or dust covers
    • H04N5/642Disposition of sound reproducers
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L2021/02082Noise filtering the noise being echo, reverberation of the speech

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Chemical & Material Sciences (AREA)
  • Quality & Reliability (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Circuit For Audible Band Transducer (AREA)
  • Telephone Function (AREA)

Abstract

The display device of the embodiment comprises a loudspeaker, the loudspeaker comprises a main sound channel and a sub sound channel, the main sound channel plays sound according to first audio after frequency division, the sub sound channel plays sound according to second audio after frequency division, the first audio is obtained by performing first processing on original audio obtained by a playing service processor, the second audio is obtained by performing second processing on the first audio, and the first processing and the second processing are different audio processing modes. The controller performs echo cancellation on the first echo and the second echo according to the back-picking reference signal corresponding to the second audio. Because the second audio comprises the original audio, the first processed related information and the second processed related information, the aim of completely eliminating multi-channel echo can be fulfilled, the echo elimination effect is improved, the voice interaction effect is finally improved, and the use experience of a user is improved.

Description

Display device, multi-channel echo cancellation circuit and multi-channel echo cancellation method
Technical Field
The present application relates to the field of display device technologies, and in particular, to a display device, a multi-channel echo cancellation circuit, and a multi-channel echo cancellation method.
Background
With the rapid development of speech recognition technology, the application scenario of speech interaction is more and more common. The distance of the man-machine voice conversation is increasingly not limited to the near field. The distance of human-computer interaction can be increased by using the microphone array technology, and a far-field speech recognition function is realized. In a far-field speech recognition scene, a microphone collects echoes caused by playing audio by the intelligent device while collecting target human voice, so that echo cancellation needs to be carried out on collected audio signals. Echo cancellation is a process of canceling echo formed in an environment by playing audio by an intelligent device from audio signals collected by a microphone, so that only target human voice is reserved.
Currently, the echo cancellation scheme in the industry mainly adopts a direct connection scheme, that is, an extraction interface of a main chip is directly used for acquiring an extraction reference signal, and then an echo cancellation process is performed on an audio signal acquired by a microphone according to the extraction reference signal. Due to hardware limitations, the direct-coupled echo cancellation scheme usually supports only one set of I2S (Inter-IC Sound, audio bus built in an integrated circuit) reference signal extraction, and does not support more paths of reference signal extraction.
However, most of the current smart devices are provided with a plurality of sound channels for enhancing sound effects. Taking the intelligent television comprising the main sound channel and the bass sound channel as an example, the echo cancellation scheme directly connected can only carry out the reference signal recovery of the main sound channel, but abandons the reference signal cancellation of the bass sound channel, thereby affecting the echo cancellation effect, finally affecting the voice interaction effect and causing poor user experience.
Disclosure of Invention
The application provides a display device, a multi-channel echo cancellation circuit and a multi-channel echo cancellation method, which are used for solving the problems that in the display device with multiple channels, the existing directly-connected echo cancellation scheme can only carry out the reference signal recovery of a main channel, and the reference signal cancellation of other channels is abandoned, the echo cancellation effect is influenced, the voice interaction effect is finally influenced, and the user experience is poor.
In a first aspect, the present embodiment provides a display device comprising,
a play service processor;
the loudspeaker comprises a main sound channel and an auxiliary sound channel, wherein the main sound channel plays sound according to a first audio frequency after frequency division, the auxiliary sound channel plays sound according to a second audio frequency after frequency division, the first audio frequency is obtained after first processing is performed on an original audio frequency obtained from the playing service processor, the second audio frequency is obtained after second processing is performed on the first audio frequency, and the first processing and the second processing are different audio frequency processing modes;
A sound collector configured to collect target audio, wherein the target audio comprises a third audio input by a user, a first echo from the main channel and a second echo from the auxiliary channel;
a controller configured to:
acquiring the second audio from a power amplifier of the secondary sound channel, and acquiring the target audio from the sound collector;
and according to the back-picking reference signal corresponding to the second audio, performing echo cancellation on the first echo and the second echo.
In a second aspect, the present embodiment provides a multi-channel echo cancellation circuit, including:
a play service processor;
the power amplifier of the main sound channel is configured to obtain original audio from the playing service processor, perform first processing on the original audio to obtain first audio, and drive the main sound channel loudspeaker to play sound according to the first audio after frequency division;
the power amplifier of the secondary channel is configured to obtain the first audio frequency from the power amplifier of the main channel, perform second processing on the first audio frequency to obtain a second audio frequency, and drive the secondary channel loudspeaker to play sound according to the second audio frequency after frequency division, wherein the first processing and the second processing are different audio frequency processing modes;
An echo cancellation processor configured to obtain the second audio from the secondary channel power amplifier, and obtain a target audio from a sound collector, where the target audio includes a third audio input by a user, a first echo from the primary channel loudspeaker, and a second echo from the secondary channel loudspeaker;
and according to the back-picking reference signal corresponding to the second audio, performing echo cancellation on the first echo and the second echo.
In a third aspect, this embodiment provides a multi-channel echo cancellation method, where the method is applied to a display device, the display device includes a main channel and a sub-channel, the main channel plays a sound according to a first audio obtained by frequency division, the sub-channel plays a sound according to a second audio obtained by frequency division, the first audio is obtained by performing first processing on an original audio obtained from a playing service processor, the second audio is obtained by performing second processing on the first audio, and the first processing and the second processing are different audio processing manners, and the method includes:
acquiring the second audio from a power amplifier of the secondary channel, and acquiring a collected target audio from a sound collector, wherein the target audio comprises a third audio input by a user, a first echo from the primary channel and a second echo from the secondary channel;
And according to the back-picking reference signal corresponding to the second audio, performing echo cancellation on the first echo and the second echo.
The present embodiment provides a display device including a speaker, where the speaker includes a main channel and a sub-channel, the main channel plays sound according to a first audio obtained after frequency division, and the sub-channel plays sound according to a second audio obtained after frequency division, where the first audio is obtained by performing first processing on an original audio obtained from a playing service processor, the second audio is obtained by performing second processing on the first audio, and the first processing and the second processing are different audio processing manners. The controller obtains a second audio frequency from a power amplifier of the auxiliary sound channel and obtains a target audio frequency from the sound collector, wherein the target audio frequency comprises a third audio frequency input by a user, a first echo from the main sound channel and a second echo from the auxiliary sound channel. And finally, the controller performs echo cancellation on the first echo and the second echo according to the back-picking reference signal corresponding to the second audio. Because the second audio comprises the original audio, the related information of the first processing and the related information of the second processing, the aim of completely eliminating the multi-channel echo can be achieved, the echo eliminating effect is improved, the voice interaction effect is finally improved, and the user experience is improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings without creative efforts.
FIG. 1 illustrates a usage scenario of a display device according to some embodiments;
fig. 2 illustrates a block diagram of a hardware configuration of the control apparatus 100 according to some embodiments;
fig. 3 illustrates a block diagram of a hardware configuration of the display apparatus 200 according to some embodiments;
FIG. 4 illustrates a software configuration diagram in the display device 200 according to some embodiments;
FIG. 5 illustrates a schematic diagram of a voice interaction principle, according to some embodiments;
FIG. 6 illustrates a functional module architecture diagram for echo cancellation in a display device according to some embodiments;
FIG. 7 illustrates a diagram of an echo cancellation scenario according to some embodiments;
FIG. 8 illustrates a modeling diagram of a filter coefficient solving algorithm in an echo cancellation algorithm according to some embodiments;
FIG. 9 illustrates a functional module architecture diagram for echo cancellation in yet another display device in accordance with some embodiments;
FIG. 10 illustrates a signaling diagram of a multi-channel echo cancellation method according to some embodiments.
Detailed Description
To make the purpose and embodiments of the present application clearer, the following will clearly and completely describe the exemplary embodiments of the present application with reference to the attached drawings in the exemplary embodiments of the present application, and it is obvious that the described exemplary embodiments are only a part of the embodiments of the present application, and not all of the embodiments.
It should be noted that the brief descriptions of the terms in the present application are only for convenience of understanding of the embodiments described below, and are not intended to limit the embodiments of the present application. These terms should be understood in their ordinary and customary meaning unless otherwise indicated.
The terms "first," "second," "third," and the like in the description and claims of this application and in the above-described drawings are used for distinguishing between similar or analogous objects or entities and not necessarily for describing a particular sequential or chronological order, unless otherwise indicated. It is to be understood that the terms so used are interchangeable under appropriate circumstances.
The terms "comprises" and "comprising," as well as any variations thereof, are intended to cover a non-exclusive inclusion, such that a product or device that comprises a list of elements is not necessarily limited to all of the elements explicitly listed, but may include other elements not expressly listed or inherent to such product or device.
The term "module" refers to any known or later developed hardware, software, firmware, artificial intelligence, fuzzy logic, or combination of hardware or/and software code that is capable of performing the functionality associated with that element.
Fig. 1 is a schematic diagram of a usage scenario of a display device according to an embodiment. As shown in fig. 1, the display apparatus 200 is also in data communication with a server 400, and a user may operate the display apparatus 200 through the smart device 300 or the control device 100.
In some embodiments, the control apparatus 100 may be a remote controller, and the communication between the remote controller and the display device includes at least one of an infrared protocol communication or a bluetooth protocol communication, and other short-distance communication methods, and the display device 200 is controlled by a wireless or wired method. The user may control the display apparatus 200 by inputting a user instruction through at least one of a key on a remote controller, a voice input, a control panel input, and the like.
In some embodiments, the smart device 300 may include any of a mobile terminal 300A, a tablet, a computer, a laptop, an AR/VR device, and the like.
In some embodiments, the smart device 300 may also be used to control the display device 200. For example, the display device 200 is controlled using an application program running on the smart device.
In some embodiments, the smart device 300 and the display device may also be used for communication of data.
In some embodiments, the display device 200 may also be controlled in a manner other than the control apparatus 100 and the smart device 300, for example, the voice instruction control of the user may be directly received by a module configured inside the display device 200 to obtain a voice instruction, or may be received by a voice control apparatus provided outside the display device 200.
In some embodiments, the display device 200 is also in data communication with a server 400. The display device 200 may be allowed to be communicatively connected through a Local Area Network (LAN), a Wireless Local Area Network (WLAN), and other networks. The server 400 may provide various contents and interactions to the display apparatus 200. The server 400 may be a cluster or a plurality of clusters, and may include one or more types of servers.
In some embodiments, software steps executed by one step execution agent may migrate to another step execution agent in data communication therewith for execution as needed. Illustratively, software steps performed by the server may be migrated on demand to be performed on the display device in data communication therewith, and vice versa.
Fig. 2 exemplarily shows a block diagram of a configuration of the control apparatus 100 according to an exemplary embodiment. As shown in fig. 2, the control device 100 includes a controller 110, a communication interface 130, a user input/output interface 140, a memory, and a power supply. The control apparatus 100 may receive an input operation instruction from a user and convert the operation instruction into an instruction recognizable and responsive by the display device 200, serving as an interaction intermediary between the user and the display device 200.
In some embodiments, the communication interface 130 is used for external communication, and includes at least one of a WIFI chip, a bluetooth module, NFC, or an alternative module.
In some embodiments, the user input/output interface 140 includes at least one of a microphone, a touchpad, a sensor, a key, or an alternative module.
Fig. 3 shows a hardware configuration block diagram of the display apparatus 200 according to an exemplary embodiment.
In some embodiments, the display apparatus 200 includes at least one of a tuner demodulator 210, a communicator 220, a detector 230, an external device interface 240, a controller 250, a display 260, an audio output interface 270, a memory, a power supply, a user interface.
In some embodiments the controller comprises a central processor, a video processor, an audio processor, a graphics processor, a RAM, a ROM, a first interface to an nth interface for input/output.
In some embodiments, the display 260 includes a display screen component for displaying pictures, and a driving component for driving image display, a component for receiving image signals from the controller output, displaying video content, image content, and menu manipulation interface, and a user manipulation UI interface, etc.
In some embodiments, the display 260 may be at least one of a liquid crystal display, an OLED display, and a projection display, and may also be a projection device and a projection screen.
In some embodiments, the tuner demodulator 210 receives broadcast television signals via wired or wireless reception and demodulates audio/video signals, such as EPG data signals, from a plurality of wireless or wired broadcast television signals.
In some embodiments, communicator 220 is a component for communicating with external devices or servers according to various communication protocol types. For example: the communicator may include at least one of a Wifi module, a bluetooth module, a wired ethernet module, and other network communication protocol chips or near field communication protocol chips, and an infrared receiver. The display apparatus 200 may establish transmission and reception of control signals and data signals with the control device 100 or the server 400 through the communicator 220.
In some embodiments, the detector 230 is used to collect signals of the external environment or interaction with the outside. For example, detector 230 includes a light receiver, a sensor for collecting ambient light intensity; alternatively, the detector 230 includes an image collector, such as a camera, which may be used to collect external environment scenes, attributes of the user, or user interaction gestures, or the detector 230 includes a sound collector, such as a microphone, which is used to receive external sounds.
In some embodiments, the external device interface 240 may include, but is not limited to, the following: high Definition Multimedia Interface (HDMI), analog or data high definition component input interface (component), composite video input interface (CVBS), USB input interface (USB), RGB port, and the like. The interface may be a composite input/output interface formed by the plurality of interfaces.
In some embodiments, the controller 250 and the modem 210 may be located in different separate devices, that is, the modem 210 may also be located in an external device of the main device where the controller 250 is located, such as an external set-top box.
In some embodiments, the controller 250 controls the operation of the display device and responds to user operations through various software control programs stored in memory. The controller 250 controls the overall operation of the display apparatus 200. For example: in response to receiving a user command for selecting a UI object to be displayed on the display 260, the controller 250 may perform an operation related to the object selected by the user command.
In some embodiments, the object may be any one of selectable objects, such as a hyperlink, an icon, or other actionable control. The operations related to the selected object are: displaying an operation connected to a hyperlink page, document, image, or the like, or performing an operation of a program corresponding to the icon.
In some embodiments the controller comprises at least one of a Central Processing Unit (CPU), a video processor, an audio processor, a Graphics Processing Unit (GPU), a RAM Random Access Memory (RAM), a ROM (Read-Only Memory), a first to nth interface for input/output, a communication Bus (Bus), and the like.
A CPU processor. For executing operating system and application program instructions stored in the memory, and executing various application programs, data and contents according to various interactive instructions receiving external input, so as to finally display and play various audio-video contents. The CPU processor may include a plurality of processors. E.g. comprising a main processor and one or more sub-processors.
In some embodiments, a graphics processor for generating various graphics objects, such as: at least one of an icon, an operation menu, and a user input instruction display figure. The graphic processor comprises an arithmetic unit, which performs operation by receiving various interactive instructions input by a user and displays various objects according to display attributes; the system also comprises a renderer for rendering various objects obtained based on the arithmetic unit, wherein the rendered objects are used for being displayed on a display.
In some embodiments, the video processor is configured to receive an external video signal, and perform at least one of video processing such as decompression, decoding, scaling, noise reduction, frame rate conversion, resolution conversion, and image synthesis according to a standard codec protocol of the input signal, so as to obtain a signal displayed or played on the direct display device 200.
In some embodiments, the video processor includes at least one of a demultiplexing module, a video decoding module, an image composition module, a frame rate conversion module, a display formatting module, and the like. The demultiplexing module is used for demultiplexing the input audio and video data stream. And the video decoding module is used for processing the video signal after demultiplexing, including decoding, scaling and the like. And the image synthesis module is used for carrying out superposition mixing processing on the GUI signal input by the user or generated by the user and the video image after the zooming processing by the graphic generator so as to generate an image signal for display. And the frame rate conversion module is used for converting the frame rate of the input video. And the display formatting module is used for converting the received video output signal after the frame rate conversion, and changing the signal to be in accordance with the signal of the display format, such as an output RGB data signal.
In some embodiments, the audio processor is configured to receive an external audio signal, decompress and decode the received audio signal according to a standard codec protocol of the input signal, and perform at least one of noise reduction, digital-to-analog conversion, and amplification processing to obtain a sound signal that can be played in the speaker.
In some embodiments, the user may input a user command on a Graphical User Interface (GUI) displayed on the display 260, and the user input interface receives the user input command through the Graphical User Interface (GUI). Alternatively, the user may input a user command by inputting a specific sound or gesture, and the user input interface receives the user input command by recognizing the sound or gesture through the sensor.
In some embodiments, a "user interface" is a media interface for interaction and information exchange between an application or operating system and a user that enables conversion between an internal form of information and a form acceptable to the user. A common presentation form of a User Interface is a Graphical User Interface (GUI), which refers to a User Interface related to computer operations and displayed in a graphical manner. It may be an interface element such as an icon, a window, a control, etc. displayed in the display screen of the electronic device, where the control may include at least one of an icon, a button, a menu, a tab, a text box, a dialog box, a status bar, a navigation bar, a Widget, etc. visual interface elements.
In some embodiments, user interface 280 is an interface that may be used to receive control inputs (e.g., physical buttons on the body of the display device, or the like).
In some embodiments, the system of the display device may include a Kernel (Kernel), a command parser (shell), a file system, and an application. The kernel, shell, and file system together form the basic operating system structure that allows users to manage files, run programs, and use the system. After power-on, the kernel starts, activates kernel space, abstracts hardware, initializes hardware parameters, etc., runs and maintains virtual memory, scheduler, signals and inter-process communication (IPC). And after the kernel is started, loading the Shell and the user application program. The application program is compiled into machine code after being started, and a process is formed.
Referring to fig. 4, in some embodiments, the system is divided into four layers, which are, from top to bottom, an Application (Applications) layer (referred to as an "Application layer"), an Application Framework (Application Framework) layer (referred to as a "Framework layer"), an Android runtime (Android runtime) layer and a system library layer (referred to as a "system runtime library layer"), and a kernel layer.
In some embodiments, at least one application program runs in the application program layer, and the application programs may be windows (windows) programs carried by an operating system, system setting programs, clock programs or the like; or may be an application developed by a third party developer. In particular implementations, the application packages in the application layer are not limited to the above examples.
The framework layer provides an Application Programming Interface (API) and a programming framework for the application program of the application layer. The application framework layer includes a number of predefined functions. The application framework layer acts as a processing center that determines the actions to be taken by applications in the application layer. The application program can access the resources in the system and obtain the services of the system in execution through the API interface.
As shown in fig. 4, in the embodiment of the present application, the application framework layer includes a manager (Managers), a Content Provider (Content Provider), and the like, where the manager includes at least one of the following modules: an Activity Manager (Activity Manager) is used for interacting with all activities running in the system; the Location Manager (Location Manager) is used for providing the system service or application with the access of the system Location service; a Package Manager (Package Manager) for retrieving various information related to an application Package currently installed on the device; a Notification Manager (Notification Manager) for controlling display and clearing of Notification messages; a Window Manager (Window Manager) is used to manage the icons, windows, toolbars, wallpapers, and desktop components on a user interface.
In some embodiments, the activity manager is used to manage the lifecycle of the various applications as well as general navigational fallback functions, such as controlling exit, opening, fallback, etc. of the applications. The window manager is used for managing all window programs, such as obtaining the size of a display screen, judging whether a status bar exists, locking the screen, intercepting the screen, controlling the change of the display window (for example, reducing the display window, displaying a shake, displaying a distortion deformation, and the like), and the like.
In some embodiments, the system runtime layer provides support for the upper layer, i.e., the framework layer, and when the framework layer is used, the android operating system runs the C/C + + library included in the system runtime layer to implement the functions to be implemented by the framework layer.
In some embodiments, the kernel layer is a layer between hardware and software. As shown in fig. 4, the core layer includes at least one of the following drivers: audio drive, display driver, bluetooth drive, camera drive, WIFI drive, USB drive, HDMI drive, sensor drive (like fingerprint sensor, temperature sensor, pressure sensor etc.) and power drive etc..
For clarity of explanation of the embodiments of the present application, a speech recognition network architecture provided by the embodiments of the present application is described below with reference to fig. 5.
Referring to fig. 5, fig. 5 is a schematic diagram of a voice recognition network architecture according to an embodiment of the present application. In fig. 6, the smart device is used to receive input information and output a processing result of the information. The voice recognition service equipment is electronic equipment with voice recognition service, the semantic service equipment is electronic equipment with semantic service, and the business service equipment is electronic equipment with business service. The electronic device may include a server, a computer, and the like, and the speech recognition service, the semantic service (also referred to as a semantic engine), and the business service are web services that can be deployed on the electronic device, wherein the speech recognition service is used for recognizing audio as text, the semantic service is used for semantic parsing of the text, and the business service is used for providing specific services such as a weather query service for ink weather, a music query service for QQ music, and the like. In one embodiment, in the architecture shown in fig. 5, there may be multiple entity service devices deployed with different business services, and one or more function services may also be aggregated in one or more entity service devices.
In some embodiments, the following describes an example of a process for processing information input to a smart device based on the architecture shown in fig. 5, where the information input to the smart device is a query statement input by voice, the process may include the following three processes:
[ Speech recognition ]
The intelligent device can upload the audio of the query sentence to the voice recognition service device after receiving the query sentence input by voice, so that the voice recognition service device can recognize the audio as a text through the voice recognition service and then return the text to the intelligent device. In one embodiment, before uploading the audio of the query statement to the speech recognition service device, the smart device may perform denoising processing on the audio of the query statement, where the denoising processing may include removing echo and environmental noise.
[ semantic understanding ]
The intelligent device uploads the text of the query sentence identified by the voice identification service to the semantic service device, and the semantic service device performs semantic analysis on the text through semantic service to obtain the service field, intention and the like of the text.
[ semantic response ]
And the semantic service equipment issues a query instruction to corresponding business service equipment according to the semantic analysis result of the text of the query statement so as to obtain the query result given by the business service. The intelligent device can obtain the query result from the semantic service device and output the query result. As an embodiment, the semantic service device may further send a semantic parsing result of the query statement to the intelligent device, so that the intelligent device outputs a feedback statement in the semantic parsing result.
It should be noted that the architecture shown in fig. 5 is only an example, and does not limit the scope of the present application. In the embodiment of the present application, other architectures may also be adopted to implement similar functions, for example: all or part of the three processes can be completed by the intelligent terminal, which is not described herein.
In some embodiments, the intelligent device shown in fig. 5 may be a display device, such as a smart television, the functions of the speech recognition service device may be implemented by cooperation of a sound collector and a controller provided on the display device, and the functions of the semantic service device and the business service device may be implemented by the controller of the display device or by a server of the display device.
In a far-field speech recognition scene of a man-machine speech conversation, a microphone collects echoes caused by playing of audio by intelligent equipment while collecting target human voice, so that echo cancellation needs to be carried out on collected audio signals. Echo cancellation is a process of canceling echo formed in an environment by playing audio by an intelligent device from audio signals collected by a microphone, so that only target human voice is reserved.
Currently, the echo cancellation scheme in the industry mainly adopts a direct connection scheme, that is, an extraction interface of a main chip is directly used for acquiring an extraction reference signal, and then an echo cancellation process is performed on an audio signal acquired by a microphone according to the extraction reference signal. Due to hardware limitations, the echo cancellation scheme with direct connection usually supports only one set of I2S reference signal extraction, and does not support more paths of reference signal extraction.
However, most of the current smart devices are provided with a plurality of sound channels for enhancing sound effects. Taking the intelligent television comprising the main sound channel and the bass sound channel as an example, the directly connected echo cancellation scheme can only carry out the reference signal back mining of the main sound channel, and the reference signal cancellation of the bass sound channel is abandoned, so that the echo cancellation effect is influenced, the voice interaction effect is finally influenced, and the user experience is poor.
Illustratively, the display device uses stereo speakers covering mid and high audio frequencies in the 250 hz to 8 khz band, while bass sounds in the frequency bands below 250 hz are handled by specialized woofers. Therefore, the display device comprises a main sound channel and a bass sound channel, and the main sound channel and the bass sound channel are driven to produce sound respectively by the power amplifier of the main sound channel and the power amplifier of the bass sound channel. On one hand, after the power amplifier of the main sound channel obtains the original audio from the playing service processing, the original audio is processed and divided, and medium and high voice frequency is obtained. And the power amplifier of the main sound channel drives the main sound channel to sound according to the medium-high audio frequency. On the other hand, after the power amplifier of the bass sound channel obtains the original audio frequency from the playing service processing, the original audio frequency is processed and subjected to frequency division to obtain the bass audio frequency. And the power amplifier of the bass sound channel drives the bass sound channel to sound according to the bass audio frequency.
Since the interface of an AEC (Acoustic Echo Cancellation) processor is limited, a bass channel power amplifier cannot output a bass audio signal to the AEC processor. Therefore, the echo cancellation of the prior art, the extraction reference signal, can only discard the bass audio signal. And finally, the stoping reference signal only comprises the audio signal processed by the power amplifier of the main sound channel, and the corresponding stoping reference signal also only comprises the processing information of the power amplifier of the main sound channel on the original audio. The echo formed by the bass channel in the environment cannot be cancelled.
In order to solve the above problem, fig. 6 shows a schematic diagram of a functional module architecture of a display device for echo cancellation in an embodiment of the present application.
The present embodiment is explained taking a display device having multiple channels (main channel and sub channel) as an example. The echo cancellation functional module comprises a main sound channel power amplifier 10, an auxiliary sound channel power amplifier 11, an AEC processor 12 and a playing service processor 13. The input end of the power amplifier of the main sound channel is connected to the output end of the playing service processor, and the original audio is obtained from the output end of the playing service processor. After the power amplifier of the main sound channel obtains the original audio frequency, the original audio frequency is processed to obtain a first audio frequency. Furthermore, the power amplifier of the main sound channel drives the main sound channel loudspeaker to make sound according to the first audio frequency.
In some embodiments, an audio DSP processor is further connected between the power amplifier 10 of the main channel and the playing service processor 13, and after the audio DSP processor obtains the original audio from the playing service processor 13, the audio DSP processor may perform audio transformation on the original audio, for example, transform an audio mode. The output end of the sound effect DSP processor is connected with the input end of the power amplifier of the main sound channel.
In the embodiment of the present application, the output end of the power amplifier 10 of the main sound channel is directly connected to the power amplifier 11 of the auxiliary sound channel. The power amplifier 10 of the main channel outputs the first audio frequency to the power amplifier 11 of the auxiliary channel before dividing the frequency of the first audio frequency. And after the power amplifier 11 of the secondary sound channel acquires the first audio, processing the first audio to obtain a second audio. The output end of the power amplifier of the secondary sound channel is connected with the AEC processor 12. The power amplifier of the secondary channel outputs the second audio to the AEC processor 12 before dividing the frequency of the second audio.
The AEC processor 12 is connected at another input to the sound collector. The sound collector collects a third audio frequency input by a user, namely a voice command, and also collects a first echo generated by playing the first audio frequency by the main sound channel and a second echo generated by playing the second audio frequency by the bass sound channel. Finally, the AEC processor 12 takes the second audio as an extraction reference signal, and performs echo cancellation on the first echo and the second echo in the target audio based on the extraction reference signal corresponding to the second audio.
Since the first audio is an audio obtained by performing first processing on the original audio, and the second audio is an audio obtained by performing second processing on the first audio. Therefore, the corresponding extraction reference signal of the final second audio not only contains the original full-band audio, but also contains the relevant information in the first processing and the relevant information in the second processing. According to the corresponding extraction reference signal of the second audio, echo cancellation can be carried out on the first echo and the second echo. Thereby realizing the process of performing echo cancellation on the multi-channel echo.
Illustratively, the primary channel is a mid-to-high channel and the secondary channel is a low-to-sound channel. The power amplifier of the medium and high sound channel obtains the original audio from the playing service processor 13, which may be the full band audio simply copied from the sound effect DSP processor. And then, carrying out first processing on the original audio by the power amplifier of the middle and high sound track to obtain a first audio. And the power amplifier of the middle and high sound track directly outputs the first audio to the power amplifier of the low sound track before the frequency division processing of the first audio. And carrying out second processing on the first audio by the power amplifier of the bass sound channel to obtain a second audio. The power amplifier of the bass channel outputs the second audio to the AEC processor before dividing the frequency of the second audio.
Since neither the first audio nor the second audio is subjected to the frequency division process, both the first audio and the second audio are full-band audio. And after the middle and high sound track power amplifier performs frequency division processing on the first audio frequency, a middle and high sound frequency is obtained, and the middle and high sound track power amplifier drives the middle and high sound track to sound according to the middle and high sound frequency. And the power amplifier of the bass sound channel divides the frequency of the second audio frequency to obtain the bass audio frequency, and then the power amplifier of the bass sound channel drives the bass sound channel to sound according to the bass audio frequency. The target audio collected by the sound collector comprises third audio input by a user, first echo from a middle and high sound channel and second echo from a low sound channel.
And finally, performing echo cancellation on the first echo and the second echo according to the back-picking reference signal corresponding to the second audio. The corresponding back-sampling reference signal of the second audio comprises full-band audio and comprises the first processing related information and the second processing related information, so that the first echo from a middle and high sound channel and the second echo from a low sound channel can be completely eliminated.
In the embodiment of the present application, the specific calculation process of echo cancellation is as follows:
as shown in the schematic diagram of the echo cancellation scenario in fig. 7, the sound collected by the sound collector can be represented by the following formula:
d(n)=S(n)+X(n)
Wherein, d (n) represents the audio signal collected by the sound collector; s (n) represents a third audio which is a voice command input by a user; x (n) represents an audio signal formed in the environment by sound played by multiple channels of the display device, that is, the first echo and the second echo in the above embodiment. The audio signal formed by the sound played by the display device in the environment can be obtained by calculating the original audio x (n) transmitted to the loudspeaker by the host and the acoustic transfer function h (n) of the environment where the convolution display device is located, namely the filter coefficient required in the AEC algorithm.
In some embodiments, the filter coefficients can be solved by means of wiener filters by different methods of convergence function, the correlation algorithm is modeled as shown in fig. 8, and the output error signal is expressed by the following equation:
e(n)=x(n)*h(n)-d(n)
wherein e (n) represents an error signal, x (n) represents an audio signal transmitted from the host to the speaker, h (n) represents an acoustic transfer function of an environment in which the display device is located, x (n) × h (n) represents an echo signal arriving at the microphone (sound collector), and d (n) represents an echo signal expected to arrive at the microphone.
Based on the principle of the wiener filter, the minimum mean square error (expectation of the square) can be found by the error function, which is expressed by the following formula:
E[e2(n)]=E[(x(n)*h(n)-d(n))^2]
The maximum correlation is calculated by the above formula, and the filter coefficient, i.e. the acoustic transfer function h (n) in the environment, is found. The calculation formula of the filter coefficient obtained by inference is as follows:
Figure BDA0003011990060000121
wherein, the first and the second end of the pipe are connected with each other,
Figure BDA0003011990060000122
being the inverse of the autocorrelation matrix, R, of the input signal (loudspeaker signal)XX=E[X(n)X(n)T],rXd=E[X(n)d(n)]。
In the actual echo cancellation process, since the selection (20ms) of the sampling points is very related to the convergence rate of the algorithm result, too many sampling points occupy a lot of computing resources, and therefore, an iterative algorithm with smooth transition is generally adopted for computing. That is, a stepwise comparison method is used to determine whether convergence occurs according to the minimum value of the mean square error, and the parameters during convergence are the required filter coefficients.
Fig. 9 is a schematic diagram illustrating an architecture of functional modules of a display device for echo cancellation in another embodiment of the present application.
The present embodiment explains the scheme by taking a display device having multiple channels (a middle-high channel and a low-sound channel) as an example. The echo cancellation functional module comprises a power amplifier of a main sound channel, a power amplifier of a bass sound channel, an AEC processor and a playing service processor. For the connection relationship and the functions of each component, refer to the contents of the above embodiments, which are not described again in this embodiment.
In this embodiment, the power amplifier of the medium-high sound channel includes a PEQ (Parametric Equalizer) adjuster, a three-segment DRC (Dynamic Range Control) adjuster, and a frequency divider. The power amplifier of the bass sound channel comprises a bass booster, three sections of DRC regulators and a frequency divider. It should be noted that the power amplifier for the medium-high sound channel and the power amplifier for the low-sound channel may further include other devices for adjusting audio, and the embodiments of the present application are not limited.
The power amplifier of well high sound track obtains original audio frequency, full frequency channel audio frequency promptly from playing the service processor, and the PEQ regulator carries out the audio effect adjustment to original audio frequency. This process is mainly to ensure the full middle frequency (500 hz to 6 khz) of the audio signal, so as to adjust the frequency response of the audio signal in this frequency band. The original audio adjusted by the PEQ adjusting sound effect generates certain nonlinear distortion. Therefore, the processed first audio of the original audio is full-band audio and carries nonlinear distortion information.
On one hand, the power amplifier of the medium and high sound track outputs the first audio frequency to the power amplifier of the low sound track through the three sections of DRC regulators. The three-stage DRC regulator is adjusted by three different processing modes of a background noise signal, a medium amplitude signal and a large amplitude signal: and (3) carrying out attenuation processing on the bottom noise, expanding the medium-amplitude signal and limiting the dynamic range of the signal with larger amplitude. Therefore, the full-band audio data processed by the PEQ is output by the middle and high sound channel power amplifier at the three-segment DRC regulator, the high voltage can be dynamically controlled, and the woofer is prevented from being burned out by the instantaneous high voltage.
On the other hand, the power amplifier of the medium and high sound track divides the frequency of the first audio frequency through the frequency divider to obtain medium and high sound audio frequencies, and the medium and high sound loudspeaker is driven to play sound according to the medium and high sound audio frequencies. The sound collector collects the sound played by the high-pitch loudspeaker, and the sound is first echo.
And after the power amplifier of the bass sound channel obtains the first audio frequency from the power amplifier of the middle-high sound channel, the first audio frequency is continuously processed through the bass enhancer. This process is mainly to gain and low pass filter the bass audio below 250 hz to get the second audio. The second audio includes gain information for bass sounds.
On one hand, the power amplifier of the bass sound channel divides the frequency of the second audio frequency to obtain bass audio frequency, and drives the woofer to play sound according to the bass audio frequency. The sound collector collects the sound played by the woofer as a second echo.
On the other hand, the power amplifier of the bass channel outputs the second audio to the AEC processor through the three-stage DRC adjuster as well. At this time, the second audio is still full-band audio, and therefore includes both the PEQ-processed nonlinear distortion information and the bass gain information.
And finally, the sound collector collects the target audio, including the voice command, the first echo and the second echo input by the user, and can completely eliminate the first echo and the second echo from the target audio based on the extraction reference signal corresponding to the second audio.
It should be noted that, the playing service processor and the echo cancellation processor in the embodiment of the present application are both integrated in the main processor (all the processors are integrated in the main processor, which may save cost), and the embodiment of the present application may also fully utilize the I2S output interface of the power amplifier circuit, and save the I2S interface of the main processor (I2S interface of the main processor is relatively short-cut).
Based on the foregoing embodiments, the present application further provides a multi-channel echo cancellation method, as shown in the signaling diagram of fig. 10, where the method includes the following steps:
step one, a playing service processor outputs original audio (full-band audio) to a power amplifier of a main sound channel, and the power amplifier of the main sound channel performs first processing on the original audio to obtain first audio.
And step two, the power amplifier of the main sound channel drives the loudspeaker of the main sound channel to play sound according to the first audio frequency after frequency division on one hand, and outputs the first audio frequency before frequency division to the power amplifier of the auxiliary sound channel on the other hand.
And step three, the power amplifier of the auxiliary sound channel performs second processing on the first power amplifier to obtain a second audio frequency. The first processing and the second processing are different audio processing modes.
And fourthly, on one hand, the power amplifier of the secondary sound channel drives the secondary sound channel loudspeaker to play sound according to the second audio frequency after frequency division, and on the other hand, the second audio frequency before frequency division is output to the echo cancellation processor.
And step five, the echo cancellation processor acquires target audio from the sound collector, wherein the target audio comprises third audio input by a user, namely a voice command, a first echo from the main sound channel and a second echo from the auxiliary sound channel. The echo cancellation processor performs echo cancellation on the first echo and the second echo based on an extraction reference signal corresponding to the second audio.
In some embodiments, the primary channel is a mid-high channel and the secondary channel is a bass channel. The first process is a PEQ process and the second process is a bass enhancement and a low pass filtering. The extraction reference signal at least includes nonlinear distortion information obtained by performing PEQ processing on the original audio and bass enhancement information obtained by performing bass enhancement processing on the first audio.
The same or similar contents in the embodiments of the present application may be referred to each other, and the related embodiments are not described in detail.
Finally, it should be noted that: the above embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present application.
The foregoing description, for purposes of explanation, has been presented in conjunction with specific embodiments. However, the illustrative discussions above are not intended to be exhaustive or to limit the embodiments to the precise forms disclosed above. Many modifications and variations are possible in light of the above teaching. The embodiments were chosen and described in order to best explain the principles and the practical application, to thereby enable others skilled in the art to best utilize the embodiments and various embodiments with various modifications as are suited to the particular use contemplated.

Claims (7)

1. A display device, comprising:
a play service processor;
the loudspeaker comprises a main sound channel loudspeaker and an auxiliary sound channel loudspeaker, wherein the main sound channel is a middle-high sound channel, the auxiliary sound channel is a low sound channel, the main sound channel loudspeaker plays sound according to a first audio frequency after frequency division, the auxiliary sound channel loudspeaker plays sound according to a second audio frequency after frequency division, the first audio frequency is obtained after performing first processing on an original audio frequency obtained from the playing service processor, the second audio frequency is obtained after performing second processing on the first audio frequency, and the first processing and the second processing are different audio frequency processing modes;
A sound collector configured to collect target audio, wherein the target audio includes a third audio input by a user, a first echo from the primary channel loudspeaker, and a second echo from the secondary channel loudspeaker;
a controller configured to:
acquiring the second audio from a power amplifier of the secondary sound channel, and acquiring the target audio from the sound collector;
and according to the back-picking reference signal corresponding to the second audio, performing echo cancellation on the first echo and the second echo.
2. The display apparatus according to claim 1, wherein the first process is a PEQ process, the second process is a bass enhancement process, and the extraction reference signal includes at least nonlinear distortion information obtained by performing the PEQ process on the original audio and bass enhancement information obtained by performing the bass enhancement process on the first audio.
3. The display device according to claim 1, wherein the power amplifier of the main channel outputs the first audio to the power amplifier of the sub channel through three-segment DRC, and the power amplifier of the sub channel outputs the second audio to the controller through three-segment DRC.
4. A multi-channel echo cancellation circuit, comprising:
a play service processor;
the power amplifier of the main sound channel is configured to obtain original audio from the playing service processor, perform first processing on the original audio to obtain first audio, and drive a main sound channel loudspeaker to play sound according to the first audio after frequency division;
the power amplifier of the secondary channel is configured to obtain the first audio frequency from the power amplifier of the main channel loudspeaker, perform second processing on the first audio frequency to obtain a second audio frequency, and drive the secondary channel loudspeaker to play sound according to the second audio frequency after frequency division, wherein the first processing and the second processing are different audio frequency processing modes;
an echo cancellation processor configured to obtain the second audio from a power amplifier of the secondary channel, and obtain a target audio from a sound collector, where the target audio includes a third audio input by a user, a first echo from the primary channel loudspeaker, and a second echo from the secondary channel loudspeaker;
and according to the back-picking reference signal corresponding to the second audio, performing echo cancellation on the first echo and the second echo.
5. A multi-channel echo cancellation method is applied to a display device, and is characterized in that the display device includes a main channel loudspeaker and a sub channel loudspeaker, the main channel is a middle and high audio channel, the sub channel is a low audio channel, the main channel loudspeaker plays sound according to a first audio frequency after frequency division, the sub channel loudspeaker plays sound according to a second audio frequency after frequency division, the first audio frequency is obtained by performing first processing on an original audio frequency obtained from a playing service processor, the second audio frequency is obtained by performing second processing on the first audio frequency, the first processing and the second processing are different audio processing modes, and the method includes:
acquiring the second audio from a power amplifier of the secondary channel, and acquiring a collected target audio from a sound collector, wherein the target audio comprises a third audio input by a user, a first echo from the main channel loudspeaker and a second echo from the secondary channel loudspeaker;
and according to the back-picking reference signal corresponding to the second audio, performing echo cancellation on the first echo and the second echo.
6. The multi-channel echo cancellation method according to claim 5, wherein the first processing is PEQ processing, the second processing is bass enhancement and low-pass filtering, and the extraction reference signal at least includes nonlinear distortion information obtained by performing PEQ processing on the original audio and bass enhancement information obtained by performing bass enhancement processing on the first audio.
7. The multi-channel echo cancellation method according to claim 5, wherein the power amplifier of the main channel outputs the first audio to the power amplifier of the auxiliary channel through three DRCs, and the power amplifier of the auxiliary channel outputs the second audio to the controller of the display device through three DRCs.
CN202110378801.1A 2021-04-08 2021-04-08 Display device, multi-channel echo cancellation circuit and multi-channel echo cancellation method Active CN113096681B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110378801.1A CN113096681B (en) 2021-04-08 2021-04-08 Display device, multi-channel echo cancellation circuit and multi-channel echo cancellation method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110378801.1A CN113096681B (en) 2021-04-08 2021-04-08 Display device, multi-channel echo cancellation circuit and multi-channel echo cancellation method

Publications (2)

Publication Number Publication Date
CN113096681A CN113096681A (en) 2021-07-09
CN113096681B true CN113096681B (en) 2022-06-28

Family

ID=76675152

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110378801.1A Active CN113096681B (en) 2021-04-08 2021-04-08 Display device, multi-channel echo cancellation circuit and multi-channel echo cancellation method

Country Status (1)

Country Link
CN (1) CN113096681B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113823310B (en) * 2021-11-24 2022-02-22 南昌龙旗信息技术有限公司 Voice interruption wake-up circuit applied to tablet computer

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
NO328256B1 (en) * 2004-12-29 2010-01-18 Tandberg Telecom As Audio System
JP6163468B2 (en) * 2014-08-25 2017-07-12 日本電信電話株式会社 Sound quality evaluation apparatus, sound quality evaluation method, and program
DE102015222105A1 (en) * 2015-11-10 2017-05-11 Volkswagen Aktiengesellschaft Audio signal processing in a vehicle
CN106910510A (en) * 2017-02-16 2017-06-30 智车优行科技(北京)有限公司 Vehicle-mounted power amplifying device, vehicle and its audio play handling method
US10482868B2 (en) * 2017-09-28 2019-11-19 Sonos, Inc. Multi-channel acoustic echo cancellation
DE102018127071B3 (en) * 2018-10-30 2020-01-09 Harman Becker Automotive Systems Gmbh Audio signal processing with acoustic echo cancellation

Also Published As

Publication number Publication date
CN113096681A (en) 2021-07-09

Similar Documents

Publication Publication Date Title
CN112992171B (en) Display device and control method for eliminating echo received by microphone
CN111757171A (en) Display device and audio playing method
CN112612443B (en) Audio playing method, display device and server
CN114302194A (en) Display device and playing method during switching of multiple devices
CN112153440B (en) Display equipment and display system
CN112995551A (en) Sound control method and display device
CN112752156A (en) Subtitle adjusting method and display device
CN112599126A (en) Awakening method of intelligent device, intelligent device and computing device
CN113096681B (en) Display device, multi-channel echo cancellation circuit and multi-channel echo cancellation method
WO2022078065A1 (en) Display device resource playing method and display device
CN113066491A (en) Display device and voice interaction method
CN113507633A (en) Sound data processing method and device
CN114302021A (en) Display device and sound picture synchronization method
CN113473241A (en) Display equipment and display control method of image-text style menu
CN113079401B (en) Display device and echo cancellation method
CN112637957A (en) Display device and communication method of display device and wireless sound box
CN113709535B (en) Display equipment and far-field voice recognition method based on sound channel use
CN113038048B (en) Far-field voice awakening method and display device
CN115359788A (en) Display device and far-field voice recognition method
CN115103144A (en) Display device and volume bar display method
CN114302070A (en) Display device and audio output method
CN114078480A (en) Display device and echo cancellation method
CN115185392A (en) Display device, image processing method and device
CN114302197A (en) Voice separation control method and display device
CN113079400A (en) Display device, server and voice interaction method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant