WO2023273360A1 - Browser-based real-time audio processing method and system, and storage device - Google Patents

Browser-based real-time audio processing method and system, and storage device Download PDF

Info

Publication number
WO2023273360A1
WO2023273360A1 PCT/CN2022/076304 CN2022076304W WO2023273360A1 WO 2023273360 A1 WO2023273360 A1 WO 2023273360A1 CN 2022076304 W CN2022076304 W CN 2022076304W WO 2023273360 A1 WO2023273360 A1 WO 2023273360A1
Authority
WO
WIPO (PCT)
Prior art keywords
real
audio
browser
time audio
audio processing
Prior art date
Application number
PCT/CN2022/076304
Other languages
French (fr)
Chinese (zh)
Inventor
潘晨
Original Assignee
稿定(厦门)科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 稿定(厦门)科技有限公司 filed Critical 稿定(厦门)科技有限公司
Publication of WO2023273360A1 publication Critical patent/WO2023273360A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output
    • G06F3/167Audio in a user interface, e.g. using voice commands for navigating, audio feedback
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/958Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output
    • G06F3/162Interface to dedicated audio devices, e.g. audio drivers, interface to CODECs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45504Abstract machines for programme code execution, e.g. Java virtual machine [JVM], interpreters, emulators
    • G06F9/45508Runtime interpretation or emulation, e g. emulator loops, bytecode interpretation

Definitions

  • the invention relates to the field of audio processing, and specifically refers to a browser-based real-time audio processing method, system, and storage device.
  • the present invention provides a browser-based real-time audio processing method, system, and storage device, which can effectively solve the above-mentioned problems in the prior art.
  • a browser-based real-time audio processing method comprising the following steps:
  • the container acquires the real-time audio through the real-time audio input interface, matches the input data type and output data type of the container, processes the real-time audio, and plays a sound.
  • step S3 the selection of the container of the Web-side audio processing module in the browser is specifically: in the browser, calling the AudioWorklet interface in the browser to create a newly started independent thread as the Web-side Container for audio processing modules.
  • step S4 is specifically:
  • the audio input source is a media stream audio input node
  • call the browser getUserMedia interface to obtain the real-time audio from the device media stream
  • call the browser createMediaStreamSource interface to obtain the real-time audio, and establish media stream audio input
  • the audio input source is a media element audio input node
  • connect the player to the browser call the browser createMediaElementSource interface, and create the media element audio input.
  • step S5 is specifically:
  • the container acquires the real-time audio through the real-time audio input interface
  • the web-side audio processing module processes the data in the input layer cache, and stores the processing result in the output layer cache;
  • both the input layer cache and the output layer cache are ring caches.
  • step S5.3 after the container obtains the real-time audio, it converts the sampling point type of the channel data into a data type matched by the web-side audio processing module and stores it in the input layer cache Before, the method further includes a step of interleaving and storing the channel data included in the real-time audio.
  • step S5.5 after converting the sample point type of the processing result into a data type matched by the container, a step is further included: converting the interleaved processing result into a multi-channel processing result.
  • step S5.4 when the cache storage capacity of the input layer cache reaches the frame length required by the web-side audio processing module, the web-side audio processing module processes the data in the input layer cache, And store the processing results in the output layer cache.
  • a browser-based real-time audio processing system comprising:
  • An acquisition unit configured to acquire a native audio processing module written in a non-JavaScript programming language
  • a compiling unit configured to compile the native audio processing module into a Web-side audio processing module
  • the container selection unit is used to select the container of the Web-side audio processing module in the browser, and the container is used to load and run the Web-side audio processing module;
  • a mapping unit configured to acquire real-time audio, establish a real-time audio input interface in the browser, and map the real-time audio to the real-time audio input interface;
  • the data type matching unit is used for the container to obtain the real-time audio through the real-time audio input interface, match the input data type and output data type of the container, process the real-time audio and play sound.
  • a computer-readable storage medium is further provided, storing a computer program, and implementing the real-time audio processing method when the computer program is executed by a processor.
  • the present invention provides following effect and/or advantage:
  • the method provided by the present invention can universally realize the method of real-time audio processing in the browser by compiling and loading the container, establishing a real-time audio input interface to input real-time audio, and matching the input data type and output data type of the container.
  • the method provided by the present invention has simple and efficient processing steps, and has the characteristics of low delay. Since the human ear is very sensitive to the intermittent sound, the delay or stuttering of the time greater than 16ms can be detected by the human ear. The low-latency processing and output make the final output audio without a sense of discontinuity.
  • the invention has universality, not only can realize the processing of the audio input source in the browser, but also is applicable to various existing c/c++ audio input algorithm modules.
  • the container selected by the present invention can establish independent computing resources for the web-side audio processing module, and does not choose the main thread that takes into account heavy tasks such as interface rendering and event response, so as to ensure the lowest processing time for audio processing.
  • the present invention selects different interfaces for different audio sources to be connected to the container, and maps the real-time audio obtained from the audio input source that can be accepted by the container to a specific interface acceptable to the browser or container, thereby ensuring that the browser or container can obtain real-time audio.
  • the present invention matches the input data type and the output data type of the container, so that the input audio data can be recognized and processed by the Web-side audio processing module, and at the same time, the data obtained after processing by the Web-side audio processing module can be processed
  • the container recognizes and plays.
  • Figure 1 is a schematic flow chart of Embodiment 1.
  • Fig. 2 is a schematic diagram of a link in which real-time audio is mapped to the real-time audio input interface.
  • Fig. 3 is a schematic diagram of input and output matching in the container.
  • FIG. 4 is a schematic diagram of interleaved storage of channel data.
  • Fig. 5 is a schematic diagram of the functional framework of the second embodiment.
  • a kind of browser-based real-time audio processing method comprises the following steps:
  • the native audio processing module can be a bottom-level audio processing module written in c/c++ language, using a mature audio processing algorithm to realize different voice-changing effects on PCM .
  • the original audio processing module can also be a program written in other programming languages, and the audio processing module can also be a module that realizes different audio processing effects.
  • the native audio processing module used in this embodiment is an existing technology, and its composition and function will not be described in detail here.
  • Webassembly is a virtual machine language, and its MVP (Minimum Viable Product, minimum viable product is the core function) has been widely supported in various browsers , and its execution performance is close to native, which has greatly improved the performance of traditional Javascript processing modules running on browsers.
  • Module compilation can be done with the help of the Emscripten compilation tool chain.
  • Emscripten is an implementation of LLVM (Low Level Virtual Machine, a general-purpose compiler architecture), which is specially used for the conversion of c/c++ modules to Webassembly modules. Convert the Web-side audio processing module through Webassembly.
  • the container is used to load and run the web-side audio processing module; the container provides computing resources for the web-side audio processing module, and A thread is a unit for computer scheduling resources.
  • the main thread in a browser is used as a container for loading and running processing modules.
  • independent computing resources are required to ensure the lowest processing time for audio processing, and the main thread that takes into account heavy tasks such as interface rendering and event response is not selected.
  • S4 acquire real-time audio, establish a real-time audio input interface in the browser, and map the real-time audio to the real-time audio input interface; the container only accepts several specific audio input sources, so it is necessary to input audio in any form mapped to these specific input sources. Map the real-time audio obtained from the audio input source that can be accepted by the container to the specific interface acceptable to the browser or container, so as to ensure that the browser or container can obtain real-time audio.
  • the container acquires the real-time audio through the real-time audio input interface, matches the input data type and output data type of the container, processes the real-time audio, and plays a sound.
  • the input and output data model of the c/c++ module is retained inside the compiled Web-side audio processing module, which does not match the input and output data model of the container, and the length of a single audio data frame input and output in the container is 128 sampling points, each sampling point is recorded using a 32-bit floating-point number, the length of a single audio data frame processed by the compiled web-side audio processing module is generally 1024, and each sampling point generally uses a 16-bit integer number Make a note.
  • step S3 the selection of the container of the Web-side audio processing module in the browser is specifically: in the browser, calling the AudioWorklet interface in the browser to create a newly started independent thread as the Web-side Container for audio processing modules.
  • Step S4 is specifically: referring to FIG. 2,
  • the audio input source is one or more of a buffered audio input node, a media stream audio input node, and a media element audio input node; wherein, the buffered audio input node is generally It is used for the input of original audio data PCM, such as the sampled audio in the buffer; the media stream audio input node is generally used for the input of the media stream of the device, such as the audio obtained from the microphone; the media element audio input node is generally used for The input of player media files in the browser, such as audio being played by a player.
  • the audio input source is a media stream audio input node
  • call the browser getUserMedia interface to obtain the real-time audio from the device media stream
  • call the browser createMediaStreamSource interface to obtain the real-time audio, and establish media stream audio input
  • the audio input source is a media element audio input node
  • connect the player to the browser call the browser createMediaElementSource interface, and create the media element audio input.
  • Step S5 is specifically: referring to FIG. 3 ,
  • the container obtains the real-time audio through the real-time audio input interface; in this embodiment, the length of a single audio data frame input and output in the container is 128 sampling points, and each sampling point uses a 32-bit floating point number Make a note.
  • each sampling point is recorded using a 16-bit integer data type, which is a data type that can be recognized and processed by the compiled web-side audio processing module.
  • the Web-side audio processing module processes the data in the input layer cache, and stores the processing results in the output layer cache; in this embodiment, the Web-side audio processing module is implemented using a mature audio processing algorithm Different voice changing effects.
  • both the input layer cache and the output layer cache are ring caches.
  • the ring cache maximizes the reuse of memory resources.
  • step S5.3 referring to FIG. 4 , after the container obtains the real-time audio, it converts the sample point type of the channel data into a data type matched by the web-side audio processing module and stores it in the Before the input layer buffer, it further includes the step of interleaving and storing the channel data included in the real-time audio. Since audio generally includes two channels, it is necessary to organize the two-channel audio data into data on a single chain before it can be read by the web-side audio processing module. Specifically, the data of the left channel and the right channel are interleavedly stored. Therefore, in step S5.5, after converting the sample point type of the processing result into a data type matching the container, a step is further included: converting the interleaved stored processing result into a multi-channel processing result.
  • step S5.4 when the cache storage capacity of the input layer cache reaches the frame length required by the web-side audio processing module, the web-side audio processing module processes the data in the input layer cache, And store the processing results in the output layer cache.
  • Using a browser to implement the method mentioned in this embodiment it takes about 10 microseconds to process 128-bit data for a single data type matching, and it takes about 300 microseconds for the audio data of 1024-bit data to be processed by the web-side audio processing module. It takes about 320 microseconds in total.
  • a browser-based real-time audio processing system comprising: with reference to Figure 5,
  • An acquisition unit configured to acquire a native audio processing module written in a non-JavaScript programming language
  • a compiling unit configured to compile the native audio processing module into a Web-side audio processing module
  • the container selection unit is used to select the container of the Web-side audio processing module in the browser, and the container is used to load and run the Web-side audio processing module;
  • a mapping unit configured to acquire real-time audio, establish a real-time audio input interface in the browser, and map the real-time audio to the real-time audio input interface;
  • the data type matching unit is used for the container to obtain the real-time audio through the real-time audio input interface, match the input data type and output data type of the container, process the real-time audio and play sound.
  • a computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, the real-time audio processing method described in Embodiment 1 is realized.
  • Each functional unit in each embodiment of the present invention may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit.
  • the above-mentioned integrated units can be implemented in the form of hardware or in the form of software functional units.
  • the integrated unit is realized in the form of a software function unit and sold or used as an independent product, it can be stored in a computer-readable storage medium.
  • the computer software product is stored in a storage medium , including several instructions to make a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor (processor) execute all or part of the steps of the methods in various embodiments of the present invention.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disk or optical disc, etc., which can store program codes. .

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Data Mining & Analysis (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The present invention relates to a browser-based real-time audio processing method, comprising the following steps: S1, acquiring a native audio processing module written in a non-JavaScript programming language; S2, compiling the native audio processing module into a Web-side audio processing module; S3, selecting a container of the Web-side audio processing module in a browser, the container being configured to load and run the Web-side audio processing module; S4, acquiring a real-time audio, establishing a real-time audio input interface in the browser, and mapping the real-time audio to the real-time audio input interface; S5, the container obtaining the real-time audio by means of the real-time audio input interface, matching an input data type and an output data type of the container, processing the real-time audio and playing sound.

Description

一种基于浏览器的实时音频处理方法、系统、储存装置A browser-based real-time audio processing method, system, and storage device 技术领域technical field
本发明涉及音频处理领域,具体指有一种基于浏览器的实时音频处理方法、系统、储存装置。The invention relates to the field of audio processing, and specifically refers to a browser-based real-time audio processing method, system, and storage device.
背景技术Background technique
随着高速网络、直播等强实时性内容载体的流行,对音频进行处理,如叠加变音以增加内容互动趣味性等,其需求与日俱增。With the popularity of high-speed network, live broadcast and other strong real-time content carriers, the demand for audio processing, such as superimposing and changing voices to increase the interactive interest of content, etc., is increasing day by day.
目前市面上浏览器端音频处理集中在音频数据波形可视化,而数据来自预先载入的整个音频文件,缺少对实时音频数据的加工处理方案,无法满足直播等实时性音频处理的需求。同时,平台原生的音频处理算法实现困难,只能实现简单的音频处理,存在延迟大等问题。At present, browser-side audio processing on the market focuses on the visualization of audio data waveforms, and the data comes from pre-loaded entire audio files. There is a lack of processing solutions for real-time audio data, which cannot meet the needs of real-time audio processing such as live broadcast. At the same time, it is difficult to implement the native audio processing algorithm of the platform, which can only realize simple audio processing, and there are problems such as large delay.
针对上述的现有技术存在的问题设计一种基于浏览器的实时音频处理方法、系统、储存装置是本发明研究的目的。It is the purpose of the present invention to design a browser-based real-time audio processing method, system, and storage device for the above-mentioned problems in the prior art.
发明内容Contents of the invention
针对上述现有技术存在的问题,本发明在于提供一种基于浏览器的实时音频处理方法、系统、储存装置,能够有效解决上述现有技术存在的问题。In view of the above-mentioned problems in the prior art, the present invention provides a browser-based real-time audio processing method, system, and storage device, which can effectively solve the above-mentioned problems in the prior art.
本发明的技术方案是:Technical scheme of the present invention is:
一种基于浏览器的实时音频处理方法,包含以下步骤:A browser-based real-time audio processing method, comprising the following steps:
S1,获取非JavaScript编程语言编写的原生音频处理模块;S1, obtaining a native audio processing module written in a non-JavaScript programming language;
S2,将所述原生音频处理模块编译为Web端音频处理模块;S2, compiling the native audio processing module into a web-side audio processing module;
S3,在浏览器中选择所述Web端音频处理模块的容器,所述容器用于加载和运行所述Web端音频处理模块;S3. Select the container of the web-side audio processing module in the browser, and the container is used to load and run the web-side audio processing module;
S4,获取实时音频,在浏览器中建立实时音频输入接口,将所述实时音频映射到所述实时音频输入接口;S4, acquiring real-time audio, establishing a real-time audio input interface in the browser, and mapping the real-time audio to the real-time audio input interface;
S5,所述容器通过所述实时音频输入接口获取所述实时音频,匹配所述容器的输入数据类型和输出数据类型,处理所述实时音频并播放声音。S5. The container acquires the real-time audio through the real-time audio input interface, matches the input data type and output data type of the container, processes the real-time audio, and plays a sound.
进一步地,步骤S3中,所述在浏览器中选择所述Web端音频处理模块的容器具体为:在浏览器中,调用浏览器中的AudioWorklet接口建立新启动的一个独立线程作为所述Web端音频处理模块的容器。Further, in step S3, the selection of the container of the Web-side audio processing module in the browser is specifically: in the browser, calling the AudioWorklet interface in the browser to create a newly started independent thread as the Web-side Container for audio processing modules.
进一步地,步骤S4具体为:Further, step S4 is specifically:
S4.1,通过音频输入源获取实时音频,所述音频输入源为缓冲音频输入节点、媒体流音频输入节点、媒体元素音频输入节点其中的一种或多种;S4.1. Obtain real-time audio through an audio input source, where the audio input source is one or more of a buffered audio input node, a media stream audio input node, and a media element audio input node;
S4.2,若所述音频输入源为缓冲音频输入节点,调用浏览器AudioBufferSourceNode接口,将所述实时音频依次写入对象的buffer属性;S4.2, if the audio input source is a buffered audio input node, call the browser AudioBufferSourceNode interface, and write the real-time audio into the buffer attribute of the object in turn;
若所述音频输入源为媒体流音频输入节点,调用浏览器getUserMedia接口,从设备媒体流获取所述实时音频,调用浏览器createMediaStreamSource接口获取所述实时音频,建立媒体流音频输入;If the audio input source is a media stream audio input node, call the browser getUserMedia interface to obtain the real-time audio from the device media stream, call the browser createMediaStreamSource interface to obtain the real-time audio, and establish media stream audio input;
若所述音频输入源为媒体元素音频输入节点,将播放器接入浏览器,调用浏览器createMediaElementSource接口,建立媒体元素音频输入。If the audio input source is a media element audio input node, connect the player to the browser, call the browser createMediaElementSource interface, and create the media element audio input.
进一步地,步骤S5具体为:Further, step S5 is specifically:
S5.1,所述容器通过所述实时音频输入接口获取所述实时音频;S5.1. The container acquires the real-time audio through the real-time audio input interface;
S5.2,在所述Web端音频处理模块建立输入层缓存和输出层缓存;S5.2, establishing an input layer cache and an output layer cache in the web-side audio processing module;
S5.3,所述容器获取到所述实时音频后,转换所述声道数据的采样点类型为所述Web端音频处理模块匹配的数据类型并储存到所述输入层缓存;S5.3. After the container obtains the real-time audio, convert the sample point type of the channel data into a data type matched by the web-side audio processing module and store it in the input layer cache;
S5.4,所述Web端音频处理模块处理所述输入层缓存内的数据,并将处理结果储存到输出层缓存;S5.4, the web-side audio processing module processes the data in the input layer cache, and stores the processing result in the output layer cache;
S5.5,读取所述输出层缓存,将所述处理结果的采样点类型转换为所述容器匹配的数据类型;S5.5. Read the output layer cache, and convert the sampling point type of the processing result into a data type matched by the container;
S5.6,播放声音。S5.6, play sound.
进一步地,所述输入层缓存和所述输出层缓存均为环形缓存。Further, both the input layer cache and the output layer cache are ring caches.
进一步地,步骤S5.3中,所述容器获取到所述实时音频后,转换所述声道数据的采样点类型为所述Web端音频处理模块匹配的数据类型并储存到所述输入层缓存之前,进一步包含步骤:将所述实时音频包含的声道数据交错储存。Further, in step S5.3, after the container obtains the real-time audio, it converts the sampling point type of the channel data into a data type matched by the web-side audio processing module and stores it in the input layer cache Before, the method further includes a step of interleaving and storing the channel data included in the real-time audio.
进一步地,步骤S5.5中,将所述处理结果的采样点类型转换为所述容器匹配的数据类型之后,进一步包含步骤:将交错储存的处理结果转换为多声道处理结果。Further, in step S5.5, after converting the sample point type of the processing result into a data type matched by the container, a step is further included: converting the interleaved processing result into a multi-channel processing result.
进一步地,步骤S5.4中,当所述输入层缓存的缓存存储量达到所述Web端音频处理模块要求的帧长度时,所述Web端音频处理模块处理所述输入层缓存内的数据,并将处理结果储存到输出层缓存。Further, in step S5.4, when the cache storage capacity of the input layer cache reaches the frame length required by the web-side audio processing module, the web-side audio processing module processes the data in the input layer cache, And store the processing results in the output layer cache.
进一步提供一种基于浏览器的实时音频处理系统,包含:A browser-based real-time audio processing system is further provided, comprising:
获取单元,用于获取非JavaScript编程语言编写的原生音频处理模块;An acquisition unit, configured to acquire a native audio processing module written in a non-JavaScript programming language;
编译单元,用于将所述原生音频处理模块编译为Web端音频处理模块;A compiling unit, configured to compile the native audio processing module into a Web-side audio processing module;
容器选择单元,用于在浏览器中选择所述Web端音频处理模块的容器,所述容器用于加载和运行所述Web端音频处理模块;The container selection unit is used to select the container of the Web-side audio processing module in the browser, and the container is used to load and run the Web-side audio processing module;
映射单元,用于获取实时音频,在浏览器中建立实时音频输入接口,将所述实时音频映射到所述实时音频输入接口;A mapping unit, configured to acquire real-time audio, establish a real-time audio input interface in the browser, and map the real-time audio to the real-time audio input interface;
数据类型匹配单元,用于所述容器通过所述实时音频输入接口获取所述实时音频,匹配所述容器的输入数据类型和输出数据类型,处理所述实时音频并播放声音。The data type matching unit is used for the container to obtain the real-time audio through the real-time audio input interface, match the input data type and output data type of the container, process the real-time audio and play sound.
进一步提供一种计算机可读存储介质,存储有计算机程序,所述计算机程序被处理器执行时实现所述的实时音频处理方法。A computer-readable storage medium is further provided, storing a computer program, and implementing the real-time audio processing method when the computer program is executed by a processor.
因此,本发明提供以下的效果和/或优点:Therefore, the present invention provides following effect and/or advantage:
本发明提供的方法,通过编译、加载容器、建立实时音频输入接口输入实时音频、匹配所述容器的输入数据类型和输出数据类型,从而能够实现在浏览器中通用地实现实时音频处理的方法。The method provided by the present invention can universally realize the method of real-time audio processing in the browser by compiling and loading the container, establishing a real-time audio input interface to input real-time audio, and matching the input data type and output data type of the container.
本发明提供的方法,处理步骤简单高效,具有低延迟性的特点,由于人耳对声音的断续十分敏感,时间大于于16ms的延迟或卡顿即可被人耳发觉,本发明对音频信号低延时的处理和输出使最终输出的音频没有断续感。The method provided by the present invention has simple and efficient processing steps, and has the characteristics of low delay. Since the human ear is very sensitive to the intermittent sound, the delay or stuttering of the time greater than 16ms can be detected by the human ear. The low-latency processing and output make the final output audio without a sense of discontinuity.
本发明具有通用性,不仅能够实现浏览器中音频输入源的处理,也适用于各种已存在的c/c++音频输入算法模块。The invention has universality, not only can realize the processing of the audio input source in the browser, but also is applicable to various existing c/c++ audio input algorithm modules.
本发明选择的容器,能够为web端音频处理模块建立独立的计算资源,不选择兼顾界面渲染和事件响应等繁重任务的主线程,保障音频处理具有最低的处理耗时。The container selected by the present invention can establish independent computing resources for the web-side audio processing module, and does not choose the main thread that takes into account heavy tasks such as interface rendering and event response, so as to ensure the lowest processing time for audio processing.
本发明为不同的音频源选择不同的接口接入容器中,将可被容器接受的音 频输入源得到的实时音频映射到浏览器或者容器可接受的特定接口,从而保证浏览器或者容器能够获得实时音频。The present invention selects different interfaces for different audio sources to be connected to the container, and maps the real-time audio obtained from the audio input source that can be accepted by the container to a specific interface acceptable to the browser or container, thereby ensuring that the browser or container can obtain real-time audio.
本发明将容器的输入数据类型和输出数据类型进行匹配,这样输入的音频数据才能被所述Web端音频处理模块所识别、处理,同时,所述Web端音频处理模块处理后得到的数据才能被容器所识别、播放。The present invention matches the input data type and the output data type of the container, so that the input audio data can be recognized and processed by the Web-side audio processing module, and at the same time, the data obtained after processing by the Web-side audio processing module can be processed The container recognizes and plays.
应当明白,本发明的上文的概述和下面的详细说明是示例性和解释性的,并且意在提供对如要求保护的本发明的进一步的解释。It is to be understood that both the foregoing general description and the following detailed description of the invention are exemplary and explanatory and are intended to provide further explanation of the invention as claimed.
附图说明Description of drawings
图1为实施例一的流程示意图。Figure 1 is a schematic flow chart of Embodiment 1.
图2为实时音频映射到所述实时音频输入接口的链路示意图。Fig. 2 is a schematic diagram of a link in which real-time audio is mapped to the real-time audio input interface.
图3为容器内输入输出匹配示意图。Fig. 3 is a schematic diagram of input and output matching in the container.
图4为声道数据交错储存的示意图。FIG. 4 is a schematic diagram of interleaved storage of channel data.
图5为实施例二的功能框架示意图。Fig. 5 is a schematic diagram of the functional framework of the second embodiment.
具体实施方式detailed description
为了便于本领域技术人员理解,现将实施例结合附图对本发明的结构作进一步详细描述:In order to facilitate the understanding of those skilled in the art, the structure of the present invention will be further described in detail with the embodiments in conjunction with the accompanying drawings:
参考图1,一种基于浏览器的实时音频处理方法,包含以下步骤:With reference to Fig. 1, a kind of browser-based real-time audio processing method comprises the following steps:
S1,获取非JavaScript编程语言编写的原生音频处理模块;本实施例中,原生音频处理模块可以是使用c/c++语言编写的底层音频处理模块,使用成熟的音频处理算法对PCM实现不同的变声效果。在其他实施例中,原生音频处理模块也可以是其他编程语言写的程序,音频处理模块也可以是实现不同 音频处理效果的模块。并且,本实施例中采用的原生音频处理模块为现有技术,在此不展开详细描述其组成、功能。S1, obtaining a native audio processing module written in a non-JavaScript programming language; in this embodiment, the native audio processing module can be a bottom-level audio processing module written in c/c++ language, using a mature audio processing algorithm to realize different voice-changing effects on PCM . In other embodiments, the original audio processing module can also be a program written in other programming languages, and the audio processing module can also be a module that realizes different audio processing effects. Moreover, the native audio processing module used in this embodiment is an existing technology, and its composition and function will not be described in detail here.
S2,将所述原生音频处理模块编译为Web端音频处理模块;Webassembly是一种虚拟机语言,其MVP(Minimum Viable Product,最小化可行产品即核心功能)在各浏览器中已得到了广泛支持,并且它的执行性能接近原生,相比于运行于浏览器上传统的Javascript处理模块性能有很大提升。模块的编译可以借助Emscripten编译工具链完成,Emscripten是LLVM(Low Level Virtual Machine,一种通用编译器架构)的一种实现,专门用于c/c++模块到Webassembly模块的转换。通过Webassembly转换所述Web端音频处理模块。S2, compiling the native audio processing module into a Web-side audio processing module; Webassembly is a virtual machine language, and its MVP (Minimum Viable Product, minimum viable product is the core function) has been widely supported in various browsers , and its execution performance is close to native, which has greatly improved the performance of traditional Javascript processing modules running on browsers. Module compilation can be done with the help of the Emscripten compilation tool chain. Emscripten is an implementation of LLVM (Low Level Virtual Machine, a general-purpose compiler architecture), which is specially used for the conversion of c/c++ modules to Webassembly modules. Convert the Web-side audio processing module through Webassembly.
S3,在浏览器中选择所述Web端音频处理模块的容器,所述容器用于加载和运行所述Web端音频处理模块;所述容器为所述Web端音频处理模块模块提供计算资源,而线程为计算机调度资源的单位,一般在浏览器中主线程作为加载和运行处理模块的容器。但实时音频处理场景中,考虑到人耳对音频不连续十分敏感,因此需要独立的计算资源来保障音频处理具有最低的处理耗时,不选择兼顾界面渲染和事件响应等繁重任务的主线程,而选择新启动一个线程作为所述Web端音频处理模块的容器。这个独立线程可以由浏览器中的AudioWorklet接口提供。S3. Select the container of the web-side audio processing module in the browser, the container is used to load and run the web-side audio processing module; the container provides computing resources for the web-side audio processing module, and A thread is a unit for computer scheduling resources. Generally, the main thread in a browser is used as a container for loading and running processing modules. However, in real-time audio processing scenarios, considering that the human ear is very sensitive to audio discontinuity, independent computing resources are required to ensure the lowest processing time for audio processing, and the main thread that takes into account heavy tasks such as interface rendering and event response is not selected. And choose to start a new thread as the container of the Web-side audio processing module. This separate thread can be provided by the AudioWorklet interface in the browser.
S4,获取实时音频,在浏览器中建立实时音频输入接口,将所述实时音频映射到所述实时音频输入接口;容器只接受几种特定的音频输入源,因此需要将可能的任意形式音频输入映射到这些特定的输入源上。将可被容器接受的音频输入源得到的实时音频映射到浏览器或者容器可接受的特定接口,从而保证浏览器或者容器能够获得实时音频。S4, acquire real-time audio, establish a real-time audio input interface in the browser, and map the real-time audio to the real-time audio input interface; the container only accepts several specific audio input sources, so it is necessary to input audio in any form mapped to these specific input sources. Map the real-time audio obtained from the audio input source that can be accepted by the container to the specific interface acceptable to the browser or container, so as to ensure that the browser or container can obtain real-time audio.
S5,所述容器通过所述实时音频输入接口获取所述实时音频,匹配所述容器的输入数据类型和输出数据类型,处理所述实时音频并播放声音。本实施例中,编译而来的所述Web端音频处理模块内部保留了c/c++模块的输入输出数据模型,与容器的输入输出数据模型不匹配,容器中输入输出的单个音频数据帧长度为128个采样点,每个采样点使用32位浮点数进行记录,编译而来的所述Web端音频处理模块处理的单个音频数据帧长一般为1024,每个采样点一般采用16位整型数进行记录。因此,需要将容器的输入数据类型和输出数据类型进行匹配,这样输入的音频数据才能被所述Web端音频处理模块所识别、处理,同时,所述Web端音频处理模块处理后得到的数据才能被容器所识别、播放。S5. The container acquires the real-time audio through the real-time audio input interface, matches the input data type and output data type of the container, processes the real-time audio, and plays a sound. In this embodiment, the input and output data model of the c/c++ module is retained inside the compiled Web-side audio processing module, which does not match the input and output data model of the container, and the length of a single audio data frame input and output in the container is 128 sampling points, each sampling point is recorded using a 32-bit floating-point number, the length of a single audio data frame processed by the compiled web-side audio processing module is generally 1024, and each sampling point generally uses a 16-bit integer number Make a note. Therefore, it is necessary to match the input data type and the output data type of the container, so that the input audio data can be recognized and processed by the Web-side audio processing module, and at the same time, the data obtained after processing by the Web-side audio processing module can be Recognized and played by the container.
进一步地,步骤S3中,所述在浏览器中选择所述Web端音频处理模块的容器具体为:在浏览器中,调用浏览器中的AudioWorklet接口建立新启动的一个独立线程作为所述Web端音频处理模块的容器。Further, in step S3, the selection of the container of the Web-side audio processing module in the browser is specifically: in the browser, calling the AudioWorklet interface in the browser to create a newly started independent thread as the Web-side Container for audio processing modules.
步骤S4具体为:参考图2,Step S4 is specifically: referring to FIG. 2,
S4.1,通过音频输入源获取实时音频,所述音频输入源为缓冲音频输入节点、媒体流音频输入节点、媒体元素音频输入节点其中的一种或多种;其中,缓冲音频输入节点一般为用于原始音频数据PCM的输入,例如缓冲内的经过采样后的音频;媒体流音频输入节点一般为用于设备媒体流的输入,例如从话筒获取的音频;媒体元素音频输入节点一般为用于浏览器中播放器媒体文件的输入,例如某播放器正在播放的音频。S4.1. Obtain real-time audio through an audio input source, where the audio input source is one or more of a buffered audio input node, a media stream audio input node, and a media element audio input node; wherein, the buffered audio input node is generally It is used for the input of original audio data PCM, such as the sampled audio in the buffer; the media stream audio input node is generally used for the input of the media stream of the device, such as the audio obtained from the microphone; the media element audio input node is generally used for The input of player media files in the browser, such as audio being played by a player.
S4.2,若所述音频输入源为缓冲音频输入节点,调用浏览器AudioBufferSourceNode接口,将所述实时音频依次写入对象的buffer属性;S4.2, if the audio input source is a buffered audio input node, call the browser AudioBufferSourceNode interface, and write the real-time audio into the buffer attribute of the object in turn;
若所述音频输入源为媒体流音频输入节点,调用浏览器getUserMedia接口,从设备媒体流获取所述实时音频,调用浏览器createMediaStreamSource接口获取所述实时音频,建立媒体流音频输入;If the audio input source is a media stream audio input node, call the browser getUserMedia interface to obtain the real-time audio from the device media stream, call the browser createMediaStreamSource interface to obtain the real-time audio, and establish media stream audio input;
若所述音频输入源为媒体元素音频输入节点,将播放器接入浏览器,调用浏览器createMediaElementSource接口,建立媒体元素音频输入。If the audio input source is a media element audio input node, connect the player to the browser, call the browser createMediaElementSource interface, and create the media element audio input.
步骤S5具体为:参考图3,Step S5 is specifically: referring to FIG. 3 ,
S5.1,所述容器通过所述实时音频输入接口获取所述实时音频;本实施例中,容器中输入输出的单个音频数据帧长度为128个采样点,每个采样点使用32位浮点数进行记录。S5.1, the container obtains the real-time audio through the real-time audio input interface; in this embodiment, the length of a single audio data frame input and output in the container is 128 sampling points, and each sampling point uses a 32-bit floating point number Make a note.
S5.2,在所述Web端音频处理模块建立输入层缓存和输出层缓存;S5.2, establishing an input layer cache and an output layer cache in the web-side audio processing module;
S5.3,所述容器获取到所述实时音频后,转换所述声道数据的采样点类型为所述Web端音频处理模块匹配的数据类型并储存到所述输入层缓存;将单个音频数据转换为帧长1024,每个采样点采用16位整型数的数据类型进行记录,该数据类型为编译而来的所述Web端音频处理模块处理所能识别并处理的数据类型。S5.3, after the container obtains the real-time audio, convert the sampling point type of the channel data into the data type matched by the Web-side audio processing module and store it in the input layer cache; save the single audio data Converted to a frame length of 1024, each sampling point is recorded using a 16-bit integer data type, which is a data type that can be recognized and processed by the compiled web-side audio processing module.
S5.4,所述Web端音频处理模块处理所述输入层缓存内的数据,并将处理结果储存到输出层缓存;本实施例中,所述Web端音频处理模块使用成熟的音频处理算法实现不同的变声效果。S5.4, the Web-side audio processing module processes the data in the input layer cache, and stores the processing results in the output layer cache; in this embodiment, the Web-side audio processing module is implemented using a mature audio processing algorithm Different voice changing effects.
S5.5,读取所述输出层缓存,将所述处理结果的采样点类型转换为所述容器匹配的数据类型;本实施例中,将输出层缓存内的数据进行逆转换,转换为步骤S5.1中提到的容器所输入的数据类型。S5.5. Read the output layer cache, and convert the sampling point type of the processing result into the data type matched by the container; in this embodiment, reverse convert the data in the output layer cache, and convert the step The type of data entered by the container mentioned in S5.1.
S5.6,播放声音。S5.6, play sound.
进一步地,所述输入层缓存和所述输出层缓存均为环形缓存。环形缓存可最大化复用内存资源。Further, both the input layer cache and the output layer cache are ring caches. The ring cache maximizes the reuse of memory resources.
进一步地,步骤S5.3中,参考图4,所述容器获取到所述实时音频后,转换所述声道数据的采样点类型为所述Web端音频处理模块匹配的数据类型并储存到所述输入层缓存之前,进一步包含步骤:将所述实时音频包含的声道数据交错储存。由于音频一般包含双声道,需要将双声道的音频数据整理成单链上的数据,才能供给所述Web端音频处理模块读取。具体为,将左声道和右声道的数据交错储存。因此,步骤S5.5中,将所述处理结果的采样点类型转换为所述容器匹配的数据类型之后,进一步包含步骤:将交错储存的处理结果转换为多声道处理结果。Further, in step S5.3, referring to FIG. 4 , after the container obtains the real-time audio, it converts the sample point type of the channel data into a data type matched by the web-side audio processing module and stores it in the Before the input layer buffer, it further includes the step of interleaving and storing the channel data included in the real-time audio. Since audio generally includes two channels, it is necessary to organize the two-channel audio data into data on a single chain before it can be read by the web-side audio processing module. Specifically, the data of the left channel and the right channel are interleavedly stored. Therefore, in step S5.5, after converting the sample point type of the processing result into a data type matching the container, a step is further included: converting the interleaved stored processing result into a multi-channel processing result.
进一步地,步骤S5.4中,当所述输入层缓存的缓存存储量达到所述Web端音频处理模块要求的帧长度时,所述Web端音频处理模块处理所述输入层缓存内的数据,并将处理结果储存到输出层缓存。Further, in step S5.4, when the cache storage capacity of the input layer cache reaches the frame length required by the web-side audio processing module, the web-side audio processing module processes the data in the input layer cache, And store the processing results in the output layer cache.
采用浏览器实现本实施例中提到的方法,单次数据类型匹配处理128位数据大约需要10微秒,Web端音频处理模块处理1024位数据的音频数据大约需要300微秒。总共耗时320微秒左右。Using a browser to implement the method mentioned in this embodiment, it takes about 10 microseconds to process 128-bit data for a single data type matching, and it takes about 300 microseconds for the audio data of 1024-bit data to be processed by the web-side audio processing module. It takes about 320 microseconds in total.
实施例二Embodiment two
一种基于浏览器的实时音频处理系统,包含:参考图5,A browser-based real-time audio processing system, comprising: with reference to Figure 5,
获取单元,用于获取非JavaScript编程语言编写的原生音频处理模块;An acquisition unit, configured to acquire a native audio processing module written in a non-JavaScript programming language;
编译单元,用于将所述原生音频处理模块编译为Web端音频处理模块;A compiling unit, configured to compile the native audio processing module into a Web-side audio processing module;
容器选择单元,用于在浏览器中选择所述Web端音频处理模块的容器, 所述容器用于加载和运行所述Web端音频处理模块;The container selection unit is used to select the container of the Web-side audio processing module in the browser, and the container is used to load and run the Web-side audio processing module;
映射单元,用于获取实时音频,在浏览器中建立实时音频输入接口,将所述实时音频映射到所述实时音频输入接口;A mapping unit, configured to acquire real-time audio, establish a real-time audio input interface in the browser, and map the real-time audio to the real-time audio input interface;
数据类型匹配单元,用于所述容器通过所述实时音频输入接口获取所述实时音频,匹配所述容器的输入数据类型和输出数据类型,处理所述实时音频并播放声音。The data type matching unit is used for the container to obtain the real-time audio through the real-time audio input interface, match the input data type and output data type of the container, process the real-time audio and play sound.
实施例三Embodiment Three
一种计算机可读存储介质,存储有计算机程序,所述计算机程序被处理器执行时实现实施例一中所述的实时音频处理方法。A computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, the real-time audio processing method described in Embodiment 1 is realized.
在本发明各个实施方式中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。Each functional unit in each embodiment of the present invention may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit. The above-mentioned integrated units can be implemented in the form of hardware or in the form of software functional units.
集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本发明的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)或处理器(processor)执行本发明各个实施方式方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或 者光盘等各种可以存储程序代码的介质。If the integrated unit is realized in the form of a software function unit and sold or used as an independent product, it can be stored in a computer-readable storage medium. Based on this understanding, the essence of the technical solution of the present invention or the part that contributes to the prior art or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium , including several instructions to make a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor (processor) execute all or part of the steps of the methods in various embodiments of the present invention. The aforementioned storage media include: U disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disk or optical disc, etc., which can store program codes. .
以上所述仅为本发明的较佳实施例,凡依本发明申请专利范围所做的均等变化与修饰,皆应属于本发明的涵盖范围。The above descriptions are only preferred embodiments of the present invention, and all equivalent changes and modifications made according to the scope of the patent application of the present invention shall fall within the scope of the present invention.

Claims (10)

  1. 一种基于浏览器的实时音频处理方法,其特征在于:包含以下步骤:A browser-based real-time audio processing method, characterized in that: comprising the following steps:
    S1,获取非JavaScript编程语言编写的原生音频处理模块;S1, obtaining a native audio processing module written in a non-JavaScript programming language;
    S2,将所述原生音频处理模块编译为Web端音频处理模块;S2, compiling the native audio processing module into a web-side audio processing module;
    S3,在浏览器中选择所述Web端音频处理模块的容器,所述容器用于加载和运行所述Web端音频处理模块;S3. Select the container of the web-side audio processing module in the browser, and the container is used to load and run the web-side audio processing module;
    S4,获取实时音频,在浏览器中建立实时音频输入接口,将所述实时音频映射到所述实时音频输入接口;S4, acquiring real-time audio, establishing a real-time audio input interface in the browser, and mapping the real-time audio to the real-time audio input interface;
    S5,所述容器通过所述实时音频输入接口获取所述实时音频,匹配所述容器的输入数据类型和输出数据类型,处理所述实时音频并播放声音。S5. The container acquires the real-time audio through the real-time audio input interface, matches the input data type and output data type of the container, processes the real-time audio, and plays a sound.
  2. 根据权利要求1所述的一种基于浏览器的实时音频处理方法,其特征在于:步骤S3中,所述在浏览器中选择所述Web端音频处理模块的容器具体为:在浏览器中,调用浏览器中的AudioWorklet接口建立新启动的一个独立线程作为所述Web端音频处理模块的容器。A browser-based real-time audio processing method according to claim 1, characterized in that: in step S3, the selection of the container of the web-side audio processing module in the browser is specifically: in the browser, Call the AudioWorklet interface in the browser to create a newly started independent thread as the container of the web-side audio processing module.
  3. 根据权利要求1所述的一种基于浏览器的实时音频处理方法,其特征在于:步骤S4具体为:A browser-based real-time audio processing method according to claim 1, characterized in that: step S4 is specifically:
    S4.1,通过音频输入源获取实时音频,所述音频输入源为缓冲音频输入节点、媒体流音频输入节点、媒体元素音频输入节点其中的一种或多种;S4.1. Obtain real-time audio through an audio input source, where the audio input source is one or more of a buffered audio input node, a media stream audio input node, and a media element audio input node;
    S4.2,若所述音频输入源为缓冲音频输入节点,调用浏览器AudioBufferSourceNode接口,将所述实时音频依次写入对象的buffer属性;S4.2, if the audio input source is a buffered audio input node, call the browser AudioBufferSourceNode interface, and write the real-time audio into the buffer attribute of the object in turn;
    若所述音频输入源为媒体流音频输入节点,调用浏览器getUserMedia接口,从设备媒体流获取所述实时音频,调用浏览器createMediaStreamSource接口获取所述实时音频,建立媒体流音频输入;If the audio input source is a media stream audio input node, call the browser getUserMedia interface to obtain the real-time audio from the device media stream, call the browser createMediaStreamSource interface to obtain the real-time audio, and establish media stream audio input;
    若所述音频输入源为媒体元素音频输入节点,将播放器接入浏览器,调用浏览器createMediaElementSource接口,建立媒体元素音频输入。If the audio input source is a media element audio input node, connect the player to the browser, call the browser createMediaElementSource interface, and create the media element audio input.
  4. 根据权利要求1所述的一种基于浏览器的实时音频处理方法,其特征在于:步骤S5具体为:A browser-based real-time audio processing method according to claim 1, characterized in that: step S5 is specifically:
    S5.1,所述容器通过所述实时音频输入接口获取所述实时音频;S5.1. The container acquires the real-time audio through the real-time audio input interface;
    S5.2,在所述Web端音频处理模块建立输入层缓存和输出层缓存;S5.2, establishing an input layer cache and an output layer cache in the web-side audio processing module;
    S5.3,所述容器获取到所述实时音频后,转换所述声道数据的采样点类型为所述Web端音频处理模块匹配的数据类型并储存到所述输入层缓存;S5.3. After the container obtains the real-time audio, convert the sample point type of the channel data into a data type matched by the web-side audio processing module and store it in the input layer cache;
    S5.4,所述Web端音频处理模块处理所述输入层缓存内的数据,并将处理结果储存到输出层缓存;S5.4, the web-side audio processing module processes the data in the input layer cache, and stores the processing result in the output layer cache;
    S5.5,读取所述输出层缓存,将所述处理结果的采样点类型转换为所述容器匹配的数据类型;S5.5. Read the output layer cache, and convert the sampling point type of the processing result into a data type matched by the container;
    S5.6,播放声音。S5.6, play sound.
  5. 根据权利要求4所述的一种基于浏览器的实时音频处理方法,其特征在于:所述输入层缓存和所述输出层缓存均为环形缓存。A browser-based real-time audio processing method according to claim 4, characterized in that: both the input layer cache and the output layer cache are ring caches.
  6. 根据权利要求4所述的一种基于浏览器的实时音频处理方法,其特征在于:步骤S5.3中,所述容器获取到所述实时音频后,转换所述声道数据的采样点类型为所述Web端音频处理模块匹配的数据类型并储存到所述输入层缓存之前,进一步包含步骤:将所述实时音频包含的声道数据交错储存。A browser-based real-time audio processing method according to claim 4, characterized in that: in step S5.3, after the container acquires the real-time audio, convert the sampling point type of the channel data into Before the data type matched by the web-side audio processing module is stored in the input layer cache, it further includes a step of interleaving the channel data included in the real-time audio.
  7. 根据权利要求6所述的一种基于浏览器的实时音频处理方法,其特征在于:步骤S5.5中,将所述处理结果的采样点类型转换为所述容器匹配的数据类型之后,进一步包含步骤:将交错储存的处理结果转换为多声道处理结 果。A browser-based real-time audio processing method according to claim 6, characterized in that: in step S5.5, after converting the sample point type of the processing result into the data type matched by the container, further comprising Step: converting the interleaved processing result into a multi-channel processing result.
  8. 根据权利要求4所述的一种基于浏览器的实时音频处理方法,其特征在于:步骤S5.4中,当所述输入层缓存的缓存存储量达到所述Web端音频处理模块要求的帧长度时,所述Web端音频处理模块处理所述输入层缓存内的数据,并将处理结果储存到输出层缓存。A browser-based real-time audio processing method according to claim 4, characterized in that: in step S5.4, when the cache storage capacity of the input layer cache reaches the frame length required by the Web-side audio processing module , the web-side audio processing module processes the data in the input layer cache, and stores the processing result in the output layer cache.
  9. 一种基于浏览器的实时音频处理系统,其特征在于:包含:A browser-based real-time audio processing system, characterized in that: comprising:
    获取单元,用于获取非JavaScript编程语言编写的原生音频处理模块;An acquisition unit, configured to acquire a native audio processing module written in a non-JavaScript programming language;
    编译单元,用于将所述原生音频处理模块编译为Web端音频处理模块;A compiling unit, configured to compile the native audio processing module into a Web-side audio processing module;
    容器选择单元,用于在浏览器中选择所述Web端音频处理模块的容器,所述容器用于加载和运行所述Web端音频处理模块;The container selection unit is used to select the container of the Web-side audio processing module in the browser, and the container is used to load and run the Web-side audio processing module;
    映射单元,用于获取实时音频,在浏览器中建立实时音频输入接口,将所述实时音频映射到所述实时音频输入接口;A mapping unit, configured to acquire real-time audio, establish a real-time audio input interface in the browser, and map the real-time audio to the real-time audio input interface;
    数据类型匹配单元,用于所述容器通过所述实时音频输入接口获取所述实时音频,匹配所述容器的输入数据类型和输出数据类型,处理所述实时音频并播放声音。The data type matching unit is used for the container to obtain the real-time audio through the real-time audio input interface, match the input data type and output data type of the container, process the real-time audio and play sound.
  10. 一种计算机可读存储介质,存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现权利要求1中所述的实时音频处理方法。A computer-readable storage medium storing a computer program, characterized in that, when the computer program is executed by a processor, the real-time audio processing method described in claim 1 is implemented.
PCT/CN2022/076304 2021-06-29 2022-02-15 Browser-based real-time audio processing method and system, and storage device WO2023273360A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110725948.3 2021-06-29
CN202110725948.3A CN113434110A (en) 2021-06-29 2021-06-29 Real-time audio processing method, system and storage device based on browser

Publications (1)

Publication Number Publication Date
WO2023273360A1 true WO2023273360A1 (en) 2023-01-05

Family

ID=77757577

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/076304 WO2023273360A1 (en) 2021-06-29 2022-02-15 Browser-based real-time audio processing method and system, and storage device

Country Status (2)

Country Link
CN (1) CN113434110A (en)
WO (1) WO2023273360A1 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113434110A (en) * 2021-06-29 2021-09-24 稿定(厦门)科技有限公司 Real-time audio processing method, system and storage device based on browser
CN114860328B (en) * 2022-07-07 2022-12-09 广东睿江云计算股份有限公司 Method for automatically detecting media equipment access in real time by front-end web page

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109547844A (en) * 2018-12-19 2019-03-29 网宿科技股份有限公司 Audio/video pushing method and plug-flow client based on WebRTC agreement
CN110198479A (en) * 2019-05-24 2019-09-03 浪潮软件集团有限公司 A kind of browser audio/video decoding playback method based on webassembly
CN111641838A (en) * 2020-05-13 2020-09-08 深圳市商汤科技有限公司 Browser video playing method and device and computer storage medium
CN112291628A (en) * 2020-11-25 2021-01-29 杭州视洞科技有限公司 Multithreading video decoding playing method based on web browser
US20210182123A1 (en) * 2019-11-01 2021-06-17 Grass Valley Limited System and method for constructing filter graph-based media processing pipelines in a browser
CN113434110A (en) * 2021-06-29 2021-09-24 稿定(厦门)科技有限公司 Real-time audio processing method, system and storage device based on browser

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9584564B2 (en) * 2007-12-21 2017-02-28 Brighttalk Ltd. Systems and methods for integrating live audio communication in a live web event

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109547844A (en) * 2018-12-19 2019-03-29 网宿科技股份有限公司 Audio/video pushing method and plug-flow client based on WebRTC agreement
CN110198479A (en) * 2019-05-24 2019-09-03 浪潮软件集团有限公司 A kind of browser audio/video decoding playback method based on webassembly
US20210182123A1 (en) * 2019-11-01 2021-06-17 Grass Valley Limited System and method for constructing filter graph-based media processing pipelines in a browser
CN111641838A (en) * 2020-05-13 2020-09-08 深圳市商汤科技有限公司 Browser video playing method and device and computer storage medium
CN112291628A (en) * 2020-11-25 2021-01-29 杭州视洞科技有限公司 Multithreading video decoding playing method based on web browser
CN113434110A (en) * 2021-06-29 2021-09-24 稿定(厦门)科技有限公司 Real-time audio processing method, system and storage device based on browser

Also Published As

Publication number Publication date
CN113434110A (en) 2021-09-24

Similar Documents

Publication Publication Date Title
WO2023273360A1 (en) Browser-based real-time audio processing method and system, and storage device
US20200151212A1 (en) Music recommending method, device, terminal, and storage medium
CN106960051B (en) Audio playing method and device based on electronic book and terminal equipment
CN108831437B (en) Singing voice generation method, singing voice generation device, terminal and storage medium
WO2021083071A1 (en) Method, device, and medium for speech conversion, file generation, broadcasting, and voice processing
US9536517B2 (en) System and method for crowd-sourced data labeling
CN107545029A (en) Voice feedback method, equipment and the computer-readable recording medium of smart machine
WO2017160498A1 (en) Audio scripts for various content
US9009050B2 (en) System and method for cloud-based text-to-speech web services
US20130246061A1 (en) Automatic realtime speech impairment correction
JP2019015951A (en) Wake up method for electronic device, apparatus, device and computer readable storage medium
WO2016165334A1 (en) Voice processing method and apparatus, and terminal device
CN104902145B (en) A kind of player method and device of live stream video
CN106098081A (en) The acoustic fidelity identification method of audio files and device
CN106598982A (en) Method and device for creating language databases and language translation method and device
CN106531202A (en) Audio processing method and device
KR100834363B1 (en) Voice response system, voice response method, voice server, voice file processing method, program and recording medium
CN111489739A (en) Phoneme recognition method and device and computer readable storage medium
CN113948062B (en) Data conversion method and computer storage medium
US10606548B2 (en) Method of generating an audio signal
KR102357313B1 (en) Content indexing method of electronic apparatus for setting index word based on audio data included in video content
CN113747337B (en) Audio processing method, medium, device and computing equipment
CN112233661B (en) Video content subtitle generation method, system and equipment based on voice recognition
KR20190111395A (en) Device and voice recognition server for providing sound effect of story contents
US11114079B2 (en) Interactive music audition method, apparatus and terminal

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22831193

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE