TW202143747A

TW202143747A - System and method for generating a 3d spatial sound field

Info

Publication number: TW202143747A
Application number: TW109115142A
Authority: TW
Inventors: 丁泊淳
Original assignee: 台灣聲研音響有限公司
Priority date: 2020-05-05
Filing date: 2020-05-05
Publication date: 2021-11-16
Also published as: TWI752487B

Abstract

A system for generating a 3d spatial sound field includes a space/environmental sound wave input unit for receiving and converting sound waves into separated digital signals, a binaural signal processing unit comparing the converted digital sound signals and separating them again using binaural signal processing technology, a digital operation analysis unit receiving the converted digital signals and performing 3d calculation, analysis, real-time calculation, distribution, processing, DA conversion, and AD conversion, and a three-dimensional signal calculation and comparison unit for calculating and optimizing the signals output by the digital operation analysis unit, and then distributing the processed signal to a sound signal playing system for generating reconstructed sound field.

Description

Three-dimensional sound field generation system and method

本發明係關於三維聲場生成技術領域，更明確地說，係關於一種三維聲場生成系統及方法。 The present invention relates to the technical field of three-dimensional sound field generation, more specifically, it relates to a system and method for three-dimensional sound field generation.

隨著科技的發展與應用，現今市場上對聲音產品的需求與應用逐年增長。為了因應市場上不斷變化的消費使用需求，新的應用技術推陳出新、加入不同技術條件以增加新的聆聽感受度，對消費市場而言是必須的。要達成上述條件，電聲音響與空間的耦合是重要探討課題，於設計上重點不僅在於聲音產品的系統特性與能力，更為重要的是考慮聲音產品中系統與實際環境聲學間相互作用的影響，藉以設計出更貼近人的雙耳於環境中不同位置、角度間實際聆聽感受之特性表現。 With the development and application of technology, the demand and application of sound products in the market nowadays has increased year by year. In order to respond to the ever-changing consumer demand in the market, it is necessary for the consumer market to introduce new application technologies and add different technical conditions to increase new listening experience. To achieve the above conditions, the coupling of electro-acoustic sound and space is an important topic for discussion. In the design, the focus is not only on the system characteristics and capabilities of the sound product, but more importantly, considering the impact of the interaction between the system in the sound product and the actual environmental acoustics. , In order to design the characteristic performance of the actual listening experience between different positions and angles of the two ears in the environment that is closer to the human.

三維聲場技術(3D spatial sound field technologies)除了可以改善聲場環境與聲音動態的表現外，於實際聆聽的情境下聽眾的感受度會接近實際空間聲場的情況。近幾年由於360度影片、VR/AR/MR的蓬勃發展，使得大家回過頭來尋求沉浸式的聲場在場域收音、格式解析/量化、處理、重建等各方面的幫助，以求能在耳機或各種喇叭佈局上實現自然的聲響，其中關鍵之一就是演算出聲源在一個廣域的3D空間裡所應讓人耳聽到的直接音和一次、二次、三次反射音，相應於聆聽者雙耳和聲源之間的距離與方位變化，即時的將前端的計算結果呈現在後端聆聽系統上。 In addition to improving the sound field environment and sound dynamic performance, 3D spatial sound field technologies (3D spatial sound field technologies) can approach the actual spatial sound field in the context of actual listening. In recent years, due to the vigorous development of 360-degree videos and VR/AR/MR, everyone has turned their heads back to seek help in field reception, format analysis/quantization, processing, reconstruction, etc. of the immersive sound field, in order to be able to To achieve natural sound on headphones or various speaker layouts, one of the keys is to calculate the direct sound and the primary, secondary, and tertiary reflected sound that the sound source should hear in a wide-area 3D space, corresponding to The distance and azimuth between the listener's ears and the sound source changes, and the front-end calculation results are presented on the back-end listening system in real time.

三維聲場，本質上來說它是對沉浸式聲場的一種更加科學的表達方式。通常來說(在某些定義)，這兩個名稱可以互換使用，但是三維聲場是一個相對更準確的術語。在三維聲/環境立體聲中，聲音可以被定位在以聽眾為球心的球體中的任何一個位置，通過在四個聲道上進行編碼-三個通道分別用於前/後、左/右和上/下的方向資訊，還有一個「全向」通道。在回放環境立體聲時，必須解碼後才能在特定的回放系統上進行播放，這個系統可以是DTS：X家庭影院系統，也可以是傳統的5.1聲道環繞設備，甚至可以是常規的耳機或是多向的陣列揚聲器系統。此外，沉浸式音頻技術的實現是與揚聲器無關的，這意味著人們只需要對混音進行簡單的調整，就可以在不同的音響系統上進行回放(play back)。對於沉浸式聲場技術來說，我們不但可以在電影院的音響系統中使用，也可以在智能揚聲器網絡，甚至是普通的耳機上回放同一段混音，而且能夠同時保留住空間資訊。 The three-dimensional sound field is essentially a more scientific expression of the immersive sound field. Generally speaking (in some definitions), these two names can be used interchangeably, but the three-dimensional sound field is one A relatively more accurate term. In three-dimensional sound/ambient stereo, the sound can be positioned at any position in the sphere with the listener as the center of the sphere, by encoding on four channels-three channels for front/rear, left/right, and There is also an "omnidirectional" channel for up/down direction information. In the stereo playback environment, it must be decoded before it can be played on a specific playback system. This system can be a DTS:X home theater system, or a traditional 5.1-channel surround device, or even a regular headset or multiple Array speaker system. In addition, the realization of immersive audio technology has nothing to do with the speakers, which means that people only need to make simple adjustments to the mix to play back on different audio systems. For the immersive sound field technology, we can not only use it in the sound system of the movie theater, but also play back the same mix on a smart speaker network, or even ordinary headphones, while retaining spatial information at the same time.

有鑑於目前市場上高階音響產品售價不菲，在高品質的產品上合理價格還有許多空間，本發明提出一種創新三維聲場生成系統及方法可以應用於微型化音響產品上，利用可以自動調整最佳聆聽範圍技術，無論聲場環境條件如何，本發明提出之設計可以藉由多個揚聲器形成聲域，建立完整的聲音定位，藉由可以自動調整處理環境聲場，模擬建立原聲場的聽覺感受，使得聽眾於不同環境中都可有接近最佳化的聆聽效果。 In view of the high price of high-end audio products on the market at present, there is still much room for reasonable prices on high-quality products. The present invention proposes an innovative three-dimensional sound field generation system and method that can be applied to miniaturized audio products. The technology of adjusting the optimal listening range, regardless of the sound field environment conditions, the design proposed by the present invention can form a sound field with multiple speakers to establish a complete sound localization. By automatically adjusting and processing the environmental sound field, it can simulate the creation of the original sound field. The auditory experience allows the audience to have a near-optimal listening effect in different environments.

本發明提出一種三維聲場生成系統，包含：一個空間/環境聲波輸入單元，用以接收空間與環境中的聲波並將其轉化分離成數位格式訊號；一個雙耳訊號處理單元，用以接收該轉化分離之數位訊號並依據雙耳訊號處理技術，對該訊號進行比對、再次分離處理；一個數位運算分析單元包含一數位音頻訊號處理器以及一數位訊號處理器，其中該數位音頻訊號處理器用以接收該轉化分離之數位訊號並對其進行三維演算與分析，其中該數位訊號處理器用以接收由該雙耳訊號處理單元與該數位音頻訊號處理器輸入之訊號進行即時計算、分配、處理、數位類比轉換以及類比數位轉換；及一個三維訊號演算與比對單元，用以對由數位運算分析單元輸出之訊號進行演算最佳化訊號比對與自動化的調整，再將比對完的處理訊號分配至一個聲音訊號撥放系統用以產生重建之聲場。 The present invention provides a three-dimensional sound field generation system, including: a space/ambient sound wave input unit for receiving sound waves in space and environment and converting them into digital format signals; a binaural signal processing unit for receiving the sound waves Convert the separated digital signal and compare and separate the signal according to the binaural signal processing technology; a digital arithmetic analysis unit includes a digital audio signal processor and a digital signal processor, where the digital audio signal processor is used In order to receive the converted and separated digital signal and perform three-dimensional calculation and analysis on it, the digital signal processor is used to receive the signal input by the binaural signal processing unit and the digital audio signal processor for real-time calculation, distribution, processing, Digital-to-analog conversion and analog-to-digital conversion; and a three-dimensional signal calculation and comparison unit, used to calculate and optimize the signal comparison and automatic adjustment of the signal output by the digital calculation analysis unit, and then compare the processed signal after the comparison Assigned to a sound signal playback system to generate reconstruction Sound field.

上述的三維聲場生成系統，其更包含一現場音源收錄演算與比對單元用以將現場演出的音源或訊號透過一個現場三維錄音裝置收錄，將所收錄之訊號與上述演算最佳化訊之號比對並進行自動化的調整，再將比對完的處理訊號分配至上述聲音訊號撥放系統用以產生重建之聲場。 The above-mentioned three-dimensional sound field generation system further includes a live audio source recording calculation and comparison unit for recording the live performance audio source or signal through a live three-dimensional recording device, and the recorded signal and the above-mentioned calculation optimization signal Signals are compared and automatically adjusted, and then the compared processing signals are distributed to the above-mentioned sound signal playback system to generate a reconstructed sound field.

上述雙耳訊號處理單元依據雙耳聲壓差(interaural level difference,IDT)、雙耳時間差(interaural time difference,ILD)、頭部相關傳遞函數(head related transform function,HRTF)等雙耳訊號處理技術，對輸入的該數位形式訊號進行比對、再次分離處理將複雜的空間混音轉換為左右兩個聲道的聲音信號。 The above-mentioned binaural signal processing unit is based on binaural signal processing technologies such as interaural level difference (IDT), interaural time difference (ILD), head related transform function (HRTF), etc. , To compare the input signal in digital form, and separate again to convert the complicated spatial mixing into left and right sound channels.

上述聲音訊號撥放系統包括一個三維揚聲器陣列。 The above-mentioned sound signal playback system includes a three-dimensional speaker array.

本發明提出一種三維聲場生成方法，包含：透過聲波輸入單元接收空間與環境中聲波，並將其轉化成數位訊號；以雙耳訊號處理單元對該數位訊號進行雙耳訊號處理；藉由數位運算分析單元進行三維演算；及以三維訊號演算與比對單元，將該數位訊號進行最佳化訊號比對，再分配至一聲音訊號撥放系統，以產生重建之聲場。 The present invention provides a method for generating a three-dimensional sound field, which includes: receiving sound waves in space and environment through a sound wave input unit and converting them into digital signals; using a binaural signal processing unit to process the digital signals in binaural signals; The arithmetic analysis unit performs three-dimensional calculation; and uses the three-dimensional signal calculation and comparison unit to perform optimized signal comparison of the digital signal, and then distributes the digital signal to a sound signal playback system to generate a reconstructed sound field.

100:生成三維聲場之系統架構 100: System architecture for generating three-dimensional sound field

10:聲波 10: Sound waves

101:空間/環境聲波輸入單元 101: Space/Ambient Sound Wave Input Unit

101a:陣列麥克風 101a: Array microphone

101b:WXYZ 101b:WXYZ

101c:解碼器(Decoder) 101c: Decoder

102:雙耳訊號處理單元 102: Binaural signal processing unit

103:數位運算分析單元 103: Digital Operation Analysis Unit

103a:數位音頻訊號處理器(DAFX) 103a: Digital Audio Signal Processor (DAFX)

103b:數位訊號處理器(Digital Signal Processor,DSP) 103b: Digital Signal Processor (DSP)

104:三維訊號演算與比對單元 104: 3D signal calculation and comparison unit

105:現場音源收錄演算與比對單元 105: On-site audio source recording calculation and comparison unit

105a:現場三維錄音裝置 105a: On-site 3D recording device

105b:3D處理器(3D processor) 105b: 3D processor (3D processor)

106:分配器 106: Distributor

107:撥放系統(3D array speaker) 107: Playback system (3D array speaker)

109:3D耳機 109: 3D headset

111:電競系統 111: E-sports system

113:3D個人消費產品 113: 3D personal consumer products

115:3D家庭劇院 115: 3D Home Theater

117:現場演出 117: Live Performance

圖一顯示本發明三維聲場生成系統之特徵，異地重放技術之示意圖。 Figure 1 shows the characteristics of the three-dimensional sound field generation system of the present invention, and a schematic diagram of the remote playback technology.

圖二顯示本發明之三維聲場生成系統利用AI數據分析功能輔助達到聲場還原技術之示意圖。 Figure 2 shows a schematic diagram of the three-dimensional sound field generation system of the present invention using the AI data analysis function to assist in achieving the sound field restoration technology.

圖三顯示本發明之三維聲場生成系統如何增加聲場的聲場範圍與其指向的控制。 Figure 3 shows how the three-dimensional sound field generating system of the present invention increases the sound field range of the sound field and its direction control.

圖四顯示本發明之三維聲場生成系統利用多個或是多組音響單體或是揚聲器的陣列排放，增加聲場的聲場範圍與其指向的控制之示意圖。 Figure 4 shows a schematic diagram of the three-dimensional sound field generation system of the present invention using multiple or multiple sets of acoustic units or arrays of speakers to increase the sound field range and direction of the sound field.

圖五顯示本發明之生成三維聲場之系統架構功能方塊示意圖。 Figure 5 shows a functional block diagram of the system architecture for generating a three-dimensional sound field of the present invention.

本說明書僅對本發明之必要元件作出陳述，且僅係用於說明本發明其中之可能之實施例，然而說明書之記述應不侷限本發明所主張之技術本質的權利範圍。 This specification only describes the essential elements of the present invention, and is only used to illustrate the possible embodiments of the present invention. However, the description in the specification should not limit the scope of rights of the technical essence claimed by the present invention.

亦應瞭解的是，目前所述僅係本發明可能之實施例，在本發明之實施或測試中，可使用與本說明書所述裝置或系統相類似或等效之任何方法、流程、功能或手段。除非有另外定義，否則本說明書所用之所有技術及科學術語，皆具有與熟習本發明所屬技術領域者通常所瞭解的意義相同之意義。本說明書目前所述者僅係實例方法、流程及其相關資料。然而在本發明之實際使用時，其可使用與本說明書所述方法及材料相類似或等效之任何方法及手段。 It should also be understood that the current descriptions are only possible embodiments of the present invention. In the implementation or testing of the present invention, any method, process, function, or method similar or equivalent to the device or system described in this specification can be used. means. Unless otherwise defined, all technical and scientific terms used in this specification have the same meaning as those commonly understood by those familiar with the technical field to which the present invention belongs. The current descriptions in this manual are only examples of methods, processes and related materials. However, in actual use of the present invention, any methods and means similar or equivalent to the methods and materials described in this specification can be used.

再者，本說明書中所提及之一數目以上或以下，係包含數目本身。且應瞭解的是，本說明書揭示執行所揭示功能之某些方法、流程，存在多種可執行相同功能之與所揭示結構有關之結構，且上述之結構通常可達成相同結果。 Furthermore, the number of one mentioned in this specification above or below includes the number itself. And it should be understood that this specification discloses certain methods and processes for performing the disclosed functions, and there are multiple structures related to the disclosed structure that can perform the same function, and the above-mentioned structures can usually achieve the same result.

以下將詳述本發明之較佳具體實施例，藉以充分說明本發明之特徵、精神及優點。 Hereinafter, preferred embodiments of the present invention will be described in detail to fully illustrate the characteristics, spirit, and advantages of the present invention.

有關三維聲場技術(3D spatial sound technologies)的建立及應用，其原理是利用有限元素(Finite Element Method,FEM)法所提供之數學運算分析，在特定的邊界條件下，將空間切割劃分成有限數量的微小元素(微小體積)，使用波動方程式 Regarding the establishment and application of 3D spatial sound technologies (3D spatial sound technologies), the principle is to use the mathematical operation analysis provided by the Finite Element Method (FEM) method to divide the space into finite under certain boundary conditions. The number of tiny elements (tiny volume), using the wave equation

以及

as well as

Helmholtz方程式▽² f=-k ² f (2)為基本計算來求出聲場之特徵以及其在空間中的聲場擴散行為，並加入邊界條件(Boundary element method(BEM))演算將Helmholtz equation轉化為邊界積分方程做進一步的特徵分析，並可在邊界上求解聲波之壓力和速度，之後即可在空間中假設或是程式中預測與評估的任意點上計算出聲波之聲場的壓力、頻率域、相位。接著，基於在空間中，各頻率域會在不同的空間幾何、容積、形狀，產生不同的聲音現象，如駐波。聆聽者在不同點的位置會隨者其所在位置改變，導致聲音感受也會改變，因此需要考慮空間幾何條件進行演算分析。又由於在3D空間中，聲音會因為各種行為(例如，駐波衰減、因擴散聲場駐波產生各類波混響時間)而產生改變，這種改變就影響聽的感受度會形成時間殘響(Reverberation Time)，這類影響亦必須於三維聲場的演算法中加入考慮。 Helmholtz equation ▽ ² f =- k ² f (2) is a basic calculation to find the characteristics of the sound field and its sound field diffusion behavior in space, and add boundary conditions (Boundary element method (BEM)) to calculate the Helmholtz equation Converted into boundary integral equation for further characteristic analysis, and the pressure and velocity of the sound wave can be solved on the boundary, and then the pressure and the pressure of the sound field of the sound wave can be calculated at any point in the space hypothesis or prediction and evaluation in the formula. Frequency domain, phase. Then, based on the space, each frequency domain will produce different sound phenomena, such as standing waves, in different spatial geometries, volumes, and shapes. The position of the listener at different points will change with the position of the listener, resulting in a change in the sound experience. Therefore, it is necessary to consider the spatial geometric conditions for calculation analysis. And because in 3D space, the sound will change due to various behaviors (for example, standing wave attenuation, various wave reverberation times due to standing waves in the diffuse sound field), and this change affects the perception of listening and forms time residuals. Reverberation Time. This type of influence must also be considered in the three-dimensional sound field algorithm.

當上述場域收音、經由軟體演算(考慮空間幾何、分析各類波的混響時間)包含將其解析/量化、處理，然後轉化分離成數位格式，接著需要以聲波合成技術(Wave Field Synthesis)完成聲波在聆聽位置的合成，其應考量到應具有相同的相位、能量、聲壓，對3D聲場的建立是重要的。以本發明技術的思考角度而言，其考慮要素包括：增加聲場的聲場範圍與其指向的控制、透過場景撥放定位計算與軟體及揚聲器數量之間的定位與演算、並考慮雙耳效應(Binaural effect)影響與控制聆聽者所聽到的虛擬及實際聲源位置。 When the above-mentioned field is picked up and calculated by software (considering the spatial geometry and analyzing the reverberation time of various waves), it includes analysis/quantization, processing, and then conversion and separation into digital formats, and then the need for wave field synthesis technology (Wave Field Synthesis) To complete the synthesis of sound waves at the listening position, it should be considered that they should have the same phase, energy, and sound pressure, which is important for the establishment of a 3D sound field. From the perspective of the technology of the present invention, the consideration elements include: increasing the sound field range of the sound field and the control of its direction, positioning and calculation between the positioning calculation through the scene playback and the number of software and speakers, and considering the binaural effect (Binaural effect) Affects and controls the virtual and actual sound source positions heard by the listener.

對於三維聲場技術(3D spatial sound technologies)的分析，以一個場地說明，例如音樂廳，聲音經由揚聲器系統放送，聲波在此空間中產生擴散、反射、折射、繞射等行為，這些聲音波動行為會改變聲音在此空間中的聽覺感受度變化。此技術的基礎是利用聲音接收系統，例如麥克風、感測器、訊號接收器，來接收聲音，並藉由一些處理設備將聲音轉換為電訊號，我們再利用軟體或計算式對聲音的電訊號進行分析。 For the analysis of 3D spatial sound technologies (3D spatial sound technologies), a site description, such as a concert hall, where sound is sent through a speaker system, the sound waves produce diffusion, reflection, refraction, diffraction, etc. in this space. These sound wave behaviors Will change the auditory perception of sound in this space. The basis of this technology is the use of sound receiving systems, such as microphones, sensors, and signal receivers, to receive sound, and some processing equipment to convert the sound into electrical signals. We then use software or computational formulas to transmit electrical signals to the sound. Perform analysis.

經由軟體或計算式分析過的訊號，藉由本技術進行初步處理，即自動偵測，可以快速的分離判斷出聲音在此空間(例如音樂廳)的資訊，如：響度(Loudness)、聲壓級與頻率(Level and Frequency)、空間脈衝響應、反射能量指向、臨界距離、時間殘響(Reverberation Time)、聲能比、回聲(Echo)、早期衰變時間(Early Decay Time,EDT)、D50(50ms內到達的早期聲能與總聲能比值)、TS(脈衝響應平方的重力時間)、C80(聲場清晰度)、ITDG(初始的延遲間隙)...等。利用這些獲得的資訊，我們可以將其儲存為數位訊號計算格式，例如C語言程式碼或其他可以處理聲音訊號的任何軟體程式。此外，我們可以去各空間場地(例如音樂廳、國家劇院、大型室外演唱會場地等)獲得更多上述列舉之空間的資訊，這些收集到的不同表演場地之空間資訊經儲存為數位訊號後可以建立為資料庫做為人工智慧(Artificial Intelligence,AI)數據化的儲存資訊。 The signal analyzed by software or calculation formula is preliminarily processed by this technology, that is, automatic detection, which can quickly separate and determine the information of the sound in this space (such as a concert hall), such as: Loudness, sound pressure level With frequency (Level and Frequency), spatial impulse response, reflected energy direction, critical distance, time reverberation (Reverberation Time), sound energy ratio, echo (Echo), early decay time (Early Decay Time, EDT), D50 (50ms) The ratio of the early sound energy to the total sound energy reached within), TS (gravity time of impulse response squared), C80 (sound field clarity), ITDG (initial delay gap)...etc. Using the information obtained, we can store it in a digital signal calculation format, such as C language code or any other software program that can process sound signals. In addition, we can go to various space venues (such as concert halls, national theaters, large outdoor concert venues, etc.) to obtain more information about the spaces listed above. The collected space information of different performance venues can be stored as digital signals. Established as a database as artificial intelligence (AI) data storage information.

上述自動偵測在應用上，是使用數位訊號處理器DSP(Digital Signal Process)來處理並應用於開發的產品上。其具有聲場自動偵測分析功能，我們利用此自動偵測功能，可以即時的分析目前所偵測的場地。當已獲的場地資訊後，可針對聲場做自動化的調整，我們稱之異地(異位)重放技術：首先請參考圖1，其顯示一異地重放技術之實施例，其概念為如本發明所開發之帶有自動偵測技術的聲音訊號撥放設備或產品，其可以將空間A之聲場還原至空間B，或是將空間B之聲場還原至空間A，取決於哪個聲場是依個人喜好而定。例如，將音樂廳之聲場還原至家中客廳，或是將家庭劇院之聲場還原至自駕車內部空間中撥放。以一較佳實施例而言，聲場還原技術亦可以使用AI數據分析功能，參考圖2，將原有自他處(例如，音樂廳、室外演唱會場、國家劇院等)儲存之聲場條件，諸如該處之響度(Loudness)、聲壓級與頻率(Level and Frequency)、空間脈衝響應、反射能量指向、臨界距離、時間殘響(Reverberation Time)、聲能比、回聲(Echo)、早期衰變時間(Early Decay Time,EDT)、D50(50ms內到達的早期聲能與總聲能比值)、TS(脈衝響應平方的重力時間)、C80(聲場清晰度)、ITDG(初始的延遲間隙)...等參數還原至目前所在之空間(例如，家中家庭劇院、自駕車車內等)。 In application, the above-mentioned automatic detection is processed by a digital signal processor (DSP) and applied to the developed product. It has an automatic sound field detection and analysis function, and we can use this automatic detection function to analyze the currently detected venue in real time. After the venue information has been obtained, automatic adjustments can be made to the sound field, which we call the off-site (off-site) playback technology: First, please refer to Figure 1, which shows an embodiment of the off-site playback technology. The concept is as follows The sound signal playback device or product with automatic detection technology developed by the present invention can restore the sound field of space A to space B or restore the sound field of space B to space A, depending on which sound The field is determined by personal preference. For example, restore the sound field of a concert hall to the living room at home, or restore the sound field of a home theater to the internal space of a self-driving car. In a preferred embodiment, the sound field restoration technology can also use the AI data analysis function. Referring to Figure 2, the original sound field conditions stored elsewhere (for example, concert halls, outdoor concert venues, national theaters, etc.) , Such as the loudness (Loudness), sound pressure level and frequency (Level and Frequency), spatial impulse response, reflected energy direction, critical distance, time reverberation (Reverberation Time), sound energy ratio, echo (Echo), early Decay Time (Early Decay Time, EDT), D50 (the ratio of early sound energy to total sound energy reached within 50ms), TS (gravity time of impulse response squared), C80 (sound field clarity), ITDG (initial delay gap) )... and other parameters are restored to the current space (for example, home theater at home, in a self-driving car, etc.).

對於如何增加聲場的聲場範圍與其指向的控制，參考圖3，其中圖3(a)顯示只有單一聲源時聲波在空間中並無干涉現象，當有二個或二個以上的聲源(如揚聲器)存在時，如圖3(b)-(c)，這些聲源產生的聲波在空間中會因為彼此之間的距離與頻率而造成相互干涉(其中，圖3(b)顯示具有6個低頻聲源；圖3(c)顯示具有6個高頻聲源)，進而產生聲柱(如圖3(b)-(c)中花瓣狀聲柱)以及聲場指向性。因此利用多個或是多組音響單體(含有高、低音)或是揚聲器的陣列排放(組合)，例如圖4(a)-(c)所示的揚聲器的陣列組合，可以增加聲場的聲場範圍與其指向的控制(圖4(d)-(e))。由於揚聲器中音圈於大聲壓條件時，音圈的位移與失真程度範圍低，因此結構上有別於現今大多數的智慧型音響於低音部分除採用"陣列"方式排列，本發明設計出仰賴“結構形狀條件”來增強低音及擴散表現。 For how to increase the sound field range of the sound field and the control of its direction, refer to Figure 3, where Figure 3(a) shows that when there is only a single sound source, sound waves do not interfere in space. When there are two or more sound sources (such as speakers), as shown in Figure 3(b)-(c), these The sound waves generated by the sound sources will interfere with each other in space due to the distance and frequency between them (Figure 3(b) shows that there are 6 low-frequency sound sources; Figure 3(c) shows that there are 6 high-frequency sound sources), In turn, sound columns (such as petal-shaped sound columns in Figure 3(b)-(c)) and sound field directivity are produced. Therefore, the use of multiple or multiple sets of audio units (including high and low frequencies) or speaker array (combination), such as the speaker array combination shown in Figure 4 (a)-(c), can increase the sound field Control of the sound field range and its direction (Figure 4(d)-(e)). Because the voice coil of the speaker has a low range of displacement and distortion when the sound pressure is high, the structure is different from most smart speakers today. In addition to the "array" arrangement in the bass part, the present invention designs Rely on "structural shape conditions" to enhance bass and diffusion performance.

至於如何實現經由自動調整最佳聆聽範圍技術，無論聲場環境條件如何，本設計可以藉由多個揚聲器形成聲域，建立完整的聲音定位，藉由可以自動調整處理環境聲場，模擬建立原聲場的聽覺感受，使得聽眾於不同環境中都可有接近最佳化的聆聽效果。基於上述三維聲場技術(3D spatial sound technologies)的分析，本發明提出一生成三維聲場之系統架構100，如圖5所示，該系統架構包括空間/環境聲波輸入單元101、雙耳訊號處理單元102、數位運算分析單元103、三維訊號演算與比對單元104、以及現場音源收錄演算與比對單元105，其中空間/環境聲波輸入單元101包含陣列麥克風101a，用於收錄或接收空間與環境中的聲波10，並將其轉化分離成數位形式、格式經數位運算分析單元103中的數位音頻訊號處理器(DAFX)103a進行3D演算與分析，分析後之訊號分別饋入雙耳訊號處理單元102以及數位運算分析單元103中的數位訊號處理器(Digital Signal Processor,DSP)103b中。饋入雙耳訊號處理單元102之訊號其來源為將收錄或接收空間與環境中的聲波10先經過WXYZ 101b的處理技術，分離轉化成類似人雙耳對於空間的感知訊號(B-format訊號)，經過解碼器(Decoder)101c處理成二個迴路(L+R)聲道訊號，訊號(X(n)再經由介面(ALL PASS1、2)做傳輸至另一處理機構的處理。此機構的的處理是將以收錄處理過的訊號，再根據雙耳聲壓差(interaural level difference,IDT)、雙耳時間差(interaural time difference,ILD)、頭部相關傳遞函數(head related transform function,HRTF)等理論技術處理。此處理是針對人對於空間的聲音、環境的聽覺辨識定位分析與處理，主要是利用人耳對於聲音的聲壓差、能量強度、頻域(頻譜)、時間的感知辨識分析。這部分在3D景觀聲場來說，是還原聲場或建立3D音響系統來說重要的環節，依據雙耳聲壓差(interaural level difference,IDT)、雙耳時間差(interaural time difference,ILD)、頭部相關傳遞函數(head related transform function,HRTF)等雙耳訊號處理技術，對數位形式聲波訊號(X(n)、cos(ω_mn))進行比對、再次分離處理將複雜的空間混音轉換為左右兩個聲道的聲音訊號(y_L(n)、y_R(n))。數位訊號處理器(DSP)103b對接收之訊號，包括由雙耳訊號處理單元102以及由DAFX 103a輸入之訊號，做及時的計算、分配、處理、ADC(Analog-Digital conversion)、DAC(Digital-Analog conversion)轉換然後由三維訊號演算與比對單元104將由數位訊號處理器(DSP)103b輸入之訊號利用3D處理器105b進行訊號與演算最佳化訊號比對以及自動化的調整，再將比對完的處理訊號利用分配器106分配至撥放系統(3D array speaker)107播放或輸出至其他使用目的裝置(例如輸出至3D耳機109、電競系統111、3D個人消費產品113、3D家庭劇院115、或現場演出117)。現場音源收錄演算與比對單元105將現場演出、空間的音源或訊號透過現場三維錄音裝置105a收錄，經由3D處理器(3D processor)105b將所收錄之訊號與演算最佳化訊之號比對並進行自動化的調整，再將比對完的處理訊號分配至撥放系統(3D array speaker)107播放或輸出至其他使用目的裝置。 As for how to realize the technology of automatically adjusting the best listening range, regardless of the sound field environment conditions, this design can form a sound field with multiple speakers to establish a complete sound localization, and can automatically adjust and process the environmental sound field to simulate the creation of the original sound The auditory experience of the field allows the audience to have a near-optimal listening effect in different environments. Based on the analysis of the above-mentioned 3D spatial sound technologies, the present invention proposes a system architecture 100 for generating a three-dimensional sound field. As shown in FIG. 5, the system architecture includes a spatial/ambient sound wave input unit 101 and binaural signal processing. The unit 102, the digital operation analysis unit 103, the three-dimensional signal calculation and comparison unit 104, and the live sound source recording calculation and comparison unit 105, wherein the space/ambient sound wave input unit 101 includes an array microphone 101a for recording or receiving space and environment The sonic wave 10 is converted and separated into digital form, and the format is processed by the digital audio signal processor (DAFX) 103a in the digital arithmetic analysis unit 103 for 3D calculation and analysis, and the analyzed signals are respectively fed to the binaural signal processing unit 102 and a digital signal processor (DSP) 103b in the digital arithmetic analysis unit 103. The source of the signal fed into the binaural signal processing unit 102 is to pass through the WXYZ 101b processing technology of the sound waves 10 in the received or received space and the environment, and then separate and convert them into a signal (B-format signal) similar to the perception of human binaural ears. , Processed by the decoder (Decoder) 101c into two loops (L+R) channel signals, and then the signal (X(n)) is transmitted to another processing organization through the interface (ALL PASS1, 2). This organization’s The processing is to include the processed signal, and then according to the interaural level difference (IDT), interaural time difference (ILD), head related transform function (HRTF) Theoretical and technical processing. This processing is aimed at the analysis and processing of human hearing recognition and positioning of the sound of space and environment, mainly using the sound pressure difference, energy intensity, frequency domain (spectrum), and time perception recognition analysis of human ears . This part of the 3D landscape sound field is an important part of restoring the sound field or establishing a 3D sound system, based on the interaural level difference (IDT) and interaural time difference (ILD) , Head related transform function (HRTF) and other binaural signal processing technology, compare and separate the digital form of acoustic signals (X(n), cos(ω _m n)), and separate the complex space The mixed sound is converted into the left and right channel sound signals (y _L (n), y _R (n)). The digital signal processor (DSP) 103b pairs the received signal, including the binaural signal processing unit 102 and the DAFX The signal input by 103a will be calculated, distributed, processed, converted by ADC (Analog-Digital conversion), DAC (Digital-Analog conversion) in time, and then input by the digital signal processor (DSP) 103b by the three-dimensional signal calculation and comparison unit 104 The signal uses the 3D processor 105b to perform signal and calculation optimization signal comparison and automatic adjustment, and then use the distributor 106 to distribute the compared processing signal to the 3D array speaker 107 for playback or output to other Use the destination device (for example, output to 3D headphones 109, e-sports system 111, 3D personal consumer products 113, 3D home theater 115, or live performance 117). The live audio source recording calculation and comparison unit 105 converts live performance, spatial audio or The signal is recorded by the live three-dimensional recording device 105a, and through the 3D location The processor (3D processor) 105b compares the recorded signal with the signal of the algorithm optimization signal and makes automatic adjustments, and then distributes the compared processing signal to the playback system (3D array speaker) 107 for playback or output To other use purpose devices.

綜上所述，本發明所開發的三維聲場生成系統，其具有下列特點： In summary, the three-dimensional sound field generation system developed by the present invention has the following characteristics:

a).聲場自我調整，自動偵測與即時的數位計算，計算完成後，自動提供與提供訊息給使用者最佳化的調整數據與指令，例如：(1)揚聲器與揚聲器之間的間距、“xyz”軸的間距、揚聲器與揚聲器在空間中的布局範圍等優化參數；(2)音質的自動調整，包含“EQ、延遲Delay、相位Phase”之調整；(3)音量大小的自動調整，依照自動偵測揚聲器與揚聲器之間與聽者距離的音量，給予自動調整；(4)聲場條件參數的調整，對於國內外幾個音響效果評價較好的音樂廳、劇院...等，工作小組將其聲場條件參數數位化，並供給使用者作為效果調整，使其聲場接近其音樂廳、劇院的聲場感受度。 a). Sound field self-adjustment, automatic detection and real-time digital calculation, after the calculation is completed, automatically provide and provide information to the user to optimize the adjustment data and instructions, such as: (1) the distance between the speaker and the speaker , "Xyz" axis spacing, speaker and speaker layout range in space and other optimization parameters; (2) automatic adjustment of sound quality, including the adjustment of "EQ, Delay, Phase"; (3) automatic adjustment of volume , Automatically adjust the volume according to the automatic detection of the distance between the speaker and the speaker and the listener; (4) The adjustment of the sound field condition parameters, for several concert halls, theaters, etc., with good sound effects at home and abroad. , The working group digitizes its sound field condition parameters and provides users as an effect adjustment to make the sound field close to the sound field perception of their concert halls and theaters.

b).個人使用的揚聲器(音箱)結構：包含(1)高、中、低音單元為陣列式排列；(2) 揚聲器與揚聲器可結合、可分離使用；(3)主機上可有低音單體，作為較低頻的效果；(4)各分離的揚聲器內部包括擴大功率放大的零件，為主動式音箱；(5)主機上與分離式的音箱，其結構設計為增強導流功能，聲音藉由此結構能流順與中低音增強功能。 b). Speaker (speaker box) structure for personal use: including (1) high, medium, and woofer units are arrayed; (2) The loudspeaker and the loudspeaker can be combined and used separately; (3) There can be a woofer on the main unit as a lower frequency effect; (4) The separate loudspeakers include parts for power amplification, which are active speakers; (5) ) The structure of the main unit and separate speakers is designed to enhance the diversion function, and the sound can flow smoothly and the mid-bass enhancement function through this structure.

c).揚聲器(音箱)功能：包含(1)揚聲器與揚聲器訊號採用無線或藍芽分配；(2)分配之訊號，可解析為L+R+SUB、L+R+C+SUB及沉浸式(Immersive)的功能；(3)訊號可以依處理完後，分配至各揚聲器，至多可達100只以上之揚聲器；(4)針對數量超過2只或以上的揚聲器，其另一項功能為指向性的控制，可針對環境、使用者位置等，將聲音指向想要的方向效果。 c). Speaker (speaker) function: including (1) the speaker and speaker signal are distributed by wireless or Bluetooth; (2) the distributed signal can be resolved into L+R+SUB, L+R+C+SUB and immersive (Immersive) function; (3) After the signal is processed, it can be distributed to each speaker, up to 100 speakers; (4) For speakers with more than 2 or more speakers, another function is pointing Sexual control can direct the sound to the desired direction according to the environment, user location, etc.

基於本發明上述技術特點，可以藉以開發出個人音頻播放設備。除了功率、效果、微型體積設計外，計聲場調整功能，消費者依喜好擺放微型揚聲器位置，在有條件的範圍內(聆聽範圍)，微型揚聲器將擁有參考最佳效果，加入聲場因素，可自動化聲場評估計算，讓聽者可以隨心擺放並得到優質的聲音感。以往音頻設備播放中會依環境因素而造成聆聽的感受度不同，例如在固定環境中架設一套音響設備，使用者也許會試者用相關儀器設備調整出其個人喜好的聆聽感受及音場，若是將其設備移至其他房間、空間，感受度一定產生變化，這是因為聲音在空間中條件改變，聲音到達人耳的效果也出現變化。個人音頻播放設備(微型音響)的設計目的就是讓家庭劇院品質的微型產品，能讓消費者像耳機般隨身攜帶，使其具有異地(異位)重放的效能。因此產品的開發方向著重於研究空間因素與微型揚聲器音場調整，運用DSP快速演算與及時調整，達到聲壓補償、音質(EQ)調整補償、相關間距的時間處理等，獲得接近家庭劇院品質的感受度。 Based on the above technical features of the present invention, a personal audio playback device can be developed. In addition to the power, effect, and micro-volume design, it also measures the sound field adjustment function. Consumers place the micro-speaker according to their preferences. Within the conditional range (listening range), the micro-speaker will have the best reference effect and add sound field factors. , It can automate the calculation of sound field evaluation, so that the listener can place it at will and get a high-quality sense of sound. In the past, audio equipment playback will vary depending on environmental factors. For example, if a set of audio equipment is set up in a fixed environment, users may try to use related equipment to adjust their personal listening experience and sound field. Move its equipment to other rooms and spaces, and the perception will definitely change. This is because the sound conditions in the space change and the effect of the sound reaching the human ears also changes. The design purpose of personal audio playback equipment (micro speakers) is to make home theater-quality miniature products that consumers can carry around like earphones, so that they have remote (ex situ) playback performance. Therefore, the development direction of the product focuses on the research of spatial factors and micro-speaker sound field adjustment, using DSP to quickly calculate and adjust in time, to achieve sound pressure compensation, sound quality (EQ) adjustment compensation, time processing of related spacing, etc., to obtain close to home theater quality Feeling degree.

針對於不同環境會有不同的聽覺感受，我們更於個人音頻播放設備(微型音響)加入了自動調整最佳化聆聽範圍的聲場調整功能，讓不同環境中，都可有接近最佳化的聆聽效果。此外，運用Wi-Fi、藍芽及擴充組合技術，也可將主機與音箱分離，消費者亦可選擇性的單獨處理訊號多軌擴充音響，在主機與音箱擺位的聲場調整中，消費者亦可使用聲場自動調整功能，將可能因為擺位不同而造成聲音變化自動調整，其擴充數量可從1到100只音箱。 In view of the different hearing experience in different environments, we have added a sound field adjustment function that automatically adjusts the optimal listening range to personal audio playback equipment (micro speakers), so that different environments can be close to optimal Listen to the effect. In addition, the use of Wi-Fi, Bluetooth and expansion combination technology can also separate the host and speakers. Consumers can also selectively process the signal separately and multi-track to expand the audio. In the sound field adjustment of the host and speakers, consumption The user can also use the automatic sound field adjustment function, which may automatically adjust the sound changes due to different positions. The number of expansions can be from 1 to 100 speakers.

隨著DSP(digital signal processor)演算技術的快速發展，相關的產品運用更為方便，空間感的調適，殘響，整體聲像，音質等化處理等等，都可以增加優化的表現。 With the rapid development of DSP (digital signal processor) calculation technology, related products are more convenient to use. Spatial adjustment, reverberation, overall sound image, sound quality equalization processing, etc., can all increase the performance of optimization.

專屬於此個人音頻播放設備(微型音響)產品獨特性功能之設計“自動調整最佳化聆聽範圍”的聲場調整功能，讓不同環境中，都可有接近最佳化的聆聽效率。此外，由於各種環境條件因素，將可能因為擺位不同而造成聲音變化聽覺感受度不同，講究聲音的質感，自動調整功能將輔助優化的可能，其更加入Transfer Function使聲音以鎖定方向並參入了最佳指標建議訊號，聲場調整功能除了調整音質之外，也具有聲束控制指向(Beam)之特性。 The sound field adjustment function of "automatically adjust and optimize the listening range" is designed exclusively for the unique function of this personal audio playback device (micro audio) product, so that in different environments, you can have near-optimal listening efficiency. In addition, due to various environmental factors, the sound may change due to different positions. The auditory perception is different. The sound texture is emphasized. The automatic adjustment function will assist the optimization. It also adds a Transfer Function to lock the sound direction and participate in it In addition to adjusting the sound quality, the sound field adjustment function also has the characteristic of beam control direction (Beam).

藉由以上具體實施例之詳述，係希望能更加清楚描述本發明之特徵與精神，而並非以上述所揭露的較佳具體實施例來對本發明之範疇加以限制。相反地，其目的是希望能涵蓋各種改變及具相等性的安排於本發明所欲申請之專利範圍的範疇內。因此，本發明所申請之專利範圍的範疇應根據上述的說明作最寬廣的解釋，以致使其涵蓋所有可能的改變以及具相等性的安排。 Through the detailed description of the above specific embodiments, it is hoped that the characteristics and spirit of the present invention can be described more clearly, and the scope of the present invention is not limited by the preferred specific embodiments disclosed above. On the contrary, its purpose is to cover various changes and equivalent arrangements within the scope of the patent for which the present invention is intended. Therefore, the scope of the patent application for the present invention should be interpreted in the broadest way based on the above description, so that it covers all possible changes and equivalent arrangements.

10:聲波 10: Sound waves

101:空間/環境聲波輸入單元 101: Space/Ambient Sound Wave Input Unit

101a:陣列麥克風 101a: Array microphone

101b:WXYZ 101b:WXYZ

101c:解碼器(Decoder) 101c: Decoder

102:雙耳訊號處理單元 102: Binaural signal processing unit

103:數位運算分析單元 103: Digital Operation Analysis Unit

105a:現場三維錄音裝置 105a: On-site 3D recording device

105b:3D處理器(3D processor) 105b: 3D processor (3D processor)

106:分配器 106: Distributor

107:撥放系統(3D array speaker) 107: Playback system (3D array speaker)

109:3D耳機 109: 3D headset

111:電競系統 111: E-sports system

113:3D個人消費產品 113: 3D personal consumer products

115:3D家庭劇院 115: 3D Home Theater

117:現場演出 117: Live Performance

Claims

A three-dimensional sound field generation system, including:

A space/ambient sound wave input unit for receiving sound waves in space and environment and converting them into digital format signals;

A binaural signal processing unit, coupled to the spatial/ambient sound wave input unit, to receive the converted and separated digital signal and compare and separate the signal again according to the binaural signal processing technology;

A digital arithmetic analysis unit includes a digital audio signal processor and a digital signal processor, coupled to the spatial/ambient sound wave input unit and the binaural signal processing unit, wherein the digital audio signal processor is used to receive the converted and separated The digital signal is subjected to three-dimensional calculation and analysis, and is fed into the binaural signal processing unit and the digital signal processor, wherein the digital signal processor is used to receive the binaural signal processing unit and the digital audio signal processor The input signal undergoes real-time calculation, distribution, processing, digital-to-analog conversion, and analog-to-digital conversion; and

A three-dimensional signal calculation and comparison unit, coupled to the digital operation analysis unit for performing calculation optimization, signal comparison and automatic adjustment of the signal output by the digital operation analysis unit, and then compare the processed signal after the comparison Assigned to a sound signal playback system to generate a reconstructed sound field.

For example, the three-dimensional sound field generation system described in the first item of the scope of patent application, which further includes a live audio source recording calculation and comparison unit for recording the audio source or signal of the live performance through a live three-dimensional recording device, and the recorded signal Compare with the signal of the above-mentioned algorithm optimization signal and perform automatic adjustment, and then distribute the compared processing signal to the above-mentioned sound signal playback system to generate a reconstructed sound field.

The three-dimensional sound field generation system described in item 1 of the scope of patent application, wherein the above-mentioned binaural signal processing unit Based on binaural level difference (IDT), binaural time difference (ILD), head related transform function (HRTF) and other binaural signal processing technologies, the input The digital signal is compared and separated again to convert the complicated spatial mixing into the left and right channels of sound signals.

The three-dimensional sound field generation system described in the first item of the scope of patent application, wherein the above-mentioned sound signal playback system includes a three-dimensional speaker array.

A three-dimensional sound field generation system, including:

A binaural signal processing unit, coupled to the spatial/ambient sound wave input unit, and perform binaural signal processing on the converted and separated digital format signal;

A digital arithmetic analysis unit coupled to the spatial/ambient sound wave input unit and the binaural signal processing unit to perform three-dimensional calculations and digital/analog conversion; and

A three-dimensional signal calculation and comparison unit is coupled with the digital operation analysis unit to perform optimized signal comparison, and then is distributed to a sound signal playback system to generate a reconstructed sound field.

For example, the 3D sound field generation system described in item 5 of the scope of patent application further includes a live sound source recording calculation and comparison unit, which is used to record, compare and adjust the sound source or signal of the live performance through a live 3D recording device .

As described in item 5 of the scope of patent application, the three-dimensional sound field generation system, wherein the binaural signal processing unit is based on the binaural level difference (IDT) and binaural time difference (interaural level difference). Time difference (ILD), head related transform function (HRTF) processing.

The three-dimensional sound field generating system described in item 5 of the scope of patent application, wherein the above-mentioned sound signal playback system includes a three-dimensional speaker array.

A method for generating a three-dimensional sound field, including:

Receive sound waves in space and environment through the sound wave input unit and convert them into digital signals;

Binaural signal processing is performed on the digital signal with a binaural signal processing unit;

Perform three-dimensional calculations with digital operation analysis unit; and

Using the three-dimensional signal calculation and comparison unit, the digital signal is optimized for signal comparison, and then distributed to a sound signal playback system to generate a reconstructed sound field.

For example, the 3D sound field generation method described in item 9 of the scope of patent application further includes a live sound source recording calculation and comparison unit, which is used to record, compare and adjust the sound source or signal of the live performance through a live 3D recording device .

The three-dimensional sound field generation method described in item 9 of the scope of patent application, wherein the above-mentioned binaural signal processing includes binaural level difference (IDT), binaural time difference (ILD), head related Transfer function (head related transform function, HRTF) processing.