TWI807504B - Method, device and storage medium for audio processing of virtual meeting room - Google Patents
Method, device and storage medium for audio processing of virtual meeting room Download PDFInfo
- Publication number
- TWI807504B TWI807504B TW110144724A TW110144724A TWI807504B TW I807504 B TWI807504 B TW I807504B TW 110144724 A TW110144724 A TW 110144724A TW 110144724 A TW110144724 A TW 110144724A TW I807504 B TWI807504 B TW I807504B
- Authority
- TW
- Taiwan
- Prior art keywords
- seat
- voiceprint information
- concentration
- participant
- audio processing
- Prior art date
Links
Images
Landscapes
- Stereophonic System (AREA)
- Input Circuits Of Receivers And Coupling Of Receivers And Audio Equipment (AREA)
- Reverberation, Karaoke And Other Acoustics (AREA)
Abstract
Description
本申請涉及虛擬會議室技術領域,具體涉及一種虛擬會議室之音訊處理方法、裝置及存儲介質。 The present application relates to the technical field of virtual meeting rooms, in particular to an audio processing method, device and storage medium for virtual meeting rooms.
虛擬會議室(Virtual Meeting Room,VMR)係一種高效、便捷之網路會議室。使用者通過手機、電腦等移動終端產品可快速高效地與其他用戶組建虛擬會議,不受時間和空間之局限,感受身臨其境之會議溝通效果。目前之虛擬會議室係把發言者之圖像放大,而難以區分不同發言者之聲音。當虛擬會議室中有複數發言者同時講話時,使用者難以分辨每個發言者之講話內容。 Virtual Meeting Room (Virtual Meeting Room, VMR) is an efficient and convenient network meeting room. Users can quickly and efficiently set up virtual meetings with other users through mobile terminal products such as mobile phones and computers, without being limited by time and space, and feel the effect of immersive meeting communication. The current virtual conference room enlarges the speaker's image, making it difficult to distinguish the voices of different speakers. When there are multiple speakers speaking at the same time in the virtual meeting room, it is difficult for the user to distinguish the speech content of each speaker.
本申請提供一種虛擬會議室之音訊處理方法、裝置及存儲介質,以提升發言者之聲音的可識別性。 This application provides an audio processing method, device and storage medium for a virtual meeting room, so as to improve the recognizability of the speaker's voice.
本申請第一方面提供一種虛擬會議室之音訊處理方法,包括:根據虛擬會議室之座位分佈設置網格頂點之數目。擷取發言者之第一聲紋資訊,第一聲紋資訊包括語音訊號之頻率、振幅及相位差。根據網格頂點之數目調整第一聲紋資訊之頻率或振幅,得到第二聲紋資訊。根據第二聲紋資訊確定發言者於虛擬會議室中的座位。 The first aspect of the present application provides an audio processing method for a virtual conference room, including: setting the number of grid vertices according to the seat distribution of the virtual conference room. The first voiceprint information of the speaker is extracted, and the first voiceprint information includes the frequency, amplitude and phase difference of the voice signal. The frequency or amplitude of the first voiceprint information is adjusted according to the number of grid vertices to obtain the second voiceprint information. The speaker's seat in the virtual conference room is determined according to the second voiceprint information.
於其中一種實施方式中,根據虛擬會議室之座位分佈設置網格頂點之數目,包括:於各個座位所覆蓋區域設置不同數目之網格頂點,以建立座位與網格頂點之數目的對應關係。 In one embodiment, setting the number of grid vertices according to the seat distribution of the virtual conference room includes: setting different numbers of grid vertices in the area covered by each seat, so as to establish a corresponding relationship between seats and the number of grid vertices.
於另一種實施方式中,根據網格頂點之數目調整第一聲紋資訊之頻率或振幅,包括:當第一座位所覆蓋區域之網格頂點之數目大於第二座位所覆蓋區域之網格頂點之數目時,調高來自於第一座位之第一聲紋資訊之頻率,或調低來自於第二座位之第一聲紋資訊之頻率,使得來自於第一座位之第一聲紋資訊之頻率大於來自於第二座位之第一聲紋資訊之頻率。 In another embodiment, adjusting the frequency or amplitude of the first voiceprint information according to the number of grid vertices includes: when the number of grid vertices in the area covered by the first seat is greater than the number of grid vertices in the area covered by the second seat, increasing the frequency of the first voiceprint information from the first seat, or lowering the frequency of the first voiceprint information from the second seat, so that the frequency of the first voiceprint information from the first seat is greater than the frequency of the first voiceprint information from the second seat.
於另一種實施方式中,根據網格頂點之數目調整第一聲紋資訊之頻率或振幅,包括:當第一座位所覆蓋區域之網格頂點之數目大於第二座位所覆蓋區域之網格頂點之數目時,調大來自於第一座位之第一聲紋資訊之振幅,或調小來自於第二座位之第一聲紋資訊之振幅,使得來自於第一座位之第一聲紋資訊之振幅大於來自於第二座位之第一聲紋資訊之振幅。 In another embodiment, adjusting the frequency or amplitude of the first voiceprint information according to the number of grid vertices includes: when the number of grid vertices in the area covered by the first seat is greater than the number of grid vertices in the area covered by the second seat, increasing the amplitude of the first voiceprint information from the first seat, or reducing the amplitude of the first voiceprint information from the second seat, so that the amplitude of the first voiceprint information from the first seat is greater than the amplitude of the first voiceprint information from the second seat.
於另一種實施方式中,於根據第二聲紋資訊確定發言者於虛擬會議室中的座位之後,音訊處理方法還包括:擷取參會者之眼球運動方向資訊。根據眼球運動方向資訊確定參會者之專心度,專心度之取值為0或1。根據專心度確定參會者對會議議題是否有興趣。 In another implementation manner, after the speaker's seat in the virtual meeting room is determined according to the second voiceprint information, the audio processing method further includes: capturing eye movement direction information of the meeting participant. Determine the concentration of the participants according to the eye movement direction information, and the value of the concentration is 0 or 1. Determine whether the participants are interested in the meeting topic according to the degree of concentration.
於另一種實施方式中,根據眼球運動方向資訊確定參會者之專心度,包括:當參會者之眼球運動方向朝向發言者時,將專心度標記為1。當參會者之眼球運動方向遠離發言者時,將專心度標記為0。 In another embodiment, determining the concentration of the participant according to the eye movement direction information includes: marking the concentration as 1 when the eye movement direction of the participant is facing the speaker. When the participant's eye movement direction is away from the speaker, mark the concentration as 0.
於另一種實施方式中,音訊處理方法還包括:當存在複數發言者時,統計參會者於每個發言者發言時之專心度之取值。根據專心度之取值確定參會者對會議議題之專心度。 In another embodiment, the audio processing method further includes: when there are multiple speakers, counting the value of the concentration of the participants when each speaker speaks. Determine the concentration of the participants on the conference topic according to the value of the concentration.
於另一種實施方式中,根據專心度確定參會者對會議議題是否有興趣,包括:當專心度之取值大於或等於預設之興趣閾值時,確定參會者對會議議題有興趣。當專心度之取值小於興趣閾值時,確定參會者對會議議題沒有興趣。 In another implementation manner, determining whether a participant is interested in a conference topic according to the degree of concentration includes: determining that the participant is interested in the conference topic when the value of the concentration degree is greater than or equal to a preset interest threshold. When the value of the concentration is less than the interest threshold, it is determined that the participant is not interested in the meeting topic.
本申請第二方面提供一種音訊處理裝置,包括伺服器、主設備及從設備,主設備用以發起虛擬會議,伺服器用以根據來自於主設備之指令構建虛擬會議室,從設備用以根據來自於主設備之連結進入虛擬會議室,伺服器包括第一處理器和第一記憶體,第一處理器運行存儲於第一記憶體中的電腦程式或代碼,實現本申請實施例之音訊處理方法。 The second aspect of the present application provides an audio processing device, including a server, a master device, and a slave device. The master device is used to initiate a virtual meeting, the server is used to construct a virtual conference room according to instructions from the master device, and the slave device is used to enter the virtual conference room according to a connection from the master device. The server includes a first processor and a first memory. The first processor runs a computer program or code stored in the first memory to implement the audio processing method of the embodiment of the present application.
本申請第三方面提供一種存儲介質,用於存儲電腦程式或代碼,當電腦程式或代碼被處理器執行時,實現本申請實施例之音訊處理方法。 The third aspect of the present application provides a storage medium for storing computer programs or codes. When the computer programs or codes are executed by a processor, the audio processing method of the embodiment of the present application is realized.
本申請實施例將虛擬會議室中的每個座位所覆蓋區域之網格頂點之數目與第一聲紋資訊建立對應關係,根據網格頂點之數目調整來自於不同座位之第一聲紋資訊之頻率或振幅,得到更具辨識性之第二聲紋資訊,從而建立起每個座位與第二聲紋資訊之對應關係。如此,可根據第二聲紋資訊確定發言者於虛擬會議室中的座位。本申請實施例可模擬發言者之聲源特性,使發言者之聲音具可識別性,使用者可以清楚地分辨出每個發言者之講話內容。 In this embodiment of the present application, the number of grid vertices in the area covered by each seat in the virtual conference room is associated with the first voiceprint information, and the frequency or amplitude of the first voiceprint information from different seats is adjusted according to the number of grid vertices to obtain more recognizable second voiceprint information, thereby establishing a correspondence between each seat and the second voiceprint information. In this way, the speaker's seat in the virtual conference room can be determined according to the second voiceprint information. The embodiment of the present application can simulate the characteristics of the sound source of the speaker, so that the voice of the speaker can be recognized, and the user can clearly distinguish the speech content of each speaker.
100:音訊處理裝置 100: Audio processing device
200:伺服器 200: server
300:電子設備 300: Electronic equipment
310:主設備 310: master device
320:從設備 320: slave device
210:第一處理器 210: first processor
220:第一記憶體 220: The first memory
311:第二處理器 311: second processor
312:第二記憶體 312: Second memory
313:第一音訊模組 313:The first audio module
314:第一顯示幕 314: The first display screen
315:第一前置攝像頭 315: The first front camera
321:第三處理器 321: the third processor
322:第三記憶體 322: The third memory
323:第二音訊模組 323:Second audio module
324:第二顯示幕 324: Second display screen
325:第二前置攝像頭 325: Second front camera
S101-S104,S201-S207,S301-S307,S401-S403:步驟 S101-S104, S201-S207, S301-S307, S401-S403: steps
圖1係本申請一實施方式之音訊處理裝置之結構示意圖。 FIG. 1 is a schematic structural diagram of an audio processing device according to an embodiment of the present application.
圖2係本申請一實施方式之音訊處理方法之流程圖。 FIG. 2 is a flowchart of an audio processing method in an embodiment of the present application.
圖3a係本申請一實施方式之虛擬會議室之結構示意圖。 Fig. 3a is a schematic structural diagram of a virtual conference room according to an embodiment of the present application.
圖3b係本申請另一實施方式之虛擬會議室之結構示意圖。 Fig. 3b is a schematic structural diagram of a virtual meeting room according to another embodiment of the present application.
圖4係本申請另一實施方式之音訊處理方法之流程圖。 FIG. 4 is a flowchart of an audio processing method in another embodiment of the present application.
圖5係本申請一實施方式之虛擬會議室之示意圖。 FIG. 5 is a schematic diagram of a virtual meeting room according to an embodiment of the present application.
圖6係本申請另一實施方式之虛擬會議室之示意圖。 FIG. 6 is a schematic diagram of a virtual conference room in another embodiment of the present application.
圖7係本申請另一實施方式之音訊處理方法之流程圖。 FIG. 7 is a flowchart of an audio processing method in another embodiment of the present application.
圖8係本申請一實施方式之第一聲紋資訊之示意圖。 FIG. 8 is a schematic diagram of the first voiceprint information in an embodiment of the present application.
圖9係本申請另一實施方式之音訊處理方法之流程圖。 FIG. 9 is a flowchart of an audio processing method in another embodiment of the present application.
需要說明的是,本申請實施例中“至少一個”係指一個或者複數,“複數”係指兩個或多於兩個。“和/或”,描述關聯物件之關聯關係,表示可存在三種關係,例如,A和/或B可表示:單獨存在A,同時存在A和B,單獨存在B之情況,其中A,B可係單數或者複數。本申請之說明書和申請專利範圍及附圖中的術語“第一”、“第二”、“第三”、“第四”等(如果存在)係用於區別類似之物件,而非用於描述特定之順序或先後次序。 It should be noted that "at least one" in the embodiments of the present application refers to one or plural, and "plural" refers to two or more than two. "And/or" describes the relationship between related objects, and means that there may be three relationships. For example, A and/or B can mean: A exists alone, A and B exist simultaneously, and B exists alone, where A and B can be singular or plural. The terms "first", "second", "third", "fourth", etc. (if any) in the specification and scope of claims of this application and the drawings are used to distinguish similar items, rather than to describe a specific order or sequence.
另外需要說明的是,本申請實施例中公開之方法或流程圖所示出之方法,包括用於實現方法之一個或複數步驟,於不脫離請求項之範圍之情況下,複數步驟之執行順序可彼此互換,其中某些步驟也可被刪除。 In addition, it should be noted that the method disclosed in the embodiment of the application or the method shown in the flow chart includes one or a plurality of steps for realizing the method. Without departing from the scope of the claim, the execution order of the plurality of steps can be interchanged with each other, and some of the steps can also be deleted.
圖1係本申請一實施方式之音訊處理裝置100之結構示意圖。
FIG. 1 is a schematic structural diagram of an
可參閱圖1,音訊處理裝置100可以包括伺服器200和電子設備300。電子設備300包括主設備310和從設備320。主設備310係指會議主持人使用之電子設備,從設備320係指其餘參會者使用之電子設備。伺服器200通訊連接於主設備310和從設備320。主持人通過主設備310發起虛擬會議,伺服器200根據來自於主設備310之指令構建虛擬會議室,主設備310發送會議連結至從設備320,其餘參會者通過從設備320進入虛擬會議室。
Referring to FIG. 1 , the
其中,通訊連接可以包括有線連接和無線連接。有線連接係指通過光纖或雙絞線等有線傳輸介質進行連接。無線連接係指通過WiFi或移動通訊網路(例如2G/3G/4G/5G)等無線傳輸介質進行連接。 Wherein, the communication connection may include a wired connection and a wireless connection. Wired connection refers to connection through wired transmission media such as optical fiber or twisted pair. Wireless connection refers to connection through wireless transmission media such as WiFi or mobile communication network (such as 2G/3G/4G/5G).
於一些實施例中,音訊處理裝置100還可以包括360度魚眼攝像機(圖未示),360度魚眼攝像機係指可以獨立實現大範圍無死角監控之全景攝像機。360度魚眼攝像機通訊連接於伺服器200。360度魚眼攝像機可以設置於辦公室內部分工位之上空,鏡頭朝上或朝下,拍攝辦公室內部分工位。伺服器200將360度魚眼攝像機拍攝到之工位映射到虛擬會議室模型中,使得工位上之人員如同置身於虛擬會議室中。當鏡頭拍攝到之畫面為倒置畫面時,伺服器200對主設備310和從設備320顯示出之畫面進行倒置處理,以校正畫面之方向。
In some embodiments, the
伺服器200可以包括第一處理器210和第一記憶體220,第一處理器210可以運行存儲於第一記憶體220中的電腦程式或代碼,實現本申請一些實施例之音訊處理方法。
The
第一處理器210可以包括一個或複數處理單元。例如,第一處理器210可以包括,但不限於,應用處理器(Application Processor,AP)、調製解調處理器、圖形處理器(Graphics Processing Unit,GPU)、圖像訊號處理器(Image Signal Processor,ISP)、控制器、視頻轉碼器、數位訊號處理器(Digital Signal Processor,DSP)、基帶處理器、神經網路處理器(Neural-Network Processing Unit,NPU)等。其中,不同之處理單元可以係獨立之器件,也可以集成於一個或複數處理器中。
The
第一處理器210中還可以設置記憶體,用於存儲指令和資料。於一些實施例中,第一處理器210中的記憶體為高速緩衝記憶體。該記憶體可以保存第一處理器210剛用過或迴圈使用之指令或資料。如果第一處理器210需要再次使用該指令或資料,可從所述記憶體中直接調用。
A memory may also be provided in the
於一些實施例中,第一處理器210可以包括一個或複數介面。介面可以包括,但不限於,積體電路(Inter-Integrated Circuit,I2C)介面、積體電路內置音訊(Inter-Integrated Circuit Sound,I2S)介面、脈衝碼調制(Pulse Code Modulation,PCM)介面、通用非同步收發傳輸器(Universal Asynchronous Receiver/Transmitter,UART)介面、移動產業處理器介面(Mobile Industry Processor Interface,MIPI)、通用輸入輸出(General-Purpose Input/Output,GPIO)介面、使用者標記模組(Subscriber Identity Module,SIM)介面、通用序列匯流排(Universal Serial Bus,USB)介面等。
In some embodiments, the
可以理解,本申請實施例示意之各模組間之介面連接關係,只係示意性說明,並不構成對伺服器200之結構限定。於本申請另一些實施例中,伺服器200也可以採用上述實施例中不同之介面連接方式,或多種介面連接方式之組合。
It can be understood that the interface connection relationship between the various modules shown in the embodiment of the present application is only a schematic illustration and does not constitute a structural limitation of the
第一記憶體220可以包括外部記憶體介面和內部記憶體。其中,外部記憶體介面可以用於連接外部存儲卡,例如Micro SD卡,實現擴展伺服器200之存儲能力。外部存儲卡通過外部記憶體介面與第一處理器210通訊,實現資料存儲功能。內部記憶體可以用於存儲電腦可執行程式碼,所述可執行程式碼包括指令。內部記憶體可以包括存儲程式區和存儲資料區。其中,存儲程式區可存儲作業系統,至少一個功能所需之應用程式(例如聲音播放功能,圖像播放功能等)等。存儲資料區可存儲伺服器200使用過程中所創建之資料(例如音訊資料,圖像資料等)等。此外,內部記憶體可以包括高速隨機存取記憶體,還可以包括非易失性記憶體,例如至少一個磁碟記憶體件、快閃記憶體器件或通用快閃記憶體記憶體(Universal Flash Storage,UFS)等。第一處理器210通過運行存儲於內部記憶體之指令,和/或存儲於設置於第一處理器210中的記憶體之指令,
執行伺服器200之各種功能應用以及資料處理,例如實現本申請一些實施例之音訊處理方法。
The
於一些實施例中,伺服器200可以包括多台虛擬機器(Virtual Machine,VM)。伺服器200具高可用性(High Availability,HA)和彈性伸縮(Auto Scaling)功能。高可用性係指可提供冗餘處理能力,當一個節點(Node)不可用或者不能處理用戶之請求時,該請求會及時轉到另外之可用節點來處理。彈性伸縮功能係指可根據業務需求和策略自動調整計算能力(即實例數量)。於業務需求增長時,彈性伸縮自動增加指定類型之實例,以保證計算能力。於業務需求下降時,彈性伸縮自動減少指定類型之實例,以節約成本。
In some embodiments, the
於一些實施例中,主設備310可以包括第二處理器311、第二記憶體312、第一音訊模組313及第一顯示幕314。第二處理器311電連接於其他上述部件和伺服器200之第一處理器210。第一音訊模組313用於對音訊訊號進行模數變換、編碼和解碼。第一顯示幕314用於顯示虛擬會議室之場景和部分參會者之頭像。第二處理器311可以運行存儲於第二記憶體312中的電腦程式或代碼,實現本申請另一些實施例之音訊處理方法。
In some embodiments, the
於本實施例中,主設備310可以包括可追蹤用戶眼球運動方向之3自由度(Degree of Freedom,DoF)虛擬實境(Virtual Reality,VR)眼鏡或頭戴式設備(Head-Mounted Device,HMD)。
In this embodiment, the
可以理解,第二處理器311和第二記憶體312之具體實施方式可參閱上述第一處理器210和第一記憶體220,此處不再贅述。
It can be understood that, the specific implementation manners of the
於一些實施例中,第一音訊模組313可以設置於第二處理器311中,或將第一音訊模組313之部分功能模組設置於第二處理器311中。主設備310可以通過第一音訊模組313實現音訊功能,例如語音播放、錄音等。
In some embodiments, the
於另一些實施例中,主設備310還可以包括第一前置攝像頭315。第一前置攝像頭315電連接於第二處理器311。第一前置攝像頭315用於拍攝人臉和捕捉人眼之運動方向,以支持伺服器200對用戶參加虛擬會議之專注度和對會議議題之感興趣程度進行分析。
In other embodiments, the
於本實施例中,主設備310可以包括智慧型電話、平板電腦、個人電腦(Personal Computer,PC)或個人數位助理(Personal Digital Assistant,PDA)。
In this embodiment, the
於一些實施例中,從設備320可以包括第三處理器321、第三記憶體322、第二音訊模組323、第二顯示幕324及第二前置攝像頭325。第三處理器321電連接於其他上述部件、伺服器200之第一處理器210及主設備310之第二處理器311。
In some embodiments, the
可以理解,從設備320之各個部件和具體實施方式可參閱主設備310。
It can be understood that the components and specific implementation of the
本申請實施例示意之結構並不構成對伺服器200、主設備310或從設備320之具體限定。於本申請另一些實施例中,伺服器200、主設備310或從設備320可以包括比圖示更多或更少之部件,或者組合某些部件,或者拆分某些部件,或者不同之部件佈置。圖示之部件可以以硬體,軟體或軟體和硬體之組合實現。
The structure shown in the embodiment of this application does not constitute a specific limitation on the
圖2係本申請一實施方式之音訊處理方法之流程圖。 FIG. 2 is a flowchart of an audio processing method in an embodiment of the present application.
可參閱圖2,本實施例之音訊處理方法應用於主設備310,音訊處理方法可以包括以下步驟:
Referring to FIG. 2, the audio processing method of this embodiment is applied to the
S101,回應於主持人之第一操作,主設備310發送建立會議之請求至伺服器200。
S101 , in response to the host's first operation, the
其中,第一操作可以包括於主設備310之三維圖形圖像軟體(例如Blender)中觸發建立會議之控制項。
Wherein, the first operation may include triggering a control item of setting up a meeting in the three-dimensional graphics software (such as Blender) of the
於本實施例中,主設備310上安裝有三維圖形圖像軟體,主持人可以於三維圖形圖像軟體中觸發建立會議之控制項,使得主設備310發送建立會議之請求至伺服器200。其中,三維圖形圖像軟體可提供全面之三維創作工具,包括建模(Modeling)、UV映射(UV-Mapping)、貼圖(Texturing)、綁定(Rigging)、蒙皮(Skinning)、動畫(Animation)、粒子(Particle)和其它系統之物理學類比(Physics)、腳本控制(Scripting)、渲染(Rendering)、運動跟蹤(Motion Tracking)、合成(Compositing)、後期處理(Post-production)等。
In this embodiment, 3D graphics software is installed on the
S102,回應於主持人之第二操作,主設備310從伺服器200之模型庫中選擇虛擬會議室模型。
S102 , in response to the host's second operation, the
其中,第二操作可以包括於主設備310之三維圖形圖像軟體中觸發選擇虛擬會議室模型之控制項。
Wherein, the second operation may include triggering the selection of the control item of the virtual conference room model in the three-dimensional graphics software of the
可參閱圖3a和圖3b,模型庫中存儲有多種不同形狀特徵之虛擬會議室,例如長方體虛擬會議室、環形虛擬會議室等。伺服器200接收到來自於主設備310之建立會議之請求後,允許主設備310訪問伺服器200之模型庫。主持人可以於三維圖形圖像軟體中選擇虛擬會議室模型,觸發選擇虛擬會議室模型之控制項,使得主設備310從伺服器200之模型庫中選擇虛擬會議室模型。
Referring to FIG. 3a and FIG. 3b, the model library stores virtual conference rooms with different shapes and characteristics, such as cuboid virtual conference rooms, circular virtual conference rooms, and the like. The
S103,回應於主持人之第三操作,主設備310收集主持人之第一聲紋資訊,並將第一聲紋資訊傳送至伺服器200。
S103 , in response to the host's third operation, the
其中,第三操作可以包括於主設備310之三維圖形圖像軟體中觸發錄製音訊之控制項。
Wherein, the third operation may include triggering a control item for recording audio in the three-dimensional graphic image software of the
於本實施例中,主設備310確定虛擬會議室模型後,主持人於主設備310之三維圖形圖像軟體中觸發錄製音訊之控制項,主設備310通過第一音訊模組313錄製主持人之語音訊號,從語音訊號中提取主持人之第一聲紋資訊,
並將第一聲紋資訊傳送至伺服器200。其中,第一聲紋資訊可以包括語音訊號之頻率、振幅及相位差。
In this embodiment, after the
S104,回應於主持人之第四操作,主設備310根據虛擬會議室模型確定參會者之座位,並發送會議連結至從設備320。
S104 , in response to the fourth operation of the moderator, the
其中,第四操作可以包括於主設備310之三維圖形圖像軟體中觸發添加會議連結之控制項。
Wherein, the fourth operation may include triggering a control item of adding a conference link in the three-dimensional graphics software of the
於本實施例中,主設備310發送主持人之第一聲紋資訊至伺服器200後,主持人於主設備310之三維圖形圖像軟體中觸發添加會議連結之控制項,主設備310根據虛擬會議室模型確定參會者之座位,虛擬會議室中的每個座位對應一個唯一之會議連結。主設備310發送會議連結至從設備320。
In this embodiment, after the
舉例而言,虛擬會議室中可以設置N個座位,一個座位對應一個會議連結,主設備310可以將N個會議連結分別發送至N個從設備320。參會者通過一個會議連結進入虛擬會議室後,可從對應座位之視角觀察虛擬會議室和其他參會者,且於會議中可以發言。
For example, N seats can be set in the virtual conference room, and one seat corresponds to one conference link, and the
於一些實施例中,虛擬會議室中還可以設置複數旁聽座位,每個旁聽座位也對應一個唯一之會議連結。主設備310可以將M個旁聽座位之會議連結分別發送至M個從設備320。參會者通過一個會議連結進入虛擬會議室後,可從旁聽座位之視角觀察虛擬會議室和其他參會者,旁聽座位之參會者不能發言。其中,M和N均為正整數。
In some embodiments, multiple observer seats can also be set in the virtual conference room, and each observer seat also corresponds to a unique conference link. The
於另一些實施例中,主設備310將虛擬會議室之會議連結發送至複數從設備320。參會者通過虛擬會議室之會議連結進入虛擬會議室後,可通過從設備320選擇座位。
In other embodiments, the
圖4係本申請另一實施方式之音訊處理方法之流程圖。 FIG. 4 is a flowchart of an audio processing method in another embodiment of the present application.
可參閱圖4,本實施例之音訊處理方法應用於從設備320,音訊處理方法可以包括以下步驟:
Referring to FIG. 4, the audio processing method of this embodiment is applied to the
S201,從設備320接收來自於主設備310之會議連結。
S201 , the
於本實施例中,主設備310建立虛擬會議室後,將虛擬會議室之會議連結或虛擬會議室中座位之會議連結發送至參會者。參會者通過從設備320接收來自於主設備310之會議連結。
In this embodiment, after the virtual meeting room is established, the
S202,回應於參會者之第一操作,從設備320根據會議連結進入虛擬會議室。
S202, in response to the first operation of the participant, the
其中,第一操作可以包括於從設備320上點擊會議連結,啟動流覽器應用(例如Chrome Browser)。
Wherein, the first operation may include clicking a meeting link on the
可參閱圖5,從設備320接收到來自於主設備310之會議連結後,參會者於從設備320上點擊會議連結,啟動流覽器應用,通過流覽器應用進入虛擬會議室。
Referring to FIG. 5 , after the
S203,從設備320根據虛擬會議室中是否有座位來確定會議連結是否為預定座位之會議連結。若會議連結為預定座位之會議連結,則執行步驟S204。若會議連結不係預定座位之會議連結,則執行步驟S205。
S203, the
於本實施例中,從設備320根據會議連結進入虛擬會議室後,從設備320根據虛擬會議室中是否有座位來確定會議連結是否為預定座位之會議連結。當參會者於虛擬會議室中有座位時,從設備320從預定座位之視角顯示虛擬會議室之場景和其他參會者。當參會者於虛擬會議室中沒有座位時,從設備320顯示整體虛擬會議室之場景。從設備320可以通過參會者進入虛擬會議室後之視角不同確定參會者於虛擬會議室中是否有座位,進而確定會議連結是否為預定座位之會議連結。
In this embodiment, after the
S204,回應於參會者之第二操作,從設備320收集參會者之第一聲紋資訊,並將第一聲紋資訊傳送至伺服器200。
S204, in response to the second operation of the participant, collect the first voiceprint information of the participant from the
其中,第二操作可以包括於從設備320之流覽器應用中觸發錄製音訊之控制項。
Wherein, the second operation may include triggering a control item for recording audio in the browser application of the
於本實施例中,從設備320確定會議連結為預定座位之會議連結後,參會者於從設備320之流覽器應用中觸發錄製音訊之控制項,從設備320通過第二音訊模組323錄製參會者之語音訊號,從語音訊號中提取參會者之第一聲紋資訊,並將第一聲紋資訊傳送至伺服器200。
In this embodiment, after the
S205,回應於參會者之第三操作,從設備320確定座位。
S205 , in response to the third operation of the participant, determine the seat from the
其中,第三操作可以包括於從設備320之流覽器應用中觸發選擇座位之控制項。
Wherein, the third operation may include triggering a control item of seat selection in the browser application of the
於本實施例中,從設備320確定會議連結不係預定座位之會議連結後,參會者於從設備320之流覽器應用中觸發選擇座位之控制項,從設備320選擇座位,並讀取該座位之選定資訊。其中,座位之選定資訊包括座位已被選定或座位未被選定。
In this embodiment, after the
S206,從設備320確定座位是否已被其他參會者選定。若座位已被其他參會者選定,則返回執行步驟S205。若座位未被其他參會者選定,則返回執行步驟S204。
S206, the
於本實施例中,當參會者通過從設備320選擇座位後,從設備320可以讀取該座位之選定資訊,以確定該座位是否已被其他參會者選定。當參會者選擇之座位未被其他參會者選定時,從設備320可從選定座位之視角顯示虛擬會議室之場景和其他參會者。當參會者選擇之座位已被其他參會者選定時,從設備320提示參會者重選座位。
In this embodiment, after a participant selects a seat through the
S207,從設備320顯示虛擬會議室之座點陣圖。
S207, the
於本實施例中,當從設備320將第一聲紋資訊傳送至伺服器200後,顯示虛擬會議室之座點陣圖。可參閱圖6,環形虛擬會議室中有6個座位,6個座位環繞形成虛擬會議室之座點陣圖,呈現真實虛擬會議室之效果。
In this embodiment, after the
圖7係本申請另一實施方式之音訊處理方法之流程圖。 FIG. 7 is a flowchart of an audio processing method in another embodiment of the present application.
可參閱圖7,本實施例之音訊處理方法應用於伺服器200,音訊處理方法可以包括以下步驟:
Referring to FIG. 7, the audio processing method of this embodiment is applied to the
S301,伺服器200接收來自於主設備310之建立會議之請求。
S301, the
於本實施例中,主持人可以於三維圖形圖像軟體中觸發建立會議之控制項,使得主設備310發送建立會議之請求至伺服器200。伺服器200接收來自於主設備310之建立會議之請求。
In this embodiment, the moderator can trigger the control item of establishing a meeting in the 3D graphics software, so that the
S302,伺服器200根據建立會議之請求向主設備310開放模型庫之存取權限。
S302, the
於本實施例中,伺服器200接收到來自於主設備310之建立會議之請求後,向主設備310開放模型庫之存取權限,允許主設備310訪問伺服器200之模型庫並從模型庫中調用虛擬會議室模型。
In this embodiment, after the
S303,伺服器200根據主設備310所選定之虛擬會議室模型建立虛擬會議室。
S303, the
於本實施例中,主持人可以於三維圖形圖像軟體中選擇虛擬會議室模型,觸發選擇虛擬會議室模型之控制項,使得主設備310從伺服器200之模型庫中選擇虛擬會議室模型。伺服器200根據主設備310所選定之虛擬會議室模型建立虛擬會議室。
In this embodiment, the moderator can select a virtual conference room model in the 3D graphics software, and trigger the control item for selecting the virtual conference room model, so that the
於一些實施例中,伺服器200可以根據預設之虛擬會議室比例建立虛擬會議室模型,並通過UV映射之工具,使得主設備310或從設備320可以顯示虛擬會議室模型之動態畫面。
In some embodiments, the
於另一些實施例中,伺服器200可以根據預存之虛擬會議室圖片,從虛擬會議室圖片中提取紋理特徵,並通過貼圖之工具,將紋理特徵添加到預設之基本模型中,使得主設備310或從設備320可以顯示虛擬會議室模型之靜態畫面。
In some other embodiments, the
S304,伺服器200根據虛擬會議室之座位分佈設置網格頂點之數目。
S304, the
其中,網格(Mesh)係三維圖形圖像軟體構圖之基本單元,虛擬會議室由複數網格拼接構成。一個網格包括4個頂點(Vertex)。虛擬會議室中一個座位所覆蓋區域包含之網格頂點之數目越多,該區域之網格頂點之密度也就越大。伺服器200根據虛擬會議室之座位分佈設置網格頂點之數目,於各個座位所覆蓋區域設置不同數目之網格頂點,即各個座位所覆蓋區域之網格頂點之密度不同,使得座位與網格頂點之數目或密度形成一一對應之關係。
Among them, Mesh is the basic unit of three-dimensional graphics and image software composition, and the virtual conference room is composed of multiple meshes. A grid includes 4 vertices (Vertex). The more the number of grid vertices contained in the area covered by a seat in the virtual conference room, the greater the density of the grid vertices in this area. The
S305,伺服器200接收來自於主設備310或從設備320之第一聲紋資訊。
S305, the
其中,第一聲紋資訊可以包括語音訊號之頻率、振幅及相位差。 Wherein, the first voiceprint information may include the frequency, amplitude and phase difference of the voice signal.
於本實施例中,當主持人發言時,主持人於主設備310之三維圖形圖像軟體中觸發錄製音訊之控制項,主設備310通過第一音訊模組313錄製主持人之語音訊號,從語音訊號中提取主持人之第一聲紋資訊,並將第一聲紋資訊傳送至伺服器200。伺服器200可以接收來自於主設備310之第一聲紋資訊。
In this embodiment, when the host speaks, the host triggers the audio recording control item in the three-dimensional graphics software of the
當參會者發言時,參會者於從設備320之流覽器應用中觸發錄製音訊之控制項,從設備320通過第二音訊模組323錄製參會者之語音訊號,從語音訊號中提取參會者之第一聲紋資訊,並將第一聲紋資訊傳送至伺服器200。伺服器200可以接收來自於從設備320之第一聲紋資訊。
When a participant speaks, the participant triggers the audio recording control item in the browser application of the
S306,伺服器200根據網格頂點之數目調整第一聲紋資訊之頻率或振幅,得到第二聲紋資訊。
S306, the
於本實施例中,虛擬會議室中每個座位具對應之網格頂點之數目。伺服器200根據網格頂點之數目調整第一聲紋資訊之頻率或振幅。例如,網格頂點數目越多或密度越大之座位對應之第一聲紋資訊之頻率越高或振幅越大。當第一座位所覆蓋區域之網格頂點之數目n1與第二座位所覆蓋區域之網格頂點之數目n2滿足:n1>n2時,伺服器200調整來自於第一座位之第一聲紋資訊或來自於第二座位之第一聲紋資訊,使得來自於第一座位之第一聲紋資訊和來自於第二座位之第一聲紋資訊滿足:f1>f2或a1>a2,其中,f1表示來自於第一座位之第一聲紋資訊之頻率,f2表示來自於第二座位之第一聲紋資訊之頻率,a1表示來自於第一座位之第一聲紋資訊之振幅,a2表示來自於第二座位之第一聲紋資訊之振幅。
In this embodiment, each seat in the virtual conference room has a corresponding number of grid vertices. The
舉例而言,可參閱圖8,伺服器200預先設置每個座位之網格頂點之數目,當伺服器200接收到6個參會者之第一聲紋資訊時,可以對6段第一聲紋資訊進行處理,根據每個座位之網格頂點之數目或密度調整相應之第一聲紋資訊之頻率或振幅,得到6段第二聲紋資訊,以提高聲紋資訊之可辨識性。
For example, referring to FIG. 8, the
S307,伺服器200根據第二聲紋資訊確定發言者於虛擬會議室中的座位。
S307, the
於本實施例中,當伺服器200擷取第一聲紋資訊後,無法確定第一聲紋資訊之來源。伺服器200將虛擬會議室中的每個座位所覆蓋區域之網格頂點之數目與第一聲紋資訊建立對應關係,網格頂點之數目越多之區域對應之第一聲紋資訊之頻率或振幅越高。伺服器200根據網格頂點之數目調整來自於不同座位之第一聲紋資訊之頻率或振幅,得到更具辨識性之第二聲紋資訊。由於每個座位上之第二聲紋資訊之頻率或振幅不同,使得第二聲紋資訊與每個座位具
一一對應之關係,伺服器200由此可以根據第二聲紋資訊確定發言者於虛擬會議室中的座位。
In this embodiment, after the
圖9係本申請另一實施方式之音訊處理方法之流程圖。 FIG. 9 is a flowchart of an audio processing method in another embodiment of the present application.
可參閱圖9,本實施例之音訊處理方法應用於伺服器200,音訊處理方法可以包括以下步驟:
Referring to FIG. 9, the audio processing method of this embodiment is applied to the
S401,伺服器200控制從設備320採集參會者之眼球運動方向資訊。
S401, the
於本實施例中,當一個參會者正於發言時,伺服器200識別出該參會者之聲紋資訊後,控制其他從設備320採集其他參會者之眼球運動方向資訊。
In this embodiment, when a participant is speaking, the
S402,伺服器200根據參會者之眼球運動方向資訊確定參會者之專心度。
S402, the
其中,專心度係指參會者對發言者講話內容之專心程度或對會議議題之專心程度。專心度越高表示參會者對會議議題越有興趣。當一個發言者正於發言時,伺服器200接收到其他參會者之眼球運動方向資訊後,可以根據其他參會者之眼球運動方向資訊確定其他參會者之專心度。
Among them, the degree of concentration refers to the degree of concentration of the participants on the content of the speaker's speech or the degree of concentration on the topic of the meeting. The higher the degree of concentration, the more interested the participants are in the meeting topics. When a speaker is speaking, the
舉例而言,當一個發言者正於發言時,如果一個參會者之眼球運動方向朝向該發言者,則表示該參會者當前係專心之,可將該參會者之專心度標記為1。如果該參會者之眼球運動方向遠離該發言者,則表示該參會者當前不專心,可將該參會者之專心度標記為0。於整場會議之10輪發言中,如果一個參會者專心度為1之次數為6輪,專心度為0之次數為4輪,可認為該參會者對會議議題之專心度為6/10=0.6。 For example, when a speaker is speaking, if a participant's eyeball moves towards the speaker, it means that the participant is currently concentrating, and the concentration of the participant can be marked as 1. If the participant's eye movement direction is far away from the speaker, it means that the participant is currently not paying attention, and the concentration of the participant can be marked as 0. In the 10 rounds of speeches in the whole meeting, if the number of times a participant's concentration is 1 is 6 rounds, and the number of times of concentration is 0 is 4 rounds, it can be considered that the participant's concentration on the conference topic is 6/10=0.6.
S403,伺服器200根據參會者之專心度確定參會者對會議議題是否有興趣。
S403, the
於本實施例中,伺服器200統計參會者對會議議題之專心度,可將專心度與預設之興趣閾值進行比較,如果專心度大於或等於興趣閾值,則說明參會者對會議議題有興趣。如果專心度小於興趣閾值,則說明參會者對會議議題沒有興趣。
In this embodiment, the
舉例而言,預設之興趣閾值為0.6,於整場會議中,如果一個參會者對會議議題之專心度為0.5,由於該參會者對會議議題之專心度小於興趣閾值,則說明該參會者對會議議題沒有興趣。如果一個參會者對會議議題之專心度為0.7,由於該參會者對會議議題之專心度大於興趣閾值,則說明該參會者對會議議題有興趣。 For example, the preset interest threshold is 0.6. In the whole meeting, if a participant's concentration on the conference topic is 0.5, since the participant's concentration on the conference topic is less than the interest threshold, it means that the participant is not interested in the conference topic. If a participant's concentration on the conference topic is 0.7, since the participant's concentration on the conference topic is greater than the interest threshold, it means that the participant is interested in the conference topic.
本申請實施例還提供一種存儲介質,用於存儲電腦程式或代碼,當所述電腦程式或代碼被處理器執行時,實現本申請實施例之音訊處理方法。 The embodiment of the present application also provides a storage medium for storing computer programs or codes, and when the computer programs or codes are executed by a processor, the audio processing method of the embodiments of the present application is implemented.
存儲介質包括於用於存儲資訊(諸如電腦可讀指令、資料結構、程式模組或其它資料)之任何方法或技術中實施之易失性和非易失性、可移除和不可移除介質。存儲介質包括,但不限於,隨機存取記憶體(Random Access Memory,RAM)、唯讀記憶體(Read-Only Memory,ROM)、帶電可擦可程式設計唯讀記憶體(Electrically Erasable Programmable Read-Only Memory,EEPROM)、快閃記憶體或其它記憶體、唯讀光碟(Compact Disc Read-Only Memory,CD-ROM)、數位通用光碟(Digital Versatile Disc,DVD)或其它光碟存儲、磁盒、磁帶、磁片存儲或其它磁存儲裝置、或者可以用於存儲期望之資訊並且可以被電腦訪問之任何其它之介質。 Storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data. Storage media include, but are not limited to, random access memory (Random Access Memory, RAM), read-only memory (Read-Only Memory, ROM), electrically erasable programmable read-only memory (Electrically Erasable Programmable Read-Only Memory, EEPROM), flash memory or other memories, compact disc read-only memory (CD-ROM), digital versatile disc ( Digital Versatile Disc (DVD) or other optical disc storage, magnetic cartridge, magnetic tape, magnetic disk storage or other magnetic storage device, or any other medium that can be used to store desired information and can be accessed by a computer.
上面結合附圖對本申請實施例作了詳細說明,但本申請不限於上述實施例,於所屬技術領域普通具通常技藝者所具備之知識範圍內,還可於不脫離本申請宗旨之前提下做出各種變化。 The embodiments of the present application have been described in detail above in conjunction with the accompanying drawings, but the present application is not limited to the above embodiments, and various changes can be made within the knowledge of ordinary skilled persons in the technical field without departing from the purpose of the present application.
S301-S307:步驟 S301-S307: Steps
Claims (9)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
TW110144724A TWI807504B (en) | 2021-11-30 | 2021-11-30 | Method, device and storage medium for audio processing of virtual meeting room |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
TW110144724A TWI807504B (en) | 2021-11-30 | 2021-11-30 | Method, device and storage medium for audio processing of virtual meeting room |
Publications (2)
Publication Number | Publication Date |
---|---|
TW202325006A TW202325006A (en) | 2023-06-16 |
TWI807504B true TWI807504B (en) | 2023-07-01 |
Family
ID=87803678
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
TW110144724A TWI807504B (en) | 2021-11-30 | 2021-11-30 | Method, device and storage medium for audio processing of virtual meeting room |
Country Status (1)
Country | Link |
---|---|
TW (1) | TWI807504B (en) |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20070247524A1 (en) * | 2006-04-19 | 2007-10-25 | Tomoaki Yoshinaga | Attention Level Measuring Apparatus and An Attention Level Measuring System |
CN102007730A (en) * | 2007-10-24 | 2011-04-06 | 社会传播公司 | Automated real-time data stream switching in a shared virtual area communication environment |
CN110035250A (en) * | 2019-03-29 | 2019-07-19 | 维沃移动通信有限公司 | Audio-frequency processing method, processing equipment, terminal and computer readable storage medium |
-
2021
- 2021-11-30 TW TW110144724A patent/TWI807504B/en active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20070247524A1 (en) * | 2006-04-19 | 2007-10-25 | Tomoaki Yoshinaga | Attention Level Measuring Apparatus and An Attention Level Measuring System |
CN102007730A (en) * | 2007-10-24 | 2011-04-06 | 社会传播公司 | Automated real-time data stream switching in a shared virtual area communication environment |
CN110035250A (en) * | 2019-03-29 | 2019-07-19 | 维沃移动通信有限公司 | Audio-frequency processing method, processing equipment, terminal and computer readable storage medium |
Also Published As
Publication number | Publication date |
---|---|
TW202325006A (en) | 2023-06-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9497416B2 (en) | Virtual circular conferencing experience using unified communication technology | |
US7475112B2 (en) | Method and system for presenting a video conference using a three-dimensional object | |
US7840638B2 (en) | Participant positioning in multimedia conferencing | |
CN110401810B (en) | Virtual picture processing method, device and system, electronic equipment and storage medium | |
CN103368929A (en) | Video chatting method and system | |
EP2685715A1 (en) | Method and device for managing video resources in video conference | |
CN111064919A (en) | VR (virtual reality) teleconference method and device | |
EP3331240A1 (en) | Method and device for setting up a virtual meeting scene | |
EP3024223B1 (en) | Videoconference terminal, secondary-stream data accessing method, and computer storage medium | |
WO2021170123A1 (en) | Video generation method and device, and corresponding storage medium | |
CN108877848A (en) | The method and device that user's operation is coped in room mode is said in virtual three-dimensional space | |
CN113873195A (en) | Video conference control method, device and storage medium | |
US20230283888A1 (en) | Processing method and electronic device | |
CN112422882A (en) | Method and device for providing video source for video conference system | |
CN114143494A (en) | Video communication method, electronic equipment and communication system | |
TWI807504B (en) | Method, device and storage medium for audio processing of virtual meeting room | |
CN101677391A (en) | Method for producing network video image | |
CN101583010A (en) | Image processing method and image processing system | |
CN116208433B (en) | Audio processing method, device and storage medium for virtual conference room | |
CN116016837A (en) | Immersive virtual network conference method and device | |
US20240046951A1 (en) | Speech image providing method and computing device for performing the same | |
CN115423728A (en) | Image processing method, device and system | |
CN115225915A (en) | Live broadcast recording device, live broadcast recording system and live broadcast recording method | |
CN112584084B (en) | Video playing method and device, computer equipment and storage medium | |
JP2023067708A (en) | Terminal, information processing method, program, and recording medium |