TWI811148B - Method for achieving latency-reduced one-to-many communication based on surrounding video and associated computer program product set - Google Patents

Method for achieving latency-reduced one-to-many communication based on surrounding video and associated computer program product set Download PDF

Info

Publication number
TWI811148B
TWI811148B TW111142416A TW111142416A TWI811148B TW I811148 B TWI811148 B TW I811148B TW 111142416 A TW111142416 A TW 111142416A TW 111142416 A TW111142416 A TW 111142416A TW I811148 B TWI811148 B TW I811148B
Authority
TW
Taiwan
Prior art keywords
audio
voice
communication device
communication
surround view
Prior art date
Application number
TW111142416A
Other languages
Chinese (zh)
Other versions
TW202420790A (en
Inventor
許精一
戴東華
Original Assignee
許精一
戴東華
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 許精一, 戴東華 filed Critical 許精一
Priority to TW111142416A priority Critical patent/TWI811148B/en
Application granted granted Critical
Publication of TWI811148B publication Critical patent/TWI811148B/en
Publication of TW202420790A publication Critical patent/TW202420790A/en

Links

Landscapes

  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
  • Mobile Radio Communication Systems (AREA)

Abstract

A method for achieving latency-reduced one-to-many communication based on surrounding video and an associated computer program product set are disclosed. The method includes: a video and audio data conversion and sending step, a video and audio playback step, a unique voice communication request step, a unique voice communication execution step and a first voice playing step. The present invention uses third-party equipment to change the type of two-way video and audio transmission. One of the two sides of communication can use the surround view video and audio and the communication delay can be reduced. The present invention is no longer limited by the traditional single-view video and audio transmission. Application scenarios expands.

Description

基於環景影音的減少延遲的一對多通訊方法與計算機程 式產品組 Delay-reducing one-to-many communication method and computer program based on surround view audio and video product group

本發明關於一種一對多通訊方法與計算機程式產品,特別是一種基於環景影音的減少延遲的一對多通訊方法與計算機程式產品組。 The present invention relates to a one-to-many communication method and a computer program product, in particular to a one-to-many communication method and a computer program product group based on surround view video and audio with reduced delay.

20世紀之前,人們之間要進行實時通訊的主要技術是電話。話筒兩端的通訊者可以隨時說出自己想說的話,並幾乎無時差地聽到對方的語音。可惜的是,通訊者無法獲得對方的影像。在當時,可視電話(Videotelephony)已有了實際的使用,比如將音頻(電話)系統與兩個通過同軸電纜或無線電連接的閉路電視系統組成。這種雙向的通訊系統相對昂貴,且影音同步的體驗也不太好。直到20世紀後期,隨著強大的視頻編解碼器的出現,結合高速寬頻互聯網和整合服務數位網路服務,可視電話才成為常規使用的實用技術。進入21世紀,網路通訊的軟硬體架構及通訊規範更加完善,基於網路的即時通訊技術產品(如SKYPE)便逐漸地改變了人際之間的實時溝通方式。人們開始可以在幾乎無延遲的情況下與他人進行影音通訊,通訊載具也由桌上電腦轉變成了智慧型手機。更有甚者,影音通訊也早由一對一便成了多點通訊。加上串流技術的成熟,多點影音通 訊在即時性與影像品質(藉由壓縮技術)上也大幅度提升。這種技術推進讓人們在疫情大流行的今日可以遠端完成協作,維持經濟運作於不墜。 Before the 20th century, the main technology for real-time communication between people was the telephone. The communicators at both ends of the microphone can say what they want to say at any time, and hear the other party's voice almost without time difference. It is a pity that the communicator cannot obtain the image of the other party. At that time, videotelephony (Videotelephony) was already in practical use, such as combining an audio (telephone) system with two closed-circuit television systems connected by coaxial cable or radio. This two-way communication system is relatively expensive, and the experience of audio and video synchronization is not very good. It wasn't until the late 20th century, with the advent of powerful video codecs, combined with high-speed broadband Internet and integrated service digital network services, that videophones became a practical technology for routine use. In the 21st century, the software and hardware architecture and communication specifications of network communication have become more perfect, and network-based instant messaging technology products (such as SKYPE) have gradually changed the way of real-time communication between people. People began to communicate with others through video and audio with almost no delay, and the communication vehicle has also changed from a desktop computer to a smart phone. What's more, audio-visual communication has already changed from one-to-one to multi-point communication. Coupled with the maturity of streaming technology, multi-point audio-visual The immediacy and image quality (through compression technology) of the information are also greatly improved. This kind of technological advancement allows people to collaborate remotely and maintain economic operations in today's pandemic.

環景影像是晚近成熟且普及的技術,擴展了人們的視野。環景影像藉由多個不同方向取景的鏡頭同步錄攝影像,通過影像縫合技術,把同一時間該些鏡頭取得的影像,整合成一幅幅的環景影像。觀看者可旋轉環景影像,由不同視角看到不同場景。傳統上,串流技術中使用的是單視角影像(以單一鏡頭錄攝的影像)而非環景影像。然而,隨著相關串流協定(如HTTP Live Streaming,HLS)的完備,環景影像也可以進行串流廣播,讓多數人可以同步觀賞。環景影像的錄製也可配合收音設備擷取背景聲音,達到單向影音通訊的目的。如果影像接受方需要跟發送方溝通時,需要通過另一套影音溝通設備,一般是採單視角影音為之。就多向溝通而言,如果都使用單視角影音,使用的網路頻寬較小,品質也較好,但適用於通訊者處於相對於鏡頭固定位置的情況,比如使用於網路會議中。如果通訊方中有一方使用環景影音,該方的活動空間就大,這應用的場景就更多,比如網路教學及最近開始流行的線上旅行(導遊於景點錄攝美景及講解,透過網路即時串流與線上遊客互動),但目前並沒有合適的解決方案。 Surround image is a recently mature and popular technology that expands people's field of vision. Surround view images are recorded synchronously by multiple lenses framing views in different directions. Through image stitching technology, the images obtained by these lenses at the same time are integrated into one after another surround view images. The viewer can rotate the panorama image to see different scenes from different perspectives. Traditionally, single-view images (images recorded with a single lens) are used in streaming technology instead of surround-view images. However, with the completion of related streaming protocols (such as HTTP Live Streaming, HLS), surround view images can also be streamed and broadcasted, so that most people can watch them simultaneously. Surround image recording can also cooperate with radio equipment to capture background sound to achieve the purpose of one-way audio-visual communication. If the image receiver needs to communicate with the sender, it needs to use another set of audio-visual communication equipment, usually using single-view audio-visual. As far as multi-directional communication is concerned, if single-view audio and video are used, the network bandwidth used is smaller and the quality is better, but it is suitable for situations where the communicators are in a fixed position relative to the camera, such as in online conferences. If one of the communication parties uses surround view audio and video, the activity space of the party will be large, and there will be more application scenarios, such as online teaching and online travel that has become popular recently (the tour guide records the beautiful scenery and explains it at the scenic spot, through the Internet) live streaming to interact with online tourists), but there is currently no suitable solution.

本段文字提取和編譯本發明的某些特點。其它特點將被揭露於後續段落中。其目的在涵蓋附加的申請專利範圍之精神和範圍中,各式的修改和類似的排列。 This paragraph extracts and compiles certain features of the present invention. Other features will be disclosed in subsequent paragraphs. It is intended to cover various modifications and similar arrangements within the spirit and scope of the appended claims.

為了滿足以上需求,本發明揭露一種基於環景影音的減少延遲的一對多通訊方法,包含:一影音資料轉換發送步驟:由一中繼伺服器接收來自一環景影音錄攝裝置的經過一直播協議編碼的複數個第一環景影音串流封包、將該些 第一環景影音串流封包依序依照網頁即時通訊(Web Real-Time Communication,WebRTC)應用程式介面轉換為複數個第二環景影音串流封包,及將該些第二環景影音串流封包向與該中繼伺服器資訊連接的複數個影音通訊裝置發送;一影音播放步驟:分別於該些影音通訊裝置播放該些第二環景影音串流封包對應的一連續環景影音;一唯一語音通訊要求步驟:至少一影音通訊裝置中每一者將一第一語音轉換為符合一語音協定的數個第一語音封包,該些第一語音封包連同一通訊要求指令發送至該中繼伺服器;一唯一語音通訊執行步驟:由該中繼伺服器將該些第一語音封包發送給與其資訊連接的一語音通訊裝置,其中,若有二個以上影音通訊裝置發出該通訊要求指令,先發出該通訊要求的影音通訊裝置所發送的第一語音封包會被發送給該語音通訊裝置,其它影音通訊裝置所發送的第一語音封包被捨棄;及一第一語音撥放步驟:由該語音通訊裝置撥放由該中繼伺服器發送的第一語音封包所對應的該第一語音。 In order to meet the above requirements, the present invention discloses a one-to-many communication method based on surround view audio and video with reduced delay, including: a video and audio data conversion and sending step: a relay server receives a live broadcast from a surround view video and audio recording device A plurality of first ambient video stream packets encoded by the protocol, these The first surround view video stream packet is sequentially converted into a plurality of second surround view video stream packets according to the web real-time communication (Web Real-Time Communication, WebRTC) application program interface, and these second surround view video stream packets The packet is sent to a plurality of audio-visual communication devices connected to the relay server; an audio-visual playing step: respectively playing a continuous ambient video corresponding to the second ambient video streaming packets on the audio-visual communication devices; Unique voice communication request step: each of at least one audio-visual communication device converts a first voice into a plurality of first voice packets conforming to a voice protocol, and sends the first voice packets to the relay together with a communication request command Server; a unique voice communication execution step: the relay server sends these first voice packets to a voice communication device connected to its information, wherein, if more than two audio-visual communication devices send the communication request command, The first voice packet sent by the audio-visual communication device that sends out the communication request first will be sent to the voice communication device, and the first voice packets sent by other audio-visual communication devices are discarded; and a first voice dialing step: by the audio-visual communication device The voice communication device plays the first voice corresponding to the first voice packet sent by the relay server.

所述的基於環景影音的減少延遲的一對多通訊方法可進一步包含:一特定對象語音通訊要求步驟:由該語音通訊裝置將一第二語音轉換為符合該語音協定的數個第二語音封包,該些第二語音封包連同一指定通訊要求指令發送至該中繼伺服器,其中該指定通訊要求指令指定該些第二語音中繼發送到一指定影音通訊裝置;一特定對象語音通訊執行步驟:由該中繼伺服器將該些第二語音封包發送給該指定通訊要求指令指定的該指定影音通訊裝置;及一第二語音撥放步驟:由該指定影音通訊裝置撥放由該中繼伺服器發送的第二語音封包所對應的該第二語音。 The one-to-many communication method based on surround view audio and video with reduced delay may further include: a specific object voice communication request step: converting a second voice into several second voices conforming to the voice agreement by the voice communication device packets, the second voice packets are sent to the relay server together with a designated communication request command, wherein the designated communication request command designates the second voice relays to be sent to a designated audio-visual communication device; a specific object voice communication is executed Step: sending the second voice packets by the relay server to the designated audio-visual communication device designated by the designated communication request command; The second voice corresponding to the second voice packet sent by the server.

依照本發明,該環景影音錄攝裝置可進一步將一固定文字、一跑馬文字、一圖像及/或一背景聲音混入該連續環景影音的至少一時段中。 According to the present invention, the surround view video recording device can further mix a fixed text, a horse racing text, an image and/or a background sound into at least one period of the continuous surround view video.

最好,該直播協議為即時訊息協定(Real-Time Messaging Protocol,RTMP)或加密即時訊息協定(RTMP-S)。 Preferably, the live broadcast protocol is Real-Time Messaging Protocol (RTMP) or encrypted real-time messaging protocol (RTMP-S).

最好,該語音協定為H.323規範、對話啟動協定(Session Initiation Protocol,SIP)或媒體網關控制協議(Media Gateway Control Protocol,MGCP)。 Preferably, the voice protocol is H.323 specification, Session Initiation Protocol (Session Initiation Protocol, SIP) or Media Gateway Control Protocol (Media Gateway Control Protocol, MGCP).

最好,該影音通訊裝置為智慧型手機、平板電腦、筆記型電腦、桌上型電腦、具有播音功能的智能眼鏡或具有播音功能的頭戴式立體影像播放器。 Preferably, the audio-visual communication device is a smart phone, a tablet computer, a notebook computer, a desktop computer, smart glasses with a broadcasting function, or a head-mounted stereoscopic image player with a broadcasting function.

最好,該語音通訊裝置為智慧型手機或平板電腦。 Preferably, the voice communication device is a smart phone or a tablet computer.

本發明亦揭露一種基於環景影音的減少延遲的一對多通訊的計算機程式產品組,包含:一第一計算機程式產品,經由一中繼伺服器載入該程式執行:一第1程式指令:接收來自一環景影音錄攝裝置的經過一直播協議編碼的複數個第一環景影音串流封包;一第2程式指令:將該些第一環景影音串流封包依序依照網頁即時通訊應用程式介面轉換為複數個第二環景影音串流封包;一第3程式指令:將該些第二環景影音串流封包向與該中繼伺服器資訊連接的複數個影音通訊裝置發送;一第4程式指令:接收來自至少一影音通訊裝置的數個第一語音封包及一通訊要求指令發送;及一第5程式指令:將該些第一語音封包發送給與其資訊連接的一語音通訊裝置,其中,若有二個以上影音通訊裝置發出該通訊要求指令,先發出該通訊要求的影音通訊裝置所發送的第一語音封包會被發送給該語音通訊裝置,其它影音通訊裝置所發送的第一語音封包被捨棄;以及一第二計算機程式產品,經由任一影音通訊裝置載入該程式執行:一第6程式指令:分別於該影音通訊裝置播放該些第二環景影音串流封包對應的一連續環景影音;一第7程式指令:由至少一影音通訊裝置中每一者將一第一語音轉換為符合一語音協定的數個第一語音封包;及一第8程式指令:將該些第一語音封包連同該通訊要求指令發送至該中繼伺服器。 The present invention also discloses a computer program product group for one-to-many communication with reduced delay based on surround view video and audio, including: a first computer program product, loaded into the program via a relay server for execution: a first program instruction: receiving a plurality of first surround view video stream packets encoded by a live broadcast protocol from a surround view video recording device; The program interface is converted into a plurality of second ambient video streaming packets; a third program instruction: sending these second ambient video streaming packets to a plurality of audio-visual communication devices connected to the relay server information; a The 4th program command: receive several first voice packets from at least one audio-visual communication device and send a communication request command; and a 5th program command: send these first voice packets to a voice communication device connected with its information , wherein, if more than two audio-visual communication devices issue the communication request command, the first audio packet sent by the audio-visual communication device that first issued the communication request will be sent to the audio communication device, and the second voice packet sent by the other audio-visual communication devices A voice packet is discarded; and a second computer program product is loaded into the program via any audio-visual communication device for execution: a sixth program instruction: respectively play the corresponding second surround-view audio-visual streaming packets on the audio-visual communication device a continuous surround view audio-visual; a seventh program instruction: each of at least one audio-visual communication device converts a first voice into a plurality of first voice packets conforming to a voice protocol; and an eighth program instruction: convert The first voice packets are sent to the relay server together with the communication request command.

基於環景影音的減少延遲的一對多通訊的計算機程式產品組進一步包含:一第三計算機程式產品,經由該語音通訊裝置載入該程式執行:一第9程式指令:撥放由該中繼伺服器發送的第一語音封包所對應的該第一語音。 The computer program product group of one-to-many communication with reduced delay based on surround view audio and video further includes: a third computer program product, loaded into the program through the voice communication device and executed: a ninth program instruction: play the relay The first voice corresponding to the first voice packet sent by the server.

該第三計算機程式產品可進一步執行:一第10程式指令:將一第二語音轉換為符合該語音協定的數個第二語音封包;及一第11程式指令:將該些第二語音封包連同一指定通訊要求指令發送至該中繼伺服器,其中該指定通訊要求指令指定該些第二語音中繼發送到一指定影音通訊裝置。該第一計算機程式產品可進一步執行:一第12程式指令:將該些第二語音封包發送給該指定通訊要求指令指定的該指定影音通訊裝置。該第二計算機程式產品可進一步執行:一第13程式指令:由該指定影音通訊裝置撥放由該中繼伺服器發送的第二語音封包所對應的該第二語音。 The third computer program product can further execute: a tenth program instruction: convert a second voice into a plurality of second voice packets conforming to the voice protocol; and an eleventh program instruction: combine these second voice packets with A designated communication request command is sent to the relay server, wherein the designated communication request command designates the second voice relays to send to a designated audio-visual communication device. The first computer program product may further execute: a twelfth program instruction: send the second voice packets to the designated audio-visual communication device designated by the designated communication request instruction. The second computer program product may further execute: a thirteenth program instruction: the designated audio-visual communication device plays the second voice corresponding to the second voice packet sent by the relay server.

本發明利用第三方設備改變了雙向影音傳輸的型態,其中一方使用環景影音且可減少通訊延遲。本發明不再受限於傳統的單視角影音傳輸,擴大了應用場景。 The present invention uses third-party equipment to change the type of two-way audio-visual transmission, one of which uses surround-view audio-visual and can reduce communication delay. The present invention is no longer limited to the traditional single-view video and audio transmission, and expands the application scenarios.

1:網路 1: Internet

10:中繼伺服器 10:Relay server

20:環景影音錄攝裝置 20: Surround view audio and video recording device

31:第一影音通訊裝置 31: The first audio-visual communication device

32:第二影音通訊裝置 32: Second audio-visual communication device

40:語音通訊裝置 40:Voice communication device

41:顯示介面 41: Display interface

A:導遊 A: tour guide

B:雲端遊客 B: Cloud tourists

C:雲端遊客 C: Cloud tourists

圖1為依照本發明實施方式的一種基於環景影音的減少延遲的一對多通訊方法應用的硬體架構示意圖。 FIG. 1 is a schematic diagram of a hardware architecture of an application of a one-to-many communication method with reduced delay based on surround view audio and video according to an embodiment of the present invention.

圖2為該基於環景影音的減少延遲的一對多通訊方法的流程圖。 FIG. 2 is a flow chart of the delay-reducing one-to-many communication method based on surround view audio and video.

圖3為該基於環景影音的減少延遲的一對多通訊方法應用於圖1的硬體架構的時序作業圖。 FIG. 3 is a sequence operation diagram of applying the one-to-many communication method with reduced delay based on surround view audio and video to the hardware architecture of FIG. 1 .

圖4繪示該基於環景影音的減少延遲的一對多通訊方法的一種應用場景。 FIG. 4 illustrates an application scenario of the delay-reducing one-to-many communication method based on surround view audio and video.

圖5繪示一語音通訊裝置的顯示介面。 FIG. 5 illustrates a display interface of a voice communication device.

本發明將藉由參照下列的實施方式而更具體地描述。 The present invention will be described more specifically by referring to the following embodiments.

請見圖1至圖3,圖1為依照本發明實施方式的一種基於環景影音的減少延遲的一對多通訊方法(以下簡稱本方法)應用的硬體架構示意圖,圖2為本方法的流程圖,圖3為本方法應用於圖1的硬體架構的時序作業圖。 Please refer to FIG. 1 to FIG. 3. FIG. 1 is a schematic diagram of the hardware architecture of a one-to-many communication method (hereinafter referred to as the method) based on surround view audio and video with reduced delay according to an embodiment of the present invention, and FIG. 2 is a schematic diagram of the application of the method. As a flow chart, FIG. 3 is a timing diagram of the method applied to the hardware architecture of FIG. 1 .

本方法的第一步驟為影音資料轉換發送步驟,其內容為由一中繼伺服器10接收來自一環景影音錄攝裝置20的經過一直播協議編碼的複數個第一環景影音串流封包、將該些第一環景影音串流封包依序依照網頁即時通訊(Web Real-Time Communication,WebRTC)應用程式介面(Application Programming Interface,API)轉換為複數個第二環景影音串流封包,及將該些第二環景影音串流封包向與中繼伺服器10資訊連接的複數個影音通訊裝置發送(S01)。依照本發明,中繼伺服器10的型態不限,但需要能執行本發明主張的程式以進行影像格式的轉換,本身並不是個用於儲存影像檔而依照需要提供用戶的硬體。環景影音錄攝裝置20是可以攝錄本身四周的360度連續環景影像並同步錄製背景聲音(連續環景影音)的電子設備。和一般單方向影音錄攝裝置(比如運動攝影機)相比,環景影音錄攝裝置20具有數個方向相異的攝像頭,每一個攝像頭同步錄製環景影像的一部份,環景影音錄攝裝置20的內部電路將同一時間的所有的部分影像縫接成一幅環景影像。連續環景影音在輸出前會經過直播協議編碼成數個第一環景影音串流封包,以便依序輸出而不易受干擾。由於連續環景影音是不間斷地錄攝,第一環景影音串流封包也會同步地產生並輸出。直播協議是製作第一環景影音串流封包(無線訊號)的規範。依照本發明,直播協議可以是,但不限於即時訊息協定(Real-Time Messaging Protocol,RTMP)或加密即時訊息協定(RTMP-S),在本實施例中以RTMP為例來說明。 The first step of the method is the audio-visual data conversion and sending step, and its content is that a relay server 10 receives a plurality of first surround-view video stream packets encoded by a live protocol from a surround-view video recording device 20, converting the first ambient video streaming packets into a plurality of second ambient video streaming packets in sequence according to the Web Real-Time Communication (WebRTC) application programming interface (Application Programming Interface, API), and Send the second surround view video stream packets to a plurality of video communication devices connected to the relay server 10 ( S01 ). According to the present invention, the type of the relay server 10 is not limited, but it needs to be able to execute the program advocated by the present invention to convert the image format, and it is not itself a hardware for storing image files and providing users as needed. Surround view video recording device 20 is an electronic device that can record 360-degree continuous surround view images around itself and simultaneously record background sound (continuous surround view video). Compared with general unidirectional audio-visual recording devices (such as motion cameras), the panoramic video-audio recording device 20 has several cameras with different directions, and each camera simultaneously records a part of the panoramic image, and the panoramic video-audio recording The internal circuit of the device 20 stitches all the partial images at the same time into one panoramic image. The continuous surround view video will be encoded into several first surround view stream packets through the live broadcast protocol before being output, so as to be output sequentially without being easily disturbed. Since the continuous surround view video is continuously recorded, the first surround view stream packet will also be generated and output synchronously. The live broadcast protocol is a specification for making the first ambient video streaming package (wireless signal). According to the present invention, the live broadcast protocol can be, but not limited to, Real-Time Messaging Protocol (RTMP) or Encrypted Instant Messaging Protocol (RTMP-S). In this embodiment, RTMP is taken as an example for illustration.

依照RTMP的第一環景影音串流封包由環景影音錄攝裝置20通過網路(比如無線通訊網路)傳送到中繼伺服器10,環景影音錄攝裝置20將該些第一環景影音串流封包依序依照WebRTC的API轉換為複數個第二環景影音串流封包。 RTMP編碼被大多數主流串流平台與設備使用,其編碼適合長時間播放。然而RTMP技術會產生一定的累積延遲,原因是應用RTMP的伺服器會把丟失的畫面檔案緩存起來,導致畫面的延遲出現。因此觀看者在看RTMP的直播影片時,無論是否為環景影片,都會感到明顯的延遲。為了解決這種因RTMP造成的不可避免的延遲,所以才將第一環景影音串流封包轉換為非RTMP規格的第二環景影音串流封包,免除了因規範限制造成的延遲。WebRTC是一個提供Web應用程式及網站進行錄影或隨選播放串流音訊與影像的技術,可以直接使用瀏覽器進行資料交換而無須透過中介服務,更進一步減少了延遲的發生。由於這種轉換,透過影音通訊裝置觀看連續環景影音的觀看者的感受可以更接近實時(Real Time)。依照本發明,播放連續環景影音的影音通訊裝置可以是個智慧型手機、平板電腦、筆記型電腦或桌上型電腦。由於連續環景影音提供立體影像,影音通訊裝置還可以是具有播音功能的智能眼鏡或具有播音功能的頭戴式立體影像播放器。為了說明方便,實施例中以二個影音通訊裝置,第一影音通訊裝置31與第二影音通訊裝置32,為例來說明。實作上,影音通訊裝置的數量不限於2個,可以更多。如圖3所示,在時間T1時,環景影音錄攝裝置20開始錄製連續環景影音並連續發送第一環景影音串流封包給中繼伺服器10,而中繼伺服器10在轉換環景影音串流封包後,隨即於時間T2向第一影音通訊裝置31與第二影音通訊裝置32發送對應的第二環景影音串流封包。T1與T2間的差距遠短於僅使用一般直播協議的中介時間差(伺服器處理時間)。此外,依照本發明,步驟S01執行的動作一直持續,以「…」來表示。由於中繼伺服器10無須如一般的直播伺服器備份第一環景影音串流封包與第二環景影音串流封包,因此環景影音錄攝裝置20與第一影音通訊裝置31或第二影音通訊裝置32間呈現實質的點對點連線。 According to RTMP, the first surround view video stream packets are transmitted by the surround view video recording device 20 to the relay server 10 through the network (such as a wireless communication network), and the surround view video recorder 20 transfers these first surround view The video and audio streaming packets are sequentially converted into a plurality of second surround view video and audio streaming packets according to the API of WebRTC. RTMP encoding is used by most mainstream streaming platforms and devices, and its encoding is suitable for long-term playback. However, RTMP technology will produce a certain cumulative delay, because the server using RTMP will cache the lost picture files, resulting in a delay in the picture. Therefore, when watching RTMP live video, whether it is a surround-view video or not, the viewer will feel obvious delay. In order to solve the unavoidable delay caused by RTMP, the first ambient video stream packet is converted into the second ambient video stream packet which is not in the RTMP specification, so as to avoid the delay caused by the specification limitation. WebRTC is a technology that provides web applications and websites to record video or play streaming audio and video on demand. It can directly use the browser to exchange data without going through intermediary services, further reducing the occurrence of delays. Due to this conversion, the experience of the viewer who watches the continuous surround view video through the audio-visual communication device can be closer to real time (Real Time). According to the present invention, the audio-visual communication device for playing continuous surround-view video and audio can be a smart phone, a tablet computer, a notebook computer or a desktop computer. Since the continuous surround view audio and video provides stereoscopic images, the audio and video communication device can also be smart glasses with a broadcasting function or a head-mounted stereoscopic image player with a broadcasting function. For the convenience of description, in the embodiment, two audio-visual communication devices, the first audio-visual communication device 31 and the second audio-visual communication device 32 , are taken as an example for illustration. In practice, the number of audio-visual communication devices is not limited to two, but can be more. As shown in FIG. 3 , at time T1, the surround view video recording device 20 starts to record continuous surround view videos and continuously sends the first surround view video stream packets to the relay server 10, and the relay server 10 is switching After the surround view video stream packet is sent, the corresponding second surround view video stream packet is sent to the first video communication device 31 and the second video communication device 32 at time T2. The gap between T1 and T2 is much shorter than the intermediary time difference (server processing time) using only general live streaming protocols. In addition, according to the present invention, the actions performed in step S01 are continuous, represented by "...". Since the relay server 10 does not need to back up the first surround view video streaming packet and the second surround view video streaming packet like a general live server, the surround view video recording device 20 and the first video communication device 31 or the second The audio-visual communication devices 32 present a substantial point-to-point connection.

本方法的第二步驟為影音播放步驟:分別於該些影音通訊裝置播放該些第二環景影音串流封包對應的連續環景影音(S02)。第一影音通訊裝置31與 第二影音通訊裝置32上安裝了解碼WebRTC格式的第二環景影音串流封包的應用軟體或特製硬體,以便能在其上撥放連續環景影音。步驟S01與步驟S02完成了基於環景影音、減少延遲的「一對多通訊」。持有環景影音錄攝裝置20的人(以下簡稱第一通訊人)可以向環景影音錄攝裝置20的麥克風說話,他說的話會成為連續環景影音的背景聲音的一部分。第一通訊人也可以不用說話,讓錄製的連續環景影像與背景聲音成為他要向持有第一影音通訊裝置31與第二影音通訊裝置32的人(以下簡稱第二通訊人)溝通的內容。如果有必要,可以對環景影音錄攝裝置20的控制模組進行編程控制,讓第一通訊人可操作環景影音錄攝裝置20將一固定文字、一跑馬文字、一圖像及/或一背景聲音混入連續環景影音的至少一時段中。如此,第一影音通訊裝置31與第二影音通訊裝置32接收的連續環景影像中可以看到字卡、跑馬燈,聽到音效,可以增加觀看時的氣氛,也能更有效地知道第一通訊人想表達的訊息。 The second step of the method is an audio-visual playing step: respectively playing the continuous ambient video corresponding to the second ambient video streaming packets on the audio-visual communication devices ( S02 ). The first audio-visual communication device 31 and The second audio-visual communication device 32 is installed with application software or special-made hardware for decoding the second surround-view video stream packet in WebRTC format, so that continuous surround-view video can be played thereon. Step S01 and Step S02 complete the "one-to-many communication" based on surround view audio and video and reduce delay. The person holding the surround view video recording device 20 (hereinafter referred to as the first communicator) can speak to the microphone of the surround view video recording device 20, and what he says will become part of the background sound of the continuous surround view video. The first communicator may also not speak, and let the recorded continuous surround view image and background sound become the communication means for him to communicate with the person holding the first audio-visual communication device 31 and the second audio-visual communication device 32 (hereinafter referred to as the second communicator). content. If necessary, the control module of the surround view video and audio recording device 20 can be programmed and controlled so that the first communicator can operate the surround view video and audio recording device 20 to record a fixed text, a horse racing character, an image and/or A background sound is mixed into at least one period of the continuous ambient video. In this way, in the continuous panoramic images received by the first audio-visual communication device 31 and the second audio-visual communication device 32, word cards and marquees can be seen, and sound effects can be heard, which can increase the atmosphere when watching, and can more effectively understand the first communication. The message people want to express.

如果第二通訊人(在本實施例中有2人)要向第一通訊人反饋訊息(依照本發明,反饋訊息是語音),會有以下的問題:第一、環景影音錄攝裝置20不具有喇叭或其他擴音裝置;第二、如果二個第二通訊人同時想向第一通訊人說話,第一通訊人只能在當下接收一個第二通訊人的語音。步驟S03到步驟S05便是要解決這問題的技術。 If the second communicator (in this embodiment, there are 2 people) will feed back information (according to the present invention, the feedback message is voice) to the first communicator, there will be the following problems: first, the surround view video and audio recording device 20 It does not have a loudspeaker or other loudspeaker devices; secondly, if two second communicators want to speak to the first communicator at the same time, the first communicator can only receive the voice of one second communicator at the moment. Steps S03 to S05 are techniques to solve this problem.

本方法的第三步驟為唯一語音通訊要求步驟:至少一影音通訊裝置中每一者將一第一語音轉換為符合一語音協定的數個第一語音封包,該些第一語音封包連同一通訊要求指令發送至中繼伺服器10(S03)。第一影音通訊裝置31或第二影音通訊裝置32都可以分別將所屬的第二通訊人想說的話(第一語音)轉換成可以向外發出的數個第一語音封包,其轉換的規範便是語音協定。依照本發明,語音協定可以是,但不限於H.323規範、對話啟動協定(Session Initiation Protocol,SIP)或媒體網關控制協議(Media Gateway Control Protocol,MGCP), 是常用的基於IP的語音傳輸(Voice over Internet Protocol,VoIP)。影音通訊裝置與中繼伺服器10間建立溝通需要往返許多訊息或指令,通訊要求指令是由影音通訊裝置主動發出,要求與語音通訊裝置40建立唯一語音通訊的指令,要求中繼伺服器10執行。 The third step of the method is a unique voice communication requirement step: each of at least one audio-visual communication device converts a first voice into a plurality of first voice packets conforming to a voice protocol, and these first voice packets are combined with a communication The request command is sent to the relay server 10 (S03). The first audio-visual communication device 31 or the second audio-visual communication device 32 can convert the words (first voice) that the second correspondent wants to say into several first voice packets that can be sent out, and the conversion standard is as follows: is a voice protocol. According to the present invention, the voice agreement can be, but not limited to, the H.323 specification, the session initiation agreement (Session Initiation Protocol, SIP) or the media gateway control protocol (Media Gateway Control Protocol, MGCP), It is a commonly used IP-based voice transmission (Voice over Internet Protocol, VoIP). The establishment of communication between the audio-visual communication device and the relay server 10 requires a lot of messages or instructions. The communication request command is sent by the audio-visual communication device actively, and the command to establish a unique voice communication with the voice communication device 40 requires the relay server 10 to execute. .

接著,本方法的第四步驟為唯一語音通訊執行步驟:由中繼伺服器10將該些第一語音封包發送給與其資訊連接的一語音通訊裝置40,其中,若有二個以上影音通訊裝置發出該通訊要求指令,先發出該通訊要求的影音通訊裝置所發送的第一語音封包會被發送給該語音通訊裝置,其它影音通訊裝置所發送的第一語音封包被捨棄(S04)。語音通訊裝置40是第一通訊人持有,異於環景影音錄攝裝置20而可與第二通訊人溝通的硬體。依照本發明,語音通訊裝置40最好是智慧型手機或平板電腦。為了對本步驟有較佳的理解,請復見圖3。在時間T3時,持有第一影音通訊裝置31的第二通訊人想對第一通訊人說話,他便操作第一影音通訊裝置31將他說的話轉換成第一語音封包A,第一語音封包A隨著通訊要求指令一起發給中繼伺服器10。如果持有第二影音通訊裝置32的第二通訊人不想跟第一通訊人說話,第一語音封包A便會直接由中繼伺服器10轉發給語音通訊裝置40。然而不巧,持有第二影音通訊裝置32的第二通訊人想跟第一通訊人說話,他於時間T4時操作第二影音通訊裝置32將他說的話轉換成第一語音封包B,第一語音封包B也隨著通訊要求指令一起發給中繼伺服器10。由於第二影音通訊裝置32較晚發出通訊要求指令,中繼伺服器10於時間T5將第一語音封包A發給語音通訊裝置40,第一語音封包B就在中繼伺服器10被捨棄了。 Then, the fourth step of the method is a unique voice communication execution step: the relay server 10 sends these first voice packets to a voice communication device 40 connected to its information, wherein, if there are more than two audio-visual communication devices When the communication request command is issued, the first voice packet sent by the audio-visual communication device that issued the communication request first will be sent to the audio communication device, and the first voice packets sent by other audio-visual communication devices will be discarded (S04). The voice communication device 40 is owned by the first communicator and is different from the surround view video and audio recording device 20 and can communicate with the second communicator. According to the present invention, the voice communication device 40 is preferably a smart phone or a tablet computer. For a better understanding of this step, please refer to Figure 3 again. At time T3, the second communicator holding the first audio-visual communication device 31 wants to speak to the first communicator, and he operates the first audio-visual communication device 31 to convert what he said into the first voice packet A, the first voice The packet A is sent to the relay server 10 together with the communication request command. If the second correspondent holding the second audio-visual communication device 32 does not want to talk to the first correspondent, the first voice packet A will be directly forwarded by the relay server 10 to the voice communication device 40 . Unfortunately, however, the second communicator holding the second audio-visual communication device 32 wants to talk to the first communicator. He operates the second audio-visual communication device 32 at time T4 to convert what he said into the first voice packet B, the second A voice packet B is also sent to the relay server 10 together with the communication request command. Since the second audio-visual communication device 32 sends out the communication request command later, the relay server 10 sends the first voice packet A to the voice communication device 40 at time T5, and the first voice packet B is discarded in the relay server 10 .

接著,本方法的第五步驟為第一語音撥放步驟:由語音通訊裝置40撥放由中繼伺服器10發送的第一語音封包所對應的該第一語音(S05)。至此,第一通訊人便能通過語音通訊裝置40聽到來自第一影音通訊裝置31的話語聲音。 Next, the fifth step of the method is the step of playing the first voice: the voice communication device 40 plays the first voice corresponding to the first voice packet sent by the relay server 10 ( S05 ). So far, the first communicator can hear the voice from the first audio-visual communication device 31 through the voice communication device 40 .

如果第一通訊人想向特定的第二通訊人說話,又不想讓其它第二通訊人聽到,本方法提供了其它步驟來解決這個問題。以下的特定對象通訊步驟可以穿插出現在步驟S02之後。 If the first correspondent wants to speak to a specific second correspondent, but does not want other second correspondents to hear, the method provides additional steps to solve this problem. The following object-specific communication steps can be interspersed after step S02.

本方法的第一特定對象通訊步驟為特定對象語音通訊要求步驟:由語音通訊裝置40將一第二語音轉換為符合該語音協定的數個第二語音封包,該些第二語音封包連同一指定通訊要求指令發送至中繼伺服器10,其中該指定通訊要求指令指定該些第二語音中繼發送到一指定影音通訊裝置(S06)。第二語音是第一通訊人說的話,依照前述的語音協定生成數個第二語音封包。指定回復通訊是由語音通訊裝置40操作,比如通過安裝於其上的行動應用程式,向中繼伺服器10指定要求中繼發送該些第二語音的指令訊號。在本實施例中指定影音通訊裝置為第一影音通訊裝置31。如圖3所示,第一通訊人在時間T6時操作語音通訊裝置40,將指定通訊要求指令與第二語音封包發給中繼伺服器10。 The first specific object communication step of the method is a specific object voice communication requirement step: a second voice is converted into several second voice packets conforming to the voice agreement by the voice communication device 40, and these second voice packets are together with a specified The communication request command is sent to the relay server 10, wherein the designated communication request command designates the second voice relays to send to a designated audio-visual communication device (S06). The second voice is the words spoken by the first communicator, and a plurality of second voice packets are generated according to the aforementioned voice protocol. The specified reply communication is operated by the voice communication device 40 , such as through a mobile application program installed thereon, to specify to the relay server 10 to request to relay and send the command signals of the second voices. In this embodiment, the specified audio-visual communication device is the first audio-visual communication device 31 . As shown in FIG. 3 , the first correspondent operates the voice communication device 40 at time T6 to send the designated communication request command and the second voice packet to the relay server 10 .

接著,本方法的第二特定對象通訊步驟為特定對象語音通訊執行步驟:由中繼伺服器10將該些第二語音封包發送給該指定通訊要求指令指定的該指定影音通訊裝置(S07)。如圖3所示,在時間T7時,中繼伺服器10將該些第二語音封包發給第一影音通訊裝置31,將第一通訊人的話發送給持有第一影音通訊裝置31的第二通訊人。 Next, the second specific object communication step of the method is the specific object voice communication execution step: the relay server 10 sends the second voice packets to the designated audio-visual communication device designated by the designated communication request command (S07). As shown in FIG. 3 , at time T7, the relay server 10 sends these second voice packets to the first audio-visual communication device 31, and sends the words of the first correspondent to the first audio-visual communication device 31. Two correspondents.

最後,本方法的第三特定對象通訊步驟為第二語音撥放步驟:由指定影音通訊裝置撥放由中繼伺服器10發送的第二語音封包所對應的該第二語音(S08)。至此,第二通訊人便可通過第一影音通訊裝置31聽到第一通訊人說的話。 Finally, the third specific object communication step of the method is the second voice playing step: the designated audio-visual communication device plays the second voice corresponding to the second voice packet sent by the relay server 10 (S08). So far, the second communicator can listen to what the first communicator said through the first audio-visual communication device 31 .

請見圖4,該圖繪示本方法的一種應用場景,此應用場景為遠端實時導遊。前述的第一通訊人是位導遊A,他在峇里島帶著環景影音錄攝裝置20與語音通訊裝置40(比如智慧型手機)進行景點介紹。第一影音通訊裝置31的第二通訊 人是位於台北的雲端遊客B,遠端看著導遊A的直撥,他的第一影音通訊裝置31是個平板電腦。第二影音通訊裝置32的第二通訊人是位於高雄的雲端遊客C,也可以同時遠端看著導遊A的直撥,他的第二影音通訊裝置32是個筆記型電腦。環景影音錄攝裝置20、語音通訊裝置40、第一影音通訊裝置31與第二影音通訊裝置32通過網路1與中繼伺服器10訊號連接,執行本方法。請見圖5,該圖繪示語音通訊裝置40的顯示介面41。如果導遊A想跟雲端遊客B說話,導遊A可以操作語音通訊裝置40的顯示介面41,比如按著標示雲端遊客B的影像(左上方框示者)後對著語音通訊裝置40說話,那他的話語便會隨著定通訊要求指令發出給中繼伺服器10,最終傳給第一影音通訊裝置31,而第二影音通訊裝置32不會收到。 Please refer to FIG. 4, which shows an application scenario of this method, and this application scenario is a remote real-time tour guide. The aforesaid first communicator is a tour guide A, who is carrying a surround view video and audio recording device 20 and a voice communication device 40 (such as a smart phone) to introduce scenic spots in Bali. The second communication of the first audio-visual communication device 31 The person is cloud tourist B located in Taipei, watching the direct dial of tour guide A from a distance, and his first audio-visual communication device 31 is a tablet computer. The second communicator of the second audio-visual communication device 32 is cloud tourist C located in Kaohsiung. He can also remotely watch the direct dialing of tour guide A at the same time. His second audio-visual communication device 32 is a notebook computer. The surround view video recording device 20 , the voice communication device 40 , the first video communication device 31 and the second video communication device 32 are signally connected to the relay server 10 through the network 1 to execute the method. Please refer to FIG. 5 , which shows the display interface 41 of the voice communication device 40 . If tour guide A wants to talk to tourist B in the cloud, tour guide A can operate the display interface 41 of the voice communication device 40, for example, press the image marked with tourist B in the cloud (shown in the upper left frame) and speak to the voice communication device 40, then he The speech will be sent to the relay server 10 along with the communication request command, and finally transmitted to the first audio-visual communication device 31, but the second audio-visual communication device 32 will not receive it.

依照本發明,前述的基於環景影音的減少延遲的一對多通訊方法在不同的設備上實現,主要是藉由安裝於個別硬體上的程式來執行。因此,本發明也揭露一種計算機程式產品組,該計算機程式產品組包含了一第一計算機程式產品、一第二計算機程式產品與一第三計算機程式產品。 According to the present invention, the aforesaid one-to-many communication method with reduced delay based on surround view video and audio is implemented on different devices, mainly by executing programs installed on individual hardware. Therefore, the present invention also discloses a computer program product group, the computer program product group includes a first computer program product, a second computer program product and a third computer program product.

第一計算機程式產品可經由中繼伺服器10載入而執行以下的程式指令:一第1程式指令:接收來自一環景影音錄攝裝置的經過一直播協議編碼的複數個第一環景影音串流封包;一第2程式指令:將該些第一環景影音串流封包依序依照網頁即時通訊應用程式介面轉換為複數個第二環景影音串流封包;一第3程式指令:將該些第二環景影音串流封包向與該中繼伺服器資訊連接的複數個影音通訊裝置發送;一第4程式指令:接收來自至少一影音通訊裝置的數個第一語音封包及一通訊要求指令發送;及一第5程式指令:將該些第一語音封包發送給與其資訊連接的一語音通訊裝置,若有二個以上影音通訊裝置發出該通訊要求指令,先發出該通訊要求的影音通訊裝置所發送的第一語音封包會被發送給該語音通訊裝置,其它影音通訊裝置所發送的第一語音封包被捨棄。 The first computer program product can be loaded through the relay server 10 to execute the following program instructions: a first program instruction: receive a plurality of first surround view video streams encoded by a live protocol from a surround view video recording device stream packets; a second program instruction: convert these first surround view video stream packets into a plurality of second surround view video stream packets in sequence according to the webpage instant messaging application program interface; a third program instruction: convert the Send some second surround view audio-visual streaming packets to a plurality of audio-visual communication devices connected to the relay server; a fourth program instruction: receive several first voice packets and a communication request from at least one audio-visual communication device Command sending; and a 5th program command: send these first voice packets to a voice communication device connected to its information, if more than two audio-visual communication devices send the communication request command, first send the video-video communication of the communication request The first voice packet sent by the device is sent to the voice communication device, and the first voice packets sent by other audio-visual communication devices are discarded.

第二計算機程式產品可經由任一影音通訊裝置載入而執行以下的程式指令:一第6程式指令:分別於該影音通訊裝置播放該些第二環景影音串流封包對應的一連續環景影音;一第7程式指令:由至少一影音通訊裝置中每一者將一第一語音轉換為符合一語音協定的數個第一語音封包;及一第8程式指令:將該些第一語音封包連同該通訊要求指令發送至該中繼伺服器。第二計算機程式產品可以是個行動應用程式,以特定的介面呈現於影音通訊裝置上而接受使用者的操作。 The second computer program product can be loaded through any audio-visual communication device to execute the following program instructions: a sixth program instruction: respectively play a continuous ambient scene corresponding to the second ambient video streaming packets on the audio-visual communication device Audio-visual; a 7th program instruction: each of at least one audio-visual communication device converts a first voice into a plurality of first voice packets conforming to a voice protocol; and an 8th program instruction: convert these first voice The packet is sent to the relay server together with the communication request command. The second computer program product may be a mobile application program, which is presented on the audio-visual communication device with a specific interface and accepts user operations.

第三計算機程式產品可經由語音通訊裝置40載入而執行以下的程式指令:一第9程式指令:撥放由該中繼伺服器10發送的第一語音封包所對應的該第一語音。第三計算機程式產品也可以是個行動應用程式,以特定的介面呈現於語音通訊裝置40上而接受使用者的操作。 The third computer program product can be loaded through the voice communication device 40 to execute the following program instructions: a ninth program instruction: play the first voice corresponding to the first voice packet sent by the relay server 10 . The third computer program product can also be a mobile application program, which is presented on the voice communication device 40 with a specific interface to accept user operations.

為了能執行由語音通訊裝置40向影音通訊裝置的語音傳輸,第三計算機程式產品可進一步執行:一第10程式指令:將一第二語音轉換為符合該語音協定的數個第二語音封包;一第11程式指令:將該些第二語音封包連同一指定通訊要求指令發送至中繼伺服器10,其中該指定通訊要求指令指定該些第二語音中繼發送到一指定影音通訊裝置。同時,第一計算機程式產品也進一步執行:一第12程式指令:將該些第二語音封包發送給該指定通訊要求指令指定的該指定影音通訊裝置。最後,第二計算機程式產品進一步執行:一第13程式指令:由該指定影音通訊裝置撥放由中繼伺服器10發送的第二語音封包所對應的該第二語音。 In order to perform voice transmission from the voice communication device 40 to the audio-visual communication device, the third computer program product can further execute: a tenth program instruction: convert a second voice into several second voice packets conforming to the voice protocol; An eleventh program command: sending the second voice packets together with a designated communication request command to the relay server 10, wherein the designated communication request command designates the second voice relays to be sent to a designated audio-visual communication device. At the same time, the first computer program product further executes: a twelfth program instruction: sending the second voice packets to the designated audio-visual communication device designated by the designated communication request instruction. Finally, the second computer program product further executes: a thirteenth program instruction: the designated audio-visual communication device plays the second voice corresponding to the second voice packet sent by the relay server 10 .

同樣地,環景影音錄攝裝置20可進一步將一固定文字、一跑馬文字、一圖像及/或一背景聲音混入該連續環景影音的至少一時段中。然而,這種功能必須要取得環景影音錄攝裝置20製作商開發的操作介面或原始碼,從而對環景影音錄攝裝置20進行相關功能的添加。前述的直播協議可以是即時訊息協定或 加密即時訊息協定。語音協定可以為H.323規範、對話啟動協定或媒體網關控制協議。影音通訊裝置可以是智慧型手機、平板電腦、筆記型電腦、桌上型電腦、具有播音功能的智能眼鏡或具有播音功能的頭戴式立體影像播放器。語音通訊裝可以是智慧型手機或平板電腦。 Likewise, the surround view video recording device 20 may further mix a fixed text, a horse racing text, an image and/or a background sound into at least one period of the continuous surround view video. However, this function must obtain the operation interface or source code developed by the manufacturer of the surround view video and audio recording device 20 , so as to add related functions to the surround view video and audio recording device 20 . The aforementioned live broadcast protocol can be an instant message protocol or Encrypted Instant Messaging Protocol. The voice protocol can be the H.323 specification, a session initiation protocol or a media gateway control protocol. The audio-visual communication device can be a smart phone, a tablet computer, a notebook computer, a desktop computer, smart glasses with a broadcasting function, or a head-mounted stereoscopic image player with a broadcasting function. The voice communication device can be a smartphone or a tablet.

雖然本發明已以實施方式揭露如上,然其並非用以限定本發明,任何所屬技術領域中具有通常知識者,在不脫離本發明之精神和範圍內,當可作些許之更動與潤飾,因此本發明之保護範圍當視後附之申請專利範圍所界定者為準。 Although the present invention has been disclosed above in terms of implementation, it is not intended to limit the present invention. Anyone with ordinary knowledge in the technical field may make some changes and modifications without departing from the spirit and scope of the present invention. Therefore, The scope of protection of the present invention should be defined by the scope of the appended patent application.

Claims (14)

一種基於環景影音的減少延遲的一對多通訊方法,包含:一影音資料轉換發送步驟:由一中繼伺服器接收來自一環景影音錄攝裝置的經過一直播協議編碼的複數個第一環景影音串流封包,將該些第一環景影音串流封包依序依照網頁即時通訊(Web Real-Time Communication,WebRTC)應用程式介面轉換為複數個第二環景影音串流封包,及將該些第二環景影音串流封包向與該中繼伺服器資訊連接的複數個影音通訊裝置發送;一影音播放步驟:分別於該些影音通訊裝置播放該些第二環景影音串流封包對應的一連續環景影音;一唯一語音通訊要求步驟:至少一影音通訊裝置中每一者將一第一語音轉換為符合一語音協定的數個第一語音封包,該些第一語音封包連同一通訊要求指令發送至該中繼伺服器;一唯一語音通訊執行步驟:由該中繼伺服器將該些第一語音封包發送給與其資訊連接的一語音通訊裝置,其中,若有二個以上影音通訊裝置發出該通訊要求指令,先發出該通訊要求的影音通訊裝置所發送的第一語音封包會被發送給該語音通訊裝置,其它影音通訊裝置所發送的第一語音封包被捨棄;一第一語音撥放步驟:由該語音通訊裝置撥放由該中繼伺服器發送的第一語音封包所對應的該第一語音;一特定對象語音通訊要求步驟:由該語音通訊裝置將一第二語音轉換為符合該語音協定的數個第二語音封包,該些第二語音封包連同一指定通訊要求指令發送至該中繼伺服器,其中該指定通訊要求指令指定該些第二語音中繼發送到一指定影音通訊裝置; 一特定對象語音通訊執行步驟:由該中繼伺服器將該些第二語音封包發送給該指定通訊要求指令指定的該指定影音通訊裝置;及一第二語音撥放步驟:由該指定影音通訊裝置撥放由該中繼伺服器發送的第二語音封包所對應的該第二語音。 A one-to-many communication method based on surround view audio and video with reduced delay, comprising: a step of converting and sending audio and video data: receiving by a relay server a plurality of first loops encoded by a live protocol from a surround view video and audio recording device Scene video stream packets, converting the first surround view video stream packets into a plurality of second surround view video stream packets sequentially according to the web real-time communication (Web Real-Time Communication, WebRTC) application program interface, and converting The second ambient video streaming packets are sent to a plurality of audio-visual communication devices connected to the relay server; an audio-visual playing step: playing the second ambient video streaming packets on the audio-visual communication devices respectively Corresponding to a continuous surround view audio-visual; a unique voice communication requires a step: each of at least one audio-visual communication device converts a first voice into a plurality of first voice packets conforming to a voice protocol, and the first voice packets together with A communication request command is sent to the relay server; a unique voice communication execution step: the relay server sends these first voice packets to a voice communication device connected to its information, wherein, if there are more than two When the audio-visual communication device issues the communication request instruction, the first voice packet sent by the audio-visual communication device that first issued the communication request will be sent to the audio-visual communication device, and the first voice packets sent by other audio-visual communication devices will be discarded; A voice playing step: the first voice corresponding to the first voice packet sent by the relay server is played by the voice communication device; a specific object voice communication request step: the voice communication device sends a second converting the voice into several second voice packets conforming to the voice protocol, and sending the second voice packets to the relay server together with a designated communication request command, wherein the designated communication request command designates the second voice relay to send to a designated audio-visual communication device; A voice communication execution step for a specific object: the relay server sends the second voice packets to the specified audio-visual communication device designated by the specified communication request command; and a second voice playback step: the specified audio-visual communication The device plays the second voice corresponding to the second voice packet sent by the relay server. 如請求項1所述的基於環景影音的減少延遲的一對多通訊方法,其中該環景影音錄攝裝置進一步將一固定文字、一跑馬文字、一圖像及/或一背景聲音混入該連續環景影音的至少一時段中。 The one-to-many communication method based on surround view audio and video with reduced delay as described in claim 1, wherein the surround view video and audio recording device further mixes a fixed text, a horse racing text, an image and/or a background sound into the During at least one period of continuous surround view video. 如請求項1所述的基於環景影音的減少延遲的一對多通訊方法,其中該直播協議為即時訊息協定(Real-Time Messaging Protocol,RTMP)或加密即時訊息協定(RTMP-S)。 The delay-reducing one-to-many communication method based on ambient video and audio as described in claim 1, wherein the live broadcast protocol is Real-Time Messaging Protocol (RTMP) or encrypted real-time messaging protocol (RTMP-S). 如請求項1所述的基於環景影音的減少延遲的一對多通訊方法,其中該語音協定為H.323規範、對話啟動協定(Session Initiation Protocol,SIP)或媒體網關控制協議(Media Gateway Control Protocol,MGCP)。 The one-to-many communication method based on surround view audio and video as described in claim 1, wherein the voice protocol is H.323 specification, session initiation protocol (Session Initiation Protocol, SIP) or media gateway control protocol (Media Gateway Control Protocol, MGCP). 如請求項1所述的基於環景影音的減少延遲的一對多通訊方法,其中該影音通訊裝置為智慧型手機、平板電腦、筆記型電腦、桌上型電腦、具有播音功能的智能眼鏡或具有播音功能的頭戴式立體影像播放器。 The delay-reducing one-to-many communication method based on surround view audio and video as described in claim 1, wherein the audio and video communication device is a smart phone, a tablet computer, a notebook computer, a desktop computer, smart glasses with a broadcasting function, or A head-mounted stereoscopic image player with a broadcasting function. 如請求項1所述的基於環景影音的減少延遲的一對多通訊方法,其中該語音通訊裝置為智慧型手機或平板電腦。 The delay-reducing one-to-many communication method based on surround view video and audio as described in Claim 1, wherein the voice communication device is a smart phone or a tablet computer. 一種基於環景影音的減少延遲的一對多通訊的計算機程式產品組,包含:一第一計算機程式產品,經由一中繼伺服器載入該程式執行: 一第1程式指令:接收來自一環景影音錄攝裝置的經過一直播協議編碼的複數個第一環景影音串流封包;一第2程式指令:將該些第一環景影音串流封包依序依照網頁即時通訊應用程式介面轉換為複數個第二環景影音串流封包;一第3程式指令:將該些第二環景影音串流封包向與該中繼伺服器資訊連接的複數個影音通訊裝置發送;一第4程式指令:接收來自至少一影音通訊裝置的數個第一語音封包及一通訊要求指令發送;及一第5程式指令:將該些第一語音封包發送給與其資訊連接的一語音通訊裝置,其中,若有二個以上影音通訊裝置發出該通訊要求指令,先發出該通訊要求的影音通訊裝置所發送的第一語音封包會被發送給該語音通訊裝置,其它影音通訊裝置所發送的第一語音封包被捨棄;以及一第二計算機程式產品,經由任一影音通訊裝置載入該程式執行:一第6程式指令:分別於該影音通訊裝置播放該些第二環景影音串流封包對應的一連續環景影音;一第7程式指令:由至少一影音通訊裝置中每一者將一第一語音轉換為符合一語音協定的數個第一語音封包;及一第8程式指令:將該些第一語音封包連同該通訊要求指令發送至該中繼伺服器;以及一第三計算機程式產品,經由該語音通訊裝置載入該程式執行:一第9程式指令:撥放由該中繼伺服器發送的第一語音封包所對應的該第一語音; 一第10程式指令:將一第二語音轉換為符合該語音協定的數個第二語音封包;及一第11程式指令:將該些第二語音封包連同一指定通訊要求指令發送至該中繼伺服器,其中該指定通訊要求指令指定該些第二語音中繼發送到一指定影音通訊裝置。 A computer program product group for one-to-many communication with reduced delay based on surround view video and audio, comprising: a first computer program product, loaded into the program via a relay server for execution: A first program instruction: receive a plurality of first surround view video stream packets encoded by a live broadcast protocol from a surround view video recording device; a second program instruction: convert these first surround view video stream packets according to The procedure is converted into a plurality of second surround view video streaming packets according to the web instant messaging application program interface; a third program instruction: send these second surround view video stream packets to a plurality of packets connected to the relay server information Audio-visual communication device sending; a 4th program instruction: receiving several first voice packets from at least one audio-visual communication device and sending a communication request instruction; and a 5th program instruction: sending these first voice packets to its information A voice communication device connected, wherein, if more than two audio-visual communication devices issue the communication request command, the first voice packet sent by the audio-visual communication device that first issued the communication request will be sent to the voice communication device, and the other audio-visual communication devices The first voice packet sent by the communication device is discarded; and a second computer program product is loaded into the program via any audio-visual communication device for execution: a sixth program instruction: play the second loops on the audio-visual communication device respectively A continuous surround view video stream packet corresponding to the scene video stream; a 7th program instruction: convert a first voice into a plurality of first voice packets conforming to a voice protocol by each of at least one audio-visual communication device; and a The eighth program instruction: send the first voice packets together with the communication request instruction to the relay server; and a third computer program product, which is loaded into the program via the voice communication device for execution: a ninth program instruction: playing the first voice corresponding to the first voice packet sent by the relay server; A 10th program instruction: convert a second voice into several second voice packets conforming to the voice agreement; and an 11th program instruction: send these second voice packets together with a designated communication request command to the relay The server, wherein the designated communication request command designates the second voice relays to be sent to a designated audio-visual communication device. 如請求項7所述的基於環景影音的減少延遲的一對多通訊的計算機程式產品組,其中該第一計算機程式產品進一步執行:一第12程式指令:將該些第二語音封包發送給該指定通訊要求指令指定的該指定影音通訊裝置。 The one-to-many communication computer program product group based on surround view audio and video as described in claim 7, wherein the first computer program product further executes: a twelfth program instruction: sending these second voice packets to The specified audio-visual communication device specified by the specified communication request command. 如請求項8所述的基於環景影音的減少延遲的一對多通訊的計算機程式產品組,其中該第二計算機程式產品進一步執行:一第13程式指令:由該指定影音通訊裝置撥放由該中繼伺服器發送的第二語音封包所對應的該第二語音。 The computer program product group of one-to-many communication with reduced delay based on surround view video and audio as described in claim 8, wherein the second computer program product further executes: a thirteenth program instruction: played by the designated audio-visual communication device The second voice corresponding to the second voice packet sent by the relay server. 如請求項7所述的基於環景影音的減少延遲的一對多通訊的計算機程式產品組,其中該環景影音錄攝裝置進一步將一固定文字、一跑馬文字、一圖像及/或一背景聲音混入該連續環景影音的至少一時段中。 The computer program product group of one-to-many communication with reduced delay based on surround view video and audio as described in claim 7, wherein the surround view video and audio recording device further records a fixed text, a horse racing text, an image and/or a Background sound is mixed into at least one period of the continuous ambient video. 如請求項7所述的基於環景影音的減少延遲的一對多通訊的計算機程式產品組,其中該直播協議為即時訊息協定或加密即時訊息協定。 The computer program product group for one-to-many communication with reduced delay based on surround view video and audio as described in claim 7, wherein the live broadcast protocol is an instant message protocol or an encrypted instant message protocol. 如請求項7所述的基於環景影音的減少延遲的一對多通訊的計算機程式產品組,其中該語音協定為H.323規範、對話啟動協定或媒體網關控制協議。 The set of computer program products for one-to-many communication with reduced delay based on Surround View Audio and Video as described in Claim 7, wherein the voice protocol is H.323 specification, session initiation protocol or media gateway control protocol. 如請求項7所述的基於環景影音的減少延遲的一對多通訊的計算機程式產品組,其中該影音通訊裝置為智慧型手機、平板電腦、筆記型電腦、桌上型電腦、具有播音功能的智能眼鏡或具有播音功能的頭戴式立體影像播放器。 The computer program product group of one-to-many communication with reduced delay based on surround view audio-visual as described in claim 7, wherein the audio-visual communication device is a smart phone, a tablet computer, a notebook computer, a desktop computer, and has a broadcasting function smart glasses or head-mounted stereoscopic video player with broadcasting function. 如請求項7所述的基於環景影音的減少延遲的一對多通訊的計算機程式產品組,其中該語音通訊裝置為智慧型手機或平板電腦。 The computer program product group for one-to-many communication with reduced delay based on surround view video and audio as described in claim 7, wherein the voice communication device is a smart phone or a tablet computer.
TW111142416A 2022-11-07 2022-11-07 Method for achieving latency-reduced one-to-many communication based on surrounding video and associated computer program product set TWI811148B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
TW111142416A TWI811148B (en) 2022-11-07 2022-11-07 Method for achieving latency-reduced one-to-many communication based on surrounding video and associated computer program product set

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
TW111142416A TWI811148B (en) 2022-11-07 2022-11-07 Method for achieving latency-reduced one-to-many communication based on surrounding video and associated computer program product set

Publications (2)

Publication Number Publication Date
TWI811148B true TWI811148B (en) 2023-08-01
TW202420790A TW202420790A (en) 2024-05-16

Family

ID=88585468

Family Applications (1)

Application Number Title Priority Date Filing Date
TW111142416A TWI811148B (en) 2022-11-07 2022-11-07 Method for achieving latency-reduced one-to-many communication based on surrounding video and associated computer program product set

Country Status (1)

Country Link
TW (1) TWI811148B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6940826B1 (en) * 1999-12-30 2005-09-06 Nortel Networks Limited Apparatus and method for packet-based media communications
US20060007871A1 (en) * 2000-03-22 2006-01-12 Welin Andrew M Systems, processes and integrated circuits for improved packet scheduling of media over packet
US20110195739A1 (en) * 2010-02-10 2011-08-11 Harris Corporation Communication device with a speech-to-text conversion function
US20180247550A1 (en) * 2015-11-19 2018-08-30 Shenzhen Eaglesoul Technology Co., Ltd. Image synchronous display method and device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6940826B1 (en) * 1999-12-30 2005-09-06 Nortel Networks Limited Apparatus and method for packet-based media communications
US20060007871A1 (en) * 2000-03-22 2006-01-12 Welin Andrew M Systems, processes and integrated circuits for improved packet scheduling of media over packet
US20110195739A1 (en) * 2010-02-10 2011-08-11 Harris Corporation Communication device with a speech-to-text conversion function
US20180247550A1 (en) * 2015-11-19 2018-08-30 Shenzhen Eaglesoul Technology Co., Ltd. Image synchronous display method and device

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
網路文獻 "NVIDIA Jetson Orin - GstWebRTC GStreamer plug-in for WebRTC", RidgeRun Embedded Linux Developer Connection, 26 Aug., 2022. [https://developer.ridgerun.com/wiki/index.php?title=NVIDIA_Jetson_Orin/RidgeRun_Products/GstWebRTC&oldid=42996]
網路文獻 "Panoramic Stitching and WebRTC Streaming on NVIDIA Jetson", RidgeRun Embedded Linux Developer Connection, 31 Aug,, 2022. [https://developer.ridgerun.com/wiki/index.php?title=Panoramic_Stitching_and_WebRTC_Streaming_on_NVIDIA_Jetson&oldid=43056];網路文獻 "NVIDIA Jetson Orin - GstWebRTC GStreamer plug-in for WebRTC", RidgeRun Embedded Linux Developer Connection, 26 Aug., 2022. [https://developer.ridgerun.com/wiki/index.php?title=NVIDIA_Jetson_Orin/RidgeRun_Products/GstWebRTC&oldid=42996] *

Similar Documents

Publication Publication Date Title
CN107846633B (en) Live broadcast method and system
EP3562163B1 (en) Audio-video synthesis method and system
US8860776B2 (en) Conference terminal, conference server, conference system and data processing method
CN105763832B (en) A kind of video interactive, control method and device
US7532231B2 (en) Video conference recorder
US20120287231A1 (en) Media sharing during a video call
WO2014079239A1 (en) Method, apparatus and system for acquiring playback data stream of real-time video communication
KR20080086262A (en) Method and apparatus for sharing digital contents, and digital contents sharing system using the method
CN103391418B (en) The fusion method of video conferencing system Network Based and Broadcast and TV system
JP2005198313A (en) Digital real-time interactive program system
CN113194278A (en) Conference control method and device and computer readable storage medium
JP2005269607A (en) Instant interactive audio/video management system
JP2014140135A (en) Information reproduction terminal
CN113301359A (en) Audio and video processing method and device and electronic equipment
WO2014177082A1 (en) Video conference video processing method and terminal
US20180167578A1 (en) Method and Apparatus for Coviewing Video
CN102438119B (en) Audio/video communication system of digital television
CN111901537B (en) Broadcast television interactive program production mode based on cloud platform
CN117176972B (en) Cloud conference audio and video transmission system and method based on WebRTC technology
US20200045095A1 (en) Method and Apparatus for Coviewing Video
CN113014950A (en) Live broadcast synchronization method and system and electronic equipment
TWI811148B (en) Method for achieving latency-reduced one-to-many communication based on surrounding video and associated computer program product set
WO2023231478A1 (en) Audio and video sharing method and device, and computer-readable storage medium
JP7290260B1 (en) Servers, terminals and computer programs
JP2003271530A (en) Communication system, inter-system relevant device, program and recording medium