TW202205135A - Intelligent audiovisual combination system and implementation method thereof - Google Patents

Intelligent audiovisual combination system and implementation method thereof Download PDF

Info

Publication number
TW202205135A
TW202205135A TW109125866A TW109125866A TW202205135A TW 202205135 A TW202205135 A TW 202205135A TW 109125866 A TW109125866 A TW 109125866A TW 109125866 A TW109125866 A TW 109125866A TW 202205135 A TW202205135 A TW 202205135A
Authority
TW
Taiwan
Prior art keywords
message
audio
processing server
customer service
message processing
Prior art date
Application number
TW109125866A
Other languages
Chinese (zh)
Inventor
莊連豪
Original Assignee
莊連豪
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 莊連豪 filed Critical 莊連豪
Priority to TW109125866A priority Critical patent/TW202205135A/en
Publication of TW202205135A publication Critical patent/TW202205135A/en

Links

Images

Landscapes

  • Information Transfer Between Computers (AREA)
  • Telephonic Communication Services (AREA)

Abstract

An intelligent audiovisual combination system and implementation method thereof are disclosed, mainly comprising a message processing server and at least one user communication device. The message processing server can be in information connection with the communication device of each user through the Internet. After the message processing server receives a first electronic message sent by the user communication device and a situation setting parameter set by the user, the message processing server produces an automatic text modified second electronic message and an associated audiovisual customer service information based on the situation setting parameter. Another communication device can play the second electronic message and the audiovisual customer service information. Furthermore, based on the situation selected by the user, the audiovisual customer service information can play an associated customer service video and provide a vivid electronic message.

Description

智能影音融合系統及其實施方法Intelligent audio-visual fusion system and its implementation method

本發明應用於即時通訊技術,本發明主要透過訊息處理伺服器擷取使用者輸入的一電子訊息後,依據使用者所選擇之情境,將原始電子訊息重新編輯文義,並於通訊方之使用者通訊裝置以虛擬客服動畫顯示。The present invention is applied to the instant communication technology. The present invention mainly retrieves an electronic message input by the user through the message processing server, re-edits the original electronic message according to the context selected by the user, and sends it to the user of the communication party. The communication device is displayed as a virtual customer service animation.

按,即時通訊技術已十分普及,民眾可於智能手機安裝即時通訊軟體,即可透過智能手機達到即時通訊的目的,目前的即時通訊軟體可以傳送文字訊息和聲音訊息,使用者亦可以透過即時通訊軟體進行即時通話,然而,實際使用即時通訊軟體可知,通訊雙方進行通訊時,使用者必須親自輸入訊息,然而,在一些使用狀態下,使用者並無法專心的輸入訊息,例如,使用者於行車狀態時,使用者需要將注意力放在駕駛上,因此,操作通訊軟體時,可能僅以簡單的字句回覆訊息,例如,使用者正在回家路上,由於交通堵塞,可能需要拖延抵達家裡的時間,然而,使用者受限於駕駛操作,而僅能回覆「快到家」,簡短的訊息並無法完整的表達出使用者真實情況,而收訊者更無法理解傳訊者真實意思,此為現有即時通訊軟體待需改善之問題,另外,有部分使用者會認為以聲音交流時,過於平淡的說話風格,可能會讓通訊對象感到無趣,因此,有部分使用者會有改變聲音的需求,透過改變聲音的方式,給予通訊對象耳目一新的感受,以活絡整個通訊過程,此為現有即時通訊軟體待需提供之需求。Press, instant messaging technology has become very popular. People can install instant messaging software on their smartphones to achieve instant messaging through smartphones. The current instant messaging software can send text messages and voice messages, and users can also use instant messaging However, the actual use of the instant messaging software shows that the user must enter the message in person when the two communicating parties communicate. However, in some usage states, the user cannot concentrate on entering the message. When the user needs to focus on driving, when operating the communication software, he may only reply to the message with simple words, for example, the user is on his way home, and he may need to delay the time to arrive home due to traffic jams , however, the user is limited by the driving operation and can only reply "coming home soon". The short message cannot fully express the real situation of the user, and the recipient cannot understand the real meaning of the sender. This is an existing real-time message. The communication software needs to be improved. In addition, some users think that when communicating with voice, the too bland speaking style may make the communication object feel boring. Therefore, some users have the need to change the voice. The method of sound can give the communication object a refreshing feeling, so as to activate the whole communication process, which is the demand that the existing instant communication software needs to provide.

有鑑於上述的問題和需求,本發明人係依據多年來從事相關行業的經驗,針對即時通訊系統及通訊方法進行研究及改進;緣此,本發明之主要目的在於提供一種可主動修改訊息內容,並將訊息融入於影音內容進行回覆的智能影音融合系統及其實施方法。In view of the above-mentioned problems and needs, the present inventor has conducted research and improvement on instant messaging systems and communication methods based on years of experience in related industries; for this reason, the main purpose of the present invention is to provide a kind of message content that can be actively modified, An intelligent audio-visual fusion system and an implementation method thereof, which integrate information into audio-visual content for replying.

為達上述目的,本發明之智能影音融合系統及其實施方法,主要具有一訊息處理伺服器和至少一使用者通訊裝置,訊息處理伺服器可透過網際網路與各使用者通訊裝置完成資訊連接,訊息處理伺服器可接收使用者通訊裝置所欲傳送的一第一電子訊息,以及使用者所設定一情境設定參數後,訊息處理伺服器依據情境設定參數,產生自動文字修飾的一第二電子訊息和相對應情境的一影音客服資訊,另一通訊方的使用者通訊裝置可播放出第二電子訊息和影音客服資訊,再者,影音客服資訊更可依據使用者選擇之情境,播放出符合情境的客服動畫,提供生動的電子訊息,又,若使用者無法輸入完整的句子時,訊息處理伺服器可以透過對訊息的再編輯作業,使得收訊者可以接收到相對完整的句子,以輔助收訊者理解發訊者意思,又,使用者透過對話情境之選擇,可改變聲音的語調及顯示為影音動畫,以增添對話的趣味性。In order to achieve the above-mentioned purpose, the intelligent video-audio fusion system and its implementation method of the present invention mainly have a message processing server and at least one user communication device, and the message processing server can complete the information connection with each user communication device through the Internet , the message processing server can receive a first electronic message to be sent by the user's communication device, and after a context setting parameter set by the user, the message processing server generates a second electronic message with automatic text modification according to the context setting parameter The message and an audio-visual customer service information corresponding to the situation, the user communication device of the other communication party can play the second electronic message and the audio-visual customer service information. Furthermore, the audio-visual customer service information can be played according to the situation selected by the user. The situational customer service animation provides vivid electronic messages, and if the user cannot input a complete sentence, the message processing server can re-edit the message so that the receiver can receive a relatively complete sentence to assist The receiver understands the meaning of the sender, and the user can change the tone of the voice and display it as an audio-visual animation through the selection of the dialogue situation, so as to increase the interest of the dialogue.

為使 貴審查委員得以清楚了解本發明之目的、技術特徵及其實施後之功效,茲以下列說明搭配圖示進行說明,敬請參閱。In order to enable your examiners to clearly understand the purpose, technical features and effects of the present invention, the following descriptions are combined with the diagrams for illustration, please refer to.

請參閱「第1圖」,圖中所示為本發明之組成示意圖(一),如圖,本發明之智能影音融合系統10,其主要具有一訊息處理伺服器101,且訊息處理伺服器101與至少一使用者通訊裝置102完成資訊連接;所述的訊息處理伺服器101可透過網際網路與各使用者通訊裝置102完成資訊連接,且訊息處理伺服器101可以交換各使用者通訊裝置102之間所欲傳送的一電子訊息,並且訊息處理伺服器101可擷取使用者通訊裝置102接收及發送的電子訊息,進而可對電子訊息進行更進一步的比對及分析作業,重新對電子訊息進行編輯及影音化;所述的使用者通訊裝置102可供以輸入、顯示和播放電子資訊,其中,使用者通訊裝置102係可以為智能手機或計算機,該等可安裝應用程式之裝置。Please refer to "FIG. 1", which is a schematic diagram of the composition of the present invention (1). As shown in the figure, the intelligent video-audio fusion system 10 of the present invention mainly has a message processing server 101, and the message processing server 101 Complete the information connection with at least one user communication device 102; the message processing server 101 can complete the information connection with each user communication device 102 through the Internet, and the message processing server 101 can exchange each user communication device 102 An electronic message to be transmitted between the two, and the message processing server 101 can capture the electronic message received and sent by the user communication device 102, and can further compare and analyze the electronic message, and re-analyze the electronic message Editing and audio-visualization are performed; the user communication device 102 can be used to input, display and play electronic information, wherein the user communication device 102 can be a smart phone or a computer, and these devices can be installed with application programs.

請參閱「第2圖」,圖中所示為本發明之組成示意圖(二),如圖,訊息處理伺服器101主要具有一中央處理模組1011,另有一訊息管理模組1012、一訊息分析模組1013、一情境設定模組1014和一儲存模組1015與中央處理模組1011完成資訊連接,其中: (1)  所述的中央處理模組1011,供以運行訊息處理伺服器101及驅動上述各模組,並具備邏輯運算、暫存運算結果、保存執行指令位置等功能,且其可為一中央處理器(Central Processing Unit, CPU)或一微控制器(Microcontroller Unit, MCU); (2)  所述的訊息管理模組1012,可以管理各使用者通訊裝置102接收和傳送的電子訊息,並可以將訊息處理伺服器101編輯完成的一影音客服資訊傳送至通訊雙方的使用者通訊裝置102(發訊端或收訊端); (3)  所述的訊息分析模組1013,可以擷取通訊雙方的電子訊息,並且可對電子訊息進行分析,以擷取該筆第一電子訊息相關聯的一通訊關係參數,所述的通訊關係參數可標示出通訊雙方關係,其中,訊息分析模組1013另可對第一電子訊息進行一斷詞處理程序,分析出第一電子訊息之句子結構以產生一訊息結構資訊,再基於所設定之情境,再對訊息結構資訊進行編輯修飾,以產生一第二電子訊息,其中,訊息分析模組1013係基於自然語言處理(NLP)技術,例如,CKIP中文斷詞系統和Jieba斷詞系統等,但不以此為限,依據使用者所使用之自然語言,增設自然語言處理(NLP)處理類型; (4)  所述的情境設定模組1014,可接收使用者於使用者通訊裝置102所設定的一情境設定參數,並且依據情境設定參數設定相關聯的一影音客服資訊; (5)  所述的儲存模組1015,包含有一訊息資料庫10151、一關係資料庫10152和一情境資料庫10153,其中,所述的訊息資料庫10151可供儲存通訊有各使用者通訊裝置接收及發送的電子訊息,所述的關係資料庫10152,儲存有至少一筆通訊帳號資訊,且各筆通訊帳號資訊係包含有一通訊識別碼和一通訊關係參數,所述的通訊識別碼可辨識出通訊帳號資訊,所述的通訊關係參數係紀錄有與該通訊帳號資訊相關聯之其他用戶關係資料,通訊關係參數包含有一稱謂詞資訊和一問候詞資訊,所述的情境資料庫10153,儲存有至少一筆情境對照參數、與情境對照參數相關聯之一影音客服資訊,其中,所述的情境對照參數至少包含有一家庭情境對照參數、一愛情情境對照參數和一朋友情境對照參數等,情境對照參數可以依據需求增設及建置,又,所述的影音客服資訊包含有不同情境風格的至少一影音客服動畫,且各影音客服資訊分別對應不同情境對照參數,再者,影音客服動畫係包含有一影音客服影像和一影音客服音訊。Please refer to "FIG. 2", which is a schematic diagram (2) of the present invention. As shown in the figure, the message processing server 101 mainly has a central processing module 1011, a message management module 1012, and a message analysis module. The module 1013, a context setting module 1014 and a storage module 1015 complete the information connection with the central processing module 1011, wherein: (1) The central processing module 1011 is used to run the information processing server 101 and drive the above modules, and has functions such as logical operation, temporary storage of operation results, and storage of execution command positions, and it can be a central a processor (Central Processing Unit, CPU) or a microcontroller (Microcontroller Unit, MCU); (2) The message management module 1012 can manage the electronic messages received and transmitted by each user communication device 102, and can transmit an audio-visual customer service information edited by the message processing server 101 to the user communication of both parties. device 102 (sender or receiver); (3) The message analysis module 1013 can capture electronic messages of both parties in communication, and can analyze the electronic messages to capture a communication relationship parameter associated with the first electronic message, the communication The relationship parameter can indicate the relationship between the two communication parties, wherein the message analysis module 1013 can further perform a word segmentation process on the first electronic message, analyze the sentence structure of the first electronic message to generate a message structure information, and then based on the set Then, edit and modify the message structure information to generate a second electronic message, wherein the message analysis module 1013 is based on natural language processing (NLP) technology, such as CKIP Chinese word segmentation system and Jieba word segmentation system, etc. , but not limited to this, according to the natural language used by the user, add a natural language processing (NLP) processing type; (4) The context setting module 1014 can receive a context setting parameter set by the user on the user communication device 102, and set an associated audio-visual customer service information according to the context setting parameter; (5) The storage module 1015 includes a message database 10151, a relationship database 10152 and a context database 10153, wherein the message database 10151 can be stored and communicated by each user communication device to receive and the electronic message sent, the relationship database 10152 stores at least one piece of communication account information, and each piece of communication account information includes a communication identification code and a communication relationship parameter, the communication identification code can identify the communication Account information, the communication relationship parameter records other user relationship data associated with the communication account information, the communication relationship parameter includes a title word information and a greeting word information, the context database 10153 stores at least A set of situational comparison parameters, and a piece of audio-visual customer service information associated with the situational comparison parameters, wherein the situational comparison parameters at least include a family situational comparison parameter, a love situational comparison parameter, a friend situational comparison parameter, etc. The situational comparison parameter can be It is added and constructed according to needs, and the audio-visual customer service information includes at least one audio-visual customer service animation with different context styles, and each audio-visual customer service information corresponds to different context comparison parameters. Furthermore, the audio-visual customer service animation system includes an audio-visual customer service animation. Video and an audio-visual customer service audio.

請參閱「第3圖」,圖中所示為本發明之組成示意圖(三),如圖,本發明之使用者通訊裝置102主要具有一微處理模組1021,另有一訊息輸入模組1022、一即時通訊模組1023和一訊息播放模組1024與微處理模組1021完成資訊連接,其中: (1)  所述的微處理模組1021,可驅動上述各模組,並具備邏輯運算、暫存運算結果、保存執行指令位置等功能,且其可為一中央處理器(Central Processing Unit, CPU)或一微控制器(Microcontroller Unit, MCU); (2)  所述的訊息輸入模組1022,可供使用者輸入一電子訊息,所述的電子訊息可以為一文字訊息或一聲音訊息,訊息輸入模組1022可以為虛擬鍵盤和麥克風其中一種或其組合; (3)  所述的即時通訊模組1023,可以執行一通訊應用程式,使得通訊裝置102可與其他通訊裝置彼此相互傳送訊息,此外,即時通訊模組1023另包含有一影音客服單元10231,影音客服單元10231可供使用者設定一情境設定參數和輸入電子訊息,其中,所述的情境設定參數與影音客服資訊相關聯,各個情境設定參數可分別對應到指定情境的影音客服; (4)  所述的訊息播放模組1024,可供以播放訊息處理伺服器101所傳送的第二電子訊息、相對應的影音客服資訊後,使虛擬客服單元10231可以透過訊息播放模組1024同時播放影音客服資訊及第二電子訊息,其中,訊息播放模組1024為揚聲器和顯示螢幕之組合。Please refer to "FIG. 3", which is a schematic diagram (3) of the composition of the present invention. As shown in the figure, the user communication device 102 of the present invention mainly has a microprocessor module 1021, and a message input module 1022, An instant messaging module 1023 and a message playing module 1024 complete the information connection with the microprocessor module 1021, wherein: (1) The micro-processing module 1021 can drive the above-mentioned modules, and has functions such as logical operation, temporary storage of operation results, and storage of execution instruction positions, and it can be a central processing unit (CPU). ) or a microcontroller (Microcontroller Unit, MCU); (2) The message input module 1022 can be used for the user to input an electronic message, the electronic message can be a text message or a voice message, and the message input module 1022 can be one of a virtual keyboard and a microphone or its combination; (3) The instant communication module 1023 can execute a communication application program, so that the communication device 102 and other communication devices can send messages to each other. The unit 10231 can be used by the user to set a context setting parameter and input electronic information, wherein the context setting parameter is associated with the audio-visual customer service information, and each context setting parameter can respectively correspond to the audio-visual customer service of the specified context; (4) The message playing module 1024 can be used to play the second electronic message and the corresponding audio-visual customer service information sent by the message processing server 101, so that the virtual customer service unit 10231 can pass the message playing module 1024 at the same time. Play the audio-visual customer service information and the second electronic message, wherein, the message playing module 1024 is a combination of a speaker and a display screen.

請參閱「第4圖」,圖中所示為本發明之實施流程圖,請搭配參閱「第1圖」~「第3圖」,如圖,訊息處理伺服器101於運行狀態下,其係可以偵測及擷取使用者通訊裝置102發送及接收的電子訊息,並且進一步對電子訊息進行編輯及影音化,本發明之智能訊息回覆方法,步驟如下: (1)  確認通訊雙方關係步驟S1:請搭配參閱「第5圖」,圖中所示為本發明之實施示意圖(一),如圖,一訊息處理伺服器101運行狀態下,訊息處理伺服器101可與使用者通訊裝置102與另一使用者通訊裝置102’保持資訊連接狀態,並且訊息處理伺服器101的一訊息管理模組1012可持續偵測及接收使用者通訊裝置102,如圖中所示,當作為發訊端的另一使用者通訊裝置102’將一發訊端訊息D1傳送至使用者通訊裝置102後,訊息管理模組1012即可擷取該筆發訊端訊息D1,並將發訊端訊息D1儲存在一訊息資料庫10151,且當訊息管理模組1012擷取發訊端訊息D1後,訊息管理模組1012會先確認發訊端訊息D1的一通訊識別碼,並且從一關係資料庫(本圖未繪示)搜尋與該通訊識別碼相關聯之一通訊關係參數; (2)  設定通訊情境步驟S2: 請搭配參閱「第6圖」,圖中所示為本發明之實施示意圖(二),如圖,當使用者透過使用者通訊裝置102接收發訊端訊息D1後,使用者可進一步透過使用者通訊裝置102驅動一影音客服單元10231,使用者可透過影音客服單元10231設定一情境設定參數,其中,使用者可以依據通訊對象,選擇適當的情境作為設定,例如,通訊對象為老婆時,使用者可以選擇家庭情境,又,情境設定參數完成設定後,使用者可再透過影音客服單元10231輸入一第一電子訊息,使得使用者通訊裝置102可將第一電子訊息和情境設定參數一併傳送至訊息處理伺服器101; (3)  分析第一電子訊息步驟S3:請搭配參閱「第7圖」,圖中所示為本發明之實施示意圖(三),如圖,訊息處理伺服器101接收到第一電子訊息和情境設定參數後,訊息處理伺服器101的一訊息分析模組1013可對第一電子訊息進行一斷詞處理程序,以取得至少一筆斷詞資訊,例如,第一電子訊息為「快到家了,塞車在路上」,訊息處理伺服器101依據斷詞作業結果產生多筆斷詞資訊為「快」「到家了」、「塞車」、「在路上」,完成斷詞處理程序後,訊息處理伺服器101即可產生至少一訊息結構資訊; (4)  產生第二電子訊息步驟S4:當訊息分析模組1013取得通訊關係參數和訊息結構資訊後,訊息分析模組1013可基於通訊關係參數對訊息結構資訊進行編輯,其中,訊息分析模組1013係於訊息結構資訊加入通訊關係參數的稱謂詞資訊和問候詞資訊,例如,「吳太太(稱謂)您好(敬語),吳先生(稱謂)快回家了,塞車在路上,請放心等候(問候語)」,編輯完成訊息分析模組1013即可產生一第二電子訊息; (5)  影音客服資訊設定步驟S5:當訊息分析模組1013產生第二電子訊息後,情境設定模組1014可基於情境設定參數,於情境資料庫10153查詢相對應一情境對照參數,並且擷取所匹配情境對照參數的一影音客服資訊,例如,工作情境對照參數可與工作影音客服資訊相關聯,其中,工作影音客服資訊提供有嚴肅風格的影音客服動畫; (6)  發佈影音客服資訊步驟S6:請搭配參閱「第8圖」,圖中所示為本發明之實施示意圖(四),如圖,當訊息處理伺服器101完成第二電子訊息和影音客服資訊之設定,如圖中所示,訊息處理伺服器101可將第二電子訊息D2和影音客服資訊D3回傳至另一使用者通訊裝置102’,再者,另一使用者通訊裝置102’接收到第二電子訊息D2和影音客服資訊D3後,另一使用者通訊裝置102’的訊息播放模組(圖中未繪示),可以同時撥放第二電子訊息D2和影音客服資訊D3。Please refer to “FIG. 4”, which shows the flow chart of the implementation of the present invention. Please refer to “FIG. 1” to “FIG. 3” in conjunction. The electronic messages sent and received by the user communication device 102 can be detected and captured, and the electronic messages can be further edited and converted into audio and video. The steps of the intelligent message reply method of the present invention are as follows: (1) Confirm the relationship between the two communication parties. Step S1: Please refer to "Fig. 5". The figure shows a schematic diagram (1) of the implementation of the present invention. As shown in the figure, when the message processing server 101 is running 101 can maintain an information connection state with the user communication device 102 and another user communication device 102', and a message management module 1012 of the message processing server 101 can continuously detect and receive the user communication device 102, as shown in the figure As shown, after another user communication device 102' serving as the sender sends a sender message D1 to the user communication device 102, the message management module 1012 can retrieve the sender message D1, and The sender message D1 is stored in a message database 10151, and after the message management module 1012 retrieves the sender message D1, the message management module 1012 will first confirm a communication identifier of the sender message D1, and Search for a communication relationship parameter associated with the communication ID from a relationship database (not shown in this figure); (2) Step S2 of Setting the Communication Scenario: Please refer to FIG. 6 , which is a schematic diagram (2) of the implementation of the present invention. As shown in the figure, when the user receives the sender message D1 through the user communication device 102 After that, the user can further drive an audio-visual customer service unit 10231 through the user communication device 102, and the user can set a context setting parameter through the audio-visual customer service unit 10231, wherein the user can select an appropriate context as the setting according to the communication object, for example , when the communication object is the wife, the user can select the family context, and after the context setting parameters are set, the user can input a first electronic message through the audio-visual customer service unit 10231, so that the user communication device 102 can send the first electronic message The message and the context setting parameters are sent to the message processing server 101 together; (3) Analyzing the first electronic message Step S3: Please refer to "Fig. 7", which is a schematic diagram of the implementation of the present invention (3), as shown in the figure, the message processing server 101 receives the first electronic message and the context After the parameters are set, a message analysis module 1013 of the message processing server 101 can perform a word segmentation process on the first electronic message to obtain at least one piece of word segmentation information, for example, the first electronic message is "coming home, traffic jam On the road, the message processing server 101 generates multiple pieces of word segmentation information according to the word segmentation operation results as “coming soon”, “coming home”, “traffic jam”, and “on the road”. After completing the word segmentation process, the message processing server 101 can generate at least one message structure information; (4) Step S4 of generating the second electronic message: After the message analysis module 1013 obtains the communication relationship parameters and the message structure information, the message analysis module 1013 can edit the message structure information based on the communication relationship parameters, wherein the message analysis module 1013 is the appellation word information and greeting word information that is added to the communication relationship parameter in the message structure information, for example, "Mrs. Wu (title) hello (honor), Mr. Wu (title) is going home soon, the traffic jam is on the road, please wait with confidence (Greeting)", after editing the message analysis module 1013, a second electronic message can be generated; (5) Audio and video customer service information setting step S5: after the message analysis module 1013 generates the second electronic message, the context setting module 1014 can query the context database 10153 for a corresponding context comparison parameter based on the context setting parameters, and retrieve An audio-visual customer service information of the matched context comparison parameter, for example, the work situation comparison parameter can be associated with the work audio-visual customer service information, wherein the work audio-visual customer service information provides a serious-style audio-visual customer service animation; (6) Publishing audio and video customer service information Step S6: Please refer to "Fig. 8", which is a schematic diagram of the implementation of the present invention (4), as shown in the figure, when the message processing server 101 completes the second electronic message and audio and video customer service Information setting, as shown in the figure, the message processing server 101 can send back the second electronic message D2 and the audio-visual customer service information D3 to another user communication device 102', and further, another user communication device 102' After receiving the second electronic message D2 and the audio-visual customer service information D3, the message playing module (not shown in the figure) of the other user communication device 102' can simultaneously play the second electronic message D2 and the audio-visual customer service information D3.

請參閱「第9圖」,圖中所示為本發明之另一實施例(一),如圖,使用者通訊裝置102更包含有一語音辨識模組1025,所述的語音辨識模組1025與微處理模組1021完成資訊連接,且語音辨識模組1025可將聲音訊息轉換為文字訊息,實施時,當使用者所輸入的第一電子訊息為語音訊息時,語音辨識模組1025可以將語音訊息轉換為,其可以透過語音轉文字識別技術(Speech To Text ,STT)、語意分析(semantic analysis)技術達成,據此,當語音辨識模組1025擷取第一電子訊息(語音)後,語音辨識模組1025可將第一電子訊息轉換為文字訊息,透過使用者通訊裝置102將第一電子訊息和情境設定參數一併傳送至訊息處理伺服器101(本圖未繪示)。Please refer to FIG. 9, another embodiment (1) of the present invention is shown in the figure. As shown in the figure, the user communication device 102 further includes a speech recognition module 1025. The speech recognition module 1025 is connected to The microprocessor module 1021 completes the information connection, and the voice recognition module 1025 can convert the voice message into a text message. During implementation, when the first electronic message input by the user is a voice message, the voice recognition module 1025 can convert the voice message. Message conversion can be achieved through speech-to-text recognition technology (Speech To Text, STT) and semantic analysis (semantic analysis) technology. According to this, when the speech recognition module 1025 captures the first electronic message (voice), the voice The identification module 1025 can convert the first electronic message into a text message, and transmit the first electronic message and the context setting parameters to the message processing server 101 through the user communication device 102 (not shown in this figure).

請參閱「第10圖」,圖中所示為本發明之另一實施例(二),如圖,訊息處理伺服器101更包含有一語音設定模組1016,所述的語音設定模組1016與中央處理模組1011完成資訊連接,語音設定模組1016儲存有至少一語音設定資訊,語音設定模組1016可基於不同情境的影音客服資訊分別選擇搭配的語音設定資訊,於「影音客服資訊設定步驟S5」,當情境設定模組1014設定影音客服資訊後,語音設定模組1016可以依據所設定的影音客服資訊,擷取相對應的語音設定資訊,並且,於「發佈影音客服資訊步驟S6」,訊息處理伺服器101可將第二電子訊息和影音客服資訊和語音設定資訊一併傳送到另一使用者通訊裝置102’,使另一使用者通訊裝置102’的訊息播放模組接收到上述資訊後,可以基於語音設定資訊與影音客服資訊之播放時間軸進行播放;其中,所述的語音設定資訊係可以設定情境,預先錄製不同語調之語音檔案,以符合各種使用情境。Please refer to "FIG. 10", which shows another embodiment (2) of the present invention. As shown in the figure, the message processing server 101 further includes a voice setting module 1016. The voice setting module 1016 and The central processing module 1011 completes the information connection, and the voice setting module 1016 stores at least one voice setting information. The voice setting module 1016 can select the matching voice setting information based on the audio and video customer service information in different situations. S5", after the context setting module 1014 sets the audio and video customer service information, the voice setting module 1016 can retrieve the corresponding voice setting information according to the set audio and video customer service information, and, in the "release audio and video customer service information step S6", The message processing server 101 can transmit the second electronic message together with the audio and video customer service information and the voice setting information to another user communication device 102', so that the message playing module of the other user communication device 102' can receive the above information Afterwards, it can be played based on the playback timeline of the voice setting information and the audio-visual customer service information; wherein, the voice setting information can set a situation, and pre-record voice files with different tones to meet various usage situations.

請參閱「第11圖」,圖中所示為本發明之另一實施例(三),如圖,使用者通訊裝置102的即時通訊模組1023更包含有一語音同步單元10232,所述的語音同步單元10232可供使用者設定模擬語音請求,當使用者透過影音客服單元10231設定好情境設定參數和完成輸入第一電子訊息後,使用者通訊裝置102可將模擬語音請求、使用者通訊裝置102和第一電子訊息傳送至訊息處理伺服器101,於「分析第一電子訊息步驟S3」~「產生第二電子訊息步驟S4」,訊息分析模組1013可以跳過「斷詞處理程序」和「加入通訊關係參數」等步驟,訊息分析模組1013基於模擬語音請求維持第一電子訊息之完整性,再者,另一使用者通訊裝置102’接收到第一答覆訊息、和影音客服資訊後,另一使用者通訊裝置102’即可接收到對方套用不同語調之聲音訊息,並且搭配設定的影音客服動畫。Please refer to "FIG. 11", which shows another embodiment (3) of the present invention. As shown in the figure, the instant messaging module 1023 of the user communication device 102 further includes a voice synchronization unit 10232. The voice The synchronization unit 10232 can be used by the user to set the analog voice request. After the user has set the context setting parameters through the audio-visual customer service unit 10231 and completed the input of the first electronic message, the user communication device 102 can send the analog voice request to the user communication device 102 and the first electronic message are sent to the message processing server 101, in "analyzing the first electronic message step S3" ~ "generating the second electronic message step S4", the message analysis module 1013 can skip the "word segmentation process" and " Steps such as adding communication relationship parameters", the message analysis module 1013 maintains the integrity of the first electronic message based on the analog voice request, and further, after another user communication device 102' receives the first reply message and the audio-visual customer service information, The other user communication device 102' can receive the voice messages of the other party using different tones, and match the set video and audio customer service animations.

由上所述可知,本發明之本發明之智能影音融合系統及其實施方法,主要具有一訊息處理伺服器和至少一使用者通訊裝置,訊息處理伺服器可接收使用者通訊裝置所欲傳送的一第一電子訊息,並且,訊息處理伺服器可基於所設定一情境設定參數,產生自動文字修飾的一第二電子訊息和相對應情境的一影音客服資訊,最後一併將第二電子訊息和影音客服資訊傳送至另一通訊方的使用者通訊裝置進行播放;依此,本發明其據以實施後,確實可達到提供一種可主動修改訊息內容,並將訊息融入於影音內容進行回覆的智能影音融合系統及其實施方法之目的。As can be seen from the above, the intelligent video-audio fusion system of the present invention and the implementation method thereof mainly have a message processing server and at least one user communication device, and the message processing server can receive the information sent by the user communication device. A first electronic message, and the message processing server can generate a second electronic message with automatic text modification and an audio-visual customer service information corresponding to the context based on a set context setting parameter, and finally combine the second electronic message with the context. The audio and video customer service information is transmitted to the user communication device of the other communication party for playback; therefore, after the implementation of the present invention, it is possible to provide an intelligent system that can actively modify the content of the message and integrate the message into the audio and video content for replying. The purpose of an audio-visual fusion system and its implementation method.

唯,以上所述者,僅為本發明之較佳之實施例而已,並非用以限定本發明實施之範圍;任何熟習此技藝者,在不脫離本發明之精神與範圍下所作之均等變化與修飾,皆應涵蓋於本發明之專利範圍內。However, the above descriptions are only preferred embodiments of the present invention, and are not intended to limit the scope of the present invention; anyone familiar with the art can make equal changes and modifications without departing from the spirit and scope of the present invention. , all should be covered within the patent scope of the present invention.

綜上所述,本發明之功效,係具有發明之「產業可利用性」、「新穎性」與「進步性」等專利要件;申請人爰依專利法之規定,向 鈞局提起發明專利之申請。To sum up, the effect of the present invention is that it has the patent requirements of "industrial applicability", "novelty" and "progressiveness" of the invention; the applicant should file an invention patent with the Jun Bureau in accordance with the provisions of the Patent Law Apply.

10:智能影音融合系統 101:訊息處理伺服器 1011:中央處理模組 1012:訊息管理模組 1013:訊息分析模組 1014:情境設定模組 1015:儲存模組 10151:訊息資料庫 10152:關係資料庫 10153:情境資料庫 1016:語音設定模組 102:使用者通訊裝置 1021:微處理模組 1022:訊息輸入模組 1023:即時通訊模組 10231:影音客服單元 10232:語音同步單元 1024:訊息播放模組 1025:語音辨識模組 S1:確認通訊雙方關係步驟 S2:設定通訊情境步驟 S3:分析第一電子訊息步驟 S4:產生第二電子訊息步驟 S5:影音客服資訊設定步驟 S6:發佈影音客服資訊步驟10: Intelligent audio-visual fusion system 101: message processing server 1011: Central Processing Module 1012: Message Management Module 1013: Message Analysis Module 1014: Situation setting module 1015: Storage Module 10151: Message Database 10152: Relational database 10153: Context Database 1016: Voice setting module 102: User Communication Device 1021: Microprocessor Module 1022: Message input module 1023: Instant Messaging Module 10231: Audio and video customer service unit 10232: Voice synchronization unit 1024: Message playback module 1025: Speech Recognition Module S1: Steps to confirm the relationship between the communication parties S2: Steps to set the communication situation S3: step of analyzing the first electronic message S4: Step of generating a second electronic message S5: Video customer service information setting steps S6: Steps for releasing audio and video customer service information

第1圖,為本發明之組成示意圖(一)。 第2圖,為本發明之組成示意圖(二)。 第3圖,為本發明之組成示意圖(三)。 第4圖,為本發明之實施流程圖。 第5圖,為本發明之實施示意圖(一)。 第6圖,為本發明之實施示意圖(二)。 第7圖,為本發明之實施示意圖(三)。 第8圖,為本發明之實施示意圖(四)。 第9圖,為本發明之另一實施例(一)。 第10圖,為本發明之另一實施例(二)。 第11圖,為本發明之另一實施例(三)。Figure 1 is a schematic diagram (1) of the composition of the present invention. Figure 2 is a schematic diagram (2) of the composition of the present invention. Fig. 3 is a schematic diagram (3) of the composition of the present invention. FIG. 4 is a flow chart of the implementation of the present invention. FIG. 5 is a schematic diagram (1) of the implementation of the present invention. FIG. 6 is a schematic diagram (2) of the implementation of the present invention. FIG. 7 is a schematic diagram (3) of the implementation of the present invention. Fig. 8 is a schematic diagram (4) of the implementation of the present invention. FIG. 9 is another embodiment (1) of the present invention. Fig. 10 is another embodiment (2) of the present invention. FIG. 11 is another embodiment (3) of the present invention.

S1:確認通訊雙方關係步驟S1: Steps to confirm the relationship between the communication parties

S2:設定通訊情境步驟S2: Steps to set the communication situation

S3:分析第一電子訊息步驟S3: step of analyzing the first electronic message

S4:產生第二電子訊息步驟S4: Step of generating a second electronic message

S5:影音客服資訊設定步驟S5: Video customer service information setting steps

S6:發佈影音客服資訊步驟S6: Steps for releasing audio and video customer service information

Claims (10)

一種智能影音融合系統的實施方法,其包含: 一確認通訊雙方關係步驟:一訊息處理伺服器偵測及擷取一使用者通訊裝置所接收到的一發訊端訊息後,該訊息處理伺服器先擷取該發訊端訊息之一通訊識別碼,並且確認該通訊識別碼的一通訊關係參數; 一設定通訊情境步驟:於該使用者通訊裝置完成一情境設定參數和一第一電子訊息之設定後,該使用者通訊裝置將該情境設定參數和一第一電子訊息一併傳送至該訊息處理伺服器; 一分析第一電子訊息步驟:該訊息處理伺服器對該第一電子訊息進行斷詞處理以取得至少一關鍵詞,再以各該關鍵詞比對於一關鍵字資料庫,使該訊息處理伺服器基於比對結果產生一斷詞資訊; 一產生第二電子訊息步驟:當該訊息處理伺服器取得該通訊關係參數和該斷詞資訊後,該訊息處理伺服器基於該通訊關係參數,對該斷詞資訊進行編輯以產生一第二電子訊息; 一影音客服資訊設定步驟:該訊息處理伺服器基於該情境設定參數選擇一影音客服資訊;以及 一發佈影音客服資訊步驟發佈:該訊息處理伺服器將該第二電子訊息和該影音客服資訊傳送至另一使用者通訊裝置播放。An implementation method of an intelligent audio-visual fusion system, comprising: A step of confirming the relationship between the two communication parties: after a message processing server detects and captures a sender message received by a user communication device, the message processing server first retrieves a communication identifier of the sender message code, and confirm a communication relationship parameter of the communication identification code; A step of setting communication context: after the user communication device completes the setting of a context setting parameter and a first electronic message, the user communication device sends the context setting parameter and a first electronic message to the message processing server; A step of analyzing the first electronic message: the message processing server performs word segmentation processing on the first electronic message to obtain at least one keyword, and then compares each keyword with a keyword database, so that the message processing server Generate a segmented word information based on the comparison result; A step of generating a second electronic message: after the message processing server obtains the communication relationship parameter and the word segmentation information, the message processing server edits the word segmentation information based on the communication relationship parameter to generate a second electronic message message; An audio-visual customer service information setting step: the message processing server selects an audio-visual customer service information based on the situation setting parameter; and A step of publishing video and audio customer service information: the message processing server transmits the second electronic message and the audio and video customer service information to another user communication device for playback. 如請求項1的智能影音融合系統的實施方法,其中,該訊息處理伺服器對該第一電子訊息進行斷詞處理後產生一訊息結構資訊,使該訊息處理伺服器可以該通訊關係參數、及該訊息結構資訊,編輯為該第二電子訊息。The implementation method of the intelligent video-audio fusion system of claim 1, wherein the message processing server performs word segmentation processing on the first electronic message to generate a message structure information, so that the message processing server can use the communication relationship parameters, and The message structure information is edited into the second electronic message. 如請求項1的智能影音融合系統的實施方法,其中,該訊息處理伺服器接收該情境設定參數,比對一情境資料庫與該情境設定參數相對應的該情境對照參數,該訊息處理伺服器依據比對結果設定該影音客服資訊。The implementation method of the intelligent video-audio fusion system of claim 1, wherein the message processing server receives the context setting parameter, compares a context database with the context comparison parameter corresponding to the context setting parameter, and the message processing server Set the video customer service information according to the comparison result. 如請求項1的智能影音融合系統的實施方法,其中,於該「影音客服資訊設定步驟」,當該訊息處理伺服器完成該影音客服資訊之設定後,該訊息處理伺服器依據該影音客服資訊擷取相對應的一語音設定資訊,於該「發佈影音客服資訊步驟」,該訊息處理伺服器將該第二電子訊息、該影音客服資訊和該語音設定資訊一併傳送到另一該使用者通訊裝置播放。The implementation method of the intelligent video-audio fusion system of claim 1, wherein, in the "audio-video customer service information setting step", after the message processing server completes the setting of the video-audio customer service information, the message processing server is based on the video-audio customer service information. Retrieve a corresponding voice setting information, and in the "publishing audio and video customer service information step", the message processing server transmits the second electronic message, the audio and video customer service information and the voice setting information to another user together Communication device playback. 一種智能影音融合系統,其包含: 一使用者通訊裝置,可供驅動一即時通訊模組,該即時通訊模組包含有一影音客服單元,可供輸入一第一電子訊息和設定一情境設定參數; 一訊息處理伺服器,與該使用者通訊裝置完成資訊連接,該訊息處理伺服器擷取該第一電子訊息,分析該第一電子訊息的一通訊關係參數,基於該通訊關係參數對該第一電子訊息重新編輯為一第二電子訊息,該訊息處理伺服器基於該情境設定參數設定一影音客服資訊;以及 該訊息處理伺服器將處理完成的該第二電子訊息和該影音客服資訊,傳送到與該使用者通訊裝置通訊的另一該使用者通訊裝置後,播放該第二電子訊息和該影音客服資訊。An intelligent audio-visual fusion system, which includes: a user communication device capable of driving an instant messaging module, the instant messaging module including an audio-visual customer service unit capable of inputting a first electronic message and setting a situation setting parameter; A message processing server completes an information connection with the user communication device, the message processing server captures the first electronic message, analyzes a communication relationship parameter of the first electronic message, and generates the first electronic message based on the communication relationship parameter. The electronic message is re-edited into a second electronic message, and the message processing server sets an audio-visual customer service information based on the situation setting parameter; and The message processing server transmits the processed second electronic message and the audio-visual customer service information to another user communication device that communicates with the user communication device, and then plays the second electronic message and the audio-visual customer service information . 如請求項5所述的智能影音融合系統,其中,該使用者通訊裝置包含有一語音辨識模組,該語音辨識模組與該為處理模組完成資訊連接。The intelligent video-audio fusion system of claim 5, wherein the user communication device includes a voice recognition module, and the voice recognition module and the processing module complete information connection. 如請求項5所述的智能影音融合系統,其中,該訊息處理伺服器對該第一電子訊息進行斷詞處理後產生一訊息結構資訊,使該訊息處理伺服器可以該通訊關係參數和該訊息結構資訊編輯為該第二電子訊息。The intelligent video-audio fusion system according to claim 5, wherein the message processing server performs word segmentation on the first electronic message to generate message structure information, so that the message processing server can use the communication relationship parameters and the message The structural information is edited into the second electronic message. 如請求項5所述的智能影音融合系統,其中,該訊息處理伺服器包含有一儲存模組,該儲存模組具有一情境資料庫,該情境資料庫儲存有至少一情境對照參數,各該情境對照參數對應相關情境的該影音客服資訊。The intelligent video-audio fusion system of claim 5, wherein the message processing server includes a storage module, the storage module has a context database, the context database stores at least one context comparison parameter, each context The comparison parameter corresponds to the audio-visual customer service information of the relevant situation. 如請求項8所述的智能影音融合系統,其中,該訊息處理伺服器包含有一情境設定模組,該情境設定模組接收該情境設定參數,比對於該情境資料庫與該情境設定參數相對應的該情境對照參數,該情境設定模組依據比對結果設定該影音客服資訊。The intelligent video-audio fusion system of claim 8, wherein the message processing server includes a context setting module, the context setting module receives the context setting parameter, and compares the context database with the context setting parameter The context comparison parameter is set, and the context setting module sets the audio-visual customer service information according to the comparison result. 如請求項5所述的智能影音融合系統,其中,該訊息處理伺服器係擷取一通訊識別碼,確認該通訊識別碼的該通訊關係參數。The intelligent video-audio fusion system of claim 5, wherein the message processing server captures a communication identification code to confirm the communication relationship parameter of the communication identification code.
TW109125866A 2020-07-30 2020-07-30 Intelligent audiovisual combination system and implementation method thereof TW202205135A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
TW109125866A TW202205135A (en) 2020-07-30 2020-07-30 Intelligent audiovisual combination system and implementation method thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
TW109125866A TW202205135A (en) 2020-07-30 2020-07-30 Intelligent audiovisual combination system and implementation method thereof

Publications (1)

Publication Number Publication Date
TW202205135A true TW202205135A (en) 2022-02-01

Family

ID=81323597

Family Applications (1)

Application Number Title Priority Date Filing Date
TW109125866A TW202205135A (en) 2020-07-30 2020-07-30 Intelligent audiovisual combination system and implementation method thereof

Country Status (1)

Country Link
TW (1) TW202205135A (en)

Similar Documents

Publication Publication Date Title
US11699456B2 (en) Automated transcript generation from multi-channel audio
WO2018107605A1 (en) System and method for converting audio/video data into written records
US8655654B2 (en) Generating representations of group interactions
US8630854B2 (en) System and method for generating videoconference transcriptions
CN113014732B (en) Conference record processing method and device, computer equipment and storage medium
US20120051719A1 (en) System and Method for Editing Recorded Videoconference Data
JP3621686B2 (en) Data editing method, data editing device, data editing program
WO2013107184A1 (en) Conference recording method and conference system
JP2003521750A (en) Speech system
CN111182347A (en) Video clip cutting method, device, computer equipment and storage medium
CN109474843A (en) The method of speech control terminal, client, server
CN111294606B (en) Live broadcast processing method and device, live broadcast client and medium
EP4235458A2 (en) Systems and methods for identifying and providing information about semantic entities in audio signals
WO2019114015A1 (en) Robot performance control method and robot
CN101867742A (en) Television system based on sound control
TWI807428B (en) Method, system, and computer readable record medium to manage together text conversion record and memo for audio file
JP7331044B2 (en) Information processing method, device, system, electronic device, storage medium and computer program
US11687576B1 (en) Summarizing content of live media programs
KR20200051173A (en) System for providing topics of conversation in real time using intelligence speakers
TW202205135A (en) Intelligent audiovisual combination system and implementation method thereof
US11785299B1 (en) Selecting advertisements for media programs and establishing favorable conditions for advertisements
TWM605620U (en) Intelligent video integrating system
TWI377559B (en) Singing system with situation sound effect and method thereof
WO2021102647A1 (en) Data processing method and apparatus, and storage medium
WO2021134284A1 (en) Voice information processing method, hub device, control terminal and storage medium