TWI502582B

TWI502582B - Customer service interactive voice system

Info

Publication number: TWI502582B
Application number: TW102112130A
Authority: TW
Inventors: Bing Kun Hsu; chen yang Zheng; yong zhang Lin; Chih Cheng Li
Original assignee: Chung Han Interlingua Knowledge Co Ltd
Priority date: 2013-04-03
Filing date: 2013-04-03
Publication date: 2015-10-01
Also published as: TW201440035A

Description

Service point voice customer service system

本發明係關於一種客服系統，特別是關於一種服務點之語音客服系統。The present invention relates to a customer service system, and more particularly to a voice customer service system for a service point.

習知電話客服系統，例如醫院的掛號服務以及餐飲業訂餐等，都是使用者打電話到客服中心後，透過客服人員與使用者對話交談互動；或播放預錄好的語音以及透過電話上的按鍵設置，例如數字鍵「0~9」及兩個符號鍵「*」、「#」所分別對應的選項，與使用者互動來達到目的。The familiar telephone customer service system, such as the registration service of the hospital and the ordering of the catering industry, are all after the user calls the customer service center and interacts with the user through the customer service; or plays the pre-recorded voice and the voice on the phone. Button settings, such as the number keys "0~9" and the two symbol keys "*", "#" correspond to the options, interact with the user to achieve the purpose.

舉例來說，當使用者打電話至醫院的電話客服系統時，電話客服系統會先播放預錄好的語音，告知數字鍵0~9及兩個符號鍵*、#所分別對應的選項，例如1：掛號、2：查詢、9：由總機為您服務之類的。然後使用者選擇所需要的服務，譬如按下數字鍵1來選擇掛號，然後系統進入掛號流程。接著電話客服系統撥放掛號流程的語音，告知數字鍵1：外科、2：內科。然後使用者根據自己的需求去按下對應的數字鍵。再者，也可以透過這些方式輸入一些資料，譬如預掛號的日期、以及使用者的身分證字號等等。For example, when the user calls the telephone customer service system of the hospital, the telephone customer service system will play the pre-recorded voice first, and inform the numeric keys 0~9 and the two symbol keys *, # respectively corresponding to the options, for example 1: registration, 2: query, 9: the service of the switchboard for you. Then the user selects the desired service, for example, press the number key 1 to select the registration number, and then the system enters the registration process. Then the telephone customer service system dials the voice of the registration process, and informs the number key 1: surgery, 2: internal medicine. Then the user presses the corresponding number key according to their needs. In addition, you can also enter some information in these ways, such as the date of the pre-registration, and the user's ID number.

然而，一個客服人員無法同時處理兩件以上的事情，所以如果要同時應付大量的客服需求，則必須安排多個客服人員來應對，如此一來人事成本會大幅上升。而採用電話按鍵及播放預錄好的語音的互動方式，使用上除了不夠人性化之外，也不方便。譬如使用手機或是話筒跟按鍵設在同個機身上的電話，則必須要聽完預錄好的語音後，再將電話拿離開耳朵到眼睛前，才能碰按鍵去作選擇或輸入，無法同時聽取語音及作輸入的動作，而且有時也會因此漏聽語音。再者，必須要等到預錄好的語音撥放完後，使用者才能得知每個按鍵對應的功能，導致使用者可能浪費了許多時間來聽取許多不用到的功能後，才聽到需要的功能。也就是說，使用者必須站在被動的角度，等待客服系統給予相關資訊或選擇。這種缺少了人性化又不方便的客服系統是需要改善的。However, a customer service staff cannot handle more than two things at the same time, so if you have to cope with a large number of customer service needs at the same time, you must arrange multiple customer service personnel to deal with them, so that personnel costs will rise sharply. The interactive method of using the telephone button and playing the pre-recorded voice is not convenient in use except for being not humanized. For example, using a mobile phone or a microphone with a button on the same body, You must listen to the pre-recorded voice, then take the phone away from your ear to your eyes before you can touch the button to make a selection or input. You can't listen to the voice and input at the same time, and sometimes you miss the voice. . Furthermore, it is necessary to wait until the pre-recorded voice is played, the user can know the function corresponding to each button, and the user may waste a lot of time to listen to many functions that are not needed before hearing the required function. . In other words, the user must stand in a passive perspective and wait for the customer service system to give relevant information or choices. This lack of user-friendly and inconvenient customer service system needs improvement.

緣此，本發明之目的即是提供一種人性化且方便的語音互動式的語音客服系統。Accordingly, the object of the present invention is to provide a user-friendly and convenient voice interactive voice customer service system.

本發明為解決習知技術之問題所採用之技術手段係提供一種服務點之語音客服系統，語音客服系統係接收一使用者所發出的一音頻訊號而進行資訊之確認，語音客服系統包括一音頻攫取機構、一第一轉換機構、一合構整合機構以及一第二轉換機構。音頻攫取機構為將音頻訊號予以攫取出一第一語音音頻資料。第一轉換機構連接音頻攫取機構，為將第一語音音頻資料轉換為一第一文字資料。合構整合機構為將第一轉換機構的第一文字資料予以合構整合，而成為第二文字資料。第二轉換機構連接合構整合機構，為將第二文字資料轉換為一第二語音音頻資料，並將第二語音音頻資料予以輸出，其中語音客服系統為設置於一服務點，服務點為接收使用者所發出的音頻訊號以進行資訊之確認。The technical means adopted by the present invention for solving the problems of the prior art provides a voice service system for a service point. The voice customer service system receives an audio signal sent by a user to confirm the information, and the voice customer service system includes an audio. The capture mechanism, a first conversion mechanism, a composite integration mechanism, and a second conversion mechanism. The audio capture mechanism extracts a first voice audio material from the audio signal. The first conversion mechanism is coupled to the audio capture mechanism for converting the first voice audio material into a first text material. The structure of the integration organization is to integrate the first text of the first conversion institution into a second text. The second conversion mechanism is connected to the combined integration mechanism for converting the second text data into a second voice audio material, and outputting the second voice audio data, wherein the voice customer service system is set at a service point, and the service point is receiving The audio signal sent by the user is used to confirm the information.

在本發明的一實施例中，第一轉換機構以及第二轉換機構為一雲端處理機構。In an embodiment of the invention, the first conversion mechanism and the second conversion mechanism are a cloud processing mechanism.

在本發明的一實施例中，音頻攫取機構攫取相對於一分貝基準值的音頻訊號而攫取出第一語音音頻資料。In an embodiment of the invention, the audio capture mechanism extracts the audio signal with respect to a decibel reference value and extracts the first voice audio material.

在本發明的一實施例中，合構整合機構係根據第一文字資料所包含的預設要素的項數及/或內容而予以合構整合。In an embodiment of the invention, the structural integration mechanism is based on the first The number and/or content of the preset elements included in the text data are combined and integrated.

在本發明的一實施例中，更包括一解構機構，連接第一轉換機構，為將第一文字資料予以解構為一文字解構構件，經解構出的文字解構構件為傳送至合構整合機構。In an embodiment of the invention, a deconstructing mechanism is further connected to the first conversion mechanism for deconstructing the first text data into a text deconstructing member, and the deconstructed text deconstructing member is transmitted to the decoupling integration mechanism.

在本發明的一實施例中，合構整合機構為在第一文字資料未滿預設項數的預設要素時，將該第一文字資料予以合構整合而成為內容係為要求不足項數的預設要素的第二文字資料，且音頻攫取機構對次一音頻訊號作攫取。In an embodiment of the present invention, when the first text data is less than the preset element of the preset number of items, the first text data is combined and integrated to become a pre-requisite number of content requirements. The second text data of the element is set, and the audio capturing mechanism captures the next audio signal.

在本發明的一實施例中，合構整合機構係為在第一文字資料具有預設要素時，將該第一文字資料予以合構整合而成為內容係為再確認預設要素的第二文字資料。In an embodiment of the present invention, when the first text material has a preset element, the first text data is combined and integrated to form a second text material whose content is a reconfirmed preset element.

在本發明的一實施例中，其中合構整合機構係為在第一文字資料具有預設項數的該預設要素時，將第一文字資料予以合構整合而成為內容係為告知使用者已完成語音客戶服務的第二文字資料。In an embodiment of the present invention, when the first text data has the preset number of items, the first text data is combined and integrated into a content system to notify the user that the content has been completed. The second text of the voice customer service.

在本發明的一實施例中，音頻訊號為來自一電話之音頻訊號。In an embodiment of the invention, the audio signal is an audio signal from a telephone.

在本發明的一實施例中，服務點為一餐飲點餐裝置。In an embodiment of the invention, the service point is a food ordering device.

經由本發明所採用之技術手段，可以不需要有客服人員來專門接客服電話，因此能節省人力。再者，語音客服系統透過語音與使用者自然對話並且藉由語音辨識而辨識出使用者的需求，對使用者而言，具有人性化而且簡易操作的優點。所以特別適合應用於客服需求大量，且使用者目的明確的環境，例如餐飲業或醫療業等。藉此既能提高服務品質，同時又能降低成本。Through the technical means adopted by the invention, it is possible to save the manpower by not having a customer service person to specifically receive the customer service call. Moreover, the voice customer service system naturally communicates with the user through voice and recognizes the user's needs through voice recognition, and has the advantages of humanization and simple operation for the user. Therefore, it is particularly suitable for applications where customer service needs are large and the user's purpose is clear, such as the catering industry or the medical industry. This will improve the quality of service while reducing costs.

本發明所採用的具體實施例，將藉由以下之實施例及附呈圖式作進一步之說明。The specific embodiments of the present invention will be further described by the following examples and the accompanying drawings.

100、100a‧‧‧語音客服系統100, 100a‧‧‧ voice customer service system

1‧‧‧音頻攫取機構1‧‧‧Audio capture agency

2‧‧‧第一轉換機構2‧‧‧First conversion agency

3‧‧‧解構機構3‧‧‧Deconstruction Agency

4‧‧‧合構整合機構4‧‧‧Construction Integration Agency

5‧‧‧第二轉換機構5‧‧‧Second conversion mechanism

6‧‧‧音頻輸出機構6‧‧‧Audio output mechanism

P、P'‧‧‧服務點P, P'‧‧‧ service points

S‧‧‧遠端伺服器S‧‧‧Remote Server

T‧‧‧電話T‧‧‧Phone

T1‧‧‧收話端T1‧‧‧ receiving end

T2‧‧‧發聲端T2‧‧‧ vocal

U‧‧‧使用者U‧‧‧Users

第1圖係顯示依據本發明之第一實施例的服務點之語音客服系統的系統示意圖；第2圖係顯示依據本發明之第一實施例的服務點之語音客服系統的方塊圖；第3圖係顯示依據本發明之第二實施例的服務點之語音客服系統的系統示意圖；第4圖係顯示依據本發明之第二實施例的服務點之語音客服系統的運作示意圖。1 is a system diagram showing a voice customer service system of a service point according to a first embodiment of the present invention; and FIG. 2 is a block diagram showing a voice customer service system of a service point according to a first embodiment of the present invention; The figure shows a system diagram of a voice customer service system of a service point according to a second embodiment of the present invention; and FIG. 4 is a schematic diagram showing the operation of a voice customer service system of a service point according to a second embodiment of the present invention.

參閱第1圖及第2圖，本發明之第一實施例的服務點之語音客服系統100係設置於一服務點P，用以接收一使用者U所發出的音頻訊號，並進行資訊的確認。服務點設在需要使用電話客戶服務的地方，在本實施例中，服務點P為一醫療語音服務裝置，設置在醫院，而在其他實施例中，服務點也可以是一台電腦、多功能電話事務機等。語音客服系統100與使用者U端的一電話T例如行動電話或家用電話連通，而於服務點P藉由電話T一收話端T1將使用者U發出的音頻訊號予以擷取。Referring to FIG. 1 and FIG. 2, the voice customer service system 100 of the service point according to the first embodiment of the present invention is disposed at a service point P for receiving an audio signal sent by a user U and confirming the information. . The service point is located in a place where the telephone customer service is required. In this embodiment, the service point P is a medical voice service device and is installed in the hospital. In other embodiments, the service point can also be a computer and a multifunction. Telephone business machine, etc. The voice customer service system 100 is connected to a telephone T of the U-side of the user, such as a mobile phone or a home phone, and the audio signal sent by the user U is captured by the service point P by the telephone terminal T1.

語音客服系統100包括一音頻攫取機構1、一第一轉換機構2、一解構機構3、一合構整合機構4、一第二轉換機構5及一音頻輸出機構6。音頻攫取機構1將音頻訊號予以攫取出一第一語音音頻資料。第一轉換機構2連接音頻攫取機構1，為將第一語音音頻資料轉換為一第一文字資料。解構機構3連接第一轉換機構，為將第一文字資料予以解構為一文字解構構件，經解構出的文字解構構件為傳送至合構整合機構4。合構整合機構4連接解構機構3，為將解構機構3解構出的文字解構構件予以整合，而成為第二文字資料。第二轉換機構5連接合構整合機構4，為將第二文字資料轉換為一第二語音音頻資料，並將第二語音音頻資料予以輸出至音頻輸出機構6。音頻輸出機構6係連接第二轉換機構 5，為將所接收的第二語音音頻資料輸出至電話T的一發聲端T2。The voice customer service system 100 includes an audio capture mechanism 1, a first conversion mechanism 2, a deconstruction mechanism 3, a cooperative integration mechanism 4, a second conversion mechanism 5, and an audio output mechanism 6. The audio capture mechanism 1 extracts a first voice audio material from the audio signal. The first converting mechanism 2 is connected to the audio capturing mechanism 1 for converting the first voice audio material into a first text data. The deconstruction mechanism 3 is connected to the first conversion mechanism for deconstructing the first text data into a text deconstructing member, and the deconstructed text deconstructing member is transmitted to the decoupling integration mechanism 4. The structure integration unit 4 connects the deconstruction mechanism 3, and integrates the character deconstruction means deconstructed by the deconstruction mechanism 3 to become the second character material. The second conversion mechanism 5 is connected to the synthesis integration mechanism 4 for converting the second text data into a second voice audio material, and outputting the second voice audio data to the audio output mechanism 6. The audio output mechanism 6 is connected to the second conversion mechanism 5. Outputting the received second voice audio material to a sounding terminal T2 of the telephone T.

詳細而言，當使用者U撥打電話T至醫院時，設置在醫院的服務點P(即醫療語音服務裝置)的語音客服系統100透過電信線路與電話T連通，然後攫取使用者U說出的音頻訊號。例如使用者U說出「我想要掛號」這類的服務時，電話T的收話端T1將音頻訊號擷取後透過無線電傳輸或電訊線路等的電訊通訊方式傳遞至音頻攫取機構1。再者，音頻攫取機構1可根據使用者U的環境背景的一分貝基準值來濾除背景聲而將音頻訊號予以攫取出第一語音音頻資料。藉此將使用者U的環境背景的雜音過濾，以提高攫取到第一語音音頻資料的正確率。例如環境背景的雜音為50分貝時，而使用者說話的語音音量變化及頻率變化會比雜音的變化來的大，所以音頻攫取機構1根據兩者的差異而將雜音的50分貝訊號濾除掉。此外，音頻攫取機構1同時也可將音頻訊號轉換攫取出成數位訊號形式的第一語音音頻資料，以便於後續分析處理。當然，本發明不限於此，可以在第一轉換機構2時才轉換成數位訊號，或者不轉換成數位訊號而是直接攫取出第一語音音頻資料並傳送至第一轉換機構2分析處理。In detail, when the user U dials the phone T to the hospital, the voice customer service system 100 installed at the service point P of the hospital (ie, the medical voice service device) communicates with the phone T through the telecommunication line, and then retrieves the user U's statement. Audio signal. For example, when the user U says "I want to register", the receiving terminal T1 of the telephone T transmits the audio signal and transmits it to the audio capturing mechanism 1 through telecommunication such as radio transmission or telecommunication line. Furthermore, the audio capture mechanism 1 can filter the background sound according to a decibel reference value of the environment background of the user U and extract the first voice audio data from the audio signal. Thereby, the noise of the environment background of the user U is filtered to improve the correct rate of capturing the first voice audio material. For example, when the noise of the environmental background is 50 decibels, and the voice volume change and frequency change of the user's speech are greater than the change of the noise, the audio capture mechanism 1 filters out the 50 dB signal of the noise according to the difference between the two. . In addition, the audio capture mechanism 1 can also convert the audio signal into the first voice audio data in the form of a digital signal for subsequent analysis processing. Of course, the present invention is not limited thereto, and may be converted into a digital signal when the first conversion mechanism 2 is performed, or may be directly extracted from the first audio and audio data and transmitted to the first conversion mechanism 2 for analysis processing without being converted into a digital signal.

第一轉換機構2在本實施例中設置在一遠端伺服器S，接收第一語音音頻資料後，利用STT(speech to text)的方式將第一語音音頻資料辨識並轉換為第一文字資料，然後再傳送至解構機構3。The first conversion mechanism 2 is disposed in a remote server S in the embodiment, and after receiving the first voice audio data, the first voice audio data is recognized and converted into the first text data by using STT (Speech to Text). It is then transferred to the deconstruction mechanism 3.

解構機構3是一種文義分析機構，利用文法結構分析、字義分析等將第一文字資料作各種文義分析，將第一文字資料所包含的預設要素解構出來，分析出「我」、「要」以及「掛號」等的文字解構構件。並且傳送至合構整合機構4。The deconstruction organization 3 is a textual analysis organization that uses the grammatical structure analysis and the meaning analysis to analyze the first text data into various texts, deconstructs the pre-set elements contained in the first text data, and analyzes "I", "Yes" and " A text destructive component such as a registered number. And transmitted to the synthetic integration mechanism 4.

合構整合機構4是一種對話處理機構，根據對話處理模型去合構整合文字解構構件，詳細而言是將接收到的各個文字解構構件與一資料庫進行比對，而判斷出接收到的預設要素為何，以及判斷出尚未接收到的預設要素為何，然後藉此合構整合而給予對應的回應。例如，將文字解構構件的「我」、「要」以及「掛號」合構整合後給予對應回應的第二文字資料，例如「請問您要掛甚麼科」、「請問您想要掛號的日期是哪一天」等，且傳送至第二轉換機構5。The integration integration mechanism 4 is a dialogue processing mechanism, which constructs an integrated text destruction component according to the dialog processing model. Specifically, the received text destruction components are compared with a database, and the received pre-determination is determined. Set the element to What, and determine the pre-set elements that have not yet been received, and then use the integration to give a corresponding response. For example, the words "I", "Yes" and "Registered" of the text destructor are combined to give a second text corresponding to the response, such as "What do you want to hang?", "I would like to ask if the date you want to register is Which day is equal to and transmitted to the second conversion mechanism 5.

第二轉換機構5與第一轉換機構2在本實施例中係設置在同一個遠端伺服器S，而且第一轉換機構2與第二轉換機構5是一種雲端處理機構，所以第一轉換機構2與第二轉換機構5可以位在醫院以外的資料處理中心而透過網路與服務點P的機構連通。當然，本發明不限於此，第一轉換機構2與第二轉換機構5也可以是與服務點P的其他機構整合在一起的語音辨識裝置或語音資料庫。第二轉換機構5在接收第二文字資料後，利用TTS(text to speech)的方式將第二文字資料轉換為第二語音音頻資料並且輸出至音頻輸出機構6。The second conversion mechanism 5 and the first conversion mechanism 2 are disposed in the same remote server S in this embodiment, and the first conversion mechanism 2 and the second conversion mechanism 5 are a cloud processing mechanism, so the first conversion mechanism 2 and the second conversion mechanism 5 can be located in a data processing center outside the hospital and communicate with the organization of the service point P through the network. Of course, the present invention is not limited thereto, and the first conversion mechanism 2 and the second conversion mechanism 5 may also be a voice recognition device or a voice data library integrated with other mechanisms of the service point P. After receiving the second text data, the second conversion mechanism 5 converts the second text data into the second voice audio material by means of TTS (text to speech) and outputs it to the audio output mechanism 6.

音頻輸出機構6接收了第二語音音頻資料並將其轉換成電信訊號在傳輸用的類比訊號，然後傳送給電話T的發聲端T2，給使用者U對應的回應，到此即完成一次對話的互動。接著。若使用者U根據第二語音音頻資料的內容有作出進一步對應。譬如，第二語音音頻資料為「請問您要掛甚麼科」，然後使用者U回應「我想要掛皮膚科」，則語音客服系統100給與對應的回應並重複同樣的運作而作出下一次對話互動。The audio output mechanism 6 receives the second voice audio data and converts it into an analog signal for transmitting the telecommunication signal, and then transmits it to the sounding terminal T2 of the telephone T, and gives the user U a corresponding response, and then completes a dialogue. interactive. then. If the user U makes a further correspondence according to the content of the second voice audio material. For example, if the second voice and audio material is "What do you want to hang?", then the user U responds to "I want to hang the dermatology", then the voice customer service system 100 gives the corresponding response and repeats the same operation to make the next time. Dialogue interaction.

此外，合構整合機構4收集到的預設要素達到一個預設的項數時，則語音客服系統完成整個對話互動。例如使用者U在一開始通話時，就直接說明「我要掛6月6號的下午黃禎憲醫生的皮膚科門診，我的姓名是王大明，身分證字號是H123456789」，合構整合機構4接收到了「掛」、「6月6號的下午」、「○○○醫生」、「皮膚科門診」、「王大明」、「H123456789」等的文字解構構件，當收集的預設要素達到預設的項數及內容時，則合構整合機構4輸出內容為完成整個語音掛號的流程的第二文字資料。反之，將未接收到的預設要素所對應的第二語音音頻資料輸出，以詢問使用者U答案，然後音頻攫取機構1重新自使用者U的一回應的次一音頻訊號攫取出一回應的次一語音音頻資料。當然，本發明不限於此，預設要素可以更多種，也可以根據不同的大群組去細分，例如，合構整合機構4在掛號群組之下接收的文字解構構件需要達到的預設要素，會與在查詢群組之下的預設要素不同，藉由上述分組的方式可以縮短使用者U與語音客服系統的運作時間。藉由上述的技術手段，語音客服系統透過語音與使用者自然對話並且藉由語音辨識而辨識出使用者的需求，對使用者而言，具有人性化而且簡易操作的優點。而且因為使用者不需要被動的等待系統的問題，可以自己主動先講出需求，大幅地加快客服速度，並且也減少客服人員的人力，降低經營的成本。In addition, when the preset elements collected by the integrated integration mechanism 4 reach a preset number of items, the voice customer service system completes the entire dialogue interaction. For example, when user U starts talking, he directly states that "I want to hang the dermatology clinic of Dr. Huang Qixian in the afternoon of June 6. My name is Wang Daming, the identity card number is H123456789", and the integration integration agency 4 received it. Text deconstructing components such as "hanging", "afternoon of June 6", "doctor of ○○○", "dermatology clinic", "Wang Daming", "H123456789", etc., when the preset elements collected reach the preset items Number and content At this time, the synthetic integration unit 4 outputs the second text data that is the process of completing the entire voice registration. On the contrary, the second voice audio data corresponding to the unreceived preset element is output to query the user U answer, and then the audio capturing mechanism 1 re-takes a response from the next audio signal of the user U. The second voice audio material. Of course, the present invention is not limited thereto, and the preset elements may be more or different, and may be subdivided according to different large groups, for example, the presets that the text deconstructing component received by the integrated integration mechanism 4 under the registered group needs to reach. The elements are different from the preset elements under the query group. By means of the above grouping, the operation time of the user U and the voice customer service system can be shortened. Through the above technical means, the voice customer service system naturally communicates with the user through voice and recognizes the user's needs through voice recognition, and has the advantages of humanization and simple operation for the user. Moreover, because the user does not need to passively wait for the system problem, he can actively speak the demand first, greatly speed up the customer service speed, and also reduce the manpower of the customer service personnel and reduce the operating cost.

參閱第3圖及第4圖，本發明之第二實施例的服務點之語音客服系統100a的服務場所是在餐飲場所，例如餐廳、速食店、外送餐飲店及飲料店等，其設置的服務點P'為一餐飲點餐裝置，例如一餐飲銷售點(point of sale)裝置，且其第一轉換機構及第二轉換機構都與餐飲銷售點裝置設置於一起。當使用者U打電話到餐廳時，服務點P'的語音客服系統100a與使用者U展開互動。使用者U可以直接說想要吃甚麼、幾時過來拿等，而語音客服系統100a主動將使用者U未提到的內容提出問題，與使用者U展開問答的互動。當預設要素的項數及內容都收集到時，即完成點餐動作。然後餐飲銷售點裝置將點餐的內容、消費金額等資訊顯示或列印給餐廳的員工去準備。如此一來，可以加快整個點餐的流程。Referring to FIG. 3 and FIG. 4, the service place of the voice customer service system 100a of the service point according to the second embodiment of the present invention is set in a restaurant, such as a restaurant, a fast food restaurant, a delivery restaurant, and a beverage shop. The service point P' is a food ordering device, such as a food point of sale device, and both the first conversion mechanism and the second conversion mechanism are disposed together with the food and beverage outlet device. When the user U calls the restaurant, the voice customer service system 100a of the service point P' interacts with the user U. The user U can directly say what he wants to eat, when he comes to take it, etc., and the voice customer service system 100a actively raises questions about the content that the user U does not mention, and engages with the user U to develop a question and answer interaction. When the number of items and contents of the preset elements are collected, the ordering action is completed. Then, the catering point-of-sale device displays or prints information such as the content of the order and the amount of consumption to the staff of the restaurant to prepare. In this way, the entire ordering process can be accelerated.

詳細而言，如第4圖所示，當使用者U打電話至餐廳時，服務點P’的語音客服系統100a輸出「XX飲料店您好，請問需要甚麼服務」的語音音頻資料，使用者U說出「我想要一杯珍珠奶茶」的音頻訊號。當合構整合機構4接收到「一杯」、「珍珠奶茶」等預設要素的文字解構構件時予以合構整合，而形成第二文字資料，其中因為第一文字資料的預設要素還為收集到足夠的預設項數，所以第二文字資料的內容為要求不足項數的預設要素，例如「糖量」、「冰量」等。在本實施例中，第二文字資料為「請問糖量及冰量呢？」。接著，使用者U接收到對應第二文字資料的第二語音音頻資料後，說出「半糖、少冰」的次一音頻訊號，則音頻攫取機構6再對次一音頻訊號作攫取。直到合構整合機構4收集的預設要素達一預設項數時，例如預設要素的項數為5個，包含飲料種類、糖量、溫度及客戶基本資料，則合構整合機構4接收到5個預設的要素時，其輸出的第二文字資料的內容為告知使用者U已完成本次的語音客服。例如第二文字資料為「已經完成訂餐服務，謝謝您的來電」。此外，在合構整合機構4在第一文字資料具有預設要素時，不論是否已經具有足夠項數時，第二文字資料的內容可以是再確認預設要素。例如第二文字資料為「好的，再跟您確認一次，是一杯珍珠奶茶，半糖，少冰嗎？」。透過本發明的語音客服系統，改善了習知的餐飲銷售點裝置的缺點，使得餐飲業可以藉由語音客服系統來提升服務的品質，並且同時節省語音客服的人力。Specifically, as shown in FIG. 4, when the user U calls the restaurant, the voice customer service system 100a of the service point P' outputs "XX beverage shop Hello, please Ask the voice and audio information of what service is required. User U says the audio signal "I want a cup of pearl milk tea". When the structural integration unit 4 receives the text deconstructing components of the pre-set elements such as "a cup" and "pearl milk tea", it is combined and integrated to form a second text, in which the preset elements of the first text data are also collected. The number of preset items is sufficient, so the content of the second text data is a preset element that requires insufficient items, such as "sugar amount", "ice amount", and the like. In this embodiment, the second text is "What is the amount of sugar and ice?". Then, after receiving the second audio and audio data corresponding to the second text data, the user U speaks the next audio signal of "half sugar, less ice", and the audio capturing mechanism 6 captures the next audio signal. Until the preset elements collected by the integration integration organization 4 reach a predetermined number of items, for example, the number of items of the preset elements is five, including the type of beverage, the amount of sugar, the temperature, and the basic information of the customer, the integrated integration mechanism 4 receives When the five preset elements are used, the content of the second text data outputted is to inform the user that the U has completed the current voice service. For example, the second text is "The order service has been completed, thank you for your call." In addition, when the first text data has a preset element, the content of the second text data may be a reconfirmed preset element, whether or not the first text data has a predetermined number of items. For example, the second text is "OK, and I will confirm it with you once, is it a cup of pearl milk tea, half sugar, less ice?". Through the voice customer service system of the present invention, the shortcomings of the conventional catering point-of-sale device are improved, so that the catering industry can improve the quality of the service through the voice customer service system, and at the same time save the manpower of the voice customer service.

以上之敘述僅為本發明之較佳實施例說明，凡精於此項技藝者當可依據上述之說明而作其它種種之改良，惟這些改變仍屬於本發明之發明精神及以下所界定之專利範圍中。The above description is only for the preferred embodiment of the present invention, and those skilled in the art can make other improvements according to the above description, but these changes still belong to the inventive spirit of the present invention and the patents defined below. In the scope.

100‧‧‧語音客服系統100‧‧‧Voice Customer Service System

P‧‧‧服務點P‧‧‧ Service Point

S‧‧‧遠端伺服器S‧‧‧Remote Server

T‧‧‧電話T‧‧‧Phone

T1‧‧‧收話端T1‧‧‧ receiving end

T2‧‧‧發聲端T2‧‧‧ vocal

U‧‧‧使用者U‧‧‧Users

Claims

A voice service system for a service point, the voice customer service system receives an audio signal sent by a user to confirm the information, and the voice customer service system includes: an audio capture mechanism for extracting the audio signal a voice audio material; a first converting mechanism connected to the audio capturing mechanism for converting the first voice audio data into a first text data; a deconstructing mechanism connecting the first converting mechanism to give the first text data Deconstructing into a text deconstructing component; a synthesizing integration mechanism, judging the received preset element according to the text deconstructing component, and determining an unreceived presetting element according to the received preset element, and Determining that the received preset element is combined with the preset element that has been determined to have not been received, and becomes a second text data; and a second conversion mechanism connecting the composite integration mechanism Converting the second text data into a second voice audio material, and outputting the second voice audio data, wherein the voice The system is set up to service a service point, the point of service to receive the audio signals sent by the user to confirm the information.

The voice customer service system of the service point according to claim 1, wherein the first conversion mechanism and the second conversion mechanism are a cloud processing mechanism.

The voice customer service system of the service point according to claim 1, wherein the audio capturing mechanism extracts the audio signal with respect to a decibel reference value and extracts the first voice audio material.

The voice customer service system of the service point according to claim 1, wherein the structure is integrated and integrated according to the number and/or content of the preset elements included in the first text data.

a voice customer service system for a service point as described in claim 4, wherein the composite integration The mechanism is that when the first text data is less than the preset element of the preset number of items, the first text data is combined and integrated into the second text data of the preset element whose content is an insufficient number of items. And the audio capture mechanism captures the next audio signal.

The voice customer service system of the service point according to claim 4, wherein the first integration means is to integrate the first text data into a content system for reconfirmation when the first text material has the preset element The second text of the preset element.

The voice customer service system of the service point according to claim 4, wherein the first integration means is to integrate the first text data into a unified element when the first text data has the preset number of items The content is the second text message that informs the user that the voice client service has been completed.

The voice customer service system of the service point according to claim 1, wherein the audio signal is an audio signal from a telephone.

The voice customer service system of the service point according to claim 1, wherein the service point is a food ordering device.