TW201946001A

TW201946001A - Authorized store voice payment system including a communication unit, a display unit, a processing unit, an image recognition module, and a voice recognition module

Info

Publication number: TW201946001A
Application number: TW107114801A
Authority: TW
Inventors: 李文元
Original assignee: 臺灣土地銀行股份有限公司
Priority date: 2018-05-01
Filing date: 2018-05-01
Publication date: 2019-12-01
Also published as: TWI672652B

Abstract

Provided is an authorized store voice payment system, which is applicable to a user end. The system includes a communication unit, a display unit, a processing unit, an image recognition module, and a voice recognition module. The display unit provides an interactive display interface. The processing unit is coupled to the communication unit and the display unit. The image recognition module is configured to perform image recognition on image data. The voice recognition module is configured to perform voice recognition on voice signals. The processing unit performs image recognition on the image data related to stores and products through the image recognition module, and determines at least one product to be purchased through the display interface. The processing unit performs voice recognition on the voice signal through the voice recognition module, and determines the purchase of at least one product and performs user identity confirmation and online payment through the display interface.

Description

Special store voice payment system

本發明係關於一種資訊查訊系統，更特別的是關於一種用於金融機構之特店語音支付系統。The invention relates to an information inquiry system, and more particularly to a special shop voice payment system for a financial institution.

隨著電子商務蓬勃發展及電子支付服務的普及，遠距網路交易已成為重要的交易管道，民眾在線上網站消費蔚為風潮。為因應此風潮，本地銀行業者紛紛推出本地銀行客戶在購物網站購物消費時，在線上使用銀行之晶片金融卡或信用卡付款的服務。With the vigorous development of e-commerce and the popularization of electronic payment services, long-distance online transactions have become an important transaction channel, and people's online website consumption has become a trend. In response to this trend, local bankers have launched services for local bank customers to use the bank's chip financial card or credit card to pay online when they shop at a shopping site.

為了確認身分及金融交易，使用者在購物及支付時，購物或支付系統會要求使用者不斷利用電腦輸入文字，如於身分確認中輸入密碼、及於支付過程中輸入金融卡或信用卡帳號、密碼等複雜過程。另外，使用者亦必須事先綁定金融卡或信用卡。在利用線上金融卡交易的使用情景下，使用者必須在電腦上安裝讀卡機及相關軟體，使用者在進行線上購物時手邊需要有信用卡、金融卡輸入卡號或帳號才能付款。此等操作方式對使用者來說並不友善，尤其是年長者來說，更是複雜、不便且容易出錯。使用者為了確認自己的輸入的卡號或帳號，也必須多次核對，降低了購物作業及支付的效率。In order to confirm identity and financial transactions, the user will continue to use the computer to enter text when shopping or paying, such as entering a password during identity verification, and entering a financial or credit card account number and password during the payment process. And other complex processes. In addition, users must bind a debit or credit card in advance. In the use of online debit card transactions, users must install a card reader and related software on their computers. Users must have a credit card, debit card, or card number or account number on hand when making online purchases to make payments. These operations are not friendly to users, especially for the elderly, they are more complicated, inconvenient and error-prone. In order to confirm the card number or account number entered by the user, the user must also check multiple times, which reduces the efficiency of shopping operations and payment.

本發明之一目的在於提出一種特店語音支付系統，其能夠有效地利用運算資訊及通訊資源以促進使用者完成線上購物及支付的作業之同時，能夠降低上述作業的複雜度，使線上購物及支付的作業更有效率及方便，讓使用者得到更佳的使用體驗。An object of the present invention is to provide a special store voice payment system, which can effectively use computing information and communication resources to facilitate users to complete online shopping and payment operations, and can reduce the complexity of the above operations, enabling online shopping and The payment operation is more efficient and convenient, so that users have a better experience.

為達至少上述目的，本發明提出一種特店語音支付系統，其適用於使用者端，特店語音支付系統包括：通訊單元、顯示單元、處理單元、影像辨識模組、語音辨識模組。顯示單元，用以提供互動之顯示介面。處理單元，耦接於通訊單元及顯示單元。影像辨識模組用以對影像資料進行影像辨識。語音辨識模組用以對語音訊號進行語音辨識。影像辨識模組、語音辨識模組係由處理單元所執行以進行線上購物及支付。處理單元透過影像辨識模組對關於商店及商品的影像資料進行影像辨識並透過顯示介面以確定欲購買之至少一商品。處理單元透過語音辨識模組對語音訊號進行語音辨識並透過顯示介面以確定購買至少一商品、進行身分確認及進行線上支付。In order to achieve at least the above-mentioned object, the present invention proposes a special shop voice payment system, which is suitable for a user terminal. The special shop voice payment system includes a communication unit, a display unit, a processing unit, an image recognition module, and a voice recognition module. The display unit is used to provide an interactive display interface. The processing unit is coupled to the communication unit and the display unit. The image recognition module is used for image recognition of image data. The voice recognition module is used for voice recognition of a voice signal. The image recognition module and voice recognition module are executed by the processing unit for online shopping and payment. The processing unit performs image recognition on the image data about the store and the product through the image recognition module, and determines at least one product to be purchased through the display interface. The processing unit performs voice recognition on the voice signal through the voice recognition module, and determines the purchase of at least one product, the identity confirmation, and the online payment through the display interface.

於一實施例中，處理單元用以取得關於商店及商品的影像資料。處理單元用以透過影像辨識模組對影像資料進行影像辨識以得到關於商店及商品的圖片特徵資料並透過通訊單元將圖片特徵資料傳送至圖片資訊伺服器。處理單元用以透過通訊單元接收並透過顯示介面輸出圖片資訊伺服器回應圖片特徵資料而對應地產生的商品相關資料，商品相關資料包含商店名稱資料、商品名稱資料、價格資料。In one embodiment, the processing unit is used to obtain image data about the store and the product. The processing unit is configured to perform image recognition on the image data through the image recognition module to obtain picture characteristic data about the store and the product, and transmit the picture characteristic data to the picture information server through the communication unit. The processing unit is used to receive the product-related data correspondingly generated by the image information server in response to the image characteristic data through the communication unit and output the image information server through the display interface. The product-related data includes store name data, product name data, and price data.

於一實施例中，處理單元用以接收第一語音訊號並據以透過語音辨識模組進行語音辨識以判斷第一語音訊號是否代表使用金融工具帳戶的支付方式以購買商品相關資料中所指的至少一商品。若處理單元判斷出第一語音訊號是代表使用金融工具帳戶的支付方式以購買至少一商品，則處理單元輸出一說出密碼請求訊息。處理單元用以透過通訊單元接收第二語音訊號並傳送依據第二語音訊號之身分確認訊號至聲紋辨識伺服器。處理單元用以透過通訊單元接收聲紋辨識伺服器回應身分確認訊號而對應地輸出的使用者識別碼。In an embodiment, the processing unit is configured to receive the first voice signal and use the voice recognition module to perform voice recognition to determine whether the first voice signal represents a payment method using a financial instrument account to purchase the goods in the relevant data At least one product. If the processing unit determines that the first voice signal is representative of using the payment method of the financial instrument account to purchase at least one product, the processing unit outputs a spoken password request message. The processing unit is configured to receive the second voice signal through the communication unit and send an identity confirmation signal according to the second voice signal to the voiceprint recognition server. The processing unit is used to receive the user identification code correspondingly output by the voiceprint recognition server in response to the identity confirmation signal through the communication unit.

於一實施例中，處理單元用以透過通訊單元將購物相關資料傳送至帳務伺服器，購物相關資料包含至少一商品所對應的商店名稱資料、商品名稱資料、價格資料以及使用者識別碼、金融工具帳戶的資料。處理單元用以透過通訊單元接收並透過顯示介面輸出帳務伺服器回應購物相關資料而對應地輸出的支付相關資料，支付相關資料包含代表轉出帳號、特店轉入帳號與轉帳金額之資料。處理單元用以接收第三語音訊號並據以透過語音辨識模組進行語音辨識以判斷第三語音訊號是否代表確定進行支付以購買至少一商品。若處理單元判斷出第三語音訊號是代表確定進行支付以購買至少一商品，則透過通訊單元將確定支付訊息傳送至帳務伺服器以進行支付。In an embodiment, the processing unit is configured to transmit the shopping-related data to the accounting server through the communication unit, and the shopping-related data includes at least one product corresponding to the store name data, product name data, price data, and user identification code, Information on financial instrument accounts. The processing unit is used to receive payment-related data correspondingly output through the communication unit through the display interface and output the accounting server in response to the shopping-related data. The payment-related data includes data representing the transfer account number, the special account transfer account number, and the transfer amount. The processing unit is configured to receive the third voice signal and perform voice recognition through the voice recognition module to determine whether the third voice signal represents a determination to make a payment to purchase at least one product. If the processing unit determines that the third voice signal indicates that payment is determined to purchase at least one product, the communication unit transmits a payment confirmation message to the account server for payment.

於一實施例中，處理單元用以找出進行支付之一終端裝置的位址。處理單元用以輸出一出貨地址確認訊息以請求確認是否以關於終端裝置的位址的地理位址作為該至少一商品的出貨地址。處理單元用以接收第四語音訊號並據以進行語音辨識以判斷第四語音訊號是否代表確定出貨地址。若處理單元判斷出第四語音訊號是代表確定出貨地址，則處理單元用以將出貨地址訊息傳送至帳務伺服器或至少一商品對應之商店交易單元。In one embodiment, the processing unit is used to find out the address of a terminal device that makes a payment. The processing unit is configured to output a shipping address confirmation message to request confirmation whether the geographic address of the address of the terminal device is used as the shipping address of the at least one product. The processing unit is configured to receive the fourth voice signal and perform voice recognition to determine whether the fourth voice signal represents determining the shipping address. If the processing unit determines that the fourth voice signal is representative of determining the shipping address, the processing unit is configured to send the shipping address information to an account server or a store transaction unit corresponding to at least one product.

為達至少上述目的，本發明提出一種特店語音支付系統，其適用於伺服器端，特店語音支付系統包括：至少一管理伺服器，各管理伺服器包含網路單元及資料處理單元。網路單元用以進行網路通訊，並與至少一終端裝置通訊；資料處理單元，電性耦接於網路單元，且用以對至少一終端裝置所傳來的訊號進行處理。至少一管理伺服器係用以：對關於商店及商品的影像資料進行影像辨識以確定欲購買之至少一商品；對第一語音訊號進行語音辨識以確定購買至少一商品；對第二語音訊號進行語音辨識以進行使用者身分確認；對第三語音訊號進行語音辨識以進行線上支付。In order to achieve at least the above-mentioned object, the present invention proposes a special shop voice payment system, which is applicable to a server. The special shop voice payment system includes at least one management server, and each management server includes a network unit and a data processing unit. The network unit is configured to perform network communication and communicate with at least one terminal device. The data processing unit is electrically coupled to the network unit and is configured to process signals transmitted by the at least one terminal device. The at least one management server is configured to: perform image recognition on the image data of the store and the commodity to determine at least one product to be purchased; perform voice recognition on the first voice signal to determine to purchase at least one commodity; and perform the second voice signal Voice recognition for user identification; voice recognition for third voice signal for online payment.

於一實施例中，至少一管理伺服器係用以：取得關於商店及商品的影像資料；對影像資料進行影像辨識以得到關於商店及商品的圖片特徵資料並將圖片特徵資料傳送至圖片資訊伺服器；以及接收並輸出圖片資訊伺服器回應圖片特徵資料而對應地產生的商品相關資料，商品相關資料包含商店名稱資料、商品名稱資料、價格資料；接收第一語音訊號並據以進行語音辨識以判斷第一語音訊號是否代表使用金融工具帳戶的支付方式以購買商品相關資料中所指的至少一商品。若判斷出第一語音訊號是代表使用金融工具帳戶的支付方式以購買至少一商品，則至少一管理伺服器輸出說出密碼請求訊息。In an embodiment, at least one management server is configured to: obtain image data about stores and products; perform image recognition on the image data to obtain image characteristic data about stores and products and transmit the image characteristic data to the image information server Receiving and outputting image-related information related to the product generated by the image information server in response to image characteristic data, the product-related data includes store name data, product name data, and price data; receiving the first voice signal and performing voice recognition based on the It is determined whether the first voice signal represents the use of a payment method of a financial instrument account to purchase at least one product indicated in the product-related information. If it is determined that the first voice signal represents the use of a payment method of a financial instrument account to purchase at least one product, at least one management server outputs a spoken password request message.

於一實施例中，至少一管理伺服器係用以：接收第二語音訊號並據以傳送身分確認訊號至聲紋辨識伺服器；接收聲紋辨識伺服器回應身分確認訊號而對應地輸出的使用者識別碼；將購物相關資料傳送至帳務伺服器，購物相關資料包含至少一商品所對應的商店名稱資料、商品名稱資料、價格資料以及使用者識別碼、金融工具帳戶的資料；接收並輸出帳務伺服器回應購物相關資料而對應地輸出的支付相關資料，支付相關資料包含代表轉出帳號、特店轉入帳號與轉帳金額之資料；接收第三語音訊號並據以進行語音辨識以判斷第三語音訊號是否代表確定進行支付以購買至少一商品。若判斷出第三語音訊號是代表確定進行支付以購買至少一商品，則至少一管理伺服器將確定支付訊息傳送至帳務伺服器以進行支付。In an embodiment, at least one management server is used to: receive the second voice signal and send an identity confirmation signal to the voiceprint recognition server accordingly; receive the voiceprint recognition server responding to the identity confirmation signal and output correspondingly Send the shopping-related data to the accounting server, the shopping-related data includes the store name data, product name data, price data, user identification code, and financial instrument account data corresponding to at least one product; receive and output The billing server responds to the shopping-related data and outputs corresponding payment-related data. The payment-related data includes data representing the transfer account number, the special store transfer account number, and the transfer amount; receiving the third voice signal and performing voice recognition to determine Whether the third voice signal represents a determination to make a payment to purchase at least one item. If it is determined that the third voice signal indicates that the payment is determined to purchase at least one product, the at least one management server transmits the determined payment message to the account server for payment.

於一實施例中，至少一管理伺服器係用以：找出進行支付之終端裝置的位址；輸出出貨地址確認訊息以請求確認是否以關於終端裝置的位址的地理位址作為至少一商品的出貨地址；接收第四語音訊號並據以進行語音辨識以判斷第四語音訊號是否代表確定出貨地址。若判斷出第四語音訊號是代表確定出貨地址，則至少一管理伺服器將出貨地址訊息傳送至帳務伺服器或至少一商品對應之商店交易單元。In an embodiment, at least one management server is used to: find out the address of the terminal device making the payment; and output a shipping address confirmation message to request confirmation whether the geographical address of the terminal device address is used as the at least one The shipping address of the product; receiving the fourth voice signal and performing voice recognition to determine whether the fourth voice signal represents determining the shipping address. If it is determined that the fourth voice signal is to determine the shipping address, the at least one management server sends the shipping address information to the account server or at least one store transaction unit corresponding to the product.

於一實施例中，至少一管理伺服器係用以：若判斷出第四語音訊號是代表並出貨地址並不正確，則至少一管理伺服器輸出出貨地址請求訊息以請求使用者輸入出貨地址；接收第五語音訊號並據以進行語音辨識以判斷第五語音訊號是否代表使用者輸入出貨地址。若判斷出第五語音訊號是代表使用者輸入出貨地址，則至少一管理伺服器將出貨地址訊息傳送至帳務伺服器或至少一商品對應之商店交易單元。In an embodiment, at least one management server is configured to: if it is determined that the fourth voice signal is representative and the shipping address is incorrect, the at least one management server outputs a shipping address request message to request the user to input and output Shipment address; receiving the fifth voice signal and performing voice recognition to determine whether the fifth voice signal enters the shipping address on behalf of the user. If it is determined that the fifth voice signal is to input the shipping address on behalf of the user, the at least one management server sends the shipping address information to the account server or at least one store transaction unit corresponding to the product.

藉此，特店語音支付系統之實施例，能夠有效地利用運算資訊及通訊資源以促進使用者完成線上購物及支付的作業之同時，能夠降低上述作業的複雜度，使線上購物及支付的作業更有效率及方便，讓使用者得到更佳的使用體驗。In this way, the embodiment of the special store voice payment system can effectively use computing information and communication resources to facilitate users to complete online shopping and payment operations, while reducing the complexity of the above operations and enabling online shopping and payment operations More efficient and convenient for users to have a better experience.

為充分瞭解本發明之目的、特徵及功效，茲藉由下述具體之實施例，並配合所附之圖式，對本發明做詳細說明，說明如後：In order to fully understand the purpose, features and effects of the present invention, the following specific embodiments are used in conjunction with the accompanying drawings to explain the present invention in detail, as described below:

請參考圖1及圖2，圖1為應用於使用者端之特店語音支付系統之一實施例的應用情景的方塊示意圖，圖2為特店語音支付系統之一實施例的方塊示意圖。Please refer to FIG. 1 and FIG. 2. FIG. 1 is a block diagram of an application scenario of a special shop voice payment system applied to a user terminal, and FIG. 2 is a block diagram of an embodiment of a special shop voice payment system.

如圖1及圖2所示，特店語音支付系統10，其適用於使用者端，如實現於終端裝置100。如圖2所示，實現於終端裝置100的特店語音支付系統10包括：通訊單元110、顯示單元120、處理單元130、影像辨識模組141、語音辨識模組142。特店語音支付系統10可以利用終端裝置100來實現。終端裝置100例如是如手機、平板之智慧型裝置、筆記型或桌上型電腦等任何運算裝置。As shown in FIG. 1 and FIG. 2, the special shop voice payment system 10 is suitable for a user terminal, such as being implemented in a terminal device 100. As shown in FIG. 2, the special store voice payment system 10 implemented in the terminal device 100 includes a communication unit 110, a display unit 120, a processing unit 130, an image recognition module 141, and a voice recognition module 142. The special store voice payment system 10 may be implemented using the terminal device 100. The terminal device 100 is, for example, any computing device such as a smart phone, a tablet, a notebook or a desktop computer.

通訊單元110用於訊號連接至網路90以與外部的伺服器通訊，例如圖片資訊伺服器20、聲紋辨識伺服器30、帳務伺服器40。顯示單元120，如液晶顯示器或觸控螢幕，用以提供互動之顯示介面。處理單元130，如微理處器、單晶片或微控制器，其耦接於通訊單元110及顯示單元120。此外，在此實施例中，終端裝置100更包括影像單元150及聲音單元160。影像單元150例如包含攝影機或影像處理器等、或其組合。聲音單元160例如包含麥克風或聲音處理器等、或其組合。The communication unit 110 is used for signal connection to the network 90 to communicate with external servers, such as the picture information server 20, the voiceprint recognition server 30, and the account server 40. The display unit 120, such as a liquid crystal display or a touch screen, is used to provide an interactive display interface. The processing unit 130, such as a micro processor, a single chip or a microcontroller, is coupled to the communication unit 110 and the display unit 120. In addition, in this embodiment, the terminal device 100 further includes an image unit 150 and a sound unit 160. The image unit 150 includes, for example, a camera, an image processor, or the like, or a combination thereof. The sound unit 160 includes, for example, a microphone, a sound processor, or the like, or a combination thereof.

影像辨識模組141用以對影像資料進行影像辨識。語音辨識模組142用以接收語音並據以進行語音辨識。影像辨識模組141、語音辨識模組142可以利用硬體、軟體或軟硬體的方式來實現。例如影像辨識模組141、語音辨識模組142是儲在於電性耦接處理單元130的記憶單元140中程式模組並由處理單元130執行，或者影像辨識模組141、語音辨識模組142分別藉由電性耦接至處理單元130的影像處理器(如影像單元150)、聲音處理器(如聲音單元160)來實現。影像辨識模組141可以利用影像分析的各種技術，將影像資料如照片進行分割(segmentation)，或進行與知識庫(knowledge base)比對，以截取照片中的目標物，或可進行目標物識辨(object recognition)，如辨識出目標物為屬於商品、商品的品牌、商店的名稱或地址或其他。語音辨識模組142可以利用語音轉文字(speech to text, STT)技術來實現將語音內容轉換為相對應的文字，其建置方式為透過聲音特徵比對、語料收集，來建立龐大的語料庫，系統接收語音後比對語料庫，並將語音內容轉換為可能的文字。由語音轉換為文字的資料，可作為分析或判斷條件比對從而進一步進行處理如購物的確認或支付的確認。然而本發明之實現並不受此等例子限制。The image recognition module 141 is used for performing image recognition on image data. The voice recognition module 142 is used for receiving voice and performing voice recognition accordingly. The image recognition module 141 and the voice recognition module 142 can be implemented by hardware, software, or software and hardware. For example, the image recognition module 141 and the voice recognition module 142 are program modules stored in a memory unit 140 electrically coupled to the processing unit 130 and executed by the processing unit 130, or the image recognition module 141 and the voice recognition module 142 are respectively This is achieved by an image processor (such as the image unit 150) and a sound processor (such as the sound unit 160) electrically coupled to the processing unit 130. The image recognition module 141 can use various techniques of image analysis to segment image data, such as photos, or compare it with a knowledge base to intercept the targets in the photos, or perform target recognition Object recognition, such as identifying the target as belonging to a product, the brand of the product, the name or address of a store, or other. The speech recognition module 142 can use speech to text (STT) technology to convert speech content into corresponding text. The construction method is to build a huge corpus through sound feature comparison and corpus collection. After receiving the voice, the system compares the corpus and converts the voice content into possible text. The data converted from speech to text can be used as analysis or comparison of judgment conditions for further processing such as confirmation of purchase or confirmation of payment. However, the implementation of the present invention is not limited by these examples.

處理單元130利用影像辨識模組141、語音辨識模組142、顯示單元120、通訊單元110以進行線上購物及支付。處理單元130透過影像辨識模組141對關於商店及商品的影像資料進行影像辨識並透過顯示介面從而讓終端裝置100之使用者確定欲購買之至少一商品。處理單元130透過語音辨識模組142對使用者之語音進行語音辨識並透過顯示介面以確定購買至少一商品、進行使用者身分確認及進行線上支付。The processing unit 130 uses the image recognition module 141, the voice recognition module 142, the display unit 120, and the communication unit 110 to perform online shopping and payment. The processing unit 130 performs image recognition on the image data about the store and the product through the image recognition module 141 and allows the user of the terminal device 100 to determine at least one product to be purchased through the display interface. The processing unit 130 performs voice recognition on the user's voice through the voice recognition module 142 and determines the purchase of at least one product, confirms the identity of the user, and performs online payment through the display interface.

舉例而言，關於商店及商品的影像資料可以包含一張或多張含有商品、品牌或店家的名稱的數位影像，例如圖1中的影像資料IM中含有商品(如鞋子)及品牌或店家的名稱(如ABC)影像。影像資料IM可以利用不同方式產生，如使用者可以利用終端裝置100拍攝實體的商品以及商品的品牌或店家的名稱之影像來作為影像資料IM。使用者亦可以利用終端裝置100從外部(如社交媒體網站、網路搜尋、網站或電子郵件等)得到含有商品、品牌或店家的名稱的數位影像。然而，本發明之實現並不受上述例子限制。For example, the image data about the store and the product may include one or more digital images containing the name of the product, brand, or store. For example, the image data IM in FIG. 1 contains the product (such as shoes) and the brand or store ’s image. Name (such as ABC) image. The image data IM can be generated in different ways. For example, the user can use the terminal device 100 to capture an image of the physical product and the brand or name of the product as the image data IM. The user can also use the terminal device 100 to obtain a digital image containing the name of a product, brand, or store from outside (such as a social media website, Internet search, website, or email). However, the implementation of the present invention is not limited by the above examples.

於一實施例中，處理單元130用以取得關於商店及商品的影像資料。處理單元130透過影像辨識模組141對影像資料進行影像辨識以得到關於商店及商品的圖片特徵資料並透過通訊單元110將圖片特徵資料傳送至圖片資訊伺服器20。例如，影像辨識模組141利用影像處理的分割技術找出目標物，如商品、商品的品牌、商店的名稱或地址或其他，從而截取影像資料中商品(如鞋子) 影像及品牌或店家的名稱(如ABC)影像，以作為圖片特徵資料。由於影像資料中可能含有跟商品、商品的品牌等購物內容無關的背景或人物等影像內容，影像資料經過上述的影像辨識的處理所得出之圖片特徵資料的資料量已大為減少。藉此，本實施例由於利用終端裝置100對影像資料進行了前置的影像辨識的處理，藉此可以減輕圖片資訊伺服器20的處理運算上的負擔，並可避免直接傳送影像資料(如一張或多張照片)所造成的網路資源及運算資料的浪費，從而可以提升整體線上購物及支付的效率。In an embodiment, the processing unit 130 is configured to obtain image data about a store and a commodity. The processing unit 130 performs image recognition on the image data through the image recognition module 141 to obtain picture characteristic data about the store and the product, and transmits the picture characteristic data to the picture information server 20 through the communication unit 110. For example, the image recognition module 141 uses the segmentation technology of image processing to find the target object, such as the product, the brand of the product, the name or address of the store, or other, so as to intercept the image of the product (such as shoes) in the image data and the name of the brand or store (Such as ABC) image as the picture characteristic data. Since the image data may contain image content such as backgrounds or people that are not related to shopping content such as goods and product brands, the amount of image feature data obtained by the image data through the above-mentioned image recognition processing has been greatly reduced. Therefore, in this embodiment, since the terminal device 100 is used to perform pre-image recognition processing on the image data, the processing operation load of the picture information server 20 can be reduced, and direct transmission of image data (such as a Or multiple photos) wasting network resources and computing data, which can improve overall online shopping and payment efficiency.

圖片資訊伺服器20係具有商品、品牌、店家相關的商品資料庫，商品資料庫中具有商品、品牌、店家的影像資料、店家所販售的商品的影像資料、以及此等影像資料之間的對應或關聯關係，以及具有店家所販售的商品的價格、商品資料等。圖片資訊伺服器20將圖片特徵資料與此商品資料庫中的影像資料進行比較或搜尋，例如利用辨識圖片所得到的商店名稱或品牌名稱找出店名，辨識圖片物品種類從而找出商品。若此商品資料庫中存在與圖片特徵資料中相同或相似的商品，則圖片資訊伺服器20對應地產生商品相關資料。商品相關資料包含商店名稱資料、商品名稱資料、價格資料。例如，若此商品資料庫中不存在與圖片特徵資料中相同或相似的某種品牌的商品(如牌子ABC的一款登山鞋)，圖片資訊伺服器20也可以對應地產生商品相關資料，例如是與圖片特徵資料中相同品牌的其他商品(如登山服裝)，或是與圖片特徵資料中相同店家的不同品牌的相同或相似商品。又例如，圖片特徵資料具有品牌及商品的影像，而利用此商品資料庫找到有多個店家對應至此圖片特徵資料中的品牌及商品時，圖片資訊伺服器20亦可對應地產生多個店家的商品相關資料。又例如，圖片特徵資料具有第一店家、品牌及商品的影像，而利用此商品資料庫找到只有第二店家對應至此圖片特徵資料中的品牌及商品時，圖片資訊伺服器20亦可對應地產生關於第二店家的商品相關資料。然而，本發明之實現並不受上述圖片資訊伺服器20的例子限制。The image information server 20 has a product database related to products, brands, and stores. The product database includes image data of products, brands, and stores, image data of products sold by the stores, and the image data between these images. Correspondence or relationship, as well as the price and product information of the products sold by the store. The picture information server 20 compares or searches the picture characteristic data with the image data in the product database. For example, the shop name or brand name obtained by identifying the picture is used to find the shop name, and the picture item type is identified to find the product. If there are products in the product database that are the same as or similar to the image characteristic data, the image information server 20 generates corresponding product-related data correspondingly. Product-related information includes store name data, product name data, and price data. For example, if there is no product of a certain brand that is the same or similar to the picture characteristic data in the product database (such as a hiking shoe of the brand ABC), the picture information server 20 may also generate product related data correspondingly, such as Are other products of the same brand in the picture characteristic data (such as mountaineering clothing), or the same or similar products of different brands of the same store in the picture characteristic data. For another example, when the image characteristic data has images of brands and products, and when using this product database to find multiple stores corresponding to the brands and products in the image characteristic data, the image information server 20 may also generate multiple stores correspondingly. Product related information. For another example, when the image feature data has images of the first store, brand, and product, and when the product database is used to find only the second store that corresponds to the brand and product in the image feature data, the image information server 20 may also generate correspondingly Information about the products of the second store. However, the implementation of the present invention is not limited by the example of the picture information server 20 described above.

處理單元130透過通訊單元110接收並透過顯示介面輸出圖片資訊伺服器20回應圖片特徵資料而對應地產生的商品相關資料，從而供使用者透過終端裝置100來選擇及確認。舉例而言，在顯示介面上可以顯示商品相關資料並配合商品的影像，並配合顯示介面供使用者來選擇及確認欲購買的商品。The processing unit 130 receives the product-related data correspondingly generated by the image information server 20 in response to the image characteristic data through the communication unit 110 and outputs the image information server 20 through the display interface, so that the user can select and confirm through the terminal device 100. For example, the display interface can display related information of the product and cooperate with the image of the product, and cooperate with the display interface for the user to select and confirm the product to be purchased.

於一實施例中，在使用者來選擇及確認欲購買的商品之後，處理單元130用以接收第一語音訊號並據以透過語音辨識模組142進行語音辨識以判斷第一語音訊號是否代表使用金融工具帳戶的支付方式以購買商品相關資料中所指的至少一商品。金融工具帳戶例如是指現金卡、信用卡、或金融卡等金融工具的帳戶。在此假設終端裝置100已進行了相關的金融工具帳戶的支付方式的設定，例如是使用者已經與金融工具發行的金融機構開通了行動交易服務，並於終端裝置100中已安裝了相關的支付方式的軟體或設定。舉例而言，特店語音支付系統或金融機構可以要求使用者說出某種內容格式的確認商品及支付方式的語句以表示購買商品，例如說出如「確定購買，請從我的銀行金融卡支付」的語句。若待確認的商品只有一個，可要求使用者在語句中說出「購買」來代表確認欲購買此商品，若待確認的商品有多個，則可以允許使用者說「購買全部」、「購買第一項、第三項商品」。又語句中「金融卡」亦可改為其金融工具如現金卡或信用卡。此外，上述語句中「銀行」亦可改為金融機構如銀行的名稱。然而，本發明之實現對並不受上述例子的限制。In an embodiment, after the user selects and confirms the product to be purchased, the processing unit 130 is configured to receive the first voice signal and perform voice recognition through the voice recognition module 142 to determine whether the first voice signal represents use. The payment method of the financial instrument account is to purchase at least one commodity specified in the relevant information of the commodity. A financial instrument account is, for example, an account of a financial instrument such as a cash card, a credit card, or a financial card. It is assumed here that the terminal device 100 has set the payment method of the related financial instrument account. For example, the user has opened a mobile transaction service with a financial institution that has issued financial instruments, and the related payment has been installed in the terminal device 100. Software or settings. For example, a special store voice payment system or a financial institution may require the user to say a certain content format to confirm the product and the payment method to indicate the purchase of the product. Pay "statement. If there is only one product to be confirmed, you can ask the user to say "buy" in the sentence to confirm that you want to buy this product. If there are multiple products to be confirmed, you can allow the user to say "buy all", "buy Item one, item three. " The "financial card" in the sentence can also be changed to its financial instrument such as cash card or credit card. In addition, the "bank" in the above sentence can also be changed to the name of a financial institution such as a bank. However, the implementation of the present invention is not limited to the above examples.

若處理單元130判斷出第一語音訊號是代表使用金融工具帳戶的支付方式以購買至少一商品，則處理單元130輸出說出密碼請求訊息。例如終端裝置100透過顯示介面顯示「請您說出通關密碼」的文字或代表圖式或以聲音方式以告知使用者說出密碼。處理單元130用以接收第二語音訊號並傳送依據該第二語音訊號之身分確認訊號至聲紋辨識伺服器30以進行使用者身分確認。If the processing unit 130 determines that the first voice signal is representative of using the payment method of the financial instrument account to purchase at least one product, the processing unit 130 outputs a spoken password request message. For example, the terminal device 100 displays a text or a representative pattern of “Please say a pass code” through a display interface or uses a voice to inform the user to say a pass code. The processing unit 130 is configured to receive a second voice signal and send an identity confirmation signal according to the second voice signal to the voiceprint recognition server 30 for user identity confirmation.

特店語音支付系統或金融機構可以要求使用者說出某種格式的特定語句(或稱通關密碼)從而進行使用者身分辨識，例如請使用者說出通關密碼6至12個字，如「土地銀行WZ語音購物89」，其中特定語句可以被定義為某一詞語、語句、呼叫聲或其組合。聲紋辨識伺服器30係具有聲紋特徵資料庫，聲紋特徵資料庫具有使用者的聲紋特徵、聲紋特徵與使用者的識別碼之對應關係，其中聲紋特徵係由使用者的聲音波長、頻率、強度、節奏等複數種聲音特徵值組成。聲紋辨識伺服器30接收身分確認訊號後，對身分確認訊號中之第二語音訊號進行處理以得出第二語音訊號的聲紋特徵，並繼而將第二語音訊號的聲紋特徵與聲紋特徵資料庫中的聲紋特徵進行比對。聲紋辨識伺服器30依據第二語音訊號的聲紋特徵從聲紋特徵資料庫中找出對應的使用者識別碼。在一實施例中，聲紋辨識伺服器30可將使用者識別碼傳送至終端裝置100。在另一實施例中，可以將終端裝置100配置為依據第二語音訊號輸出身分確認訊號及進一步使身分確認訊號夾帶配對密碼，並傳送至聲紋辨識伺服器30。聲紋辨識伺服器30在如前述方式找出使用者識別碼以後，可以進一步判斷儲存於聲紋辨識伺服器30中與此使用者識別碼關聯的對應密碼是否與終端裝置100所傳送的配對密碼一致(如相同或匹配)。若關於密碼的判斷結果為一致，聲紋辨識伺服器30將此使用者識別碼傳送至終端裝置100。若判斷結果為否，則聲紋辨識伺服器30將此結果通知終端裝置100，而終端裝置100可以允許使用者重新說出特定語句或停止購物的操作。藉此實施例，可以加強使用者身分確認機制的安全性。然而，本發明之實現並不受上述例子限制。在一些實施例中，終端裝置100可以被配置為依據第二語音訊號分析出如聲音波長、頻率、強度或節奏等或其組合的聲音特徵值後，並輸出包含聲音特徵值的身分確認訊號，或者進一步使身分確認訊號包含配對密碼，並傳送至聲紋辨識伺服器30以進行使用者身分確認。藉此，這些實施例由於利用終端裝置100依據第二語音訊號進行前置的聲音分析的處理，藉此可以減輕聲紋辨識伺服器30的處理運算上的負擔，並可避免直接傳送第二語音訊號所造成的網路資源及運算資料的浪費，從而可以提升整體線上購物及支付的效率。Special store voice payment systems or financial institutions may require users to speak a specific sentence (or pass code) in a certain format to identify users. For example, ask users to say 6 to 12 characters for pass codes, such as "land Bank WZ Voice Shopping 89 ", where a specific sentence can be defined as a certain word, sentence, call sound or a combination thereof. The voiceprint recognition server 30 has a voiceprint feature database. The voiceprint feature database has a user's voiceprint feature, and the correspondence between the voiceprint feature and the user's identification code. The voiceprint feature is determined by the user's voice. It is composed of a plurality of sound characteristic values such as wavelength, frequency, intensity, and rhythm. After the voiceprint recognition server 30 receives the identity confirmation signal, it processes the second voice signal in the identity confirmation signal to obtain the voiceprint feature of the second voice signal, and then uses the voiceprint feature and voiceprint of the second voice signal. The voiceprint features in the feature database are compared. The voiceprint recognition server 30 finds the corresponding user identification code from the voiceprint feature database according to the voiceprint characteristics of the second voice signal. In one embodiment, the voiceprint recognition server 30 may transmit the user identification code to the terminal device 100. In another embodiment, the terminal device 100 may be configured to output an identity confirmation signal according to the second voice signal and further cause the identity confirmation signal to carry a pairing password, and transmit the identity confirmation signal to the voiceprint recognition server 30. After the voiceprint recognition server 30 finds the user identification code as described above, it can further determine whether the corresponding password stored in the voiceprint recognition server 30 and associated with the user identification code is the pairing password transmitted by the terminal device 100. Consistent (such as identical or matching). If the judgment result about the passwords are consistent, the voiceprint recognition server 30 transmits the user identification code to the terminal device 100. If the determination result is no, the voiceprint recognition server 30 notifies the terminal device 100 of the result, and the terminal device 100 may allow the user to re-utter a specific sentence or stop the shopping operation. With this embodiment, the security of the user identity confirmation mechanism can be enhanced. However, the implementation of the present invention is not limited by the above examples. In some embodiments, the terminal device 100 may be configured to analyze a voice characteristic value such as a sound wavelength, frequency, intensity, or rhythm or a combination thereof according to the second voice signal, and output an identity confirmation signal including the voice characteristic value. Alternatively, the identity confirmation signal further includes a pairing password, and is sent to the voiceprint recognition server 30 for user identity verification. Therefore, in these embodiments, since the terminal device 100 performs pre-voice analysis processing according to the second voice signal, the burden on the processing operation of the voiceprint recognition server 30 can be reduced, and the second voice can be directly avoided. The waste of network resources and computing data caused by signals can improve the overall efficiency of online shopping and payment.

處理單元130接收聲紋辨識伺服器30回應身分確認訊號而對應地輸出的使用者識別碼。於一實施例中，處理單元130用以將購物相關資料傳送至帳務伺服器40，購物相關資料包含至少一商品所對應的商店名稱資料、商品名稱資料、價格資料以及使用者識別碼、金融工具帳戶的資料。帳務伺服器40可被實現為具有關於購物相關資料與支付相關資料對應關係的購物支付資料庫，帳務伺服器40依據購物相關資料從購物支付資料庫找到對應地的支付相關資料並將輸出支付相關資料。處理單元130用以接收並透過顯示介面輸出此支付相關資料，支付相關資料包含代表轉出帳號、特店轉入帳號與轉帳金額之資料。The processing unit 130 receives the user identification code correspondingly output by the voiceprint recognition server 30 in response to the identity confirmation signal. In one embodiment, the processing unit 130 is configured to transmit shopping-related data to the accounting server 40. The shopping-related data includes at least one product corresponding to the store name data, product name data, price data, and user identification code, finance Tool account information. The accounting server 40 can be implemented as a shopping payment database having a correspondence relationship between shopping-related data and payment-related data. The accounting server 40 finds corresponding payment-related data from the shopping payment database according to the shopping-related data and outputs it. Payment related information. The processing unit 130 is configured to receive and output this payment-related information through a display interface. The payment-related data includes information representing a transfer account number, a special store transfer account number, and a transfer amount.

為了確定使用者是否要進行支付以購買商品，特店語音支付系統或金融機構可以要求使用者說出某種格式的特定語句從而進行使用者身分辨識，例如請使用者說出「正確」、「OK」、「對的，請轉帳」或其他特定語句，其中特定語句可以被定義為某一詞語、語句、呼叫聲或其組合。為此，處理單元130用以接收第三語音訊號並據以透過語音辨識模組142進行語音辨識以判斷第三語音訊號是否代表確定進行支付以購買至少一商品。若處理單元130判斷出第三語音訊號是代表確定進行支付以購買至少一商品，如第三語音訊號被辨識為符合特定語句「正確」，則將確定支付訊息傳送至帳務伺服器40以進行支付。舉例而言，帳務伺服器40收到確定支付訊息後可進行有關支付的作業，如進一步透過金融機機的帳務系統或第三方支付系統進行線上轉帳或支付的交易。而帳務伺服器40可以將支付結果，如成功或失敗、支付資訊傳至終端裝置100。In order to determine whether a user wants to make a payment to purchase a product, a special store voice payment system or a financial institution may require the user to speak a specific sentence in a certain format to identify the user, such as asking the user to say "correct", " OK "," Yes, please transfer "or other specific sentences, where a specific sentence can be defined as a certain word, sentence, call sound or a combination thereof. To this end, the processing unit 130 is configured to receive the third voice signal and perform voice recognition through the voice recognition module 142 to determine whether the third voice signal represents a determination to make a payment to purchase at least one product. If the processing unit 130 determines that the third voice signal is representative of making a payment to purchase at least one product, and if the third voice signal is identified as conforming to a specific sentence "correct", it transmits a payment confirmation message to the accounting server 40 for processing Pay. For example, after receiving the payment confirmation message, the account server 40 may perform payment related operations, such as further performing online transfer or payment transactions through an accounting system of a financial machine or a third-party payment system. The account server 40 may transmit payment results, such as success or failure, and payment information to the terminal device 100.

如上所示，在終端裝置100上實現的特店語音支付系統10係提供利用影像、語音互動的使用者介面以進行欲購商品的選擇、確認以至於支付方式確認、身分確認、支付的確認。如此，令終端裝置100有效地利用運算資訊及通訊資源以促進使用者完成線上購物及支付的作業。As shown above, the special store voice payment system 10 implemented on the terminal device 100 provides a user interface using video and voice interactions to select and confirm a desired product, as well as to confirm the payment method, identity, and payment. In this way, the terminal device 100 effectively utilizes computing information and communication resources to facilitate users to complete online shopping and payment operations.

請參考圖3，其為特店語音支付系統之運作方法之一實施例的示意流程圖。如圖3所示，特店語音支付系統之運作方法包括以下步驟S10至S40。如步驟S10所示，藉由特店語音支付系統對關於商店及商品的影像資料進行影像辨識以確定欲購買之至少一商品。如步驟S20所示，藉由特店語音支付系統對第一語音訊號進行語音辨識以確定購買至少一商品。如步驟S30所示，藉由特店語音支付系統對第二語音訊號進行語音辨識以進行使用者身分確認。如步驟S40所示，藉由特店語音支付系統對第三語音訊號進行語音辨識以進行線上支付。Please refer to FIG. 3, which is a schematic flowchart of an embodiment of an operating method of a special store voice payment system. As shown in FIG. 3, the operation method of the special store voice payment system includes the following steps S10 to S40. As shown in step S10, the special store voice payment system performs image recognition on the image data about the store and the product to determine at least one product to be purchased. As shown in step S20, voice recognition of the first voice signal is performed by the special store voice payment system to determine the purchase of at least one product. As shown in step S30, a voice recognition system of the special store performs voice recognition on the second voice signal to confirm the identity of the user. As shown in step S40, the third voice signal is recognized by the special store voice payment system to perform online payment.

如圖3所示的運作方法可用以實現特店語音支付系統，例如於使用者端的終端裝置上實現特店語音支付系統，亦可以於伺服器端上實現特店語音支付系統。此外，亦可基於此運作方法而對上述特店語音支付系統的任一實施例作出各種可能的延伸及應用。The operation method shown in FIG. 3 can be used to implement a special shop voice payment system. For example, a special shop voice payment system can be implemented on a terminal device on a user side, or a special shop voice payment system can be implemented on a server side. In addition, based on this operation method, various possible extensions and applications of the above-mentioned special store voice payment system can be made.

在一些實施例中，如圖3所示的運作方法可以應用於使用者端的終端裝置，如於使用者端的終端裝置上執行基於圖3之的運作方法所製作的應用程式(APP)。藉此，使用者端的終端裝置作為中介的角色而提供使用者介面並利用圖像辨識及語音辨識功能，並繼而與圖片資訊伺服器20、聲紋辨識伺服器30、帳務伺服器40通訊，從而令使用者端的終端裝置有效地利用運算資訊及通訊資源以促進使用者完成線上購物及支付的作業。In some embodiments, the operation method shown in FIG. 3 may be applied to a terminal device of a user terminal, such as executing an application (APP) created based on the operation method of FIG. 3 on the terminal device of the user terminal. In this way, the terminal device on the user side provides the user interface as an intermediary and utilizes image recognition and voice recognition functions, and then communicates with the picture information server 20, voiceprint recognition server 30, and accounting server 40. In this way, the terminal device on the user side effectively utilizes computing information and communication resources to facilitate the user to complete online shopping and payment operations.

以下進一步就圖3之步驟S10至S40的實現方式，利用圖4A及圖4B之實施例來舉例說明。The implementation of steps S10 to S40 of FIG. 3 is further described below by using the embodiments of FIGS. 4A and 4B as examples.

請參考圖4A及圖4B，其為基於圖3之運作方法之實施例的示意流程圖。如圖4A所示，於一實施例中，步驟S10可以包括以下步驟S110至S130。如步驟S110所示，藉由特店語音支付系統取得關於商店及商品的影像資料。如步驟S120所示，藉由特店語音支付系統對影像資料進行影像辨識以得到關於商店及商品的圖片特徵資料並將圖片特徵資料傳送至圖片資訊伺服器20。如步驟S130所示，藉由特店語音支付系統接收並輸出圖片資訊伺服器20回應圖片特徵資料而對應地產生的商品相關資料，商品相關資料包含商店名稱資料、商品名稱資料、價格資料。關於圖片特徵資料、圖片資訊伺服器20、商品相關資料之產生的實現方式，可以利用前述相關內容中所舉的多個實施例來實現，故不再贅述。然而，本發明的實現並不受此例子限制。Please refer to FIG. 4A and FIG. 4B, which are schematic flowcharts of an embodiment based on the operation method of FIG. 3. As shown in FIG. 4A, in an embodiment, step S10 may include the following steps S110 to S130. As shown in step S110, the image data about the store and the product is acquired by the special store voice payment system. As shown in step S120, the special store voice payment system performs image recognition on the image data to obtain picture characteristic data about the store and the product and transmits the picture characteristic data to the picture information server 20. As shown in step S130, the special store voice payment system receives and outputs the product related data correspondingly generated by the picture information server 20 in response to the picture characteristic data, and the product related data includes store name data, product name data, and price data. The implementation of the generation of the picture feature data, the picture information server 20, and the product-related data can be implemented by using multiple embodiments mentioned in the foregoing related content, and will not be described again. However, the implementation of the present invention is not limited by this example.

如圖4A所示，於一實施例中，步驟S20可以包括以下步驟S140、S150。如步驟S140所示，藉由特店語音支付系統接收第一語音訊號並據以進行語音辨識以判斷第一語音訊號是否代表使用金融工具帳戶的支付方式以購買商品相關資料中所指的至少一商品。如步驟S150所示，若判斷出第一語音訊號是代表使用金融工具帳戶的支付方式以購買至少一商品，則藉由特店語音支付系統輸出說出密碼請求訊息。此外，如步驟S145所示，若步驟S150的判斷結果為否，則特店語音支付系統可以進行其他處理，如回覆錯誤而停止或可以允許使用者嘗試再次輸入語音。關於以語音進行支付方式的確認、商品的選擇或確認的實現方式，可以利用前述相關內容中所舉的多個實施例來實現，故不再贅述。然而，本發明的實現並不受此例子限制。As shown in FIG. 4A, in an embodiment, step S20 may include the following steps S140 and S150. As shown in step S140, the special store voice payment system receives the first voice signal and performs voice recognition to determine whether the first voice signal represents a payment method using a financial instrument account to purchase at least one of the items referred to in the product-related information. commodity. As shown in step S150, if it is determined that the first voice signal represents the use of a payment method of a financial instrument account to purchase at least one product, the special store voice payment system outputs a spoken password request message. In addition, as shown in step S145, if the determination result in step S150 is NO, the special store voice payment system may perform other processing, such as stopping in response to an error or allowing the user to try to enter voice again. Regarding the implementation of the confirmation of the payment method, the selection of the product, or the confirmation by voice, it can be implemented by using multiple embodiments mentioned in the foregoing related content, so it will not be repeated here. However, the implementation of the present invention is not limited by this example.

如圖4A所示，於一實施例中，步驟S30可以包括以下步驟S160、S170。如步驟S160所示，藉由特店語音支付系統接收第二語音訊號並傳送依據該第二語音訊號之身分確認訊號至聲紋辨識伺服器30以進行使用者身分確認。關於身分確認訊號的內容、聲紋辨識伺服器30、及使用者身分確認之實現方式，可以利用前述相關內容中所舉的多個實施例來實現，故不再贅述。若聲紋辨識伺服器30進行使用者身分確認的判斷結果為正確，聲紋辨識伺服器30將對應的使用者識別碼傳送至特店語音支付系統。如步驟S170所示，藉由特店語音支付系統接收聲紋辨識伺服器30回應身分確認訊號而對應地輸出的使用者識別碼。此外，若聲紋辨識伺服器30進行使用者身分確認的判斷結果為不正確，則特店語音支付系統可以允許使用者重新說出特定語句或停止購物的操作。然而，本發明的實現並不受此例子限制。As shown in FIG. 4A, in an embodiment, step S30 may include the following steps S160 and S170. As shown in step S160, the special voice payment system receives the second voice signal and sends an identity confirmation signal according to the second voice signal to the voiceprint recognition server 30 to confirm the identity of the user. Regarding the content of the identity confirmation signal, the voiceprint recognition server 30, and the implementation manner of the identity verification of the user, it can be implemented by using multiple embodiments mentioned in the foregoing related content, so it will not be repeated here. If the judgment result of the voiceprint recognition server 30 confirming the identity of the user is correct, the voiceprint recognition server 30 transmits the corresponding user identification code to the special shop voice payment system. As shown in step S170, the voiceprint recognition server 30 receives the user identification code correspondingly output in response to the identity confirmation signal through the special store voice payment system. In addition, if the determination result of the voiceprint recognition server 30 confirming the identity of the user is incorrect, the special store voice payment system may allow the user to re-utter a specific sentence or stop the shopping operation. However, the implementation of the present invention is not limited by this example.

如圖4B所示，於一實施例中，步驟S40可以包括以下步驟。如步驟S180所示，藉由特店語音支付系統將購物相關資料傳送至帳務伺服器40，購物相關資料包含至少一商品所對應的商店名稱資料、商品名稱資料、價格資料以及使用者識別碼、金融工具帳戶的資料。如步驟S190所示，藉由特店語音支付系統接收並輸出帳務伺服器40回應購物相關資料而對應地輸出的支付相關資料，支付相關資料包含代表轉出帳號、特店轉入帳號與轉帳金額之資料。如步驟S200所示，藉由特店語音支付系統接收第三語音訊號並據以進行語音辨識以判斷第三語音訊號是否代表確定進行支付以購買至少一商品。如步驟S210所示，若特店語音支付系統判斷出第三語音訊號是代表確定進行支付以購買至少一商品，則特店語音支付系統將確定支付訊息傳送至帳務伺服器40以進行支付。此外，如步驟S205所示，若步驟S200的判斷結果為否，則特店語音支付系統可以進行其他處理，如停止處理或可以允許使用者嘗試再次輸入語音。關於以語音進行支付之確定的實現方式，可以利用前述相關內容中所舉的多個實施例來實現，故不再贅述。然而，本發明的實現並不受此例子限制。As shown in FIG. 4B, in an embodiment, step S40 may include the following steps. As shown in step S180, the shopping-related data is transmitted to the account server 40 through the special store voice payment system, and the shopping-related data includes the store name data, product name data, price data, and user identification code corresponding to at least one product. , Financial instrument account information. As shown in step S190, the special store voice payment system receives and outputs payment-related data correspondingly output by the account server 40 in response to shopping-related data. The payment-related data includes representative transfer account numbers, special store transfer account numbers, and transfers. Information on the amount. As shown in step S200, the third store voice payment system receives the third voice signal and performs voice recognition to determine whether the third voice signal represents a determination to make a payment to purchase at least one product. As shown in step S210, if the special store voice payment system determines that the third voice signal is representative to make a payment to purchase at least one product, the special store voice payment system transmits the determined payment message to the account server 40 for payment. In addition, as shown in step S205, if the determination result in step S200 is no, the special store voice payment system may perform other processes, such as stopping the process or allowing the user to try to input voice again. The implementation manner of determining the payment by voice can be implemented by using multiple embodiments mentioned in the foregoing related content, and will not be described again. However, the implementation of the present invention is not limited by this example.

此外，可基於圖3的運作方法的任一實施例實現上述的特店語音支付系統，並作出可能的延伸及應用。譬如，在一些實施例中，基於圖3的運作方法進一步利用特店語音支付系統來得到出貨地址，並要求使用者以語音進行確定。請參考圖5，其為基於圖3之運作方法之另一實施例的示意流程圖。如圖5所示，於一實施例中，運作方法更包括以下步驟。如步驟S310所示，藉由特店語音支付系統找出進行支付之使用者所使用的終端裝置的位址。舉例而言，終端裝置為運算裝置如手機、平板的智慧型裝置，利用「以位置為基礎的服務」(location based service，LBS)透過行動業者的無線電通訊網路或外部定位方式(如行動通訊定位、Wi-Fi定位、藍牙定位或GPS定位)取得行動終端用戶的位置訊息，並在地理資訊系統(geographic information system， GIS)平台的支援下，得知終端裝置的地理位址。此外，特店語音支付系統可以進一步就此終端裝置的使用者於特店語音支付系統的使用者註冊資訊或購物記錄來修正此地理位址，使最後得到的地理位址符合郵政或快遞所能送達的出貨地址。如步驟S320所示，特店語音支付系統輸出出貨地址確認訊息以請求使用者確認是否以關於終端裝置的位址的地理位址作為至少一商品的出貨地址。舉例而言，出貨地址確認訊息可以含有地圖及文字的訊息，或是含有多個出貨地址選項的訊息，並且以語音、圖像或文字或其組合方式輸出於運算裝置的使用者介面上。特店語音支付系統可以要求使用者以語音方式對於上述出貨地址作出選擇及確認。如使用者介面上輸出3個可選之位址可作為出貨地址，使用者可以說出：「請出貨到第一個位址」；若只有一個位址可供選擇，使用者可以說出：「請出貨到這個位址」或「請出貨到這裡」。如步驟S330所示，藉由特店語音支付系統接收第四語音訊號並據以進行語音辨識以判斷第四語音訊號是否代表使用者確定出貨地址。如步驟S340所示，若特店語音支付系統判斷出第四語音訊號是代表使用者確定出貨地址，則特店語音支付系統將出貨地址訊息傳送至帳務伺服器40或至少一商品對應之商店交易單元。舉例而言，商店交易單元可以是提供商品的商店的伺服器或程式模組，且可進一步通知商店處理有關出貨、帳戶等後續事宜。In addition, the above-mentioned special store voice payment system can be implemented based on any embodiment of the operation method of FIG. 3, and possible extensions and applications can be made. For example, in some embodiments, the operation method based on FIG. 3 further utilizes the special store voice payment system to obtain the shipping address, and requires the user to determine by voice. Please refer to FIG. 5, which is a schematic flowchart of another embodiment based on the operation method of FIG. 3. As shown in FIG. 5, in an embodiment, the operation method further includes the following steps. As shown in step S310, the address of the terminal device used by the user who makes the payment is found by using the special store voice payment system. For example, a terminal device is a smart device such as a mobile phone or tablet. It uses a "location based service" (LBS) via a mobile operator's radio communication network or an external positioning method (such as mobile communication positioning). , Wi-Fi positioning, Bluetooth positioning or GPS positioning) to obtain the location information of the mobile end user, and obtain the geographic address of the terminal device with the support of a geographic information system (GIS) platform. In addition, the special store voice payment system can further modify this geographic address based on the user registration information or shopping records of the user of this terminal device in the special store voice payment system, so that the final geographic address meets the postal or express delivery Shipping address. As shown in step S320, the special store voice payment system outputs a shipping address confirmation message to request the user to confirm whether the geographical address of the address of the terminal device is used as the shipping address of at least one product. For example, the shipping address confirmation message may include a map and text message, or a message containing multiple shipping address options, and is output on the user interface of the computing device in voice, image, or text, or a combination thereof. . The special store voice payment system may require users to make a choice and confirm the above shipping address by voice. If the user interface outputs 3 optional addresses that can be used as the shipping address, the user can say: "Please ship to the first address"; if there is only one address to choose from, the user can say Out: "Please ship to this address" or "Please ship here". As shown in step S330, the special store voice payment system receives the fourth voice signal and performs voice recognition to determine whether the fourth voice signal represents the user to determine the shipping address. As shown in step S340, if the special store voice payment system determines that the fourth voice signal is to determine the shipping address on behalf of the user, the special store voice payment system sends the shipping address information to the account server 40 or at least one product corresponding Store transaction unit. For example, a store transaction unit may be a server or program module of a store that provides goods, and may further notify the store to handle subsequent matters such as shipments, accounts, and so on.

如圖5所示，於一實施例中，運作方法更包括以下步驟。如步驟S331所示，若特店語音支付系統判斷出第四語音訊號是代表並出貨地址並不正確，則特店語音支付系統輸出出貨地址請求訊息以請求使用者輸入出貨地址。舉例而言，出貨地址請求訊息可利用語音、圖像或文字或其組合方式輸出於終端裝置的使用者介面上。如步驟S333所示，藉由特店語音支付系統接收第五語音訊號並據以進行語音辨識以判斷第五語音訊號是否代表使用者輸入出貨地址。如步驟S335所示，若特店語音支付系統判斷出第五語音訊號是代表使用者輸入出貨地址，則將出貨地址訊息傳送至帳務伺服器40或至少一商品對應之商店交易單元。此外，若特店語音支付系統的判斷結果為否，則特店語音支付系統可以允許使用者重新輸入出貨地址或停止購物的操作。然而，本發明的實現並不受此例子限制。As shown in FIG. 5, in an embodiment, the operation method further includes the following steps. As shown in step S331, if the special shop voice payment system determines that the fourth voice signal is representative and the shipping address is incorrect, the special shop voice payment system outputs a shipping address request message to request the user to enter a shipping address. For example, the shipping address request message may be output on the user interface of the terminal device by using voice, image, or text, or a combination thereof. As shown in step S333, the special store voice payment system receives the fifth voice signal and performs voice recognition to determine whether the fifth voice signal enters the shipping address on behalf of the user. As shown in step S335, if the special store voice payment system determines that the fifth voice signal is to input the shipping address on behalf of the user, it sends the shipping address information to the account server 40 or the store transaction unit corresponding to at least one product. In addition, if the judgment result of the special store voice payment system is negative, the special store voice payment system may allow the user to re-enter the shipping address or stop the shopping operation. However, the implementation of the present invention is not limited by this example.

在另一些實施例中，基於圖3所示的運作方法可以應用於實現伺服器端的特店語音支付系統。請參考圖6，其為應用於伺服器端之特店語音支付系統15之一實施例的示意圖。如圖6所示，伺服器端之特店語音支付系統15包括：至少一管理伺服器，可用以實現基於圖3的運作方法的任一實施例或其組合。使用者端的終端裝置100A以無線或有線的方式與伺服器端之特店語音支付系統15通訊。伺服器端之特店語音支付系統15係用作中介的角色並與使用者端的終端裝置100A所執行的應用程式或網頁瀏覽器通訊及互動，就線上購物及支付所作的請求作出處理及回應，從而令使用者端的應用程式或網頁瀏覽器中相對應的使用者介面上允許使用者利用圖像、語音進行操作。伺服器端之特店語音支付系統15基於圖3之運作方法的實施例來實現圖像辨識及語音辨識功能，並繼而與圖片資訊伺服器20、聲紋辨識伺服器30、帳務伺服器40通訊，從而令特店語音支付系統15有效地利用運算資訊及通訊資源以促進使用者完成線上購物及支付的作業。In other embodiments, the operation method based on FIG. 3 may be applied to implement a special store voice payment system on the server side. Please refer to FIG. 6, which is a schematic diagram of an embodiment of a special store voice payment system 15 applied to a server. As shown in FIG. 6, the server-side special shop voice payment system 15 includes: at least one management server, which can be used to implement any one or a combination of the operation methods based on FIG. 3. The terminal device 100A on the user side communicates with the special shop voice payment system 15 on the server side in a wireless or wired manner. The special voice payment system 15 on the server side acts as an intermediary and communicates and interacts with applications or web browsers executed by the terminal device 100A on the user side to process and respond to requests for online shopping and payment. Thereby, the user application or the corresponding user interface in the web browser allows the user to perform operations using images and voice. The special store voice payment system 15 on the server side implements the image recognition and voice recognition functions based on the embodiment of the operating method of FIG. 3, and then communicates with the picture information server 20, voiceprint recognition server 30, and accounting server 40. Communication, so that the special store voice payment system 15 effectively uses computing information and communication resources to facilitate users to complete online shopping and payment operations.

在將基於圖3之運作方法(如圖3至圖5)應用於伺服器端之特店語音支付系統15的一些實施例中，使用者端的終端裝置100A係以無線或有線的方式與伺服器端之特店語音支付系統15通訊，特店語音支付系統15係從終端裝置100A取得關於商店及商品的影像資料以及接收語音訊號(如第一至第三語音訊號，或第四、第五語音訊號)，其中終端裝置100A可以利用聲音處理或聲音壓縮技術將使用者的語音轉換為數位化的訊號而成為語音訊號。此外，特店語音支付系統15係將從圖片資訊伺服器20接收到的商品相關資料輸出至終端裝置100A。特店語音支付系統15係將說出密碼請求訊息輸出至終端裝置100A。特店語音支付系統15係將從帳務伺服器40接收到的支付相關資料輸出至終端裝置100A。In some embodiments in which the operation method based on FIG. 3 (as shown in FIGS. 3 to 5) is applied to the special store voice payment system 15 on the server side, the terminal device 100A on the user side communicates with the server in a wireless or wired manner. The special store voice payment system 15 communicates. The special store voice payment system 15 obtains image data about stores and products from the terminal device 100A and receives voice signals (such as the first to third voice signals, or the fourth and fifth voice signals). Signal), in which the terminal device 100A can use voice processing or voice compression technology to convert the user's voice into a digitized signal to become a voice signal. The special store voice payment system 15 outputs the product-related data received from the picture information server 20 to the terminal device 100A. The special store voice payment system 15 outputs a spoken password request message to the terminal device 100A. The special store voice payment system 15 outputs the payment-related materials received from the account server 40 to the terminal device 100A.

請參考圖7，其為特店語音支付系統之管理伺服器之一實施例的示意方塊圖。如圖7所示，管理伺服器200之一實施例包含：網路單元210及資料處理單元220。網路單元210用以進行網路通訊，並與終端裝置100A通訊。資料處理單元220，電性耦接於網路單元210。網路單元210為無線或有線通訊模組，如用於與寬頻網路、光纖網路、無線區域網路、及行動通訊網路中之至少一種建立通訊連結，從而與網路通訊。資料處理單元220為微處理器、微控制器或任何具控制或運算功能之電路。此外，管理伺服器200更可包含儲存單元230，例如是記憶裝置、資料庫或雲端儲存空間或其組合，且儲存單元230可用於儲存管理伺服器在處理線上購物或支付過程中的資料。然而本發明之實現並不受上述例子限制。Please refer to FIG. 7, which is a schematic block diagram of an embodiment of a management server of a special shop voice payment system. As shown in FIG. 7, an embodiment of the management server 200 includes a network unit 210 and a data processing unit 220. The network unit 210 is configured to perform network communication and communicate with the terminal device 100A. The data processing unit 220 is electrically coupled to the network unit 210. The network unit 210 is a wireless or wired communication module. For example, the network unit 210 is configured to establish a communication link with at least one of a broadband network, an optical fiber network, a wireless local area network, and a mobile communication network to communicate with the network. The data processing unit 220 is a microprocessor, a microcontroller, or any circuit with control or computing functions. In addition, the management server 200 may further include a storage unit 230, such as a memory device, a database, or a cloud storage space or a combination thereof, and the storage unit 230 may be used to store data of the management server during processing of online shopping or payment. However, the implementation of the present invention is not limited by the above examples.

在一些實施例中，提出一種電腦可讀取儲存媒體，其記錄讓一運算裝置(如前述終端裝置100或管理伺服器200)執行基於圖3之特店語音支付系統之運作方法的程式碼，其中此方法可以為包含依據圖3、4A、4B或5之運作方法的上述所有實施例中的一者或任意組合。舉例而言，程式碼例如是一個或多個程式或程式模組，如用於實現依據圖3的步驟S10至S40，且可以用任何適合的順序而被執行。當運算裝置(如前述終端裝置100或管理伺服器200)執行此程式碼時，能導致運算裝置執行基於圖3之運作方法之實施例。這些儲存媒體之實施例比如但不受限於：光學式資訊儲存媒體，磁式資訊儲存媒體或記憶體，如記憶卡、靭體或ROM或RAM。In some embodiments, a computer-readable storage medium is proposed, which records a code that allows a computing device (such as the aforementioned terminal device 100 or management server 200) to execute an operation method of a special store voice payment system based on FIG. 3, The method may be one or any combination of all the above embodiments including the operation method according to FIG. 3, 4A, 4B or 5. For example, the code is, for example, one or more programs or program modules, such as used to implement steps S10 to S40 according to FIG. 3, and may be executed in any suitable order. When a computing device (such as the aforementioned terminal device 100 or management server 200) executes this code, the computing device can cause the computing device to execute the embodiment based on the operation method of FIG. 3. Examples of these storage media include, but are not limited to, optical information storage media, magnetic information storage media, or memory, such as a memory card, firmware, or ROM or RAM.

如上述實施例所示，特店語音支付系統係讓使用者端的終端裝置可以提供利用影像、語音互動的使用者介面以進行欲購商品的選擇、確認以至於支付方式確認、身分確認、支付的確認。如此，特店語音支付系統有效地利用運算資訊及通訊資源以促進使用者完成線上購物及支付的作業。相較於習知的購物及支付系統可能要求使用者不斷利用電腦輸入文字，如於使用者身分確認中輸入密碼、及於支付過程中輸入金融卡或信用卡帳號、密碼等複雜過程，特店語音支付系統允許使用者以語音方式進行欲購物品的確認、支付方式確認、身分確認、支付的確認，並能夠大大地降低了上述作業的複雜度。故此，特店語音支付系統有效地利用運算資訊及通訊資源以促進使用者完成線上購物及支付的作業之同時，能夠大大地降低了上述作業的複雜度，使線上購物及支付的作業更有效率及方便，讓使用者得到更佳的使用體驗。藉此，依據本發明的特店語音支付系統可以提高行動支付的方便性，譬如特店語音支付系統可以實現為允許使用者不需事先綁定金融卡或信用卡，使用者免安裝讀卡機，也可以令使用者在進行線上購物時手邊不需要有信用卡、金融卡輸入卡號或帳號才能付款的不便的操作方式。As shown in the above embodiment, the special store voice payment system allows the terminal device on the user side to provide a user interface that uses video and voice interaction to select the product to be purchased, confirm the payment method, confirm the identity, and pay. confirm. In this way, the special store voice payment system effectively uses computing information and communication resources to facilitate users to complete online shopping and payment operations. Compared with the conventional shopping and payment system, users may be required to continuously use the computer to enter text, such as entering a password in the user's identity confirmation, and entering a financial card or credit card account number and password in the payment process. The payment system allows the user to confirm the desired product, the payment method, the identity, and the payment in a voice manner, and can greatly reduce the complexity of the above operation. Therefore, the special store voice payment system effectively uses computing information and communication resources to facilitate users to complete online shopping and payment operations, while greatly reducing the complexity of the above operations, and making online shopping and payment operations more efficient. And convenient, let users get a better experience. In this way, the special store voice payment system according to the present invention can improve the convenience of mobile payment. For example, the special store voice payment system can be implemented to allow users not to bind a financial card or credit card in advance, and the user does not need to install a card reader. It is also an inconvenient operation method that allows users to make online purchases without having to have a credit card or debit card to enter a card number or account number.

本發明在上文中已以較佳實施例揭露，然熟習本項技術者應理解的是，該實施例僅用於描繪本發明，而不應解讀為限制本發明之範圍。應注意的是，舉凡與該實施例等效之變化與置換，均應設為涵蓋於本發明之範疇內。因此，本發明之保護範圍當以申請專利範圍所界定者為準。The present invention has been disclosed in the foregoing with a preferred embodiment, but those skilled in the art should understand that this embodiment is only for describing the present invention, and should not be interpreted as limiting the scope of the present invention. It should be noted that all changes and substitutions equivalent to this embodiment should be included in the scope of the present invention. Therefore, the scope of protection of the present invention shall be defined by the scope of the patent application.

10‧‧‧使用者端之特店語音支付系統10‧‧‧Special store voice payment system on the user side

15‧‧‧伺服器端之特店語音支付系統15‧‧‧ server-side special shop voice payment system

20‧‧‧圖片資訊伺服器20‧‧‧Picture Information Server

30‧‧‧聲紋辨識伺服器30‧‧‧Voiceprint recognition server

40‧‧‧帳務伺服器40‧‧‧Accounting Server

90‧‧‧網路90‧‧‧ Internet

100‧‧‧終端裝置100‧‧‧ terminal device

100A‧‧‧終端裝置100A‧‧‧Terminal device

110‧‧‧通訊單元110‧‧‧ communication unit

120‧‧‧顯示單元120‧‧‧display unit

130‧‧‧處理單元130‧‧‧processing unit

140‧‧‧儲存單元140‧‧‧Storage unit

141‧‧‧影像辨識模組141‧‧‧Image recognition module

142‧‧‧語音辨識模組142‧‧‧Voice recognition module

150‧‧‧影像單元150‧‧‧Image Unit

160‧‧‧聲音單元160‧‧‧ sound unit

200‧‧‧管理伺服器200‧‧‧ management server

210‧‧‧網路單元210‧‧‧ Network Unit

220‧‧‧資料處理單元220‧‧‧Data Processing Unit

230‧‧‧儲存單元230‧‧‧Storage Unit

IM‧‧‧影像資料IM‧‧‧Image data

S10~S40‧‧‧步驟S10 ~ S40‧‧‧‧step

S110~S210‧‧‧步驟S110 ~ S210‧‧‧step

S310~S335‧‧‧步驟S310 ~ S335‧‧‧step

［圖1］係為應用於使用者端之特店語音支付系統之一實施例的應用情景的方塊示意圖。［圖2］係為使用者端之特店語音支付系統之一實施例的方塊示意圖。［圖3］係為特店語音支付系統之運作方法之一實施例的示意流程圖。［圖4A］係為基於圖3之運作方法之實施例的示意流程圖。［圖4B］係為基於圖3之運作方法之實施例的示意流程圖。［圖5］係為基於圖3之運作方法之另一實施例的示意流程圖。［圖6］為應用於伺服器端之特店語音支付系統之一實施例的示意圖。［圖7］為特店語音支付系統之管理伺服器之一實施例的示意方塊圖。[Fig. 1] is a block diagram of an application scenario of an embodiment of a special shop voice payment system applied to a user terminal. [Fig. 2] is a block diagram of an embodiment of a special shop voice payment system on the user side. [Fig. 3] It is a schematic flowchart of an embodiment of a method for operating a voice payment system in a special store. [Fig. 4A] is a schematic flowchart of an embodiment based on the operation method of Fig. 3. [Fig. 4B] is a schematic flowchart of an embodiment based on the operation method of Fig. 3. [Fig. 5] is a schematic flowchart of another embodiment based on the operation method of Fig. 3. [Fig. 6] It is a schematic diagram of an embodiment of a special shop voice payment system applied to a server. [Fig. 7] It is a schematic block diagram of an embodiment of a management server of a special shop voice payment system.

Claims

A special store voice payment system suitable for a user terminal. The special store voice payment system includes: a communication unit; a display unit for providing an interactive display interface; a processing unit coupled to the communication unit and The display unit; an image recognition module for image recognition of image data; and a voice recognition module for voice recognition of voice signals; wherein the processing unit uses the image recognition module and the voice recognition module Group, the display unit, and the communication unit for online shopping and payment; the processing unit performs image recognition on the image data of the store and the product through the image recognition module and determines at least one product to be purchased through the display interface; The processing unit performs voice recognition on the voice signal through the voice recognition module and determines the purchase of the at least one product, confirms the identity of the user, and performs online payment through the display interface.

The special shop voice payment system according to claim 1, wherein: the processing unit is used to obtain the image data about the store and the goods; the processing unit is used to perform image recognition on the image data through the image recognition module to obtain Regarding the picture characteristic data of the store and the product and transmitting the picture characteristic data to a picture information server through the communication unit; the processing unit is used to receive through the communication unit and output the picture information server through the display interface to respond to the picture Product-related data corresponding to the characteristic data. The product-related data includes store name data, product name data, and price data.

The special store voice payment system according to claim 2, wherein: the processing unit is configured to receive a first voice signal and perform voice recognition through the voice recognition module to determine whether the first voice signal represents the use of a financial The payment method of the instrument account is to purchase at least one product indicated in the relevant information of the product; if the processing unit determines that the first voice signal is representative of using the payment method of the financial instrument account to purchase the at least one product, the processing The unit outputs a spoken password request message; receives a second voice signal and transmits a confirmation signal based on one of the second voice signals to a voiceprint recognition server through the communication unit; the processing unit is used to receive through the communication unit The voiceprint recognition server responds to the identity confirmation signal and outputs a user identification code correspondingly.

The special store voice payment system according to claim 3, wherein: the processing unit is configured to transmit shopping-related data to an account server through the communication unit, and the shopping-related data includes a name of a store corresponding to the at least one product Data, product name data, price data, and the user identification code and data of the financial instrument account; the processing unit is used to receive through the communication unit and output through the display interface the accounting server responds to shopping-related data and corresponds Payment-related data output from the local area, the payment-related data includes data representing a transfer account number, a special store transfer account number, and a transfer amount; the processing unit is configured to receive a third voice signal and pass the voice recognition module accordingly The group performs voice recognition to determine whether the third voice signal represents a determination to make a payment to purchase the at least one commodity; if the processing unit determines that the third voice signal represents a determination to make a payment to purchase the at least one commodity, through the communication The unit sends a confirm payment message to the account server for payment.

The special store voice payment system as described in claim 4, wherein: the processing unit is used to find out the address of a terminal device that makes a payment; the processing unit is used to output a shipping address confirmation message to request confirmation whether the The geographic address of the address of the terminal device is used as the shipping address of the at least one product; the processing unit is configured to receive a fourth voice signal and perform voice recognition to determine whether the fourth voice signal represents the determination of the shipment Address; and if the processing unit determines that the fourth voice signal is representative of determining the shipping address, the processing unit is configured to send a shipping address message to the account server or a store corresponding to the at least one product Transaction unit.

A special store voice payment system is applicable to a server. The special store voice payment system includes: at least one management server, each of the management servers includes: a network unit for network communication and communicates with at least A terminal device communicates; a data processing unit electrically coupled to the network unit for processing signals transmitted by the at least one terminal device; wherein the at least one management server is configured to: Performing image recognition on the image data of the product to determine at least one product to be purchased; performing speech recognition on a first voice signal to determine the purchase of the at least one product; performing speech recognition on a second voice signal to confirm user identity; Perform voice recognition on a third voice signal for online payment.

The special shop voice payment system according to claim 6, wherein the at least one management server is configured to: obtain the image data; perform image recognition on the image data to obtain image characteristic data about the store and the product, and the image Send characteristic data to a picture information server; and receive and output product related data corresponding to the picture characteristic data in response to the picture information server, the product related data including store name data, product name data, and price data; receive The first voice signal is used to perform voice recognition to determine whether the first voice signal represents a payment method using a financial instrument account to purchase at least one commodity referred to in the product-related information; and if the first voice is determined The signal indicates that the payment method of the financial instrument account is used to purchase the at least one commodity, and a password request message is output.

The special store voice payment system as described in claim 7, wherein the at least one management server is configured to: receive the second voice signal and transmit an identity confirmation signal to a voiceprint recognition server; and receive the voice The pattern recognition server responds to the identity confirmation signal and outputs a user identification code correspondingly; sends shopping-related data to an account server, the shopping-related data includes store name data and product names corresponding to the at least one product Data, price data, the user ID, and the information of the financial instrument account; receiving and outputting payment-related data that the accounting server responds to in response to shopping-related data, the payment-related data includes a transfer-out account number , A special store transfer account number and a transfer amount of information; receiving the third voice signal and performing voice recognition to determine whether the third voice signal represents a determination to make a payment to purchase the at least one product; and if it is judged that The third voice signal indicates that a payment is determined to purchase the at least one product, and a confirmation payment message is transmitted. Send to this account server for payment.

The special store voice payment system according to claim 8, wherein the at least one management server is used to: find out the address of a terminal device that makes a payment; and output a shipping address confirmation message to request confirmation whether the The geographic address of the address of the terminal device is used as the shipping address of the at least one product; receiving a fourth voice signal and performing voice recognition to determine whether the fourth voice signal represents determining the shipping address; and if determined The fourth voice signal indicates that the shipping address is determined, and then sends a shipping address message to the account server or a store transaction unit corresponding to the at least one product.

The special shop voice payment system according to claim 9, wherein the at least one management server is configured to: if it is determined that the fourth voice signal is representative and the shipping address is incorrect, output a shipping address request Message to request input of the shipping address; receive a fifth voice signal and perform speech recognition to determine whether the fifth voice signal represents the input of the shipping address; and if it is determined that the fifth voice signal represents the input of the shipping address, Then, a shipping address message is transmitted to the account server or a store transaction unit corresponding to the at least one product.