WO2020052370A1 - 自助服务的使用方法及装置和电子设备 - Google Patents

自助服务的使用方法及装置和电子设备 Download PDF

Info

Publication number
WO2020052370A1
WO2020052370A1 PCT/CN2019/098986 CN2019098986W WO2020052370A1 WO 2020052370 A1 WO2020052370 A1 WO 2020052370A1 CN 2019098986 W CN2019098986 W CN 2019098986W WO 2020052370 A1 WO2020052370 A1 WO 2020052370A1
Authority
WO
WIPO (PCT)
Prior art keywords
service
service name
semantic content
self
user
Prior art date
Application number
PCT/CN2019/098986
Other languages
English (en)
French (fr)
Inventor
李善发
Original Assignee
阿里巴巴集团控股有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 阿里巴巴集团控股有限公司 filed Critical 阿里巴巴集团控股有限公司
Publication of WO2020052370A1 publication Critical patent/WO2020052370A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output
    • G06F3/162Interface to dedicated audio devices, e.g. audio drivers, interface to CODECs
    • GPHYSICS
    • G07CHECKING-DEVICES
    • G07FCOIN-FREED OR LIKE APPARATUS
    • G07F11/00Coin-freed apparatus for dispensing, or the like, discrete articles
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems

Definitions

  • the embodiments of the present specification relate to the field of Internet technologies, and in particular, to a method and device for using self-service and electronic equipment.
  • a method, device and electronic device for using self-service provided by embodiments of this specification, and a method and device for embedding point analysis and electronic device:
  • a method for using self-service includes:
  • the payment interface is called to perform payment settlement of the service corresponding to the service name.
  • a self-service using device includes:
  • An obtaining unit for obtaining voice information input by a user An obtaining unit for obtaining voice information input by a user
  • a recognition unit that recognizes the semantic content represented by the voice information
  • a determining unit based on the semantic content, determining a service name that the user needs to purchase
  • the payment unit invokes a payment interface to perform payment settlement of a service corresponding to the service name.
  • an electronic device including:
  • Memory for storing processor-executable instructions
  • the processor is configured as a method of using any one of the above self-services.
  • This manual provides a self-service usage scheme. Using self-service through voice, one-time voice interaction can complete the use of self-service, which improves the efficiency of self-service. This solution allows users to use self-service through voice even when there is no touch screen.
  • FIG. 1 is a schematic diagram of a self-service vending machine provided by an embodiment of the present specification
  • FIG. 2 is a schematic diagram of a self-service vending machine provided by an embodiment of the present specification
  • FIG. 3 is a flowchart of a method for using self-service provided by an embodiment of the present specification
  • FIG. 4 is a hardware structural diagram of a self-service using device provided by an embodiment of the present specification.
  • FIG. 5 is a schematic block diagram of a self-service using device provided by an embodiment of the present specification.
  • first, second, third, etc. may be used in this specification to describe various information, the information should not be limited to these terms. These terms are only used to distinguish the same type of information from each other.
  • first information may also be referred to as the second information, and similarly, the second information may also be referred to as the first information.
  • word “if” as used herein can be interpreted as “at” or "when” or "in response to determination”.
  • the self-service terminal needs to set a selection button corresponding to each service item.
  • the self-service vending machine 11 includes a window 12 for displaying items and an item outlet 13.
  • an item 121 for sale is placed in the window 12.
  • a physical button 122 may be provided near the item 121. The user can press the button corresponding to the item to be purchased to initiate a request to purchase the item to the kiosk 11.
  • the self-service vending machine 11 can spit out the item from the item outlet 13 and the user can take it.
  • the conventional self-service vending machine shown in FIG. 1 since the area of the window 12 of the self-service vending machine is limited, too many items cannot be placed, and the number of items for users to choose is limited.
  • the self-service vending machine 21 includes a touch screen 22 for displaying items and an item outlet 23.
  • items for sale can be dynamically displayed; after the user clicks the target item, the user can be confirmed twice; and the item that the user needs to purchase is controlled through a virtual button.
  • the size of the touch screen is limited after all. In the case of a large number of items, the user needs to turn multiple pages to find the desired item, which affects the efficiency of the user's shopping and the user experience is poor.
  • this specification provides a method for using self-service, which can be introduced with reference to the example shown in FIG. 3 below.
  • This method can be applied to a self-service terminal (hereinafter referred to as a terminal).
  • the method may include the following steps. :
  • Step 310 Acquire voice information input by the user.
  • the terminal may be provided with a voice collection device, such as a microphone.
  • the user's voice information can be collected by the voice acquisition device.
  • the display screen can display the welcome interface by default.
  • the user can wake up the terminal by touching the welcome interface to perform voice input.
  • the step 310 may specifically include:
  • the preset wake-up word may be set artificially.
  • a user makes a voice shopping, they can start a voice shopping by saying a preset wake-up word.
  • the preset wake-up word is "I want to shop”
  • the user first needs to say “I want to shop” to wake up the terminal.
  • the terminal can provide feedback, such as playing "OK, please say what you need to buy", or display the corresponding text on the touch screen.
  • the terminal can call the voice collection device to collect the voice information spoken by the user next.
  • the step 310 may specifically include:
  • the user's voice information is collected.
  • the terminal may be provided with an image acquisition device, such as a camera. Continuously monitor the presence of users in front of the terminal through the camera. Specifically, when the image acquisition device captures a face image, it indicates that a user is ready to use the terminal, and at this time, the terminal is woken up, and then the voice acquisition device can be started to collect voice information spoken by the user.
  • an image acquisition device such as a camera. Continuously monitor the presence of users in front of the terminal through the camera. Specifically, when the image acquisition device captures a face image, it indicates that a user is ready to use the terminal, and at this time, the terminal is woken up, and then the voice acquisition device can be started to collect voice information spoken by the user.
  • Step 320 Identify the semantic content represented by the voice information.
  • the terminal After collecting the voice information input by the user, the terminal can further identify the semantic content to be expressed by the voice information.
  • the identification process may be performed based on a recognition library built into the terminal, or may be performed by a server through a network.
  • the server may refer to a server, a server cluster, or a cloud platform constructed by the server cluster corresponding to the self-service.
  • a server a server cluster, or a cloud platform constructed by the server cluster corresponding to the self-service.
  • a shopping server a server cluster corresponding to a self-service vending machine, or a cloud platform constructed by a server cluster.
  • the step 320 may specifically include:
  • A1 convert the voice information into text information based on the voice recognition technology
  • A2 based on semantic recognition technology, extracting a vocabulary of a predetermined part of speech from the text information;
  • A3 Determine the extracted vocabulary as semantic content.
  • the voice recognition technology may refer to a technology that uses voice recognition algorithms to convert voice information into text information.
  • the server can continuously learn the speech recognition model based on machine learning technology, thereby improving the accuracy of speech recognition. For example, dialects can be identified more accurately. Specifically, the server can learn local voice information for different regions, and can more accurately identify the actual text information corresponding to the voice information in different regions. For example, the server can collect a large number of speech samples in a certain area in advance (that is, the speech information of the recognized text information, and the samples corresponding to the speech information and the text information), and build a speech recognition model for the area based on machine learning algorithms. Through continuous learning, the speech recognition model can be continuously improved.
  • the speech recognition model in the region can be launched and the corresponding speech recognition model can be used to identify Voice messages uploaded by regional terminals.
  • the server can send the learned local speech recognition models to terminals in different regions; so that the terminal can more accurately recognize the speech information.
  • the semantic recognition technology is a technology for recognizing text information recognized by speech into instructions that can be understood by a machine.
  • this embodiment can process the original text information (that is, the text information corresponding to the speech information itself) identified by the speech information, and supplement the missing content in the original text information to form more complete and accurate text information.
  • the server can continuously learn the semantic recognition model based on machine learning technology, thereby improving the accuracy of semantic recognition.
  • the server can collect a large number of semantic samples of a certain area in advance (that is, text information of recognized semantic content, and samples corresponding to the text information and language content), and build a semantic recognition model of the area based on machine learning algorithms. Through continuous learning, the semantic recognition model can be continuously improved.
  • the learned semantic recognition model meets expectations (for example, the recognition accuracy rate meets business requirements)
  • the local area's semantic recognition model can be launched, and the corresponding semantic recognition model can be used to identify correspondence.
  • the semantic content of local text information for the terminal's own semantic recognition scheme, the server can send the learned local semantic recognition models to terminals in different regions; so that the terminal can more accurately identify the semantic content.
  • a semantic recognition algorithm can be used to extract vocabulary with a predetermined part of speech from the text information.
  • the semantic recognition algorithm may include parsing, lexical analysis, or according to certain rules, such as Regular Expression, and some algorithms, such as CYK analysis algorithm and Earlier Profiling algorithms, etc.
  • parsing can obtain the part-of-speech of each word from the text information to be processed according to a preset dictionary, and extract the words matching the preset part-of-speech.
  • Function words adverbs, prepositions, conjunctions, auxiliary words, onomatopoeia, and interjections.
  • the preset part-of-speech may include, for example, a noun, that is, a word matching the name in the text information is extracted and then processed.
  • the step A1 may specifically include:
  • the terminal calls the voice recognition SDK, and the server converts the voice information into text information;
  • the step A2 may specifically include:
  • the terminal invokes the semantic recognition SDK, and the server extracts a vocabulary of a predetermined part of speech from the text information;
  • the step A3 may specifically include:
  • the computing requirements of the terminal can be reduced, and the recognition time can also be reduced. Responses can be made without delay as much as possible to improve user experience.
  • Step 330 Based on the semantic content, determine a service name that the user needs to purchase.
  • the service name that the user actually wants to purchase can be determined according to the semantic content.
  • a service name list can be preset in the terminal system, and the service name list includes all service names that can be sold inside the terminal.
  • the step 330 may specifically include:
  • the matched service name is determined as the service name that the user needs to purchase.
  • the hit service name may be determined as the service name that the user needs to purchase.
  • step 330 can also be completed by the server via the network.
  • the server can perform data interaction with the terminal based on the network.
  • the server can maintain a service name list of all service names available for sale inside the terminal. After identifying the semantic content, the server can match the service name corresponding to the content in that language.
  • the server after transmitting the acquired voice information input by the user to the server, the server can complete the semantic content to be expressed by identifying the semantic information and the service name corresponding to the matching semantic content. ; And return the service name to the terminal, so that the terminal determines the service name that the user needs to purchase, and then the terminal can then call the payment interface to perform payment settlement for the service corresponding to the service name.
  • the above-mentioned matching may adopt a fuzzy matching method.
  • the fuzzy matching method can be used to match the service that the user actually wants to purchase as much as possible, avoiding the situation that the user cannot really identify the service that the user really needs to purchase due to insufficient language expression of the user.
  • the matching result may include the following situations:
  • the terminal may directly execute the subsequent step 340; it may also prompt the user and wait for the user to confirm before executing step 340.
  • the confirmation method may include multiple types:
  • the terminal may display the matched service information on the touch screen, such as service name, service image, service unit price, and other information, and a button for user confirmation. After receiving the trigger of the button, the user has confirmed that the service needs to be purchased, and then step 340 can be performed.
  • the terminal may receive voice confirmation information of the user; for example, the user may say "confirm purchase", and after the terminal recognizes that the purchase is confirmed, execute step 340.
  • This embodiment can be applied to a terminal without a touch screen, and of course, it can also be applied to a terminal with a touch screen.
  • the voice information collected can be used for speech recognition to obtain text information, and then based on the text information, it is determined whether there is a vocabulary expressing confirmation, such as "confirm", "OK” , "Yes”; when there is a vocabulary that expresses confirmation in this way, it is determined that the user confirms by voice before the subsequent payment settlement can be performed.
  • the semantic content is "yoghurt", which may match the yogurt of brand A and the yogurt of brand B.
  • the terminal may prompt the user, for example, by displaying a prompt message on the touch screen, waiting for the user to confirm.
  • the user can be prompted by voice.
  • the terminal can display a plurality of matched service information on the touch screen, and a confirmation button corresponding to each service information. After receiving the confirmation button being triggered, it indicates that the user confirms the service to be purchased again, and then step 340 can be performed. It is worth mentioning that the confirmation button here can be multi-selected, that is, users can purchase multiple services at the same time.
  • the terminal can receive more detailed voice information input by the user again.
  • the first voice message collected by the user is "I want to buy acid", and the result matches "A yogurt” and "B yogurt”; the terminal can collect more detailed information for the user "I” To buy “A yogurt”, only "A yogurt” can be matched at this time, and then the foregoing embodiment of "matching to 1 service name" can be repeated.
  • the terminal may prompt the user, for example, by displaying a prompt message on the touch screen, waiting for the user to speak again, and repeating the foregoing steps 310-330.
  • Step 340 Call the payment interface to perform payment settlement of the service corresponding to the service name.
  • the payment settlement method may include multiple mobile payment methods such as face payment, scanning payment, and sonic payment, and may also support traditional payment methods such as cash and bank cards.
  • face payment an image acquisition device (such as a camera) can be activated to collect a user's face image.
  • a payment code may be displayed for scanning by a user, or a scanning device may be activated to actively scan a payment code provided by a user terminal.
  • the method further includes:
  • the step 340 may specifically include:
  • a payment interface is called to perform payment settlement of the service corresponding to the service name.
  • each service name in the service name list may also correspond to an inventory quantity.
  • inventory quantity is 0, it means that the service has been sold out, and can be executed according to the foregoing embodiment that “the service name is not matched”; or services with similar semantic content can be recommended for users to choose.
  • This manual provides a self-service usage scheme. Using self-service through voice, one-time voice interaction can complete the use of self-service, which improves the efficiency of self-service. This solution allows users to use self-service through voice even when there is no touch screen.
  • the solution provided in this manual can be applied to the self-service vending machine scenario, that is, users can order goods by voice. Since the voice ordering process does not require additional operations, users only need to say the products they want to purchase to complete the order. Increased the efficiency of self-service vending machines.
  • the solution provided in this specification can also be applied to a restaurant ordering scene. Users order food by voice before the self-service ordering machine. Since the voice ordering process requires no additional operations, the user only needs to say what they want to eat to complete the order, which greatly improves the ordering efficiency of the self-service ordering machine. .
  • this specification also provides an embodiment of a self-service use device.
  • the device embodiments may be implemented by software, or may be implemented by hardware or a combination of software and hardware.
  • software implementation as an example, as a device in a logical sense, it is formed by reading the corresponding computer business program instructions in the non-volatile memory into the memory through the processor of the device in which it is located.
  • hardware as shown in FIG. 4, it is a hardware structure diagram of the device where the self-service use device of this specification is located, except for the processor, network interface, memory, and non-volatile memory shown in FIG.
  • the device where the device is located is usually based on the actual function of self-service use, and may also include other hardware, which will not be described again.
  • FIG. 5 is a block diagram of a self-service using device provided by an embodiment of the specification.
  • the device corresponds to the embodiment shown in FIG. 3.
  • the device includes:
  • the obtaining unit 410 obtains voice information input by a user
  • a recognition unit 420 identifying the semantic content represented by the voice information
  • a determining unit 430 based on the semantic content, determining a service name that the user needs to purchase;
  • the payment unit 440 invokes a payment interface to perform payment settlement of the service corresponding to the service name.
  • the obtaining unit 410 specifically includes:
  • the identification unit 420 specifically includes:
  • a speech recognition subunit which converts the speech information into text information based on the speech recognition technology
  • a semantic recognition subunit based on semantic recognition technology, extracting vocabulary of a predetermined part of speech from the text information
  • a determination subunit determines the extracted vocabulary as semantic content.
  • the speech recognition subunit specifically includes:
  • the semantic recognition subunit specifically includes:
  • the determining subunit specifically includes:
  • the preset part of speech includes a name.
  • the determining unit 430 specifically includes:
  • the matched service name is determined as the service name that the user needs to purchase.
  • the device further includes:
  • a judging unit judging whether an inventory quantity of the service name is greater than 0;
  • the payment unit 440 specifically includes:
  • a payment interface is called to perform payment settlement of the service corresponding to the service name.
  • the self-service includes a self-service vending machine service
  • the service name includes a product name.
  • the system, device, module, or unit described in the foregoing embodiments may be specifically implemented by a computer chip or entity, or a product with a certain function.
  • a typical implementation device is a computer, and the specific form of the computer may be a personal computer, a laptop computer, a cellular phone, a camera phone, a smart phone, a personal digital assistant, a media player, a navigation device, an email sending and receiving device, and a game control Desk, tablet computer, wearable device, or a combination of any of these devices.
  • the relevant part may refer to the description of the method embodiment.
  • the device embodiments described above are only schematic, wherein the units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, may be located One place, or it can be distributed across multiple network elements. Some or all of the modules can be selected according to actual needs to achieve the purpose of the solution in this specification. Those of ordinary skill in the art can understand and implement without creative efforts.
  • FIG. 5 describes the internal functional modules and structure of the self-service device, and its substantial execution subject may be an electronic device, including:
  • Memory for storing processor-executable instructions
  • the processor is configured to:
  • the payment interface is called to perform payment settlement of the service corresponding to the service name.
  • the acquiring voice information input by a user specifically includes:
  • the identifying the semantic content represented by the voice information specifically includes:
  • the extracted words are determined as semantic content.
  • the converting the voice information into text information based on the voice recognition technology specifically includes:
  • the extracting a vocabulary of a predetermined part of speech from the text information based on the semantic recognition technology specifically includes:
  • the determining the extracted vocabulary as semantic content specifically includes:
  • the preset part of speech includes a name.
  • determining the service name that the user needs to purchase based on the semantic content specifically includes:
  • the matched service name is determined as the service name that the user needs to purchase.
  • Optional also includes:
  • the calling the payment interface to perform payment settlement of the service corresponding to the service name specifically includes:
  • a payment interface is called to perform payment settlement of the service corresponding to the service name.
  • the self-service includes a self-service vending machine service
  • the service name includes a product name.
  • the processor may be a central processing unit (English: Central Processing Unit, CPU for short), or other general-purpose processors, digital signal processors (English: Digital Signal Processor) , Referred to as DSP), application specific integrated circuit (English: Application Specific Integrated Circuit, referred to as ASIC), etc.
  • a general-purpose processor may be a microprocessor, or the processor may be any conventional processor, and the foregoing memory may be a read-only memory (English: read-only memory (abbreviation: ROM)), a random access memory (English : Random access memory (abbreviation: RAM), flash memory, hard disk or solid state hard disk.
  • the steps of the method disclosed in combination with the embodiments of the present invention may be directly implemented by a hardware processor, or may be performed by a combination of hardware and software modules in the processor.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Acoustics & Sound (AREA)
  • User Interface Of Digital Computer (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

一种自助服务的使用方法及装置和电子设备,所述方法包括:获取用户输入的语音信息(310);识别所述语音信息表示的语义内容(320);基于所述语义内容,确定所述用户需要购买的服务名称(330);调用支付接口,进行所述服务名称对应服务的支付结算(340)。

Description

自助服务的使用方法及装置和电子设备 技术领域
本说明书实施例涉及互联网技术领域,尤其涉及一种自助服务的使用方法及装置和电子设备。
背景技术
随着移动支付的日益普及,采用移动支付的自助服务的应用场景越来越多。例如自助售货机、自助点餐系统。人们通过移动支付可以快速在自助服务终端上进行支付,从而获取相应的服务,大大提高了自助服务的使用率。
在相关技术中,由于用户使用自助服务时需要在自助服务终端上选择需要的服务项目;因此,自助服务终端上需要设置每种服务项目对应的选择按钮。然而,自助服务终端的屏幕大小有限,在服务项目数量比较多的情况下,用户需要翻多页才能找到想要的服务项目,影响自助服务的效率,用户体验较差。
需要提供一种更有效率的使用自助服务的方案。
发明内容
本说明书实施例提供的一种自助服务的使用方法及装置和电子设备,以及埋点分析方法及装置和电子设备:
根据本说明书实施例的第一方面,提供一种自助服务的使用方法,所述方法包括:
获取用户输入的语音信息;
识别所述语音信息表示的语义内容;
基于所述语义内容,确定所述用户需要购买的服务名称;
调用支付接口,进行所述服务名称对应服务的支付结算。
根据本说明书实施例的第二方面,提供一种自助服务的使用装置,所述装置包括:
获取单元,获取用户输入的语音信息;
识别单元,识别所述语音信息表示的语义内容;
确定单元,基于所述语义内容,确定所述用户需要购买的服务名称;
支付单元,调用支付接口,进行所述服务名称对应服务的支付结算。
根据本说明书实施例的第五方面,提供一种电子设备,包括:
处理器;
用于存储处理器可执行指令的存储器;
其中,所述处理器被配置为上述任一项自助服务的使用方法。
本说明书提供了一种自助服务的使用方案,通过语音使用自助服务,一次语音交互即可完成自助服务的使用,提升了自助服务的使用效率。该方案即使在没有触摸屏的情况下,用户也可以通过语音进行自助服务的使用。
附图说明
图1是本说明书一实施例提供的自助售货机的示意图;
图2是本说明书一实施例提供的自助售货机的示意图;
图3是本说明书一实施例提供的自助服务的使用方法的流程图;
图4是本说明书一实施例提供的自助服务的使用装置的硬件结构图;
图5是本说明书一实施例提供的自助服务的使用装置的模块示意图。
具体实施方式
这里将详细地对示例性实施例进行说明,其示例表示在附图中。下面的描述涉及附图时,除非另有表示,不同附图中的相同数字表示相同或相似的要素。以下示例性实施例中所描述的实施方式并不代表与本说明书相一致的所有实施方式。相反,它们仅是与如所附权利要求书中所详述的、本说明书的一些方面相一致的装置和方法的例子。
在本说明书使用的术语是仅仅出于描述特定实施例的目的,而非旨在限制本说明书。在本说明书和所附权利要求书中所使用的单数形式的“一种”、“所述”和“该”也旨在包括多数形式,除非上下文清楚地表示其他含义。还应当理解,本文中使用的术语“和/或”是指并包含一个或多个相关联的列出项目的任何或所有可能组合。
应当理解,尽管在本说明书可能采用术语第一、第二、第三等来描述各种信息,但 这些信息不应限于这些术语。这些术语仅用来将同一类型的信息彼此区分开。例如,在不脱离本说明书范围的情况下,第一信息也可以被称为第二信息,类似地,第二信息也可以被称为第一信息。取决于语境,如在此所使用的词语“如果”可以被解释成为“在……时”或“当……时”或“响应于确定”。
如前所述,由于用户使用自助服务时需要在自助服务终端上选择需要的服务项目;因此,自助服务终端上需要设置每种服务项目对应的选择按钮。
以图1所示的自助售货机为例,自助售货机11包括展示物品的窗口12以及物品出口13。在所述窗口12中,摆放有待售卖的物品121,一般该物品121附近还可以设置有物理按键122。用户可以按下需要购买的物品对应的按键,以向自助售货机11发起购买该物品的请求。当购买成功后,自助售货机11可以将物品从物品出口13吐出,用户拿取即可。在图1所示的传统自助售货机中,由于自助售货机的窗口12的面积有限,无法摆放过多的物品,供用户选择的物品数量有限。
图2所示的一种触屏自助售货机中,自助售货机21包括展示物品的触摸屏22以及物品出口23。在所述触摸屏22中,可以动态展示待售卖的物品;在用户点击目标物品后,可以向用户二次确认;通过虚拟按钮控制用户需要购买的物品。然而,触摸屏的大小毕竟有限,在物品数量比较多的情况下,用户需要翻多页才能找到想要的物品,影响用户购物的效率,用户体验较差。
为了解决上述问题,本说明书提供了一种自助服务的使用方法,以下可以参考图3所示的例子介绍,该方法可以应用于自助服务终端(以下简称为终端),所述方法可以包括以下步骤:
步骤310:获取用户输入的语音信息。
所述终端上可以设置有语音采集装置,例如麦克风。通过该语音采集装置可以采集用户的语音信息。
一般的,终端如果有显示屏的情况下,显示屏默认可以展示欢迎界面。用户可以通过触摸欢迎界面来唤醒终端,从而进行语音输入。
在一实施例中,所述步骤310,具体可以包括:
在检测到预设的唤醒词的情况下,采集后续的语音信息。
其中,所述预设的唤醒词可以是人为设定的。当用户进行语音购物时,可以通过说 出预设的唤醒词,开始进行语音购物。
举例说明,当预设的唤醒词为“我要购物”;用户首先需要说出“我要购物”,用以唤醒终端。所述终端可以进行反馈,例如播放“好的,请说出需要购买的物品”,或者在触摸屏上展示相应文字。此时,终端可以调用语音采集装置采集接下来用户说出的语音信息。
在一实施例中,所述步骤310,具体可以包括:
在检测到用户人脸图像的情况下,采集用户的语音信息。
该实施例中,所述终端上可以设置有图像采集装置,例如摄像头。通过摄像头持续监控终端前是否出现用户。具体地,当图像采集装置捕捉到出现人脸图像时,说明有用户准备使用终端,此时所述终端被唤醒,进而可以启动语音采集装置以采集用户说出的语音信息。
步骤320:识别所述语音信息表示的语义内容。
终端在采集到用户输入的语音信息后,可以进一步识别所述语音信息所要表达的语义内容。
在一实施例中,所以识别过程可以基于终端内置的识别库进行识别,也可以通过网络由服务端进行识别。
其中,所述服务端可以是指所述自助服务对应的服务器、服务器集群或者由服务器集群构建的云平台。例如,自助售货机对应的购物服务器、服务器集群或者由服务器集群构建的云平台。
在一种实现方式中,所述步骤320,具体可以包括:
A1:基于语音识别技术,将所述语音信息转换为文字信息;
A2:基于语义识别技术,从所述文字信息中提取预设词性的词汇;
A3:将所提取的词汇确定为语义内容。
其中,所述语音识别技术,可以是指利用语音识别算法将语音信息转换为文字信息的技术。
在实际应用中,由于不同地区存在不同的口音、方言,可能会导致语音识别结果出现不准确的问题。
为此,在一实施例中:服务端可以基于机器学习技术,持续学习语音识别模型,进而完善语音识别的准确性。例如可以更为准确地识别出方言。具体地,服务端可以针对不同地区,学习当地的语音信息,进而可以更为准确地识别不同地区语音信息实际所对应的文字信息。举例说明,服务端可以预先收集大量的某个地区的语音样本(即已识别文字信息的语音信息,语音信息和文字信息对应的样本),并基于机器学习算法,构建该地区的语音识别模型,通过持续地学习可以不断完善所述语音识别模型,当学习到的语音识别模型达到预期(例如识别准确率符合业务要求)时,可以上线该地区的语音识别模型,并通过该语音识别模型识别对应地区终端上传的语音信息。在有的实施中,对于终端自己进行语音识别的方案,服务端可以向不同地区的终端下发学习到的当地的语音识别模型;以使终端可以更为准确地识别语音信息。
所述语义识别技术,就是把语音识别出来的文字信息识别成机器可理解的指令的技术。
在实际应用中,由于不同地区的用户语言表达方式存在差异,因此可能导致不同地区对于同一种含义的物品表达的语音信息存在差异,进而导致识别出的文字信息也不相同。例如,对于混沌,有些地区称为混沌,有些地区称为云吞。为此,本实施例可以对语音信息识别出的原始文字信息(即语音信息本身对应的文字信息)进行处理,补充原始文字信息中缺失的内容,以形成更为完整更为准确的文字信息。
同样地,在一实施例中:服务端可以基于机器学习技术,持续学习语义识别模型,进而完善语义识别的准确性。具体地,服务端可以预先收集大量的某个地区的语义样本(即已识别语义内容的文字信息,文字信息与语言内容对应的样本),并基于机器学习算法,构建该地区的语义识别模型,通过持续地学习可以不断完善所述语义识别模型,当学习到的语义识别模型达到预期(例如识别准确率符合业务要求)时,可以上线该地区的语义识别模型,并通过该语义识别模型识别对应地区文字信息的语义内容。在有的实施中,对于终端自己进行语义识别的方案,服务端可以向不同地区的终端下发学习到的当地的语义识别模型;以使终端可以更为准确地识别语义内容。
在本实施例中,基于语义识别技术,可以利用语义识别算法从文字信息中提取出预设词性的词汇。
具体的,所述语义识别算法可以包括句法分析(Parsing)、词法分析(lexical analysis),或者根据某些规则,例如正则表达式(Regular Expression),某些算法,例如CYK剖析算法和厄尔利剖析算法等。
以句法分析(Parsing)为例,句法分析(Parsing)可以将待处理的文字信息根据预设的词典得出每个词的词性,并将符合预设词性的词汇提取出来。
一般的,可以将现代语言的词汇分为2大类供12小类:
实词:名词、动词、形容词、数词、量词和代词。
虚词:副词、介词、连词、助词、拟声词和叹词。
在本实施例中,所述预设词性例如可以包括名词,即将文字信息中的符合名称的词汇提取出来后再做后续处理。
举例说明,假设用户输入的语音信息转换为的文字信息为“我想要购买三明治”。对该段文字信息进行词性分析,“我想要购买三明治”中“我”为代词,“想要”为动词,“购买”为动词,“三明治”为名称。由于本实施例中,需要得出词性为名词的词汇,因此可以从该段文字信息中提取出“三明治”。
以下介绍经由网络由服务端进行识别的实施例:
在一实施例中,所述步骤A1,具体可以包括:
终端调用语音识别SDK,由服务端将所述语音信息转换为文字信息;
所述终端接收所述服务端返回的所述文字信息。
在一实施例中,所述步骤A2,具体可以包括:
终端调用语义识别SDK,由服务端从所述文字信息中提取预设词性的词汇;
相应地,所述步骤A3,具体可以包括:
将所述服务端返回的所述词汇确定为语义内容。
具体地服务端进行语音识别以及语言识别的过程与上一实施例相同,此处不再进行赘述。
由于终端的存储计算资源有限,通过将语音、语义识别交由服务端完成,可以降低终端计算需求,并且也可以减少识别时间,尽可能做到无延迟响应,提高用户体验效果。
步骤330:基于所述语义内容,确定所述用户需要购买的服务名称。
在识别出用户语义信息所要表达的语义内容之后,可以根据所述语义内容判断用户实际想要购买的服务名称。
一般的,终端系统内部可以预先设置有一个服务名称列表,该服务名称列表中包含了本终端内部可售卖的所有服务名称。
在一实施例中,所述步骤330,具体可以包括:
根据所述语义内容,从服务名称列表中匹配所述语义内容对应的服务名称;
将匹配到的服务名称确定为所述用户需要购买的服务名称。
该实施例中,如果语义内容命中所述服务名称列表,则可以将命中的服务名称确定为用户需要购买的服务名称。
在一实施例中,步骤330同样可以经网络由服务端完成。类似的,服务端可以与终端基于网络进行数据交互。服务端可以维护有该终端内部可售卖的所有服务名称的服务名称列表,在识别出语义内容后,服务端可以匹配该语内容对应的服务名称。
也就是说,在一种实施例中,终端在将获取到的用户输入的语音信息传输给服务端后,可以由服务端完成识别语义信息所要表达的语义内容,以及匹配语义内容对应的服务名称;并将服务名称返回给终端,以使终端确定用户需要购买的服务名称,然后终端可以再调用支付接口,进行服务名称对应服务的支付结算。
在一实施例中,上述匹配可以采用模糊匹配的方式。
举例说明,假设识别出的语义内容为“火腿”,那么通过模糊匹配,即可以匹配到“金华火腿”,也可以匹配到“火腿肠”。假设预约内容为“可乐”,即可以匹配到“可口可乐”,也可以匹配到“百事可乐”。
采用模糊匹配的方式,可以尽可能匹配到用户实际想购买的服务,避免由于用户语言表达的不足导致无法识别到用户真正需要购买的服务的情况。
需要说明的是,前述识别出的服务名称可以存在1个或者多个。本申请中“多个”可以是指2个或者2个以上。通常,所述匹配结果可以包括如下几种情况:
第一种情况,匹配到1个服务名称。
在匹配到1个服务名称时,终端可以直接执行后续步骤340;也可以提示用户,等待用户确认后再执行步骤340。
所述确认方式可以包括多种:
在一种实现方式中:
所述终端可以在触摸屏中显示匹配到的服务信息,如服务名称、服务图像、服务单价等信息,以及供用户确认的按钮。在接收到该按钮被触发后,说明用户已经确认需要购买该服务,进而可以执行步骤340。
在另一种实现方式中:
所述终端可以接收用户的语音确认信息;例如用户可以说“确认购买”,终端在识别到确认购买后,执行步骤340。该实施例可以适用于没有触摸屏的终端,当然也可以应用在有触摸屏的终端。需要说明的是,当终端在等待用户确认过程中,所采集到的语音信息后可以进行语音识别得到文字信息,然后基于文字信息判断是否存在表达确认的词汇,例如“确认”、“好的”、“是的”;当存在这样表达确认的词汇后,判定用户语音确认,才能执行后续支付结算。
第二种情况,匹配到多个服务名称。
在实际应用中,由于存在不同品牌的类似服务,因此可能会匹配到多个服务。例如语义内容为“酸奶”,可能匹配到A品牌的酸奶,B品牌的酸奶。
在匹配到多个服务名称时,终端可以提示用户,例如可以通过在触摸屏上显示提示信息,等待用户确认。或者,可以通过语音提示用户。
类似的,这里的确认方式可以分为多种。
在一种实现方式中:
终端可以在触摸屏上展示匹配到的多个服务信息,以及每个服务信息对应的确认按钮。在接收到该确认按钮被触发后,说明用户二次确认需要购买的服务,进而可以执行步骤340。值得一提的是,这里的确认按钮可以多选,即用户可以同时购买多种服务。
在另一种实现方式中:
所述终端可以接收用户再次输入的更为详细的语音信息。
举例说明,采集到用户第一次的语音信息为“我要购买酸内”,结果匹配到“A酸奶”和“B酸奶”;终端可以再次采集用户说的更为详细的用于信息“我要购买A酸奶”,此时只能匹配到“A酸奶”了,然后可以重复前述“匹配到1个服务名称”的实施例。
第三种情况,没有匹配到服务名称。
在没有匹配到服务名称时,终端可以提示用户,例如可以通过在触摸屏上显示提示信息,等待用户再次说话,重复前述步骤310-330。
步骤340:调用支付接口,进行所述服务名称对应服务的支付结算。
本实施例中,支付结算的方式可以包括人脸支付、扫描支付、声波支付等多种移动支付方式,也可以支持现金、银行卡等传统支付方式。这里,调用支付接口后,就可以跳转到相应的支付流程。例如人脸支付的方式下,可以启动图像采集装置(如摄像头)采集用户的人脸图像。再例如扫描支付的方式下,可以展示支付码供用户扫描,或者开启扫描装置主动扫描用户终端提供的支付码。
在一实施例中,所述方法还包括:
判断所述服务名称的库存数量是否大于0;
所述步骤340,具体可以包括:
在所述服务名称的库存数量大于0的情况下,调用支付接口,进行所述服务名称对应服务的支付结算。
需要说明的是,所述服务名称列表中每个服务名称还可以对应有一个库存数量。当所述库存数量为0时,说明该服务已经售罄,可以按照前述“没有匹配到服务名称”的实施例执行;或者可以推荐语义内容类似的服务供用户选择。
本说明书提供了一种自助服务的使用方案,通过语音使用自助服务,一次语音交互即可完成自助服务的使用,提升了自助服务的使用效率。该方案即使在没有触摸屏的情况下,用户也可以通过语音进行自助服务的使用。
本说明书提供的方案可以应用在自助售货机场景,即用户可以通过语音进行点货,由于语音点货过程用户无需进行额外操作,只需要说出想要购买的商品即可完成点货,大大提升了自助售货机售货效率。
类似的,本说明书提供的方案还可以应用在餐厅点餐场景。用户在自助点餐机前通过语音进行点餐,由于语音点餐过程用户无需进行额外操作,只需要说出想要吃的餐品即可完成点货,大大提升了自助点餐机点餐效率。
与前述自助服务的使用方法实施例相对应,本说明书还提供了自助服务的使用装置的实施例。所述装置实施例可以通过软件实现,也可以通过硬件或者软硬件结合的方式实现。以软件实现为例,作为一个逻辑意义上的装置,是通过其所在设备的处理器 将非易失性存储器中对应的计算机业务程序指令读取到内存中运行形成的。从硬件层面而言,如图4所示,为本说明书自助服务的使用装置所在设备的一种硬件结构图,除了图4所示的处理器、网络接口、内存以及非易失性存储器之外,实施例中装置所在的设备通常根据自助服务的使用实际功能,还可以包括其他硬件,对此不再赘述。
请参见图5,为本说明书一实施例提供的自助服务的使用装置的模块图,所述装置对应了图3所示实施例,所述装置包括:
获取单元410,获取用户输入的语音信息;
识别单元420,识别所述语音信息表示的语义内容;
确定单元430,基于所述语义内容,确定所述用户需要购买的服务名称;
支付单元440,调用支付接口,进行所述服务名称对应服务的支付结算。
可选的,所述获取单元410,具体包括:
在检测到预设的唤醒词的情况下,采集后续的语音信息。
可选的,所述识别单元420,具体包括:
语音识别子单元,基于语音识别技术,将所述语音信息转换为文字信息;
语义识别子单元,基于语义识别技术,从所述文字信息中提取预设词性的词汇;
确定子单元,将所提取的词汇确定为语义内容。
可选的,所述语音识别子单元,具体包括:
调用语音识别SDK,由服务端将所述语音信息转换为文字信息;
接收所述服务端返回的所述文字信息。
可选的,所述语义识别子单元,具体包括:
调用语义识别SDK,由服务端从所述文字信息中提取预设词性的词汇;
所述确定子单元,具体包括:
将所述服务端返回的所述词汇确定为语义内容。
可选的,所述预设词性包括名称。
可选的,所述确定单元430,具体包括:
根据所述语义内容,从服务名称列表中匹配所述语义内容对应的服务名称;
将匹配到的服务名称确定为所述用户需要购买的服务名称。
可选的,所述装置还包括:
判断单元,判断所述服务名称的库存数量是否大于0;
所述支付单元440,具体包括:
在所述服务名称的库存数量大于0的情况下,调用支付接口,进行所述服务名称对应服务的支付结算。
可选的,所述自助服务包括自助售货机服务;
所述服务名称包括商品名称。
上述实施例阐明的系统、装置、模块或单元,具体可以由计算机芯片或实体实现,或者由具有某种功能的产品来实现。一种典型的实现设备为计算机,计算机的具体形式可以是个人计算机、膝上型计算机、蜂窝电话、相机电话、智能电话、个人数字助理、媒体播放器、导航设备、电子邮件收发设备、游戏控制台、平板计算机、可穿戴设备或者这些设备中的任意几种设备的组合。
上述装置中各个单元的功能和作用的实现过程具体详见上述方法中对应步骤的实现过程,在此不再赘述。
对于装置实施例而言,由于其基本对应于方法实施例,所以相关之处参见方法实施例的部分说明即可。以上所描述的装置实施例仅仅是示意性的,其中所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本说明书方案的目的。本领域普通技术人员在不付出创造性劳动的情况下,即可以理解并实施。
以上图5描述了自助服务的使用装置的内部功能模块和结构示意,其实质上的执行主体可以为一种电子设备,包括:
处理器;
用于存储处理器可执行指令的存储器;
其中,所述处理器被配置为:
获取用户输入的语音信息;
识别所述语音信息表示的语义内容;
基于所述语义内容,确定所述用户需要购买的服务名称;
调用支付接口,进行所述服务名称对应服务的支付结算。
可选的,所述获取用户输入的语音信息,具体包括:
在检测到预设的唤醒词的情况下,采集后续的语音信息。
可选的,所述识别所述语音信息表示的语义内容,具体包括:
基于语音识别技术,将所述语音信息转换为文字信息;
基于语义识别技术,从所述文字信息中提取预设词性的词汇;
将所提取的词汇确定为语义内容。
可选的,所述基于语音识别技术,将所述语音信息转换为文字信息,具体包括:
调用语音识别SDK,由服务端将所述语音信息转换为文字信息;
接收所述服务端返回的所述文字信息。
可选的,所述基于语义识别技术,从所述文字信息中提取预设词性的词汇,具体包括:
调用语义识别SDK,由服务端从所述文字信息中提取预设词性的词汇;
所述将所提取的词汇确定为语义内容,具体包括:
将所述服务端返回的所述词汇确定为语义内容。
可选的,所述预设词性包括名称。
可选的,所述基于所述语义内容,确定所述用户需要购买的服务名称,具体包括:
根据所述语义内容,从服务名称列表中匹配所述语义内容对应的服务名称;
将匹配到的服务名称确定为所述用户需要购买的服务名称。
可选的,还包括:
判断所述服务名称的库存数量是否大于0;
所述调用支付接口,进行所述服务名称对应服务的支付结算,具体包括:
在所述服务名称的库存数量大于0的情况下,调用支付接口,进行所述服务名称对应服务的支付结算。
可选的,所述自助服务包括自助售货机服务;
所述服务名称包括商品名称。
在上述电子设备的实施例中,应理解,该处理器可以是中央处理单元(英文:Central Processing Unit,简称:CPU),还可以是其他通用处理器、数字信号处理器(英文:Digital Signal Processor,简称:DSP)、专用集成电路(英文:Application Specific Integrated Circuit,简称:ASIC)等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等,而前述的存储器可以是只读存储器(英文:read-only memory,缩写:ROM)、随机存取存储器(英文:random access memory,简称:RAM)、快闪存储器、硬盘或者固态硬盘。结合本发明实施例所公开的方法的步骤可以直接体现为硬件处理器执行完成,或者用处理器中的硬件及软件模块组合执行完成。
本说明书中的各个实施例均采用递进的方式描述,各个实施例之间相同相似的部分互相参见即可,每个实施例重点说明的都是与其他实施例的不同之处。尤其,对于电子设备实施例而言,由于其基本相似于方法实施例,所以描述的比较简单,相关之处参见方法实施例的部分说明即可。
本领域技术人员在考虑说明书及实践这里公开的发明后,将容易想到本说明书的其它实施方案。本说明书旨在涵盖本说明书的任何变型、用途或者适应性变化,这些变型、用途或者适应性变化遵循本说明书的一般性原理并包括本说明书未公开的本技术领域中的公知常识或惯用技术手段。说明书和实施例仅被视为示例性的,本说明书的真正范围和精神由下面的权利要求指出。
应当理解的是,本说明书并不局限于上面已经描述并在附图中示出的精确结构,并且可以在不脱离其范围进行各种修改和改变。本说明书的范围仅由所附的权利要求来限制。

Claims (11)

  1. 一种自助服务的使用方法,所述方法包括:
    获取用户输入的语音信息;
    识别所述语音信息表示的语义内容;
    基于所述语义内容,确定所述用户需要购买的服务名称;
    调用支付接口,进行所述服务名称对应服务的支付结算。
  2. 根据权利要求1所述的方法,所述获取用户输入的语音信息,具体包括:
    在检测到预设的唤醒词的情况下,采集后续的语音信息。
  3. 根据权利要求1所述的方法,所述识别所述语音信息表示的语义内容,具体包括:
    基于语音识别技术,将所述语音信息转换为文字信息;
    基于语义识别技术,从所述文字信息中提取预设词性的词汇;
    将所提取的词汇确定为语义内容。
  4. 根据权利要求3所述的方法,所述基于语音识别技术,将所述语音信息转换为文字信息,具体包括:
    调用语音识别SDK,由服务端将所述语音信息转换为文字信息;
    接收所述服务端返回的所述文字信息。
  5. 根据权利要求3所述的方法,所述基于语义识别技术,从所述文字信息中提取预设词性的词汇,具体包括:
    调用语义识别SDK,由服务端从所述文字信息中提取预设词性的词汇;
    所述将所提取的词汇确定为语义内容,具体包括:
    将所述服务端返回的所述词汇确定为语义内容。
  6. 根据权利要求3所述的方法,所述预设词性包括名称。
  7. 根据权利要求1所述的方法,所述基于所述语义内容,确定所述用户需要购买的服务名称,具体包括:
    根据所述语义内容,从服务名称列表中匹配所述语义内容对应的服务名称;
    将匹配到的服务名称确定为所述用户需要购买的服务名称。
  8. 根据权利要求1所述的方法,所述方法还包括:
    判断所述服务名称的库存数量是否大于0;
    所述调用支付接口,进行所述服务名称对应服务的支付结算,具体包括:
    在所述服务名称的库存数量大于0的情况下,调用支付接口,进行所述服务名称对 应服务的支付结算。
  9. 根据权利要求1所述的方法,所述自助服务包括自助售货机服务;
    所述服务名称包括商品名称。
  10. 一种自助服务的使用装置,所述装置包括:
    获取单元,获取用户输入的语音信息;
    识别单元,识别所述语音信息表示的语义内容;
    确定单元,基于所述语义内容,确定所述用户需要购买的服务名称;
    支付单元,调用支付接口,进行所述服务名称对应服务的支付结算。
  11. 一种电子设备,包括:
    处理器;
    用于存储处理器可执行指令的存储器;
    其中,所述处理器被配置为上述权利要求1-9中任一项所述的方法。
PCT/CN2019/098986 2018-09-10 2019-08-02 自助服务的使用方法及装置和电子设备 WO2020052370A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201811049376.6A CN109343817A (zh) 2018-09-10 2018-09-10 自助服务的使用方法及装置和电子设备
CN201811049376.6 2018-09-10

Publications (1)

Publication Number Publication Date
WO2020052370A1 true WO2020052370A1 (zh) 2020-03-19

Family

ID=65305069

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/098986 WO2020052370A1 (zh) 2018-09-10 2019-08-02 自助服务的使用方法及装置和电子设备

Country Status (3)

Country Link
CN (1) CN109343817A (zh)
TW (1) TW202011324A (zh)
WO (1) WO2020052370A1 (zh)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109343817A (zh) * 2018-09-10 2019-02-15 阿里巴巴集团控股有限公司 自助服务的使用方法及装置和电子设备
CN109933198B (zh) * 2019-03-13 2022-04-05 广东小天才科技有限公司 一种语义识别方法及装置
CN111161729A (zh) * 2019-12-26 2020-05-15 苏州思必驰信息科技有限公司 用于智能自助设备的语音交互方法和装置
CN113450497A (zh) * 2020-03-25 2021-09-28 阿里巴巴集团控股有限公司 信息处理方法、装置及电子设备

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2015055653A (ja) * 2013-09-10 2015-03-23 セイコーエプソン株式会社 音声認識装置及び方法、並びに、電子機器
CN106383872A (zh) * 2016-09-06 2017-02-08 北京百度网讯科技有限公司 基于人工智能的信息处理方法及装置
CN106663428A (zh) * 2014-07-16 2017-05-10 索尼公司 装置、方法、非暂态计算机可读介质及系统
CN107909715A (zh) * 2017-09-29 2018-04-13 嘉兴川森智能科技有限公司 自动售货机中的语音识别系统及方法
CN109343817A (zh) * 2018-09-10 2019-02-15 阿里巴巴集团控股有限公司 自助服务的使用方法及装置和电子设备

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8923491B2 (en) * 2007-10-03 2014-12-30 At&T Intellectual Property I, L.P. System and method for connecting to addresses received in spoken communications
CN103632317A (zh) * 2013-11-14 2014-03-12 成都博约创信科技有限责任公司 一种自助式点餐及结算的方法
CN105575015B (zh) * 2015-06-30 2018-05-15 宇龙计算机通信科技(深圳)有限公司 一种自助结账方法、设备、服务器及系统
CN107067510A (zh) * 2017-03-27 2017-08-18 杭州赛狐科技有限公司 一种无人值守超市购物系统
CN107240000B (zh) * 2017-06-05 2020-10-23 北京大想智能科技有限公司 自助售货方法、系统和电子设备
CN107451893A (zh) * 2017-08-02 2017-12-08 上海庆科信息技术有限公司 一种语音购物的方法、云端信息处理装置及系统

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2015055653A (ja) * 2013-09-10 2015-03-23 セイコーエプソン株式会社 音声認識装置及び方法、並びに、電子機器
CN106663428A (zh) * 2014-07-16 2017-05-10 索尼公司 装置、方法、非暂态计算机可读介质及系统
CN106383872A (zh) * 2016-09-06 2017-02-08 北京百度网讯科技有限公司 基于人工智能的信息处理方法及装置
CN107909715A (zh) * 2017-09-29 2018-04-13 嘉兴川森智能科技有限公司 自动售货机中的语音识别系统及方法
CN109343817A (zh) * 2018-09-10 2019-02-15 阿里巴巴集团控股有限公司 自助服务的使用方法及装置和电子设备

Also Published As

Publication number Publication date
CN109343817A (zh) 2019-02-15
TW202011324A (zh) 2020-03-16

Similar Documents

Publication Publication Date Title
WO2020052370A1 (zh) 自助服务的使用方法及装置和电子设备
US20200286467A1 (en) Adaptive interface in a voice-based networked system
CN107832286B (zh) 智能交互方法、设备及存储介质
JP7159392B2 (ja) 画像および/または他のセンサデータに基づいている自動アシスタント要求の解決
US20220335930A1 (en) Utilizing pre-event and post-event input streams to engage an automated assistant
WO2019201098A1 (zh) 问答交互方法和装置、计算机设备及计算机可读存储介质
CN109215643B (zh) 一种交互方法、电子设备及服务器
EP3242224A1 (en) Question-answer information processing method and apparatus, storage medium, and device
CN110597952A (zh) 信息处理方法、服务器及计算机存储介质
US20220148596A1 (en) Voice to text conversion based on third-party agent content
WO2017084334A1 (zh) 一种语种识别方法、装置、设备及计算机存储介质
CN108959247B (zh) 一种数据处理方法、服务器及计算机可读介质
EP3724875B1 (en) Text independent speaker recognition
CN111985249A (zh) 语义分析方法、装置、计算机可读存储介质及电子设备
US11741139B2 (en) Systems and methods for determining a response to a user query
US20230033396A1 (en) Automatic adjustment of muted response setting
CN111611358A (zh) 信息交互方法、装置、电子设备及存储介质
JP2019185737A (ja) 検索方法及びそれを用いた電子機器
US20200411004A1 (en) Content input method and apparatus
JP6085149B2 (ja) 機能実行指示システム、機能実行指示方法及び機能実行指示プログラム
CN114391260A (zh) 文字识别方法、装置、存储介质及电子设备
US20220165257A1 (en) Neural sentence generator for virtual assistants
CN114898755A (zh) 语音处理方法及相关装置、电子设备、存储介质
CN111222334A (zh) 命名实体识别方法、装置、设备及介质
WO2023040692A1 (zh) 语音控制方法、装置、设备及介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19860181

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19860181

Country of ref document: EP

Kind code of ref document: A1