WO2021056127A1

WO2021056127A1 - Systems and methods for analyzing sentiment

Info

Publication number: WO2021056127A1
Application number: PCT/CN2019/107163
Authority: WO
Inventors: Linhao HUANG
Original assignee: Beijing Didi Infinity Technology And Development Co., Ltd.
Priority date: 2019-09-23
Filing date: 2019-09-23
Publication date: 2021-04-01

Abstract

Systems and methods for analyzing sentiment may be provided. The method may include obtaining, via a user terminal, current interaction information between a user and a service system. The method may include determining a sentiment estimation using a predictive model that processes the current interaction information. The method may further include determining one or more key word strings indicating the sentiment estimation based on the current interaction information and the dictionary regarding the service scenario.

Description

SYSTEMS AND METHODS FOR ANALYZING SENTIMENT

TECHNICAL FIELD

The present disclosure generally relates to sentiment analysis, and more particularly, relates to systems and methods for analyzing sentiment in a specific service scenario.

BACKGROUND

Emotion is an intuitive response to a behavior or thing. The emotion may be positive, negative or neutral. In some cases, the emotion may affect people’s behavior. For example, if a person has a negative emotion, his/her analytical ability and self-control ability may become weaken, further resulting in an impulsive or a reckless behavior. In online to offline service, for example, in a transportation service, the impulsive or reckless behavior of a driver may put safety at risk, and/or bring poor service experience to a service requester. Therefore, it is desirable to develop systems and methods for the online to offline service to analyze an employed service provider’s sentiment in, so as to improve service experience and reduce safety risk.

SUMMARY

According to an aspect of the present disclosure, a system for analyzing sentiment on an online to offline service may be provided. The system may include at least one storage device including a set of instructions, and at least one processor in communication with the at least one storage device. When executing the set of instructions, the at least one processor may perform a method including one or more of the following operations. The at least one processor may obtain, via a user terminal, current interaction information between a user and a service system. The at least one processor may determine a sentiment estimation using a predictive model that processes the current interaction information. The predictive model may be generated, based on labeled historical interaction information and a dictionary regarding a service scenario provided by the service system, by training an initial model. The at least one processor may determine one or more key word strings indicating the sentiment estimation based on the current interaction information and the dictionary regarding the service scenario.

In some embodiments, the sentiment estimation may include at least one of: a label indicating a positive mood, a label indicating a negative mood, or a label indicating a neutral mood.

In some embodiments, the at least one processor may send a reminder signal for easing the negative mood to the user terminal if the sentiment estimation is the negative mood.

In some embodiments, the at least one processor may determine a corresponding service strategy for the user in response to the determined sentiment estimation.

In some embodiments, the at least one processor may construct the dictionary regarding the service scenario. The at least one processor obtain a plurality of sets of interaction information regarding the service scenario. Each set of interaction information may correspond to an individual user, and the interaction information may be converted to a text including one or more words. The at least one processor may determine one or more candidate word strings for each set of interaction information. The at least one processor may determine common candidate word strings based on the one or more candidate word strings from the plurality of sets of interaction information. The at least one processor may construct the dictionary by measuring the common candidate word strings based on a measurement indicator.

In some embodiments, the at least one processor may perform a word segmentation operation to determine the one or more candidate word strings for each set of interaction information.

In some embodiments, the at least one processor may set a segmentation step for the text corresponding to the interaction information, and determine the one or more candidate word strings by segmenting the text in the segmentation step.

In some embodiments, the at least one processor may obtain a set of training data including the labeled historical interaction information. The at least one processor may generate, based on the dictionary regarding the service scenario, a plurality of word strings for each text in the set of training data. The at least one processor may generate word vectors corresponding to the plurality of word strings, and train the initial model based on the word vectors. During the training, the at least one processor may update parameters of the initial model by minimizing a loss function of the initial model, and determine the predictive model if the value of the loss function is less than or equal to a threshold.

In some embodiments, the at least one processor may use a stochastic gradient descent algorithm to update the parameters.

In some embodiments, the predictive model may include a convolutional neural network (CNN) model.

In some embodiments, the user may a driver and the service system may be a transportation service system.

According to an aspect of the present disclosure, a method is provided. The method may include one or more of the following operations. At least one processor may obtain, via a user terminal, current interaction information between a user and a service system. The at least one processor may determine a sentiment estimation using a predictive model that processes the current interaction information. The predictive model may be generated, based on labeled historical interaction information and a dictionary regarding a service scenario provided by the service system, by training an initial model. The at least one processor may determine one or more key word strings indicating the sentiment estimation based on the current interaction information and the dictionary regarding the service scenario.

According to another aspect of the present disclosure, a non-transitory computer readable medium is provided. The non-transitory computer readable medium may comprise executable instructions that cause at least one processor to effectuate a method. The method may include one or more of the following operations. The at least one processor may obtain, via a user terminal, current interaction information between a user and a service system. The at least one processor may determine a sentiment estimation using a predictive model that processes the current interaction information. The predictive model may be generated, based on labeled historical interaction information and a dictionary regarding a service scenario provided by the service system, by training an initial model. The at least one processor may determine one or more key word strings indicating the sentiment estimation based on the current interaction information and the dictionary regarding the service scenario.

BRIEF DESCRIPTION OF THE DRAWINGS

The present disclosure is further described in terms of exemplary embodiments. These exemplary embodiments are described in detail with reference to the drawings. The drawings are not to scale. These embodiments are non-limiting exemplary embodiments, in which like reference numerals represent similar structures throughout the several views of the drawings, and wherein:

FIG. 1A is a schematic diagram illustrating an exemplary sentiment analysis system according to some embodiments of the present disclosure;

FIG. 1B is a schematic diagram illustrating an exemplary sentiment analysis system implemented on an on-demand transportation system according to some embodiments of the present disclosure.

FIG. 2 is a schematic diagram illustrating exemplary components of a computing device according to some embodiments of the present disclosure;

FIG. 3 is a schematic diagram illustrating exemplary hardware and/or software components of an exemplary mobile device according to some embodiments of the present disclosure;

FIG. 4 is a block diagram illustrating an exemplary processing device according to some embodiments of the present disclosure;

FIG. 5 is a flowchart illustrating an exemplary process for analyzing sentiment according to some embodiments of the present disclosure;

FIG. 6 is a flowchart illustrating an exemplary process for constructing a specialized specified dictionary regarding a service scenario according to some embodiments of the present disclosure;

FIG. 7 is a flowchart illustrating an exemplary process for determining a predictive model according to some embodiments of the present disclosure; and

FIG. 8 is a schematic diagram illustrating an exemplary structure of a CNN model according to some embodiments of the present disclosure.

DETAILED DESCRIPTION

In order to illustrate the technical solutions related to the embodiments of the present disclosure, brief introduction of the drawings referred to in the description of the embodiments is provided below. Obviously, drawings described below are only some examples or embodiments of the present disclosure. Those having ordinary skills in the art, without further creative efforts, may apply the present disclosure to other similar scenarios according to these drawings. Unless stated otherwise or obvious from the context, the same reference numeral in the drawings refers to the same structure and operation.

As used in the disclosure and the appended claims, the singular forms “a, ” “an, ” and “the” include plural referents unless the content clearly dictates otherwise. It will be further understood that the terms “comprises, ” “comprising, ” “includes, ” and/or “including” when used in the disclosure, specify the presence of stated steps and elements, but do not preclude the presence or addition of one or more other steps and elements.

Some modules of the system may be referred to in various ways according to some embodiments of the present disclosure. However, any number of different modules may be used and operated in a client terminal and/or a server. These modules are intended to be illustrative, not intended to limit the scope of the present disclosure. Different modules may be used in different aspects of the system and method.

According to some embodiments of the present disclosure, flowcharts are used to illustrate the operations performed by the system. It is to be expressly understood, the operations above or below may or may not be implemented in order. Conversely, the operations may be performed in inverted order, or simultaneously. Besides, one or more other operations may be added to the flowcharts, or one or more operations may be omitted from the flowchart.

Technical solutions of the embodiments of the present disclosure are described with reference to the drawings as described below. It is obvious that the described embodiments are not exhaustive and are not limiting. Other embodiments obtained, based on the embodiments set forth in the present disclosure, by those with ordinary skill in the art without any creative works are within the scope of the present disclosure.

For various online to offline services, for example, a food delivery service, an on-demand transportation service, it is essential that a service provider provides good services to a service requester safely. In most cases, the service provider’s sentiment may have an impact on the quality of his or her services and even create safety risk. It thus is significant for the service system/platform to assess the service provider’s sentiment. In some embodiments, the service system may provide an interaction interface for communication between the service provider and the service system. The service provider may dialogue with the service system via the interaction interface, for example, inputting an audio, texting messages. The interaction information may be recorded in a form of text (s) in various languages (e.g., English, or Chinese) , which is analyzed by the service system to assess the service provider’s sentiment. However, it is hard for the service system to accurately identify the true meaning of interaction information due to various reasons (e.g., different languages, different terminologies in certain service scenarios) , especially if the interaction information is recorded in Chinese. Unlike English text in which sentences are sequences of words delimited by white spaces, in Chinese text, sentences are represented as strings of Chinese words/characters or hanzi without similar natural delimiters. Moreover, in certain service scenarios, for example, in the on-demand transportation service, Chinese sentences used in interaction have unique features. For example, the Chinese driver likes to say a short sentence in the dialogue, and the short sentence may include uncommon combinations of words with meanings specific to on-demand transportation service (e.g., a terminology having a specific meaning in the on-demand transportation service scenario) . In fact, most hanzi occurring in different positions in a sentence may mean different things. It therefore is very hard to accurately segment the wording strings included in the Chinese text due to the lack of unambiguous word boundary indicators, which creates a huge challenge for analyzing the service provider’s sentiment.

To resolve the issue or similar issue above, various embodiments of the present disclosure may be provided. For example, a sentiment analysis system may be provided. The sentiment analysis system may be implemented on various online to offline service systems. The system may construct a specialized dictionary regarding a specific service based on the interaction information between a plurality of service providers and the service system. The system may utilize the specialized dictionary to segment current interaction information in the specific service (e.g., the on-demand transportation service) , which is useful for understanding the interaction information accurately. In some embodiments, the system may generate a predictive model for estimating a sentiment class of the service provider. For example, the predictive model may be generated by training an initial model (e.g., a CNN model) based on historical interaction information and the specialized dictionary. In some embodiments, the system may perform a corresponding service strategy based on the sentiment assessment, which may improve service experience and reduce safety risk.

FIG. 1A is a schematic diagram illustrating an exemplary sentiment analysis system according to some embodiments of the present disclosure. For example, sentiment analysis system 100 (hereinafter referred as SA system 100) may be implemented on an online to offline service platform (e.g., an on-demand transportation service system) for processing a service request (e.g., a car-hailing service request) from a service requester. In some embodiments, the service may be a transportation service, such as a taxi hailing service, a chauffeur service, a delivery vehicle service, a carpool service, a bus service, a driver hiring service and a shuttle service. In some embodiment, the service may be any online service, such as booking a meal, shopping, or the like, or any combination thereof. The SA system 100 may be a platform including a server 110, a network 120, a user terminal 130, and a storage device 140.

In some embodiments, the server 110 may be a single server or a server group. The server group may be centralized, or distributed (e.g., server 110 may be a distributed system) . In some embodiments, the server 110 may be local or remote. For example, the server 110 may access information and/or data stored in the user terminal 130, and/or the storage device 140 via the network 120. As another example, the server 110 may be directly connected to the user terminal 130, and/or the storage device 140 to access stored information and/or data. In some embodiments, the server 110 may be implemented on a cloud platform. Merely by way of example, the cloud platform may include a private cloud, a public cloud, a hybrid cloud, a community cloud, a distributed cloud, an inter-cloud, a multi-cloud, or the like, or any combination thereof. In some embodiments, the server 110 may be implemented on a computing device 200 having one or more components illustrated in FIG. 2 in the present disclosure.

In some embodiments, the server 110 may include a processing device 112. The processing device 112 may process information and/or data relating to user interaction to perform one or more functions described in the present disclosure. For example, in the on-demand transportation service, a driver may dialogue with the on-demand transportation service system via the user terminal 130. The SA system 100 may be implemented on the on-demand transportation service system. The SA system 100 may process interaction information including dialogue contents. Specifically, the processing device 112 of the SA system 100 may obtain a current interaction information between the driver and the on-demand transportation service system, and determine a sentiment estimation using a predictive model that processes the current interaction information. The processing device 112 may also determine one or more key word strings indicating the sentiment evaluation. The processing device 112 may determine the predictive model by training an initial model. Specifically, the processing device 112 may train the initial model based on labeled historical interaction information and a specialized dictionary regarding a service scenario (e.g., a dictionary regarding the on-demand transportation service) . As a further example, the processing device 112 may construct the specialized dictionary. In some embodiments, the processing device 112 may include one or more processing devices (e.g., single-core processing device (s) or multi-core processor (s) ) . Merely by way of example, the processing device 112 may include one or more hardware processors, such as a central processing unit (CPU) , an application-specific integrated circuit (ASIC) , an application-specific instruction-set processor (ASIP) , a graphics processing unit (GPU) , a physics processing unit (PPU) , a digital signal processor (DSP) , a field-programmable gate array (FPGA) , a programmable logic device (PLD) , a controller, a microcontroller unit, a reduced instruction-set computer (RISC) , a microprocessor, or the like, or any combination thereof.

The network 120 may facilitate the exchange of information and/or data. In some embodiments, one or more components in the SA system 100 (e.g., the server 110, the user terminal 130, and/or the storage device 140) may send information and/or data to other component (s) in the SA system 100 via the network 120. For example, the server 110 may obtain/acquire interaction information from the user terminal 130 via the network 120. In some embodiments, the network 120 may be any type of wired or wireless network, or a combination thereof. Merely by way of example, the network 120 may include a cable network, a wireline network, an optical fiber network, a telecommunications network, an intranet, the Internet, a local area network (LAN) , a wide area network (WAN) , a wireless local area network (WLAN) , a metropolitan area network (MAN) , a public telephone switched network (PSTN) , a Bluetooth network, a ZigBee network, a near field communication (NFC) network, or the like, or any combination thereof. In some embodiments, the network 120 may include one or more network access points. For example, the network 120 may include wired or wireless network access points such as base stations and/or internet exchange points 120-1, 120-2, …, through which one or more components of the SA system 100 may be connected to the network 120 to exchange data and/or information.

In some embodiments, the user terminal 130 may include a mobile device 130-1, a tablet computer 130-2, a laptop computer 130-3, a built-in device in a motor vehicle 130-4, or the like, or any combination thereof. In some embodiments, the mobile device 130-1 may include a smart home device, a wearable device, a mobile device, a virtual reality device, an augmented reality device, or the like, or any combination thereof. In some embodiments, the smart home device may include a smart lighting device, a control device of an intelligent electrical apparatus, a smart monitoring device, a smart television, a smart video camera, an interphone, or the like, or any combination thereof. In some embodiments, the wearable device may include a bracelet, footgear, glasses, a helmet, a watch, clothing, a backpack, a smart accessory, or the like, or any combination thereof. In some embodiments, the mobile device may include a mobile phone, a personal digital assistance (PDA) , a gaming device, a navigation device, a point of sale (POS) device, a laptop, a desktop, or the like, or any combination thereof. In some embodiments, the virtual reality device and/or the augmented reality device may include a virtual reality helmet, a virtual reality glass, a virtual reality patch, an augmented reality helmet, augmented reality glasses, an augmented reality patch, or the like, or any combination thereof. For example, the virtual reality device and/or the augmented reality device may include a Google Glass ^TM, a RiftCon ^TM, a Fragments ^TM, a Gear VR ^TM, etc. In some embodiments, a built-in device in the motor vehicle 130-4 may include an onboard computer, an onboard television, etc.

The storage device 140 may store data and/or instructions. In some embodiments, the storage device 140 may store data obtained from the user terminal 130. In some embodiments, the storage device 140 may store data and/or instructions that the server 110 may execute or use to perform exemplary methods described in the present disclosure. In some embodiments, the storage device 140 may include a mass storage, a removable storage, a volatile read-and-write memory, a read-only memory (ROM) , or the like, or any combination thereof. Exemplary mass storage may include a magnetic disk, an optical disk, a solid-state drive, etc. Exemplary removable storage may include a flash drive, a floppy disk, an optical disk, a memory card, a zip disk, a magnetic tape, etc. Exemplary volatile read-and-write memory may include a random-access memory (RAM) . Exemplary RAM may include a dynamic RAM (DRAM) , a double date rate synchronous dynamic RAM (DDR SDRAM) , a static RAM (SRAM) , a thyristor RAM (T-RAM) , and a zero-capacitor RAM (Z-RAM) , etc. Exemplary ROM may include a mask ROM (MROM) , a programmable ROM (PROM) , an erasable programmable ROM (EPROM) , an electrically-erasable programmable ROM (EEPROM) , a compact disk ROM (CD-ROM) , and a digital versatile disk ROM, etc. In some embodiments, the storage device 140 may be implemented on a cloud platform. Merely by way of example, the cloud platform may include a private cloud, a public cloud, a hybrid cloud, a community cloud, a distributed cloud, an inter-cloud, a multi-cloud, or the like, or any combination thereof.

In some embodiments, the storage device 140 may be connected to the network 120 to communicate with one or more components in the SA system 100 (e.g., the server 110, the user terminal 130, etc. ) . One or more components in the SA system 100 may access the data or instructions stored in the storage device 140 via the network 120. In some embodiments, the storage device 140 may be directly connected to or communicate with one or more components in the SA system 100 (e.g., the server 110, the user terminal 130, etc. ) . In some embodiments, the storage device 140 may be part of the server 110.

In some embodiments, one or more components in the SA system 100 (e.g., the server 110, the user terminal 130, etc. ) may have permission to access the storage device 140. In some embodiments, one or more components in the SA system 100 may read and/or modify information relating to the user and/or the public when one or more conditions are met. For example, the server 110 may read and/or modify a driver’s information (e.g., a service score) after estimating the driver’ sentiment class.

In some embodiments, information exchanging of one or more components in the SA system 100 may be achieved by way of requesting a dialogue service between the user (e.g., the service provider) and the service system. For example, for the on-demand transportation service, a driver, who registers in the on-demand transportation service platform, requests a dialogue with the on-demand transportation service platform via an application connected to the on-demand transportation service platform. In some embodiments, the service system may be any online to offline service system/platform. The object of the online to offline service may be any product. In some embodiments, the product may be a tangible product or an immaterial product. The tangible product may include food, medicine, commodity, chemical product, electrical appliance, clothing, car, housing, luxury, or the like, or any combination thereof. The immaterial product may include a servicing product, a financial product, a knowledge product, an internet product, or the like, or any combination thereof. The internet product may include an individual host product, a web product, a mobile internet product, a commercial host product, an embedded product, or the like, or any combination thereof. The mobile internet product may be used in a software of a mobile terminal, a program, a system, or the like, or any combination thereof. The mobile terminal may include a tablet computer, a laptop computer, a mobile phone, a personal digital assistance (PDA) , a smart watch, a point of sale (POS) device, an onboard computer, an onboard television, a wearable device, or the like, or any combination thereof. For example, the product may be any software and/or application used in the computer or mobile phone. The software and/or application may relate to socializing, shopping, transporting, entertainment, learning, investment, or the like, or any combination thereof. In some embodiments, the software and/or application relating to transporting may include a traveling software and/or application, a vehicle scheduling software and/or application, a mapping software and/or application, etc. In the vehicle scheduling software and/or application, the vehicle may include a horse, a carriage, a rickshaw (e.g., a wheelbarrow, a bike, a tricycle, etc. ) , a car (e.g., a taxi, a bus, a private car, etc. ) , a train, a subway, a vessel, an aircraft (e.g., an airplane, a helicopter, a space shuttle, a rocket, a hot-air balloon, etc. ) , or the like, or any combination thereof.

One of ordinary skill in the art would understand that when an element of the SA system 100 performs, the element may perform through electrical signals and/or electromagnetic signals. For example, when the user terminal 130 processes a task, such as making a determination, sending out dialogue contents (e.g., an audio dialogue or a text dialogue) , the user terminal 130 may operate logic circuits in its processor to process such task. When the user terminal 130 sends out the dialogue contents to the server 110, a processor of the user terminal 130 may generate electrical signals encoding the dialogue contents. The processor of the user terminal 130 may then send the electrical signals to an output port. If the user terminal 130 communicates with the server 110 via a wired network, the output port may be physically connected to a cable, which may further transmit the electrical signals to an input port of the server 110. If the user terminal 130 communicates with the server 110 via a wireless network, the output port of the user terminal 130 may be one or more antennas, which may convert the electrical signals to electromagnetic signals. Within an electronic device, such as the user terminal 130, and/or the server 110, when a processor thereof processes an instruction, sends out an instruction, and/or performs an action, the instruction and/or action is conducted via electrical signals. For example, when the processor retrieves or saves data from a storage medium (e.g., the storage device 140) , it may send out electrical signals to a read/write device of the storage medium, which may read or write structured data in the storage medium. The structured data may be transmitted to the processor in the form of electrical signals via a bus of the electronic device. Here, an electrical signal may refer to one electrical signal, a series of electrical signals, and/or a plurality of discrete electrical signals.

FIG. 1B is a schematic diagram illustrating sentiment analysis system 100 implemented on an on-demand transportation system according to some embodiments of the present disclosure. As shown in FIG. 1 B, service system 150 may be the on-demand transportation system (hereinafter referred as the on-demand transportation system 150) . The on-demand transportation system 150 may be a configured platform for providing the transportation service (e.g., a car hailing service) . The on-demand transportation system 150 may dispatch a service vehicle 160 to provide the transportation service for a service requester in response to a service request. Each service vehicle 160 may correspond to a service provider (e.g., a driver) . The service provider may, via the network 120, communicate with the on-demand transportation system 150 through the user terminal (e.g., the user terminal 130) . An application of the on-demand transportation system 150 may be installed in the user terminal. In some embodiments, the application may embed an intelligent customer service robot (hereinafter referred as an AI robot) . The service provider may interact with the AI robot by inputting an audio, or texting messages. In some embodiments, the application may provide various interaction languages for service requesters or service providers from different countries, for example, English language or Chinese language. In China, almost all of the service providers use Chinese language to dialogue with the AI robot. The dialogue contents between the service provider and the AI robot may be stored in a storage device (e.g., the storage device 140) . The dialogue content may include service requirements, complaints, even some bad talks, and so on. The SA system 100 may analyze the dialogue content to estimate the service provider’s sentiment class. In some embodiments, the SA system 100 may be directly implemented on the on-demand transportation system 150. In some embodiments, the SA system 100 may indirectly connect to the on-demand transportation system 150 via the network 120. Merely for illustration, the on-demand transportation system 150 may be the default service system disclosed in the present disclosure. However, other online to offline service systems are not excluded from the service system.

FIG. 2 is a schematic diagram illustrating exemplary components of a computing device (e.g., computing device 200) according to some embodiments of the present disclosure. The server 110, the user terminal 130, and/or the storage device 140 may be implemented on the computing device 200. The particular system may use a functional block diagram to explain the hardware platform containing one or more user interfaces. The computer may be a computer with general or specific functions. Both types of the computers may be configured to implement any particular system according to some embodiments of the present disclosure. Computing device 200 may be configured to implement any components that perform one or more functions disclosed in the present disclosure. For example, the computing device 200 may implement any component of the SA system 100 as described herein. In FIGs. 1A and 2, only one such computer device is shown purely for convenience purposes. One of ordinary skill in the art would understand at the time of filing of this application that the computer functions relating to the service as described herein may be implemented in a distributed fashion on a number of similar platforms, to distribute the processing load.

The computing device 200, for example, may include COM ports 250 connected to and from a network connected thereto to facilitate data communications. The computing device 200 may also include a processor (e.g., the processor 220) , in the form of one or more processors (e.g., logic circuits) , for executing program instructions. For example, the processor 220 may include interface circuits and processing circuits therein. The interface circuits may be configured to receive electronic signals from a bus 210, wherein the electronic signals encode structured data and/or instructions for the processing circuits to process. The processing circuits may conduct logic calculations, and then determine a conclusion, a result, and/or an instruction encoded as electronic signals. Then the interface circuits may send out the electronic signals from the processing circuits via the bus 210.

The exemplary computing device may include the internal communication bus 210, program storage and data storage of different forms including, for example, a disk 270, and a read-only memory (ROM) 230, or a random access memory (RAM) 240, for various data files to be processed and/or transmitted by the computing device. The exemplary computing device may also include program instructions stored in the ROM 230, RAM 240, and/or another type of non-transitory storage medium to be executed by the processor 220. The methods and/or processes of the present disclosure may be implemented as the program instructions. The computing device 200 also includes an I/O component 260, supporting input/output between the computer and other components. The computing device 200 may also receive programming and data via network communications.

Merely for illustration, only one processor is illustrated in FIG. 2. Multiple processors are also contemplated; thus operations and/or method steps performed by one processor as described in the present disclosure may also be jointly or separately performed by the multiple processors. For example, if in the present disclosure the processor of the computing device 200 executes both operation A and operation B, it should be understood that operation A and operation B may also be performed by two different processors jointly or separately in the computing device 200 (e.g., the first processor executes operation A and the second processor executes operation B, or the first and second processors jointly execute operations A and B) .

FIG. 3 is a schematic diagram illustrating exemplary hardware and/or software components of an exemplary mobile device (e.g., mobile device 300) according to some embodiments of the present disclosure. The user terminal 130 may be implemented on the mobile device 300. As illustrated in FIG. 3, the mobile device 300 may include a communication platform 310, a display 320, a graphics processing unit (GPU) 330, a central processing unit (CPU) 340, an I/O 350, a memory 360, and a storage 390. The CPU 340 may include interface circuits and processing circuits similar to the processor 220. In some embodiments, any other suitable component, including but not limited to a system bus or a controller (not shown) , may also be included in the mobile device 300. In some embodiments, a mobile operating system 370 (e.g., iOS ^TM, Android ^TM, Windows Phone ^TM) and one or more applications 380 may be loaded into the memory 360 from the storage 390 in order to be executed by the CPU 340. The applications 380 may include a browser or any other suitable mobile apps for transmitting the user interaction data/information to the server 110. For example, the application 380 may be a car-hailing application for providing a transportation service. The car-hailing application may connect to a the transportation service system and/or the SA system 100 implemented on the transportation service system. User interaction with the information stream may be achieved via the I/O devices 350 and provided to the processing device 112 and/or other components of the SA system 100 via the network 120.

In order to implement various modules, units and their functions described above, a computer hardware platform may be used as hardware platforms of one or more elements (e.g., a component of the server 110 described in FIG. 1) . Since these hardware elements, operating systems, and program languages are common, it may be assumed that persons skilled in the art may be familiar with these techniques and they may be able to provide information required in the traffic lights controlling according to the techniques described in the present disclosure. A computer with user interface may be used as a personal computer (PC) , or other types of workstations or terminal devices. After being properly programmed, a computer with user interface may be used as a server. It may be considered that those skilled in the art may also be familiar with such structures, programs, or general operations of this type of computer device. Thus, extra explanations are not described for the figures.

FIG. 4 is a block diagram illustrating an exemplary processing device according to some embodiments of the present disclosure. As shown in FIG. 4, the processing device 112 may include an acquisition module 402, a dictionary construction module 404, a model training module 406, a sentiment analysis module 408, a reminder module 410, and a strategy determination module 412.

The modules may be hardware circuits of at least part of the processing device 112. The modules may also be implemented as an application or set of instructions read and executed by the processing device 112. Further, the modules may be any combination of the hardware circuits and the application/instructions. For example, the modules may be the part of the processing device 112 when the processing device 112 is executing the application/set of instructions.

The acquisition module 402 may obtain current interaction information between a user and a service system. In some embodiments, the user may be a service provider registered in the service system. For example, the service system may be an on-demand transportation service system 150. The user may be a driver for providing a transportation service (e.g., a car- hailing service) . It should be understood that the service system may include various online to offline services, and not intended to limit herein. Taking the transportation service as example, the driver may interact with the on-demand transportation service system 150 via the user terminal 130. In some embodiments, the driver may feedback their own requirements to the on-demand transportation system150 through an application installed in the user terminal 130. For example, the driver may dialogue with an AI robot embedded in the application. The user terminal 130 may send interaction information to the on-demand transportation system 150 and/or the SA system 100 in real time or near real time. The acquisition module 402 may obtain the real-time or near real-time interaction information including dialogue contents.

In some embodiments, the interaction information between a plurality of users and the service system may be stored in a storage device (e.g., the storage device 140) . The interaction information may include historical interaction information and current interaction information. In some embodiments, the interaction information include dialogue contents between the plurality of drivers and the on-demand transportation service system 150. In some embodiments, the dialogue contents regarding a specific service scenario may be used to construct a specialized dictionary regarding the specific service scenario. The service system may utilize the specialized dictionary to analyze features of the service providers through text classification or sentiment analysis.

The dictionary construction module 404 may construct the specialized dictionary regarding the service scenario (e.g., a dictionary regarding the transportation service) . Specifically, the dictionary construction module 404 may obtain a plurality of sets of interaction information regarding the service scenario from the storage device 140. In some embodiments, each of the plurality of sets of interaction information may correspond to an individual user. In some embodiments, the interaction information may be converted to a text including one or more words or characters. The dictionary construction module 404 may determine one or more candidate word strings for each set of interaction information. For example, the dictionary construction module 404 may perform word segmentation operation to determine the one or more candidate word strings for each set of interaction information. The dictionary construction module 404 may determine common candidate word strings based on the one or more candidate word strings from the plurality of sets of interaction information. The dictionary construction module 404 may construct the specialized dictionary by measuring the common candidate word strings based on a measurement indicator. For example, the measurement indicator may include a term frequency, a collocation, a degree of freedom, or the like, or any combination thereof. More descriptions regarding the construction of the specialized dictionary may be founded elsewhere in the present disclosure (e.g., FIG. 6 and the descriptions thereof) .

The model training module 406 may generate a predictive model for determining a sentiment estimation. The model training module 406 may generate the predictive model by training an initial model. For example, the model training module 406 may obtain a set of training data including the labeled historical interaction information from the storage device 140. The model training module 406 may generate, based on the dictionary regarding the service scenario (e.g., the specialized dictionary regarding the transportation service) , a plurality of word strings for each text in the set of training data. The model training module 406 may generate word vectors corresponding to the plurality of word strings. The model training module 406 train an initial model based on the word vectors.

Specifically, the model training module 406 may configure the initial model. The initial model may be a machine learning model. Exemplary model may include but not limited to a deep neural network (DNN) model, a long short-term memory (LSTM) model, a convolutional neural network (CNN) model, etc. The word vectors may be taken as inputs of the initial model. During the training, the model training module 406 may iteratively update parameters by minimizing a loss function of the initial model. The model training module 406 may optimize a training loss of the loss function to generate a predictive model. For example, when the training loss of the loss function is less than or equal to a threshold, the model training module 406 may terminate the training, and determine the current model as the optimal predictive model. As a further example, when the training loss of the loss function is convergent, for example, the training loss keeps a constant, the model training module 406 may terminate the training, and determine the current model as the optimal predictive model. As still an example, in some embodiments, when the number of training rounds (or counts of iterations) is equal to a maximum value (e.g., 50, 100, 150, etc. ) , the model training module 406 may also terminate the training, and determine the current model as the optimal predictive model. More descriptions regarding the determination of the predictive model may be founded elsewhere in the present disclosure (e.g., FIGs. 7-8, and the descriptions thereof) .

The sentiment analysis module 408 may determine a sentiment estimation using a predictive model that processes the current interaction information. For example, the sentiment analysis module 408 may obtain the predictive model from the storage device 140. The sentiment analysis module 408 may input the current interaction information to the predictive model. The predictive model may output a corresponding sentiment estimation. In some embodiments, the sentiment estimation may include a label indicating a positive mood, a label indicating a negative mood, or a label indicating a neutral mood.

In some embodiments, the sentiment analysis module 408 may determine one or more key word strings indicating the sentiment estimation. For example, the sentiment analysis module 408 may determine the one or more key word strings based on the current interaction information and the specialized dictionary. The key word string may refer to a word or a word string having emotion tendency (e.g., an emotion word) . In some embodiments, the dictionary may include a plurality of emotion words from the interaction information between the service provider and the service system. The sentiment analysis module 408 may segment the text corresponding to the current interaction information based on the specialized dictionary. If a segmented word string is included in the specialized dictionary, and the segmented word string is the same as or similar to the emotion words included in the specialized dictionary, the sentiment analysis module 408 may designate the segmented word string as the key word string. The key word strings may be assisted to analyze reasons causing the current estimated sentiment class.

In some embodiments, if the sentiment estimation is the negative mood, the reminder module 410 may send a reminder signal for easing the negative mood to the user terminal 130. In some embodiments, the reminder signal may include information for encouraging the service provider (e.g., the driver) to keep positive mood, such as a song, a joke, etc. In some embodiments, the reminder signal may include an alert signal. For example, the reminded signal may remind the driver to drive safety, and don't get distracted due to the negative mood.

In some embodiments, the strategy determination module 412 may determine a corresponding service strategy for the service provider in response to the determined sentiment estimation. For example, for the on-demand transportation service, the strategy determination module 412 of the processing device 112 may determine an order allocation strategy based on the sentiment estimation. In some embodiments, if the sentiment estimation is the positive mood, it is possible that the driver has a great passion to serve others. The strategy determination module 412 may intend to allocate more service orders for the driver. In some embodiments, if the sentiment estimation is the negative mood, it is possible that the driver has some complaints or discontents. In this case, the strategy determination module 412 may intend to reduce order allocation for the driver, so as that reduce security risk or poor service experience for the service requester due to the negative mood of the driver. In some embodiments, if the sentiment estimation is the neutral mood, the strategy determination module 412 may perform normal order allocation for the driver.

It should be noted that the above description of the processing device 112 is merely provided for the purposes of illustration, and not intended to limit the scope of the present disclosure. For persons having ordinary skills in the art, multiple variations and modifications may be made under the teachings of the present disclosure. For example, the processing device 112 may further include a storage module to facilitate data storage. As another example, one or more modules may be integrated to a single module. However, those variations and modifications do not depart from the scope of the present disclosure.

FIG. 5 is a flowchart illustrating an exemplary process for analyzing sentiment according to some embodiments of the present disclosure. In some embodiments, process 500 may be implemented on the SA system 100. For example, the process 500 may be stored in the storage device 140 and/or the storage (e.g., the ROM 230, the RAM 240, or the storage 390) as the form of instructions, and invoked and/or executed by the server 110 (e.g., the processing device 112 of the server 110, or the processor 220 of the computing device 200) . The operations of the illustrated process presented below are intended to be illustrative. In some embodiments, the process 500 may be accomplished with one or more additional operations not described and/or without one or more of the operations herein discussed. Additionally, the order in which the operations of the process as illustrated in FIG. 5 and described below is not intended to be limiting.

In 502, the processing device (e.g., the acquisition module 402 of the processing device 112) may obtain, via a user terminal, current interaction information between a user and a service system.

In some embodiments, the service system may provide various kinds of online to offline services, such as an on-demand transportation service or a food delivery service. The user may be an individual who registers with the service system, and be responsible for providing the service. For example, the user is a service provider. Taking the on-demand transportation service system 150 as an example, the service provider may be a registered driver who is responsible for providing the transportation service (e.g., a car-hailing service) . The driver may receive service information from the on-demand transportation system150 through an application installed in the user terminal 130. The driver may also feedback their own requirements to the on-demand transportation system150 through the application. For example, the driver may dialogue with an AI robot embedded in the application. The user terminal 130 may send interaction information to the on-demand transportation system 150 and/or the SA system 100 in real time or near real time. The processing device 112 may obtain the real-time or near real-time interaction information including dialogue contents.

In some embodiments, the interaction information may include but not limited to audio information, text information, etc. In some embodiments, the audio information may be converted to corresponding text information for further processing. In some embodiments, the interaction information may exist as various languages (e.g., English language, Chinese language, or Japanese language) . For example, in China, the dialogue content may be recorded in Chinese words. In a specific service scenario, for example, in the on-demand transportation service, the dialogue, between the Chinese driver and the on-demand transportation system 150, may have unique features. For example, most Chinese sentences included in the dialogue contents are short, that is, including few Chinese characters/words. Specialized phases or combinations of Chinese words (i.e. idioms) may be included in the sentences. By contrast with English text, it is challenge for a computing device to understand the true meaning of a Chinese text due to the lack of unambiguous word boundary indicators. As used herein, the terminologies “Chinese word” , “Chinese character” and “hanzi” may be used interchangeably.

In 504, the processing device (e.g., a sentiment analysis module 408 of the processing device 112) may quantitatively determine a sentiment estimation using a predictive model that processes the current interaction information.

In some embodiments, the sentiment estimation may include a label indicating a positive mood (hereinafter referred as a first label) , a label indicating a negative mood (hereinafter referred as a second label) , or a label indicating a neutral mood (hereinafter referred as a third label) . In some embodiments, the processing device 112 may obtain a trained predictive model from a storage device (e.g., the storage device 140) . The processing device 112 may input the current interaction information to the predictive model. The predictive model may output a corresponding sentiment estimation. In some embodiments, the output values of the predictive model may include a prediction score (e.g., a probability) indicating the positive mood, the negative mood or the neutral mood. The processing device 112 may estimate a corresponding sentiment label based on the prediction score. For example, assuming that a first probability indicating the positive mood is 0.72, a second probability indicating the negative mood is 0.13, and a third probability indicating the neutral mood is 0.15. The processing device 112 may determine that the current sentiment estimation is the positive mood, and output the first label.

In some embodiments, the predictive model may be generated by training an initial model using a machine learning algorithm. For example, the model training module 406 may obtain a set of training data (also referred to herein as training set) . The training set may include labeled historical interaction information between a plurality of drivers and the on-demand transportation system 150, for example, labeled historical dialogue contents between the plurality of drivers and the AI robot embedded in the application. In some embodiments, training data is text data converted from the historical dialogue contents. The text data may include one or more phrases, one or more sentences, even one or more paragraphs. The model training module 406 may generate a plurality of word strings for each text in the training set based on a specialized dictionary regarding the service scenario (e.g., a dictionary regarding the on-demand transportation service) . The specialized dictionary may include a plurality of common word strings in the on-demand transportation service, and a plurality of word strings having emotion tendency (hereinafter referred as emotion words) . In some embodiments, the word string may include a single word, or a combination of multiple words. The model training module 406 may generate a plurality of word vectors corresponding to the plurality of word strings. The plurality of word vectors may be taken as input of the initial model for training the initial model. During the training, the model training module 406 may iteratively update parameters of the initial model by minimizing a loss function of the initial model. If the loss function is convergent or a training loss value of the loss function is less than or equal to a threshold, the model training module 406 may determine the predictive model, and terminate the training process. The trained predictive model may be used to predict a real-time sentiment estimation by processing a real-time interaction information. More descriptions of the determination of the predictive model may be found elsewhere in the present disclosure (e.g., FIGs. 7-8, and the descriptions thereof) .

For the generation of the predictive model, the determination of a dictionary regarding a specific service scenario (hereinafter referred as a specialized dictionary) is necessary. The word or word string (e.g., Chinese word string) may be segmented according to the dictionary. However, in different service scenarios, the words or word strings may contain specific meanings. It is important to accurately segment the word or the word string reading in the context of the specific service. For example, in the on-demand transportation service, the Chinese word string, “异地车” , means cars in different cities. It is designated as a whole word here. In common Chinese dictionary, “异地车” may be segmented to two Chinese words, that are, “异地” (means different cities) and “车” (means a vehicle) . During the training of the model, the word vector corresponding to “异地车” may be different from the word vectors corresponding to “异地” and “车” . Different word strings may correspond to different word vectors. Inaccurate word vectors may cause an error of the training of the predictive model. The specialized dictionary is necessary. More descriptions of constructing the specialized dictionary may be founded elsewhere in the present disclosure.

In 506, the processing device (e.g., the sentiment analysis module 408 of the processing device 112) may determine one or more key word strings indicating the sentiment estimation.

In some embodiments, the processing device 112 may determine the one or more key word strings based on the current interaction information and the specialized dictionary. The key word string may refer to the word or word string having emotion tendency. Emotion words may be included in the dictionary. The processing device 112 may segment the text corresponding to the current interaction information based on the specified dictionary. For example, for each sentence included in the text, the sentiment analysis module 408 may set a maximum segmentation length. The sentiment analysis module 408 may segment the sentence in a segmentation sequence (e.g., left to right) . If a segmented word string is included in the specialized dictionary, and the segmented word string is the same as or similar to the emotion words included in the specialized dictionary, the sentiment analysis module 408 may designate the segmented word string as the key word string. The key word strings may be assisted to analyze reasons causing the current estimated sentiment class. In some embodiments, the key word strings corresponding to the service provider may be stored in a storage device (e.g., the storage device 140) . The SA system 100 may estimate emotion management of the service provider based on the key word strings.

In some embodiments, the processing device 112 may determine the one or more key word strings by using a dictionary-based word segmentation algorithm. Exemplary dictionary-based word segmentation algorithm may include Maximum Matching, Reverse Maximum Matching, Minimum Matching, Reverse Minimum Matching, Bidirectional Maximum Matching, Bidirectional Minimum Matching, Bidirectional Maximum Minimum Matching, Full Segmentation, Minimal Word Count, MaxNgram Score, or the like, or any combination thereof.

In some embodiments, the SA system 100 may perform one or more operations according to the sentiment estimation. For example, if the sentiment estimation is the negative mood, the reminder module 410 of the processing device 112 may send a reminder signal for easing the negative mood to the user terminal 130. In some embodiments, the reminder signal may include information for encouraging the service provider (e.g., the driver) to keep positive mood, such as a song, a joke, etc. In some embodiments, the reminder signal may include an alert signal. For example, the reminded signal may remind the driver to drive safety, and don't get distracted due to the negative mood.

In some embodiments, the processing device 112 may determine a corresponding service strategy for the service provider in response to the determined sentiment estimation. For example, for the on-demand transportation service, the strategy determination module 412 of the processing device 112 may determine an order allocation strategy based on the sentiment estimation. In some embodiments, if the sentiment estimation is the positive mood, it is possible that the driver has a great passion to serve others. The strategy determination module 412 may intend to allocate more service orders for the driver. In some embodiments, if the sentiment estimation is the negative mood, it is possible that the driver has some complaints or discontents. In this case, the strategy determination module 412 may intend to reduce order allocation for the driver, so as that reduce security risk or poor service experience for the service requester due to the negative mood of the driver. In some embodiments, if the sentiment estimation is the neutral mood, the strategy determination module 412 may perform normal order allocation for the driver.

It should be noted that the above description of the process 500 is merely provided for the purposes of illustration, and not intended to limit the scope of the present disclosure. For persons having ordinary skills in the art, multiple variations and modifications may be made under the teachings of the present disclosure. For example, operation 504 and operation 506 may be integrated into a single operation. However, those variations and modifications do not depart from the scope of the present disclosure.

FIG. 6 is a flowchart illustrating an exemplary process for constructing a specialized dictionary regarding a service scenario according to some embodiments of the present disclosure. In some embodiments, process 600 may be implemented in the SA system 100. For example, the process 600 may be stored in the storage device 140 and/or the storage (e.g., the ROM 230, the RAM 240, or the storage 390) as the form of instructions, and invoked and/or executed by the server 110 (e.g., the processing device 112 of the server 110, or the processor 220 of the computing device 200) . The operations of the illustrated process presented below are intended to be illustrative. In some embodiments, the process 600 may be accomplished with one or more additional operations not described and/or without one or more of the operations herein discussed. Additionally, the order in which the operations of the process as illustrated in FIG. 6 and described below is not intended to be limiting.

In 602, the processing device (e.g., the dictionary construction module 404 of the processing device 112) may obtain a plurality of sets of interaction information regarding the service scenario.

In some embodiments, each of the plurality of sets of interaction information may correspond to an individual user. In some embodiments, the interaction information may be converted to a text including one or more words or characters. For example, if the interaction information is audio information, the audio information may be converted to the text information. The text information may be stored in a storage device (e.g., the storage device 140) . As another example, if the interaction information is the text information, the text information may be directly stored in the storage device (e.g., the storage device 140) . In some embodiments, the plurality of sets of interaction information may include real-time interaction information and historical interaction information between the user and the service system.

Merely for illustration, the service scenario may relate to an on-demand transportation service (e.g., a car-hailing service) . The user may be a service provider, that is, a driver. In the transportation service scenario, the driver may feedback their requirements or problems to the service system. For example, the driver may dialogue with an intelligent customer service robot embedded in an application (e.g., a car-hailing application) installed in his/her user terminal. The dialogue content may be stored in the storage device. In some embodiments, each driver’s dialogue content may be individually stored in one set of interaction information. The dialogue content may be sent to the on demand transportation service system 150 or the SA system 100. The dictionary construction module 404 may obtain the plurality of sets of interaction information from the storage device 140 in order to construct a specialized dictionary regarding the transportation service.

In 604, the processing device (e.g., the dictionary construction module 404 of the processing device 112) may determine one or more candidate word strings for each set of interaction information.

In some embodiments, the processing device 112 may perform word segmentation operation to determine the one or more candidate word strings for each set of interaction information. For example, the dictionary construction module 404 may set a segmentation step for the text corresponding to the interaction information. The dictionary construction module 404 may further determine the one or more candidate word strings by segmenting the text in the segmentation step.

Let the plurality of sets of interaction information be D= (D ₁, D ₂, …, D _n) , where D _i denotes a text corresponding to the interaction information of an individual user, i=1, 2, …, n. The plurality of sets of interaction information may include the interaction information between a plurality of users and the service system. In some embodiments, for each text (e.g., D ₁) , the dictionary construction module 404 may determine one or more semantic units. The semantic unit may include a sentence. The dictionary construction module 404 may segment each of the one or more semantic units in the segmentation step. Let the segmentation step be N. The segmentation step may refer to a number (or counts) of characters or words used in each word segmentation operation, such as 1 character, 2 characters, 3 characters, etc. The segmentation step may also be referred as the maximum segmentation length. The dictionary construction module 404 may preset the segmentation step. The dictionary construction module 404 may determine a plurality of candidate sets of candidate word strings by segmenting each semantic unit. Let the plurality of sets of candidate word strings be S= (S ₁, S ₂, …, S _n) , where S _i denotes a set of candidate word strings for one text (i.e., D _i) . S _i corresponds to D _i. S _i includes one or more candidate word strings.

In some embodiments, given that a segmentation sequence is from left to right, the segmentation step is 3 (N=3) , the dictionary construction module 404 may segment the semantic unit in the segmentation step according to the segmentation sequence, and generate multiple word strings. The segmented word strings may be designated as the candidate word strings. The number of the word (s) included in segmented word string may be less than or equal to the segmentation step. Merely for illustration, assuming that D ₁= {派单太少了} , means the lack of order allocation, the set of candidate word strings may be generated, that is, S ₁= {派, 单, 太, 少, 了, 派单, 单太, 太少, 少了, 派单太, 单太少, 太少了} . Similar to the word segmentation of D ₁, the dictionary construction module 404 may determine the set of candidate word strings for each text.

In 606, the processing device (e.g., the dictionary construction module 404 of the processing device 112) may determine common candidate word strings based on the one or more candidate word strings from the plurality of sets of interaction information.

In some embodiments, the plurality of sets of candidate word strings, S, may be determined by using the word segmentation operation. The dictionary construction module 404 may compare each of the plurality of sets of candidate word strings, for example, S ₁, S ₂, …, S _n. The dictionary construction module 404 may further designate same candidate word strings as the common candidate word strings. For example, if a candidate Chinese word string “派单少” , means there are few orders, occurs in at least two sets of candidate word strings, thus the candidate word string, “派单少” , may be designated as the common candidate string. In some embodiments, the common candidate word strings may form a candidate dictionary.

In 608, the processing device (e.g., the dictionary construction module 404 of the processing device 112) may construct the specialized dictionary by measuring the common candidate word strings based on a measurement indicator.

In some embodiments, at least one measurement indicator may describe attributes of the common candidate word strings in the candidate dictionary. Exemplary measurement indicator may include a term frequency, a collocation, a degree of freedom, or the like, or any combination thereof. The processing device 112 may determine the specialized dictionary based on the at least one measurement indicator.

In some embodiments, the processing device 112 may determine the term frequency according to Equation (1) as follows:

where p _i denotes a term frequency of a word string, W _i denotes number (counts) of the word string occurred in the candidate dictionary, M denotes number (counts) of the candidate word strings in the candidate dictionary, and i=1, 2…, M.

In some embodiments, the processing device 112 may determine the collocation of the word string according to Equation (2) as follows:

where Co denotes a collocation of a word string, p _i denotes a term frequency of a word string, P _i, j denotes a term frequency of a sub word string occurred in the candidate dictionary. The sub word string is at least a part of the word string. For example, for the Chinese word string, “派单少” , “派” is designated as a first sub word string, and “单少” is designated as a second sub word string.

In some embodiments, the processing device 112 may determine the degree of freedom of the word string according to Equations (3) and (4) as follows:

where H (U) denotes an information entropy of a word string, p _i denotes a term frequency of a word string.

fr=min {H (U) ₁, H (U) ₂, …, H (U) _n} , (4)

where fr denotes a degree of freedom of a word string, {H (U) ₁, H (U) ₂, …, H (U) _n} denotes a set of information entropies of left and right adjacent words.

In some embodiments, the processing device 112 may preset a first threshold regarding the term frequency, a second threshold regarding the collocation, and a third threshold regarding the degree of freedom. In some embodiments, when a term frequency of a word string is greater than or equal to the first threshold, the word string may be added to the specialized dictionary. In some embodiments, when a collocation of a word string is greater than or equal to the second threshold, the word string may be added to the specialized dictionary. In some embodiments, when the degree of freedom of the word string is greater than or equal to the third threshold, the word string may be added to the specialized dictionary. In some embodiments, for the word string in the specialized dictionary, at least one of the three measurement indicators need to be satisfied.

In some embodiments, the processing device 112 may label one or more word strings (e.g., emotion words) having emotion indication in the specialized dictionary. For example, the labelled emotion words may be classified to a sub-dictionary of the specialized dictionary. The emotion words may contribute to the sentiment estimation.

In some embodiments, the processing device 112 may update the specialized dictionary based updated interaction information between the service provider and the service system. For example, the dictionary construction module 404 may add new word strings to the specialized dictionary by performing operations 602-608. In some embodiments, the processing device 112 may construct the specialized dictionaries corresponding to various service scenarios. The constructed specialized dictionary may be used to sentiment estimation of the service providers in the service scenarios.

It should be noted that the above description of the process 600 is merely provided for the purposes of illustration, and not intended to limit the scope of the present disclosure. For persons having ordinary skills in the art, multiple variations and modifications may be made under the teachings of the present disclosure. For example, operation 606 and operation 608 may be integrated into a single operation. However, those variations and modifications do not depart from the scope of the present disclosure.

FIG. 7 is a flowchart illustrating an exemplary process for determining a predictive model according to some embodiments of the present disclosure. In some embodiments, process 700 may be implemented in the SA system 100. For example, the process 700 may be stored in the storage device 140 and/or the storage (e.g., the ROM 230, the RAM 240, or the storage 390) as the form of instructions, and invoked and/or executed by the server 110 (e.g., the processing device 112 of the server 110, or the processor 220 of the computing device 200) . The operations of the illustrated process presented below are intended to be illustrative. In some embodiments, the process 700 may be accomplished with one or more additional operations not described and/or without one or more of the operations herein discussed. Additionally, the order in which the operations of the process as illustrated in FIG. 6 and described below is not intended to be limiting.

In 702, the processing device (e.g., the model training module 406 of the processing device 112) may obtain a set of training data including the labeled historical interaction information.

In some embodiments, the historical interaction information may include historical dialogue contents between a plurality of users and the service system. The user may be the driver, and the service system may be on-demand transportation system. In some embodiments, the historical dialogue contents may be in the form of text data. If the original dialogue contents are audio data, the audio data need to be converted to text data. In some embodiments, the historical dialogue contents may include the historical dialogues occurred in a predetermined period, for example, a week, a month, a year, etc. The historical dialogue contents may be stored in a storage device (e.g., the storage device 140) . In some embodiments, each of the historical dialogue content may be assigned a label indicating a sentiment class (e.g., a positive mood, a negative mood or a neutral mood) . The labeled historical dialogue contents may be designated as the training data.

In 704, the processing device (e.g., the model training module 406 of the processing device 112) may generate, based on the dictionary regarding the service scenario, a plurality of word strings for each text in the set of training data (hereinafter referred as training set) .

In some embodiments, the dictionary may be the specialized dictionary regarding the transportation service. As described in connection with FIG. 6, the dictionary construction module 404 may construct the specialized dictionary. In some embodiments, the processing device 112 may perform word segmentation operation based on the specialized dictionary, and generate the plurality of word strings for each text in the training set. The key word string refers to the word string indicating emotional tendency. For example, the model training module 406 may divide each text to a plurality of semantic units. The semantic unit may be in unit of a sentence. For each sentence, the model training module 406 may perform the word segmentation based on the specialized dictionary. Let the sentence be S1. The model training module 406 may set a maximum segmentation length. Let the maximum segmentation length be MaxLen. The model training module 406 may segment S1 in MaxLen according to a segmentation sequence (e.g., left to right) . If a segmented candidate word string is included in the specialized dictionary, the model training module 406 may output the segmented candidate word string. Otherwise, the model training module 406 may not output the segmented candidate word string.

In some embodiments, the processing device 112 may generate the one or more word strings based on a dictionary-based word segmentation algorithm. Exemplary dictionary-based word segmentation algorithm may include Maximum Matching, Reverse Maximum Matching, Minimum Matching, Reverse Minimum Matching, Bidirectional Maximum Matching, Bidirectional Minimum Matching, Bidirectional Maximum Minimum Matching, Full Segmentation, Minimal Word Count, MaxNgram Score, or the like, or any combination thereof.

In 706, the processing device (e.g., the model training module 406 of the processing device 112) may generate word vectors corresponding to the plurality of word strings.

In some embodiments, each of the plurality of word strings need to be converted to the word vector. The processing device 112 may determine the word vector by using one or more algorithms. The one or more algorithms may include but not limited to a TF-IDF (term frequency–inverse document frequency) algorithm, a BOW (Bag-of-words) algorithm, a One-Hot algorithm, a word2vec algorithm, or a Glove algorithm, etc. For example, the model training module 406 may determine the word vector corresponding to the word string based on the word2vec algorithm.

In 708, the processing device (e.g., the model training module 406 of the processing device 112) may train an initial model based on the word vectors.

In some embodiments, the processing device 112 may configure the initial model. The initial model may be a machine learning model. Exemplary model may include but not limited to a deep neural network (DNN) model, a long short-term memory (LSTM) model, a convolutional neural network (CNN) model, etc. The word vectors may be taken as inputs of the initial model. During the training, the model training module 406 may iteratively update parameters by minimizing a loss function of the initial model. The loss function may measure how far away an output solution is from an optimal solution. In some embodiments, the loss function may include a square loss function, a logistic loss function, or the like, or any combination thereof. In some embodiments, the loss function may also include a regularization term, for example, L1 norm, or L2 norm. For example, the loss function may be a combination of a square loss function and the regularization term. As another example, the loss function may be a combination of logistic loss function and the regularization term. The model training module 406 may optimize a training loss of the loss function to generate a predictive model. In each training round (or each iteration process) , the model training module 406 may update the parameters of the model by using a stochastic gradient descent (SGD) algorithm.

In 710, the processing device (e.g., the model training module 406 of the processing device 112) may determine a predictive model.

In some embodiments, when the training loss of the loss function is less than or equal to a threshold, the model training module 406 may terminate the training, and determine the current model as the optimal predictive model. In other words, the parameters of the current model may be designated as the parameters of the optimal predictive model. In some embodiments, when the training loss of the loss function is convergent, for example, the training loss keeps a constant, the model training module 406 may terminate the training, and determine the current model as the optimal predictive model. In some embodiments, when the number of training rounds (or counts of iterations) is equal to a maximum value (e.g., 50, 100, 150, etc. ) , the model training module 406 may also terminate the training, and determine the current model as the optimal predictive model. It should be noted that an accuracy of the predictive model may be equal to or greater than an accuracy threshold (e.g., 80%, 85%, 90%, etc. ) . The accuracy of the predictive model may be measured by verifying a test set. The test set is similar to the training set. The test set may include labeled historical interaction information. In the verification of the test set, if the accuracy of the predictive model is not satisfied, the model training module 406 may continue to train the model by adjusting parameters of the predictive model until the accuracy is equal to or greater than the accuracy threshold.

Merely for illustration, the initial model may be the CNN model. FIG. 8 illustrates an exemplary structure of the CNN model according to some embodiments of the present disclosure. As shown in FIG. 8, CNN 806 may include a plurality of layers. For illustration purposes, layer 806a, layer 806b, …, layer 806e, and layer 806f are shown in FIG. 8. The plurality of layers may include one or more convolutional layers, one or more pooling layers, and one or more connection layers. For example, layers 806a and 806b may be the convolutional layers, layer 806e may be the pooling layer, and layer 806f may be a full connection layer. In some embodiments, the number of the convolutional layer, the pooling layer and the connection layer may be not limited. It depends on the structure of the initial CNN model. The processing device 112 may design the structure of the initial CNN model, for example, set the number of the CNN layers.

In each layer, there are a plurality of neural units (not shown in FIG. 8) . The number of neural units may be not limited. The plurality of neural units may be configured to process word vectors (e.g., word vectors 804) . For example, the neural unit may output a value according to Equation (5) as follows:

f _output=f (∑ _iw _ix _i+b) , (5) ,

where f _output denotes an output value of a neural unit, f (·) denotes an activation function, w _i demotes a weight corresponding to an element of an input vector, x _i denotes an element of an input vector, and b denotes a bias term corresponding to the input vector. The weights and the bias terms may be parameters of the CNN model. The weights and the bias terms may be updated based on the SGD algorithm. It should be noted that the plurality of neural units in different layers may be same or different. In some embodiments, the activation function may include a sigmoid function, a tanh function, a ReLU function, an ELU function, a PReLU function, or the like, or any combination thereof. In the CNN model, outputs of a current layer may be taken as the inputs of a next layer.

Referred to FIG. 8, a sentence 802 to be trained, “真讨厌派单太少” , means “I hate the lack of orders” . The sentence 802 is from a text included in the training set. The sentence 802 may be segmented to multiple word strings based on the dictionary regarding the transportation service. For example, the segmentation result is {真讨厌, 派单, 太少} , including three Chinese word strings. The word string may be converted to corresponding word vector 804. The word vector 804 of the word string may be a matrix having a M×N dimension, for example,

The word vector 804 may be taken as inputs of the convolutional layer 806a. In some embodiments, the processing device 112 may perform convolutional operation for the inputs based on one or more filters. The filter may be also referred as a convolutional kernel. Each neural unit in the convolutional layer 806a may output a value according to Equation (5) . The outputs of the plurality of the convolutional layer 806a may be taken as the inputs of the convolutional layer 806b. For the convolutional layer 806b and other convolutional layers, similar convolutional operation may be performed, and not describe repeatedly. Neural units in the pooling layer 806e (also referred as pooling units) may perform the pooling operation for the inputs, such as a max pooling, an average pooling or L2-norm pooling. Neural units in the connection layer 806f may have full connections to all activations in the previous layer (e.g., the layer 806e) . After the connection layer 806f, output (s) 808 may be generated by performing a softmax function. In some embodiments, the output (s) 808 may be a prediction score indicating the sentiment class. The processing device 112 may output a label indicating the sentiment class based on the prediction score.

It should be noted that the above description of the CNN model is merely provided for the purposes of illustration, and not intended to limit the scope of the present disclosure. For persons having ordinary skills in the art, multiple variations and modifications may be made under the teachings of the present disclosure. For example, various machine learning models (e.g., the DNN model) may be applied to the predictive model for estimating the sentiment class based on the interaction information. However, those variations and modifications do not depart from the scope of the present disclosure.

Having thus described the basic concepts, it may be rather apparent to those skilled in the art after reading this detailed disclosure that the foregoing detailed disclosure is intended to be presented by way of example only and is not limiting. Various alterations, improvements, and modifications may occur and are intended to those skilled in the art, though not expressly stated herein. These alterations, improvements, and modifications are intended to be suggested by this disclosure, and are within the spirit and scope of the exemplary embodiments of this disclosure.

Moreover, certain terminology has been used to describe embodiments of the present disclosure. For example, the terms “one embodiment, ” “an embodiment, ” and “some embodiments” mean that a particular feature, structure or characteristic described in connection with the embodiment is included in at least one embodiment of the present disclosure. Therefore, it is emphasized and should be appreciated that two or more references to “an embodiment” or “one embodiment” or “an alternative embodiment” in various portions of this specification are not necessarily all referring to the same embodiment. Furthermore, the particular features, structures or characteristics may be combined as suitable in one or more embodiments of the present disclosure.

Further, it will be appreciated by one skilled in the art, aspects of the present disclosure may be illustrated and described herein in any of a number of patentable classes or context including any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof. Accordingly, aspects of the present disclosure may be implemented entirely hardware, entirely software (including firmware, resident software, micro-code, etc. ) or combining software and hardware implementation that may all generally be referred to herein as a “module, ” “unit, ” “component, ” “device, ” or “system. ” Furthermore, aspects of the present disclosure may take the form of a computer program product embodied in one or more computer-readable media having computer readable program code embodied thereon. The one or more computer-readable media may include ROM, RAM, magnetic disk, optical disk, or the like, or any combination thereof.

A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated signal may take any of a variety of forms, including electro-magnetic, optical, or the like, or any suitable combination thereof. A computer readable signal medium may be any computer readable medium that is not a computer readable storage medium and that may communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable signal medium may be transmitted using any appropriate medium, including wireless, wireline, optical fiber cable, RF, or the like, or any suitable combination of the foregoing.

Computer program code for carrying out operations for aspects of the present disclosure may be written in any combination of one or more programming languages, including an object-oriented programming language such as Java, Scala, Smalltalk, Eiffel, JADE, Emerald, C++, C#, VB. NET, Python or the like, conventional procedural programming languages, such as the "C" programming language, Visual Basic, Fortran 2003, Perl, COBOL 2002, PHP, ABAP, dynamic programming languages such as Python, Ruby, and Groovy, or other programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the latter scenario, the remote computer may be connected to the user's computer through any type of network, including a local area network (LAN) or a wide area network (WAN) , or the connection may be made to an external computer (for example, through the Internet using an Internet Service Provider) or in a cloud computing environment or offered as a service such as a Software as a Service (SaaS) .

Furthermore, the recited order of processing elements or sequences, or the use of numbers, letters, or other designations, therefore, is not intended to limit the claimed processes and methods to any order except as may be specified in the claims. Although the above disclosure discusses through various examples what is currently considered to be a variety of useful embodiments of the disclosure, it is to be understood that such detail is solely for that purpose and that the appended claims are not limited to the disclosed embodiments, but, on the contrary, are intended to cover modifications and equivalent arrangements that are within the spirit and scope of the disclosed embodiments. For example, although the implementation of various components described above may be embodied in a hardware device, it may also be implemented as a software-only solution, e.g., an installation on an existing server or mobile device.

Similarly, it should be appreciated that in the foregoing description of embodiments of the present disclosure, various features are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure aiding in the understanding of one or more of the various embodiments. This method of disclosure, however, is not to be interpreted as reflecting an intention that the claimed subject matter requires more features than are expressly recited in each claim. Rather, claim subject matter lie in less than all features of a single foregoing disclosed embodiment.

Claims

A system for analyzing sentiment on an online to offline service, comprising:

at least one storage device including a set of instructions; and

at least one processor in communication with the at least one storage device, when executing the set of instructions, the at least one processor is configured to cause the system to:

obtain, via a user terminal, current interaction information between a user and a service system;

determine a sentiment estimation using a predictive model that processes the current interaction information, wherein the predictive model is generated, based on labeled historical interaction information and a dictionary regarding a service scenario provided by the service system, by training an initial model; and

determine one or more key word strings indicating the sentiment estimation based on the current interaction information and the dictionary regarding the service scenario.
The system of claim 1, wherein the sentiment estimation includes at least one of: a label indicating a positive mood, a label indicating a negative mood, or a label indicating a neutral mood.
The system of claims 2, wherein the at least one processor is further configured to cause the system to:

if the sentiment estimation is the negative mood, send a reminder signal for easing the negative mood to the user terminal.
The system of any one of claims 1-3, wherein the at least one processor is further configured to cause the system to:

determine a corresponding service strategy for the user in response to the determined sentiment estimation.
The system of any one of claims 1-4, wherein the at least one processor is further configured to cause the system to construct the dictionary regarding the service scenario;

wherein for constructing the dictionary, the at least one processor is further configured to cause the system to:

obtain a plurality of sets of interaction information regarding the service scenario, each set of interaction information corresponding to an individual user, and the interaction information being converted to a text including one or more words;

determine one or more candidate word strings for each set of interaction information;

determine common candidate word strings based on the one or more candidate word strings from the plurality of sets of interaction information; and

construct the dictionary by measuring the common candidate word strings based on a measurement indicator.
The system of claim 5, wherein to determine the one or more candidate word strings for each set of interaction information, the at least one processor is further configured to cause the system to:

perform a word segmentation operation to determine the one or more candidate word strings for each set of interaction information.
The system of claim 6, wherein to perform the word segmentation operation, the at least one processor is further configured to cause the system to:

set a segmentation step for the text corresponding to the interaction information; and

determine the one or more candidate word strings by segmenting the text in the segmentation step.
The system of any one of claims 1-7, wherein for generating the predictive model, the at least one processor is further configured to cause the system to:

obtain a set of training data including the labeled historical interaction information;

generate, based on the dictionary regarding the service scenario, a plurality of word strings for each text in the set of training data;

generate word vectors corresponding to the plurality of word strings;

train the initial model based on the word vectors, the training including:

update parameters of the initial model by minimizing a loss function of the initial model; and

determine the predictive model if the value of the loss function is less than or equal to a threshold.
The system of claim 8, wherein a stochastic gradient descent algorithm is used to update the parameters.
The system of any one of claims 1-9, wherein the predictive model includes a convolutional neural network (CNN) model.
The system of any one of claims 1-9, wherein the user is a driver and the service system is a transportation service system.
A method for analyzing sentiment on an online to offline service, the method implemented on a computing device having at least one processor and at least one computer-readable storage medium, the method comprising:

obtaining, via a user terminal, current interaction information between a user and a service system;

determining a sentiment estimation using a predictive model that processes the current interaction information, wherein the predictive model is generated, based on labeled historical interaction information and a dictionary regarding a service scenario provided by the service system, by training an initial model; and

determining one or more key word strings indicating the sentiment estimation based on the current interaction information and the dictionary regarding the service scenario.
The method of claim 12, wherein the sentiment estimation includes at least one of: a label indicating a positive mood, a label indicating a negative mood, or a label indicating a neutral mood.
The method of claim 13, the method further comprising:

if the sentiment estimation is the negative mood, sending a reminder signal for easing the negative mood to the user terminal.
The method of any one of claims 12-14, the method further comprising:

determining a corresponding service strategy for the user in response to the determined sentiment estimation.
The method of any one of claims 12-15, the method further comprising:

constructing the dictionary regarding the service scenario;

wherein for constructing the dictionary, the method further includes:

obtaining a plurality of sets of interaction information regarding the service scenario, each set of interaction information corresponding to an individual user, and the interaction information being converted to a text including one or more words;

determining one or more candidate word strings for each set of interaction information;

determining common candidate word strings based on the one or more candidate word strings from the plurality of sets of interaction information; and

constructing the dictionary by measuring the common candidate word strings based on a measurement indicator.
The method of claim 16, wherein the determining the one or more candidate word strings for each set of interaction information includes:

performing a word segmentation operation to determine the one or more candidate word strings for each set of interaction information.
The method of claim 17, wherein the performing the word segmentation operation includes:

setting a segmentation step for the text corresponding to the interaction information; and

determining the one or more candidate word strings by segmenting the text in the segmentation step.
The method of any one of claims 12-18, wherein for generating the predictive model, the method further includes:

obtaining a set of training data including the labeled historical interaction information;

generating, based on the dictionary regarding the service scenario, a plurality of word strings for each text in the set of training data;

generating word vectors corresponding to the plurality of word strings;

training the initial model based on the word vectors, the training including:

updating parameters of the initial model by minimizing a loss function of the initial model; and

determining the predictive model if the value of the loss function is less than or equal to a threshold.
The method of claim 19, wherein a stochastic gradient descent algorithm is used to update the parameters.
The method of any one of claims 12-20, wherein the predictive model includes a convolutional neural network (CNN) model.
The method of any one of claims 12-20, wherein the user is a driver and the service system is a transportation service system.
A non-transitory computer readable medium, comprising at least one set of instructions for analyzing sentiment on an online to offline service, wherein when executed by at least one processor of a computing device, the at least one set of instructions causes the computing device to perform a method, the method comprising:

obtaining, via a user terminal, current interaction information between a user and a service system;

determining a sentiment estimation using a predictive model that processes the current interaction information, wherein the predictive model is generated, based on labeled historical interaction information and a dictionary regarding a service scenario provided by the service system, by training an initial model; and

determining one or more key word strings indicating the sentiment estimation based on the current interaction information and the dictionary regarding the service scenario.