WO2021217769A1 - 基于情绪识别的答复方法、装置、计算机设备及存储介质 - Google Patents

基于情绪识别的答复方法、装置、计算机设备及存储介质 Download PDF

Info

Publication number
WO2021217769A1
WO2021217769A1 PCT/CN2020/092977 CN2020092977W WO2021217769A1 WO 2021217769 A1 WO2021217769 A1 WO 2021217769A1 CN 2020092977 W CN2020092977 W CN 2020092977W WO 2021217769 A1 WO2021217769 A1 WO 2021217769A1
Authority
WO
WIPO (PCT)
Prior art keywords
information
text
text information
recognized
corpus
Prior art date
Application number
PCT/CN2020/092977
Other languages
English (en)
French (fr)
Inventor
叶怡周
胡宏伟
马骏
王少军
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2021217769A1 publication Critical patent/WO2021217769A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0281Customer communication at a business location, e.g. providing product or service information, consulting

Definitions

  • This application relates to the field of artificial intelligence technology, and belongs to an application scenario related to intelligent voice customer service interaction in a smart city, and in particular to a response method, device, computer equipment, and storage medium based on emotion recognition.
  • the customer may encounter various problems in the process of handling the business.
  • the customer can contact the customer service staff by phone to obtain the corresponding solution, or send the required question information to the customer service staff through the Internet to obtain the corresponding solution
  • the above methods are all based on manual customer service.
  • more and more companies use intelligent voice customer service instead of manual customer service to provide services to customers.
  • the use of intelligent voice customer service can significantly reduce the labor cost of enterprises.
  • the inventor realizes that the existing intelligent voice customer service can only obtain corresponding response information based on the question information raised by the customer, and cannot obtain other useful information from the question information raised by the customer, resulting in the flexibility of the current intelligent voice customer service Insufficient sexuality, such as the inability to flexibly adjust the reply information according to the emotion in the question information. Therefore, the intelligent voice customer service in the prior art method has the problem of insufficient flexibility when responding to the customer's question information.
  • the embodiments of the application provide a response method, device, computer equipment, and storage medium based on emotion recognition, aiming to solve the insufficient flexibility of intelligent voice customer service in responding to question information in the existing technical methods. problem.
  • an embodiment of the present application provides a response method based on emotion recognition, which includes:
  • the to-be-recognized information is voice information, recognizing the voice information according to a pre-stored voice recognition model to obtain target text information corresponding to the to-be-recognized information;
  • If the information to be recognized is text information, use the information to be recognized as target text information;
  • matching corpus information is obtained from a pre-stored corpus information database as reply corpus information and fed back to the user terminal.
  • an embodiment of the present application provides a response device based on emotion recognition, which includes:
  • the to-be-identified information judging unit is configured to determine the information type of the to-be-identified information if the information to be identified from the user terminal is received, and the information type includes text information and voice information;
  • the to-be-recognized information recognition unit is configured to, if the to-be-recognized information is voice information, recognize the voice information according to a pre-stored voice recognition model to obtain target text information corresponding to the to-be-recognized information;
  • the target text information acquiring unit is configured to, if the to-be-recognized information is text information, use the to-be-recognized information as target text information;
  • a text feature vector obtaining unit configured to obtain a text feature vector corresponding to the target text information according to a pre-stored word processing rule
  • An emotion level obtaining unit configured to input the text feature vector into a pre-stored emotion recognition model to obtain the emotion level corresponding to the text feature vector;
  • the response corpus information acquisition unit is configured to acquire matching corpus information from a pre-stored corpus information database as the response corpus information according to the emotion level and the target text information and feed it back to the user terminal.
  • an embodiment of the present application provides a computer device, which includes a memory, a processor, and a computer program stored on the memory and running on the processor, and the processor executes the computer Implementation of the program:
  • the to-be-recognized information is voice information, recognizing the voice information according to a pre-stored voice recognition model to obtain target text information corresponding to the to-be-recognized information;
  • If the information to be recognized is text information, use the information to be recognized as target text information;
  • matching corpus information is obtained from a pre-stored corpus information database as reply corpus information and fed back to the user terminal.
  • an embodiment of the present application also provides a computer-readable storage medium, wherein the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, the processor executes the above-mentioned first On the one hand, the response method based on emotion recognition.
  • the embodiment of the application can obtain the emotional level matching the information to be identified, and can realize flexible adjustment of the reply corpus information according to the emotional level in the information to be identified, and improve the flexibility of replying to the questioned information.
  • FIG. 1 is a schematic flowchart of a response method based on emotion recognition provided by an embodiment of this application;
  • FIG. 2 is a schematic diagram of an application scenario of a response method based on emotion recognition provided by an embodiment of the application;
  • FIG. 3 is a schematic diagram of a sub-flow of the response method based on emotion recognition provided by an embodiment of the application;
  • FIG. 4 is a schematic diagram of another sub-flow of the response method based on emotion recognition provided by an embodiment of the application.
  • FIG. 5 is a schematic diagram of another sub-flow of the response method based on emotion recognition provided by an embodiment of the application.
  • FIG. 6 is a schematic diagram of another sub-flow of the response method based on emotion recognition provided by an embodiment of this application.
  • FIG. 7 is a schematic diagram of another sub-process of the response method based on emotion recognition provided by an embodiment of the application.
  • FIG. 8 is a schematic diagram of another flow of the response method based on emotion recognition provided by an embodiment of the application.
  • FIG. 9 is a schematic block diagram of a reply device based on emotion recognition provided by an embodiment of the application.
  • FIG. 10 is a schematic block diagram of a computer device provided by an embodiment of the application.
  • the embodiments of this application can be applied to predictive analysis and robotics in the field of artificial intelligence, such as predicting the follow-up behavior of users through emotion recognition, and applying the response method of emotion recognition to robotics to improve the practicability of robots, etc. It is determined based on the actual application scenario, and there is no restriction here.
  • FIG. 1 is a schematic flowchart of a response method based on emotion recognition provided by an embodiment of the present application
  • FIG. 2 is a schematic diagram of an application scenario of the response method based on emotion recognition provided by an embodiment of this application.
  • the response method based on emotion recognition is applied to the management server 10.
  • the method is executed by application software installed in the management server 10.
  • the management server 10 establishes a network connection with the user terminal 20 to communicate with the user terminal 20.
  • the user terminal The user of 20 can send the information to be identified to the management server 10 through the user terminal 20.
  • the information to be identified can be the question information sent by the user that needs to be answered, and the information to be identified can be used as the basis for characterizing the user’s true intentions.
  • the management server 10 executes the response method based on emotion recognition to obtain the response corpus information corresponding to the information to be identified and feed it back to the corresponding user terminal 20 to complete the response.
  • the management server 10 is an enterprise terminal used to perform a response method based on emotion recognition
  • the user terminal 20 is a terminal device used to send information to be identified and receive response corpus information.
  • the user terminal 20 may be a desktop computer or a notebook. Computer, tablet or mobile phone, etc.
  • FIG. 2 only shows that one user terminal 20 and the management server 10 perform data information transmission. In practical applications, the management server 10 can also perform data information transmission with multiple user terminals 20 at the same time.
  • the method includes steps S110 to S160.
  • S110 If receiving information to be identified from the user terminal, determine the information type of the information to be identified, and the information type includes text information and voice information.
  • the information type of the information to be identified is determined, where the information type includes text information and voice information.
  • the information to be identified includes corresponding format identification information.
  • the format identification information is information used to identify the format of the information to be identified.
  • the format identification information of the information to be identified can determine whether the information to be identified is text information.
  • the information to be recognized can be sent by the user of the user terminal to the management server through the user terminal.
  • the information to be recognized can be text, voice or short video.
  • the corresponding target text information needs to be obtained from the information to be recognized and based on the target text information. Get the real intention of the user.
  • the corresponding information to be identified is text information
  • the format identification information is wav, mp3, wma
  • the corresponding information to be identified is audio information
  • the format identification information is avi, flv and rmvb
  • the corresponding information to be identified is video information.
  • the user terminal sends the text as the information to be recognized to the management server; the user clicks the voice entry button on the terminal page to speak his own If you have a problem and click the confirm button, the user terminal will send the recorded voice as the information to be recognized to the management server; the user clicks the video entry button on the terminal page, and is speaking about his problem to the video capture device of the user terminal and clicks confirm Button, the user terminal sends the recorded short video as information to be identified to the management server.
  • to-be-recognized information is voice information
  • the voice information is recognized according to a pre-stored voice recognition model to obtain target text information corresponding to the to-be-recognized information.
  • the information to be identified is not text information, the information to be identified may be audio information or video information, and both audio information or video information include voice information.
  • a voice recognition model is a model for recognizing and converting voice information contained in audio information or video information, where the voice recognition model includes noise judgment rules and text information acquisition models.
  • the noise judgment rule is the rule for judging whether the voice information contains noise
  • the text information acquisition model is the model that obtains the corresponding text information from the voice information.
  • the noise judgment rule can be used to determine whether the voice information contains noise, so as to ensure that more accurate target text information is obtained from the noise-free voice information .
  • step S120 includes sub-steps S121, S122, and S123.
  • S121 Determine whether the voice information in the to-be-recognized information contains noise according to the noise determination rule.
  • the noise determination rule it is determined whether the voice information in the to-be-identified information contains noise. Specifically, since the frequency of human voices when speaking is in a fixed frequency range (85 Hz to 1100 Hz), the average signal of the voice print signal in the fixed frequency range can be obtained from the voice information based on the frequency of the voice print signal in the voice information The intensity is used as the target sound signal intensity, and the average intensity of other voiceprint signals that are not in the above fixed frequency range is obtained from the voice information as the background noise signal intensity, and the ratio between the background noise signal intensity and the target sound signal intensity is greater than the noise judgment For the preset threshold in the rule, if the ratio is greater than the threshold, it is determined that the voice information in the to-be-recognized information contains noise; if the ratio is not greater than the threshold, it is determined that the voice information in the to-be-recognized information does not contain noise.
  • the target sound signal strength is 65 decibels (decibel, dB)
  • the background noise signal strength is 50 decibels
  • the preset threshold value is 0.8
  • the background noise signal strength is the same as the target sound signal strength.
  • the ratio between is not greater than the preset threshold, and it is determined that the voice information in the to-be-identified information does not contain noise.
  • the voice information in the to-be-recognized information does not contain noise
  • the voice information in the to-be-recognized information is recognized according to the text information acquisition model to obtain target text information corresponding to the to-be-recognized information.
  • the voice information can be recognized according to the text information acquisition model to obtain the corresponding target text information.
  • the text information acquisition model includes an acoustic model, a speech feature dictionary, and a semantic analysis model .
  • step S122 includes sub-steps S1221, S1222, and S1223.
  • S1221 according to the acoustic model in the text information acquisition model, segment the to-be-recognized information to obtain multiple phonemes included in the to-be-recognized information.
  • the information to be identified is segmented according to the acoustic model in the text information acquisition model to obtain multiple phonemes contained in the information to be identified.
  • the voice information contained in the audio information or the video information is composed of phonemes pronounced by multiple characters, and the phoneme of a character includes the frequency and timbre of the character's pronunciation.
  • the acoustic model contains the phonemes of all characters. By matching the speech information with all the phonemes in the acoustic model, the phonemes of a single character in the speech information can be segmented, and the segmentation will finally obtain the information contained in the information to be recognized. Multiple phonemes.
  • S1222 Match the phonemes according to the voice feature dictionary in the text information acquisition model to convert the phonemes into pinyin information.
  • the phoneme is matched according to the phonetic feature dictionary in the text information acquisition model to convert the phoneme into pinyin information.
  • the phonetic feature dictionary contains the phoneme information corresponding to all characters' pinyin.
  • S1223 Perform semantic analysis on the pinyin information according to the semantic analysis model in the text information acquisition model to obtain target text information corresponding to the information to be recognized.
  • the semantic analysis model contains the corresponding mapping relationship between the pinyin information and the text information. Through the mapping relationship contained in the semantic analysis model, the obtained pinyin information can be semantically analyzed to convert the pinyin information into the corresponding target text information. .
  • the voice information in the information to be recognized contains noise, it will affect the accuracy of the obtained target text information. At this time, it is necessary to feed back the re-input prompt information to the user terminal to prompt the user of the user terminal to move to the lower position. Re-enter the information to be identified in a noisy environment.
  • the to-be-recognized information is text information
  • the information to be recognized is text information, there is no need to process the information to be recognized, and the information to be recognized can be directly used as target text information for subsequent processing.
  • S140 Acquire a text feature vector corresponding to the target text information according to a pre-stored word processing rule.
  • the word processing rule is to convert the acquired target text to obtain the rule information corresponding to the character feature vector.
  • the target text information can be converted into the corresponding feature vector through the word processing rule.
  • the word processing rule includes characters Screening rules, character length information and character vector table.
  • the character screening rules are the rule information for screening meaningless characters in the target text information, and the character length information is the characters contained in the target text information after the screening process.
  • the number carries unified quantity information
  • the character vector table is a data table that records the vector information of each character.
  • meaningless characters can be modal particles (ah, ah) and structural particles (de, di) in the target text information.
  • step S140 includes sub-steps S141, S142, and S143.
  • the target text information is screened according to the character screening rule to obtain screened text information.
  • the character screening rule is the rule information used to filter the target text information. Specifically, the character screening rule can filter out the characters with little meaning in the target text information, and the characters contained in the obtained filtered text information are all practical Meaningful characters.
  • S142 Perform standardization processing on the screened text information according to the character length information to obtain corresponding text information to be converted.
  • the character length information can be denoted as N.
  • the first N characters in the filtered text information will be intercepted as the text information to be converted; if the filtered text information contains If the number of characters is less than the character length information N, use empty characters (using ⁇ to indicate) to fill in the characters of the filtered text information to obtain the text information to be converted containing N characters; if the characters contained in the text information are filtered If the number is equal to the character length information N, the filtered text information is directly used as the text information to be converted.
  • S143 Acquire a character feature vector corresponding to the character information to be converted according to the character vector table in the character processing rule.
  • the character vector table contains a 1 ⁇ M-dimensional vector corresponding to each character, and the 1 ⁇ M-dimensional vector can be used to quantify the characteristics of the character.
  • a 1 ⁇ M-dimensional vector corresponding to each character in the text information to be converted can be obtained from the character vector information table, and the 1 ⁇ M dimensional vector corresponding to the N characters contained in the text information to be converted can be obtained.
  • an N ⁇ M vector can be obtained as a text feature vector, that is, the text information to be converted is converted into a corresponding text feature vector.
  • the text feature vector is input into a pre-stored emotion recognition model to obtain the emotion level corresponding to the text feature vector.
  • the emotion recognition model is the model used to obtain the emotion level corresponding to the text feature vector, that is, the model used to recognize the user's emotion level from the information to be recognized.
  • the emotion recognition model includes the long-short-term memory network (Long Short -Term Memory, LSTM), weight layer and neural network.
  • step S150 includes sub-steps S151 and S152.
  • the text feature vector is input to the long and short-term memory network to output information from the corresponding memory network.
  • C(t) C(t_1) ⁇ f(t)+i(t) ⁇ a(t)
  • C is the accumulated cell memory information in each calculation process
  • C(t) is the current The cell memory information output by the cell
  • C(t_1) is the cell memory information output by the previous cell
  • is the vector operator
  • the calculation process of C(t_1) ⁇ f(t) is the calculation process of each in the vector C(t_1)
  • the one-dimensional value is respectively multiplied by f(t), and the calculated vector dimension is the same as the dimension in the vector C(t_1).
  • 5Calculate the output information of the current cell: y(t) ⁇ (V ⁇ h(t)+c), V and c are the parameter values of the formula in this cell.
  • Each cell can calculate an output information after a round of calculation.
  • the output information of N cells can be combined to obtain a memory network output information of a text feature vector.
  • the output information of a memory network of a text feature vector is a 1 ⁇ N Dimensional vector.
  • the output information of the memory network is calculated according to the weight layer and the neural network to obtain the corresponding emotion level.
  • the number of weight values and the number of characters contained in the weight layer, that is, the number of weight values is N
  • the calculated output information of the memory network is multiplied by the weight layer, that is, the output information of the memory network is the first
  • the n dimension values are multiplied by the nth weight value in the weight layer (0 ⁇ n ⁇ N), and then the output information of the memory network with additional weight values can be obtained.
  • the output information of the memory network with additional weight value is input into the neural network, where the neural network contains N input nodes, and each input node corresponds to a dimension value of the vector in the output information of the memory network with additional weight value.
  • the input node and the output The nodes include a fully connected layer, a first formula group is set between the input node and the fully connected layer, and a second formula group is set between the output node and the fully connected layer.
  • the first formula group contains formulas from all input nodes to all characteristic units
  • the formulas in the first formula group all use input node values as input values and characteristic unit values as output values
  • the second formula group includes all output nodes to all
  • the formula of the characteristic unit, the formulas in the second formula group all use the value of the characteristic unit as the input value and the output node value as the output value
  • each formula included in the neural network has a corresponding parameter value.
  • Each output node corresponds to an emotion category, and the output node value is the probability value of the emotion category to which the information to be identified belongs, and the emotion category with the highest probability value of the information to be identified is obtained as the emotional level output by the neural network, that is It is to obtain the emotion level corresponding to the character feature vector.
  • the neural network contains three emotion categories: positive emotions, neutral emotions, and negative emotions.
  • the probability value corresponding to positive emotions output by the neural network is 65%, and the probability value corresponding to neutral emotions is 24%.
  • the probability value corresponding to the emotion is 33%, and the emotion level corresponding to the text feature vector will be obtained as a positive emotion.
  • matching corpus information is obtained from a pre-stored corpus information database as reply corpus information and fed back to the user terminal.
  • the corpus information database contains multiple pieces of corpus information that answer all possible questions.
  • the corpus information can be text information, audio information, video information, or a combination of text information and audio information or a combination of text information and video information.
  • the corpus information database can contain multiple pieces of corpus information for the same question, and multiple pieces of corpus information for the same question can be applied to various emotion levels.
  • the response corpus information corresponding to the target text information and emotional level is obtained from the corpus information database, and the questions raised by the user can be answered according to the user's emotional level.
  • the corpus information can be to answer questions raised by the user of the user terminal, for example, to explain in detail the business terms raised by the user; it can also be based on the user’s questions and feedback corresponding guidance information to answer the questions. Provide guidance to guide the user of the user terminal to carry out the relevant operations of business processing.
  • the aforementioned corpus information database may be a collection of corpus information acquired based on big data related technologies, that is to say, the aforementioned corpus information database may be expressed in the form of a database or a database cluster.
  • the above corpus information can also be stored in the blockchain based on distributed storage.
  • the matching corpus information needs to be obtained from the corpus information database as the reply corpus information
  • the corpus can be completed by calling the blockchain smart contract The process of matching and obtaining.
  • step S160 includes sub-steps S161 and S162.
  • the corpus information in the corpus information database corresponds to one or more corpus keywords, and multiple pieces of corpus information for the same problem contain the same corpus keywords, which can be used to determine the target text information and the corpus keys contained in the corpus information.
  • Words are matched to obtain the number of characters in the target text information that matches the corpus keyword, and the ratio between the number of characters and the number of characters in the corresponding corpus keyword is calculated to obtain the degree of matching between the target text information and the corpus information , After obtaining the degree of matching between the target text information and each corpus information, one or more pieces of corpus information with the highest degree of corpus information matching are used as the target corpus information.
  • the corpus keywords contained in multiple pieces of corpus information for a certain problem are “claim, amount, compensation, payment”, and a certain target text information is "how much is the claim amount of product A, and how to pay”, then
  • the corpus information corresponding to the emotion level is selected from the target corpus information as the reply corpus information and fed back to the user terminal.
  • the acquired target corpus information includes multiple pieces of corpus information, and each corpus information corresponds to a different emotion level, and the corpus information corresponding to the emotion level can be obtained from the target corpus information as the reply corpus information.
  • the target corpus information contains at least one piece of corpus information corresponding to the emotion level. If there is only one piece of corpus information corresponding to the emotion level in the target corpus information, the corpus information is determined as the reply corpus information; if the target corpus information contains The corpus information corresponding to the sentiment level contains multiple pieces of corpus information, and one piece of corpus information is randomly selected as the reply corpus information.
  • steps S170 and S180 are further included.
  • the manual customer service terminal is the user terminal used by the manual customer service in the enterprise.
  • the manual customer service terminal can be used to obtain the information to be identified and the reply information can be manually input to the manual customer service terminal.
  • the manual customer service terminal can send the reply information to the user terminal; if it is not a negative emotion, it can return to the execution of the If received from the user
  • step S170 and step S180 can also be performed after step S150. If step S170 and step S180 are performed after step S150, step S170 is performed first to determine whether the emotion level is negative emotion; if the emotion level is negative emotion, step S180 is performed ; If the emotion level is not a negative emotion, perform step S160; that is, when the emotion level is not a negative emotion, perform the matching according to the emotion level and the target text information to obtain a match from the pre-stored corpus information database The corpus information is used as the reply corpus information and fed back to the step of the user terminal.
  • the target text information corresponding to the information to be recognized from the user terminal is obtained, the text feature vector corresponding to the target text information is obtained according to the word processing rules, and the emotion recognition model is used
  • the emotion level corresponding to the text feature vector is obtained, and the corpus information matching the emotion level and the target text information in the corpus information database is obtained as the reply corpus information and fed back to the user terminal to complete the reply.
  • the aforementioned word processing rules, corpus information and other information can also be stored in a blockchain node.
  • the blockchain referred to in this application is a new application mode of computer technology such as distributed data storage, point-to-point transmission, consensus mechanism, and encryption algorithm.
  • Blockchain essentially a decentralized database, is a series of data blocks associated with cryptographic methods. Each data block contains a batch of network transaction information for verification. The validity of the information (anti-counterfeiting) and the generation of the next block.
  • the blockchain can include the underlying platform of the blockchain, the platform product service layer, and the application service layer.
  • the embodiment of the present application also provides a response device based on emotion recognition, and the response device based on emotion recognition is used to execute any embodiment of the foregoing response method based on emotion recognition.
  • FIG. 9 is a schematic block diagram of a reply device based on emotion recognition provided in an embodiment of the present application.
  • the response device based on emotion recognition may be configured in the management server 10.
  • the response device 100 based on emotion recognition includes a determination unit for information to be recognized 110, a recognition unit for information to be recognized 120, a target text information acquisition unit 130, a text feature vector acquisition unit 140, an emotion level acquisition unit 150, and a response corpus.
  • Information acquisition unit 160 includes a determination unit for information to be recognized 110, a recognition unit for information to be recognized 120, a target text information acquisition unit 130, a text feature vector acquisition unit 140, an emotion level acquisition unit 150, and a response corpus.
  • Information acquisition unit 160 includes a determination unit for information to be recognized 110, a recognition unit for information to be recognized 120, a target text information acquisition unit 130, a text feature vector acquisition unit 140, an emotion level acquisition unit 150, and a response corpus.
  • the to-be-identified information judging unit 110 is configured to determine the information type of the to-be-recognized information if the information to be identified from the user terminal is received, and the information type includes text information and voice information.
  • the to-be-recognized information recognition unit 120 is configured to, if the to-be-recognized information is voice information, recognize the voice information according to a pre-stored voice recognition model to obtain target text information corresponding to the to-be-recognized information.
  • the to-be-identified information recognition unit 120 includes sub-units: a noise judgment unit, a voice information recognition unit, and a prompt information feedback unit.
  • the noise judgment unit is used to judge whether the voice information in the information to be recognized contains noise according to the noise judgment rule; the voice information recognition unit is used to judge whether the voice information in the information to be recognized does not contain noise, according to The text information acquisition model recognizes the voice information in the information to be recognized to obtain the target text information corresponding to the information to be recognized; the prompt information feedback unit is used if the voice information in the information to be recognized includes Noise, feedback the re-entered prompt information to prompt the user of the user terminal to re-enter the to-be-identified information in a low-noise environment.
  • the voice information recognition unit includes sub-units: a phoneme acquisition unit, a pinyin information acquisition unit, and a semantic analysis unit.
  • the phoneme acquisition unit is used to segment the information to be identified according to the acoustic model in the text information acquisition model to obtain multiple phonemes contained in the information to be identified;
  • the pinyin information acquisition unit is used to The phonetic feature dictionary in the text information acquisition model matches the phonemes to convert the phonemes into pinyin information;
  • a semantic analysis unit is used to perform the pinyin information on the semantic analysis model in the text information acquisition model Semantic analysis is performed to obtain target text information corresponding to the information to be identified.
  • the target text information acquiring unit 130 is configured to, if the to-be-recognized information is text information, use the to-be-recognized information as target text information.
  • the text feature vector obtaining unit 140 is configured to obtain a text feature vector corresponding to the target text information according to a pre-stored word processing rule.
  • the text feature vector obtaining unit 140 includes subunits: a screening text information obtaining unit, a standardization processing unit, and a text information conversion unit.
  • the screening text information obtaining unit is used to screen the target text information according to the character screening rules to obtain screening text information;
  • the standardization processing unit is used to perform standardization processing on the screening text information according to the character length information Obtain the corresponding text information to be converted;
  • the text information conversion unit is used to obtain the text feature vector corresponding to the text information to be converted according to the character vector table in the word processing rule.
  • the emotion level obtaining unit 150 is configured to input the text feature vector into a pre-stored emotion recognition model to obtain the emotion level corresponding to the text feature vector.
  • the emotion level obtaining unit 150 includes sub-units: a screening text information obtaining unit and a calculation unit.
  • the memory network output information acquisition unit is used to input the text feature vector into the long and short-term memory network to output information from the corresponding memory network;
  • the network output information is calculated to obtain the corresponding emotional level.
  • the reply corpus information obtaining unit 160 is configured to obtain matching corpus information from a pre-stored corpus information database as reply corpus information according to the emotion level and the target text information and feed it back to the user terminal.
  • the reply corpus information acquisition unit 160 includes sub-units: a target corpus information acquisition unit and a reply corpus information selection unit.
  • the target corpus information acquisition unit is used to obtain the target corpus information corresponding to the target text information from the corpus information database;
  • the reply corpus information selection unit is used to select the corpus corresponding to the emotion level from the target corpus information
  • the information is used as reply corpus information and fed back to the user terminal.
  • the response device 100 based on emotion recognition further includes sub-units: an emotion level judging unit and a sending unit for information to be identified.
  • the emotion level judging unit is used to judge whether the emotion level is a negative emotion; the to-be-identified information sending unit is used to send the to-be-identified information to the human customer service terminal if the emotion level is a negative emotion.
  • the above response method based on emotion recognition is applied to obtain the target text information corresponding to the information to be recognized from the user terminal, and the text characteristics corresponding to the target text information are obtained according to the word processing rules.
  • the emotion recognition model the emotion level corresponding to the text feature vector is obtained, and the corpus information matching the emotion level and the target text information in the corpus information database is obtained as the reply corpus information and fed back to the user terminal to complete the reply.
  • the emotional level that matches the information to be identified can be obtained, and the response corpus information can be obtained according to the emotional level and the target text information corresponding to the information to be identified, and the response corpus information can be flexibly adjusted according to the emotional level in the information to be identified. , Improve the flexibility of replying to the questioned information.
  • the above-mentioned response device based on emotion recognition can be implemented in the form of a computer program, and the computer program can be run on a computer device as shown in FIG. 10.
  • FIG. 10 is a schematic block diagram of a computer device according to an embodiment of the present application.
  • the computer device 500 may be implemented as the above-mentioned management server 10 for executing the response method based on emotion recognition.
  • the computer device 500 includes a processor 502, a memory, and a network interface 505 connected through a system bus 501, where the memory may include a non-volatile storage medium 503 and an internal memory 504.
  • the non-volatile storage medium 503 can store an operating system 5031 and a computer program 5032.
  • the processor 502 can execute a response method based on emotion recognition.
  • the processor 502 is used to provide calculation and control capabilities, and support the operation of the entire computer device 500.
  • the internal memory 504 provides an environment for the operation of the computer program 5032 in the non-volatile storage medium 503.
  • the processor 502 can execute the response method based on emotion recognition.
  • the network interface 505 is used for network communication, such as providing data information transmission.
  • the management server 10 can establish a network connection with the user terminal 20 via the Internet based on the network interface 505 to realize communication with the user terminal 20.
  • FIG. 10 is only a block diagram of a part of the structure related to the solution of the present application, and does not constitute a limitation on the computer device 500 to which the solution of the present application is applied.
  • the specific computer device 500 may include more or fewer components than shown in the figure, or combine certain components, or have a different component arrangement.
  • the processor 502 is configured to run a computer program 5032 stored in a memory to realize the following function: if receiving information to be identified from a user terminal, determine the information type of the information to be identified, and the information type includes Text information and voice information; if the to-be-recognized information is voice information, recognize the voice information according to a pre-stored voice recognition model to obtain target text information corresponding to the to-be-recognized information; if the to-be-recognized information is Text information, using the to-be-recognized information as the target text information; obtaining the text feature vector corresponding to the target text information according to the pre-stored word processing rules; inputting the text feature vector into the pre-stored emotion recognition model to obtain the The emotion level corresponding to the text feature vector; according to the emotion level and the target text information, matching corpus information is obtained from a pre-stored corpus information database as reply corpus information and fed back to the user terminal.
  • the processor 502 executes the step of recognizing the voice information according to a pre-stored voice recognition model to obtain target text information corresponding to the information to be recognized if the information to be recognized is voice information , Perform the following operations: determine whether the voice information in the to-be-recognized information contains noise according to the noise determination rule; if the voice information in the to-be-recognized information does not contain noise, perform the following operations according to the text information acquisition model The voice information in the to-be-recognized information is recognized to obtain the target text information corresponding to the to-be-recognized information; if the voice information in the to-be-recognized information contains noise, the re-input prompt information is fed back to prompt the user terminal The user re-enters the to-be-identified information in a low-noise environment.
  • the processor 502 is executing that if the voice information in the to-be-recognized information does not contain noise, it recognizes the voice information in the to-be-recognized information according to the text information acquisition model to obtain a connection with the to-be-recognized information.
  • the following operations are performed: segmenting the to-be-recognized information according to the acoustic model in the text-information acquisition model to obtain multiple phonemes contained in the to-be-recognized information; Match the phonemes according to the phonetic feature dictionary in the text information acquisition model to convert the phonemes into pinyin information; perform semantic analysis on the pinyin information according to the semantic analysis model in the text information acquisition model to obtain Target text information corresponding to the information to be identified.
  • the processor 502 when the processor 502 executes the step of obtaining a text feature vector corresponding to the target text information according to a prestored word processing rule, it performs the following operations: performs the following operations on the target text information according to the character screening rule Screening to obtain screening text information; standardizing the screening text information according to the character length information to obtain corresponding text information to be converted; obtaining the text information to be converted according to the character vector table in the word processing rule The corresponding text feature vector.
  • the processor 502 when the processor 502 executes the step of inputting the text feature vector into a pre-stored emotion recognition model to obtain the emotion level corresponding to the text feature vector, the processor 502 performs the following operations: The long and short-term memory network is used to output information from the corresponding memory network; the output information of the memory network is calculated according to the weight layer and the neural network to obtain the corresponding emotion level.
  • the processor 502 executes the step of obtaining matching corpus information from a pre-stored corpus information base according to the emotion level and the target text information as the reply corpus information and feeding it back to the user terminal, Perform the following operations: obtain the target corpus information corresponding to the target text information from the corpus information database; select the corpus information corresponding to the emotion level from the target corpus information as the reply corpus information and feed it back to the user terminal .
  • the processor 502 further performs the following operations: judging whether the emotion level is a negative emotion; if the emotion level is a negative emotion, sending the information to be identified to a human customer service terminal.
  • the embodiment of the computer device shown in FIG. 10 does not constitute a limitation on the specific configuration of the computer device.
  • the computer device may include more or less components than those shown in the figure. Or some parts are combined, or different parts are arranged.
  • the computer device may only include a memory and a processor. In such an embodiment, the structures and functions of the memory and the processor are consistent with the embodiment shown in FIG. 10, and will not be repeated here.
  • the processor 502 may be a central processing unit (Central Processing Unit, CPU), and the processor 502 may also be other general-purpose processors, digital signal processors (Digital Signal Processors, DSPs), Application Specific Integrated Circuit (ASIC), Field-Programmable Gate Array (FPGA) or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components, etc.
  • the general-purpose processor may be a microprocessor or the processor may also be any conventional processor.
  • a computer-readable storage medium may be a non-volatile computer-readable storage medium. Wherein, the computer-readable storage medium may be non-volatile or volatile.
  • the computer-readable storage medium stores a computer program, wherein the computer program implements the following steps when executed by the processor: if the information to be identified from the user terminal is received, the information type of the information to be identified is determined, and the information type includes text Information and voice information; if the to-be-recognized information is voice information, recognize the voice information according to a pre-stored voice recognition model to obtain target text information corresponding to the to-be-recognized information; if the to-be-recognized information is text Information, using the to-be-recognized information as the target text information; obtaining the text feature vector corresponding to the target text information according to the pre-stored word processing rules; inputting the text feature vector into the pre-stored emotion recognition model to obtain the text
  • the step of recognizing the voice information according to a pre-stored voice recognition model to obtain target text information corresponding to the information to be recognized includes: The noise judgment rule judges whether the voice information in the information to be identified contains noise; if the voice information in the information to be identified does not contain noise, the information in the information to be identified is determined according to the text information acquisition model.
  • the voice information is recognized to obtain the target text information corresponding to the information to be recognized; if the voice information in the information to be recognized contains noise, the re-input prompt information is fed back to prompt the user of the user terminal to be in a low-noise environment Enter the information to be identified again.
  • the voice information in the to-be-recognized information if the voice information in the to-be-recognized information does not contain noise, the voice information in the to-be-recognized information is recognized according to the text information acquisition model to obtain the information corresponding to the to-be-recognized information
  • the step of target text information includes: segmenting the to-be-recognized information according to the acoustic model in the text-information acquisition model to obtain a plurality of phonemes contained in the to-be-recognized information; acquiring according to the text information
  • the voice feature dictionary in the model matches the phonemes to convert the phonemes into pinyin information; performs semantic analysis on the pinyin information according to the semantic analysis model in the text information acquisition model to obtain the information to be recognized The corresponding target text information.
  • the step of obtaining a text feature vector corresponding to the target text information according to a prestored word processing rule includes: screening the target text information according to the character screening rule to obtain the screening text information Standardize the screened text information according to the character length information to obtain the corresponding text information to be converted; obtain the text feature vector corresponding to the text information to be converted according to the character vector table in the word processing rule.
  • the step of inputting the text feature vector into a pre-stored emotion recognition model to obtain an emotion level corresponding to the text feature vector includes: inputting the text feature vector into the long- and short-term memory network to Output information of the corresponding memory network; calculate the output information of the memory network according to the weight layer and the neural network to obtain the corresponding emotion level.
  • the step of obtaining matching corpus information from a pre-stored corpus information base according to the emotion level and the target text information as the reply corpus information and feeding back to the user terminal includes: The target corpus information corresponding to the target text information is acquired from the information database; the corpus information corresponding to the emotion level is selected from the target corpus information as the reply corpus information and fed back to the user terminal.
  • the method further includes: judging whether the emotion level is a negative emotion; if the emotion level is a negative emotion, sending the to-be-identified information to a human customer service terminal.
  • the disclosed equipment, device, and method may be implemented in other ways.
  • the device embodiments described above are only illustrative.
  • the division of the units is only a logical function division. In actual implementation, there may be other division methods, or the units with the same function may be combined into one. Units, for example, multiple units or components can be combined or integrated into another system, or some features can be omitted or not implemented.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may also be electrical, mechanical or other forms of connection.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments of the present application.
  • the functional units in the various embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
  • the above-mentioned integrated unit can be implemented in the form of hardware or software functional unit.
  • the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer-readable storage medium.
  • the technical solution of this application is essentially or the part that contributes to the existing technology, or all or part of the technical solution can be embodied in the form of a software product, and the computer software product can be stored in a computer.
  • the read storage medium includes several instructions to enable a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of the present application.
  • the computer-readable storage medium is a tangible, non-transitory storage medium, and the computer-readable storage medium may be an internal storage unit of the aforementioned device, such as a physical storage medium such as a hard disk or a memory of the device.
  • the storage medium may also be an external storage device of the device, such as a plug-in hard disk equipped on the device, a smart memory card (Smart Media Card, SMC), a Secure Digital (SD) card, and a flash memory card. (Flash Card) and other physical storage media.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Health & Medical Sciences (AREA)
  • Business, Economics & Management (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Strategic Management (AREA)
  • Molecular Biology (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Accounting & Taxation (AREA)
  • Development Economics (AREA)
  • Computing Systems (AREA)
  • Finance (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Game Theory and Decision Science (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Human Computer Interaction (AREA)
  • Databases & Information Systems (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Machine Translation (AREA)

Abstract

基于情绪识别的答复方法、装置、计算机设备及存储介质,方法包括:获取来自用户终端的待识别信息对应的目标文字信息,根据文字处理规则获取与目标文字信息对应的文字特征向量,并根据情绪识别模型获取与文字特征向量对应的情绪等级,获取语料信息库中与情绪等级及目标文字信息相匹配的语料信息作为答复语料信息反馈至用户终端,以完成答复。基于情绪识别技术,可接收来自用户终端的待识别信息并获取与待识别信息相匹配的情绪等级,根据情绪等级及与待识别信息对应的目标文字信息获取答复语料信息,可实现根据待识别信息中的情绪等级灵活调整答复语料信息,提高了对提问信息进行答复的灵活性。

Description

基于情绪识别的答复方法、装置、计算机设备及存储介质
本申请要求于2020年4月27日提交中国专利局,申请号为2020103459202、发明名称为“基于情绪识别的答复方法、装置、计算机设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及人工智能技术领域,属于智慧城市智能语音客服交互相关应用场景,尤其涉及一种基于情绪识别的答复方法、装置、计算机设备及存储介质。
背景技术
客户在办理业务的过程中可能会遇到各种问题,此时客户即可通过电话联系客服人员以获取相应的解决办法,或通过互联网发送所需解答的提问信息至客服人员以获取相应的解决办法,上述方式均是基于人工客服实现的。随着人工智能的兴起和发展,越来越多的企业采用智能语音客服替代人工客服以为客户提供服务,采用智能语音客服可显著降低企业的人力成本。然而发明人意识到,现有的智能语音客服仅能基于客户所提出的提问信息获取对应的答复信息,而无法从客户所提出的提问信息中获取其它有用信息,导致目前的智能语音客服的灵活性不足,例如无法针对提问信息中的情绪灵活调整答复信息。因而,现有技术方法中的智能语音客服在对客户的提问信息进行答复时存在灵活性不足的问题。
发明内容
本申请实施例提供了一种基于情绪识别的答复方法、装置、计算机设备及存储介质,旨在解决现有的技术方法中的智能语音客服在对提问信息进行答复时所存在的灵活性不足的问题。
第一方面,本申请实施例提供了一种基于情绪识别的答复方法,其包括:
若接收到来自用户终端的待识别信息,判断所述待识别信息的信息类型,所述信息类型包括文字信息和语音信息;
若所述待识别信息为语音信息,根据预存的语音识别模型对所述语音信息进行识别以得到与所述待识别信息对应的目标文字信息;
若所述待识别信息为文字信息,将所述待识别信息作为目标文字信息;
根据预存的文字处理规则获取与所述目标文字信息对应的文字特征向量;
将所述文字特征向量输入预存的情绪识别模型以获取与所述文字特征向量对应的情绪等级;
根据所述情绪等级及所述目标文字信息从预存的语料信息库中获取相匹配的语料信息作为答复语料信息并反馈至所述用户终端。
第二方面,本申请实施例提供了一种基于情绪识别的答复装置,其包括:
待识别信息判断单元,用于若接收到来自用户终端的待识别信息,判断所述待识别信息的信息类型,所述信息类型包括文字信息和语音信息;
待识别信息识别单元,用于若所述待识别信息为语音信息,根据预存的语音识别模型对所述语音信息进行识别以得到与所述待识别信息对应的目标文字信息;
目标文字信息获取单元,用于若所述待识别信息为文字信息,将所述待识别信息作为目标文字信息;
文字特征向量获取单元,用于根据预存的文字处理规则获取与所述目标文字信息对应的文字特征向量;
情绪等级获取单元,用于将所述文字特征向量输入预存的情绪识别模型以获取与所述文字特征向量对应的情绪等级;
答复语料信息获取单元,用于根据所述情绪等级及所述目标文字信息从预存的语料信息库中获取相匹配的语料信息作为答复语料信息并反馈至所述用户终端。
第三方面,本申请实施例又提供了一种计算机设备,其包括存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机程序,所述处理器执行所述计算机程序时实现:
若接收到来自用户终端的待识别信息,判断所述待识别信息的信息类型,所述信息类型包括文字信息和语音信息;
若所述待识别信息为语音信息,根据预存的语音识别模型对所述语音信息进行识别以得到与所述待识别信息对应的目标文字信息;
若所述待识别信息为文字信息,将所述待识别信息作为目标文字信息;
根据预存的文字处理规则获取与所述目标文字信息对应的文字特征向量;
将所述文字特征向量输入预存的情绪识别模型以获取与所述文字特征向量对应的情绪等级;
根据所述情绪等级及所述目标文字信息从预存的语料信息库中获取相匹配的语料信息作为答复语料信息并反馈至所述用户终端。
第四方面,本申请实施例还提供了一种计算机可读存储介质,其中所述计算机可读存储介质存储有计算机程序,所述计算机程序当被处理器执行时使所述处理器执行上述第一方面所述的基于情绪识别的答复方法。
本申请实施例可获取与待识别信息相匹配的情绪等级,可实现根据待识别信息中的情绪等级灵活调整答复语料信息,提高了对提问信息进行答复的灵活性。
附图说明
为了更清楚地说明本申请实施例技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1为本申请实施例提供的基于情绪识别的答复方法的流程示意图;
图2为本申请实施例提供的基于情绪识别的答复方法的应用场景示意图;
图3为本申请实施例提供的基于情绪识别的答复方法的子流程示意图;
图4为本申请实施例提供的基于情绪识别的答复方法的另一子流程示意图;
图5为本申请实施例提供的基于情绪识别的答复方法的另一子流程示意图;
图6为本申请实施例提供的基于情绪识别的答复方法的另一子流程示意图;
图7为本申请实施例提供的基于情绪识别的答复方法的另一子流程示意图;
图8为本申请实施例提供的基于情绪识别的答复方法的另一流程示意图;
图9为本申请实施例提供的基于情绪识别的答复装置的示意性框图;
图10为本申请实施例提供的计算机设备的示意性框图。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
应当理解,当在本说明书和所附权利要求书中使用时,术语“包括”和“包含”指示所描述特征、整体、步骤、操作、元素和/或组件的存在,但并不排除一个或多个其它特征、整体、步骤、操作、元素、组件和/或其集合的存在或添加。
还应当理解,在此本申请说明书中所使用的术语仅仅是出于描述特定实施例的目的而并不意在限制本申请。如在本申请说明书和所附权利要求书中所使用的那样,除非上下文清楚地指明其它情况,否则单数形式的“一”、“一个”及“该”意在包括复数形式。
还应当进一步理解,在本申请说明书和所附权利要求书中使用的术语“和/或”是指 相关联列出的项中的一个或多个的任何组合以及所有可能组合,并且包括这些组合。
本申请实施例可适用于人工智能领域中预测分析、机器人技术等方面,如通过情绪识别对预测用户的后续行为,将情绪识别的答复方法应用于机器人技术以提高机器人的实用性等,具体可基于实际应用场景确定,在此不做限制。
请参阅图1,图1是本申请实施例提供的基于情绪识别的答复方法的流程示意图,图2为本申请实施例提供的基于情绪识别的答复方法的应用场景示意图。该基于情绪识别的答复方法应用于管理服务器10中,该方法通过安装于管理服务器10中的应用软件进行执行,管理服务器10通过与用户终端20建立网络连接以与用户终端20进行通信,用户终端20的使用者可通过用户终端20发送待识别信息至管理服务器10,待识别信息可以是该使用者所发送的需进行解答的提问信息,待识别信息可作为表征该使用者的真实意图的基础,也即是基于待识别信息可获取该使用者的真实意图,管理服务器10执行基于情绪识别的答复方法以对获取与待识别信息对应的答复语料信息并反馈至相应的用户终端20以完成答复,其中,管理服务器10即是用于执行基于情绪识别的答复方法的企业终端,用户终端20即是用于发送待识别信息并接收答复语料信息的终端设备,用户终端20可以是台式电脑、笔记本电脑、平板电脑或手机等。图2中仅仅示意出一台用户终端20与管理服务器10进行数据信息传输,在实际应用中,管理服务器10也可与多台用户终端20同时进行数据信息传输。
如图1所示,该方法包括步骤S110~S160。
S110、若接收到来自用户终端的待识别信息,判断所述待识别信息的信息类型,所述信息类型包括文字信息和语音信息。
若接收到来自用户终端的待识别信息,判断所述待识别信息的信息类型,其中,所述信息类型包括文字信息和语音信息。具体的,待识别信息中包含对应的格式标识信息,格式标识信息即是用于对待识别信息的格式进行标识的信息,通过待识别信息的格式标识信息即可判断待识别信息是否为文字信息。其中,待识别信息可以由用户终端的使用者通过用户终端发送至管理服务器,待识别信息可以是文字、语音或短视频,需从待识别信息中获取对应的目标文字信息,并基于目标文字信息获取使用者的真实意图。
例如,若格式标识信息为txt、string,则对应的待识别信息为文字信息;若格式标识信息为wav、mp3、wma,则该对应的待识别信息为音频信息;若格式标识信息为avi、flv、rmvb,则对应的待识别信息为视频信息。
例如,用户终端的使用者在终端页面的问题框中输入文字并点击确认按钮,则用户终端将该文字作为待识别信息发送至管理服务器;使用者点击终端页面的语音录入按钮,说出自己的问题并点击确认按钮,则用户终端将所录得的语音作为待识别信息发送至管理服务器;使用者点击终端页面的视频录入按钮,正对用户终端的视频采集设备说出自己的问题并点击确认按钮,则用户终端将所录得的短视频作为待识别信息发送至管理服务器。
S120、若所述待识别信息为语音信息,根据预存的语音识别模型对所述语音信息进行识别以得到与所述待识别信息对应的目标文字信息。
若所述待识别信息为语音信息,根据预存的语音识别模型对所述语音信息进行识别以得到与所述待识别信息对应的目标文字信息。若待识别信息不为文字信息,则该待识别信息可以是音频信息或视频信息,音频信息或视频信息中均包含语音信息。语音识别模型即是音频信息或视频信息中所包含的语音信息进行识别及转换的模型,其中,所述语音识别模型包括噪音判断规则及文字信息获取模型。噪音判断规则即为对语音信息中是否包含噪音进行判断的规则,文字信息获取模型即为从语音信息中获取对应文字信息的模型,若语音信息中包含噪音,则会影响所获取到的目标文字信息的精确度,因此在从语音信息中获取对应的目标文字信息之前,可通过噪音判断规则对语音信息中是否包含噪音进行判断, 以确保从无噪音的语音信息中获取更加准确的目标文字信息。
在一实施例中,如图3所示,步骤S120包括子步骤S121、S122和S123。
S121、根据所述噪音判断规则对所述待识别信息中的语音信息是否包含噪音进行判断。
根据所述噪音判断规则对所述待识别信息中的语音信息是否包含噪音进行判断。具体的,由于人类说话时发出声音的频率处于一个固定频率区间(85Hz~1100Hz),可基于语音信息中声纹信号的频率,从语音信息中获取处于上述固定频率区间的声纹信号的平均信号强度作为目标声音信号强度,从语音信息中获取未处于上述固定频率区间的其他声纹信号的平均强度作为背景噪声信号强度,判断背景噪声信号强度与目标声音信号强度之间的比值是否大于噪音判断规则中所预设的阈值,若比值大于阈值则判断该待识别信息中的语音信息包含噪音;若比值不大于阈值则判断该待识别信息中的语音信息不包含噪音。
例如,从待识别信息的语音信息中获取到目标声音信号强度为65分贝(decibel,dB),背景噪声信号强度为50分贝,预设的阈值为0.8,则背景噪声信号强度与目标声音信号强度之间的比值不大于该预设的阈值,判断该待识别信息中的语音信息不包含噪音。
S122、若所述待识别信息中的语音信息不包含噪音,根据所述文字信息获取模型对所述待识别信息中的语音信息进行识别以得到与所述待识别信息对应的目标文字信息。
若所述待识别信息中的语音信息不包含噪音,根据所述文字信息获取模型对所述待识别信息中的语音信息进行识别以得到与所述待识别信息对应的目标文字信息。若待识别信息中的语音信息不包含噪音,则可根据文字信息获取模型对语音信息进行识别以获取对应的目标文字信息,具体的,文字信息获取模型包括声学模型、语音特征词典及语义解析模型。
在一实施例中,如图4所示,步骤S122包括子步骤S1221、S1222和S1223。
S1221、根据所述文字信息获取模型中的声学模型对所述待识别信息进行切分以得到所述待识别信息中所包含的多个音素。
根据所述文字信息获取模型中的声学模型对所述待识别信息进行切分以得到所述待识别信息中所包含的多个音素。具体的,音频信息或视频信息中所包含的语音信息由多个字符发音的音素而组成,一个字符的音素包括该字符发音的频率和音色。声学模型中包含所有字符发音的音素,通过将语音信息与声学模型中所有的音素进行匹配,即可对语音信息中单个字符的音素进行切分,通过切分最终得到待识别信息中所包含的多个音素。
S1222、根据所述文字信息获取模型中的语音特征词典对所述音素进行匹配以将所述音素转换为拼音信息。
根据所述文字信息获取模型中的语音特征词典对所述音素进行匹配以将所述音素转换为拼音信息。语音特征词典中包含所有字符拼音对应的音素信息,通过将所得到的音素与字符拼音对应的音素信息进行匹配,即可将单个字符的音素转换为语音特征词典中与该音素相匹配的字符拼音,以实现将语音信息中所包含的所有音素转换为拼音信息。
S1223、根据所述文字信息获取模型中的语义解析模型对所述拼音信息进行语义解析以得到与所述待识别信息对应的目标文字信息。
根据所述文字信息获取模型中的语义解析模型对所述拼音信息进行语义解析以得到与所述待识别信息对应的目标文字信息。语义解析模型中包含拼音信息与文字信息之间所对应的映射关系,通过语义解析模型中所包含的映射关系即可对所得到的拼音信息进行语义解析以将拼音信息转换为对应的目标文字信息。
例如,拼音“tóng,yì”在语义解析模型中所对应的文字信息为“同意”。
S123、若所述待识别信息中的语音信息包含噪音,反馈重新输入的提示信息以提示所述用户终端的使用者在低噪音环境下再次输入所述待识别信息。
若待识别信息中的语音信息包含噪音,则会影响所获取到的目标文字信息的精确度, 此时需向该用户终端反馈重新输入的提示信息,以提示用户终端的使用者移步至低噪音环境下重新输入待识别信息。
S130、若所述待识别信息为文字信息,将所述待识别信息作为目标文字信息。
若所述待识别信息为文字信息,将所述待识别信息作为目标文字信息。若待识别信息为文字信息,则无需对该待识别信息进行处理,可将待识别信息直接作为目标文字信息进行后续处理。
S140、根据预存的文字处理规则获取与所述目标文字信息对应的文字特征向量。
根据预存的文字处理规则获取与所述目标文字信息对应的文字特征向量。文字处理规则即为对所获取到的目标文字进行转换处理以得到对应文字特征向量的规则信息,通过文字处理规则即可将目标文字信息转换为对应的特征向量,其中,文字处理规则中包括字符筛选规则、字符长度信息及字符向量表,字符筛选规则即为对目标文字信息中无意义的字符进行筛除的规则信息,字符长度信息即为对筛除处理后的目标文字信息所包含的字符数进行统一的数量信息,字符向量表即为对每一字符的向量信息进行记录的数据表。例如,无意义的字符可以是目标文字信息中的语气词(啊、哎)及结构助词(得、地)等。
在一实施例中,如图5所示,步骤S140包括子步骤S141、S142和S143。
S141、根据所述字符筛选规则对所述目标文字信息进行筛选以得到筛选文字信息。
根据所述字符筛选规则对所述目标文字信息进行筛选以得到筛选文字信息。字符筛选规则即是用于对目标文字信息进行筛选的规则信息,具体的,字符筛选规则可将目标文字信息中意义不大的字符筛除,得到的筛选文字信息中包含的字符均为具有实际意义的字符。
S142、根据所述字符长度信息对所述筛选文字信息进行标准化处理以得到对应的待转换文字信息。
根据所述字符长度信息对所述筛选文字信息进行标准化处理以得到对应的待转换文字信息。筛选文字信息中所包含的字符数量并不相等,为方便对筛选文字信息进行后续处理,需根据字符长度信息对所得到的筛选文字信息进行处理,以得到字符数量与字符长度信息相等的待转换文字信息。具体的,字符长度信息可记为N,若筛选文字信息中所包含的字符数量超过字符长度信息N,则截取筛选文字信息中前N个字符作为待转换文字信息;若筛选文字信息中所包含的字符数量少于字符长度信息N,则使用空字符(使用□进行表示)将筛选文字信息的字符进行补齐以得到包含N个字符的待转换文字信息;若筛选文字信息中所包含的字符数量等于字符长度信息N,则直接将该筛选文字信息作为待转换文字信息。
S143、根据所述文字处理规则中的字符向量表获取与所述待转换文字信息对应的文字特征向量。
根据所述文字处理规则中的字符向量表获取与所述待转换文字信息对应的文字特征向量。具体的,字符向量表中包含每一字符对应的一个1×M维的向量,该1×M维的向量可用于对字符的特征进行量化。根据待转换文字信息即可从字符向量信息表中获取该待转换文字信息中每一字符对应的一个1×M维向量,将该待转换文字信息中包含的N个字符所对应的1×M维向量进行组合,即可得到一个N×M的向量作为文字特征向量,也即是将待转换文字信息转换为对应的文字特征向量。
例如,若M=8,字符向量表中所包含的部分信息如表1所示。
字符 1×M维的向量
{a 1,a 2,a 3,a 4,a 5,a 6,a 7,a 8}
{b 1,b 2,b 3,b 4,b 5,b 6,b 7,b 8}
{c 1,c 2,c 3,c 4,c 5,c 6,c 7,c 8}
{d 1,d 2,d 3,d 4,d 5,d 6,d 7,d 8}
□(空字符) {0,0,0,0,0,0,0,0}
表1
某一段文字为“如何办理”,即可对应得到一个与该文字信息对应的4×8维的特征向量:
Figure PCTCN2020092977-appb-000001
S150、将所述文字特征向量输入预存的情绪识别模型以获取与所述文字特征向量对应的情绪等级。
将所述文字特征向量输入预存的情绪识别模型以获取与所述文字特征向量对应的情绪等级。情绪识别模型即是用于获取与文字特征向量对应情绪等级的模型,也即是用于从待识别信息中对使用者的情绪等级进行识别的模型,情绪识别模型包括长短期记忆网络(Long Short-Term Memory,LSTM)、权重层及神经网络。
在一实施例中,如图6所示,步骤S150包括子步骤S151和S152。
S151、将所述文字特征向量输入所述长短期记忆网络以将对应的记忆网络输出信息。
将所述文字特征向量输入所述长短期记忆网络以将对应的记忆网络输出信息。具体的,获取文字特征向量的记忆网络输出信息的步骤分为五步,①计算遗忘门输出信息:f(t)=σ(Wf×h(t_1)+Uf×X(t)+bf),其中f(t)为遗忘门参数值,0≤f(t)≤1;σ为激活函数计算符号,σ可具体表示为f(x)=(1+e^(-x)) -1,则将Wf×h(t_1)+Uf×X(t)+bf的计算结果作为x输入激活函数σ即可计算得到f(t);Wf、Uf及bf均为本细胞中公式的参数值;h(t_1)为上一细胞的输出门信息;X(t)为文字特征向量中第一个字符对应的向量,也即是输入当前细胞的1×M维的向量,若当前细胞为长短期记忆网络中的第一个细胞,则h(t_1)为零。②计算输入门信息:i(t)=σ(Wi×h(t_1)+Ui×X(t)+bi);a(t)=tanh(Wa×h(t-1)+Ua×X(t)+ba),其中i(t)为输入门参数值,0≤i(t)≤1;Wi、Ui、bi、Wa、Ua及ba均为本细胞中公式的参数值,a(t)为所计算得到的输入门向量值,a(t)为一个1×M维的向量。③更新细胞记忆信息:C(t)=C(t_1)⊙f(t)+i(t)⊙a(t),C为每一次计算过程所累计的细胞记忆信息,C(t)为当前细胞所输出的细胞记忆信息,C(t_1)为上一细胞所输出的细胞记忆信息,⊙为向量运算符,C(t_1)⊙f(t)的计算过程为将向量C(t_1)中每一维度值分别与f(t)相乘,所计算的得到的向量维度与向量C(t_1)中的维度相同。④计算输出门信息:o(t)=σ(Wo×h(t_1)+Uo×X(t)+bo);h(t)=o(t)⊙tanh(C(t)),o(t)为输出门参数值,0≤o(t)≤1;Wo、Uo及bo均为本细胞中公式的参数值,h(t)为本细胞的输出门信息,h(t)为一个1×M维的向量。⑤计算当前细胞的输出信息:y(t)=σ(V×h(t)+c),V及c均为本细胞中公式的参数值。每一个细胞经过一轮计算后均可计算得到一个输出信息,综合N个细胞的输出信息即可得到一个文字特征向量的记忆网络输出信息,一个文字特征向量的记忆网络输出信息为一个1×N维的向量。
S152、根据所述权重层及所述神经网络对所述记忆网络输出信息进行计算以获取对应的情绪等级。
根据所述权重层及所述神经网络对所述记忆网络输出信息进行计算以获取对应的情绪等级。权重层中所包含权重值的数量与字符数量信息,也即是权重值的数量为N,将所计算得到的记忆网络输出信息与权重层进行相乘,也即是记忆网络输出信息中的第n个维度值与权重层中的第n个权重值相乘(0≤n≤N),即可得到附加权重值的记忆网络输出信息。将附加权重值的记忆网络输出信息输入神经网络,其中,神经网络中包含N个输入节点,每一输入节点均与附加权重值的记忆网络输出信息中向量的一个维度值对应,输入节点与输出节点之间包含全连接层,输入节点与全连接层之间设置有第一公式组,输出节点与全连接层之间设置有第二公式组。其中,第一公式组包含所有输入节点至所有特征单元的公 式,第一公式组中的公式均以输入节点值作为输入值、特征单元值作为输出值,第二公式组包含所有输出节点至所有特征单元的公式,第二公式组中的公式均以特征单元值作为输入值、输出节点值作为输出值,神经网络中所包含的每一公式中均拥有对应的参数值。每一输出节点均对应一个情绪类别,输出节点值也即是该待识别信息所属情绪类别的概率值,获取该待识别信息的概率值最高的情绪类别作为神经网络所输出的情绪等级,也即是获取得到与文字特征向量对应的情绪等级。
例如,神经网络中包含正面情绪、中性情绪和负面情绪三个情绪类别,神经网络所输出的与正面情绪对应的概率值为65%,与中性情绪对应的概率值为24%,与负面情绪对应的概率值为33%,则将得到与文字特征向量对应的情绪等级为正面情绪。
S160、根据所述情绪等级及所述目标文字信息从预存的语料信息库中获取相匹配的语料信息作为答复语料信息并反馈至所述用户终端。
根据所述情绪等级及所述目标文字信息从预存的语料信息库中获取相匹配的语料信息作为答复语料信息并反馈至所述用户终端。语料信息库中包含对所有可能提出的问题进行答复的多条语料信息,语料信息可以是文字信息、音频信息、视频信息,还可以是文字信息与音频信息或的结合或文字信息与视频信息的结合,语料信息库中针对同一问题可包含多条语料信息,对同一问题的多条语料信息可适用于各种情绪等级。从语料信息库中获取与目标文字信息及情绪等级对应的答复语料信息,即可针对使用者的情绪等级针对性地回答其所提出的问题。具体的,语料信息可以是对用户终端的使用者所提出的问题进行解答,例如,对使用者所提出的业务名词进行详细解释;还可以是根据使用者的问题反馈对应的引导信息以对其进行引导,以引导用户终端的使用者进行业务办理的相关操作。
其中,上述语料信息库可以为基于大数据相关技术的获取的语料信息集合,也就是说上述语料信息库可以以数据库或者数据库集群的形式表现。
其中,上述语料信息还可以基于分布式存储的方式存储于区块链中,当需要从语料信息库中获取相匹配的语料信息作为答复语料信息时,可通过调用区块链智能合约以完成语料匹配和获取的过程。
在一实施例中,如图7所示,步骤S160包括子步骤S161和S162。
S161、从语料信息库中获取与所述目标文字信息对应的目标语料信息。
从语料信息库中获取与所述目标文字信息对应的目标语料信息。具体的,语料信息库中的语料信息均对应一个或多个语料关键字,针对同一问题的多条语料信息所包含的语料关键字相同,可对目标文字信息与语料信息中所包含的语料关键字进行匹配,以获取目标文字信息中与语料关键字相匹配的字符数,计算该字符数与对应语料关键字的字符数之间的比值,得到目标文字信息与该语料信息之间的匹配度,获取目标文字信息与每一语料信息之间的匹配度后,将语料信息匹配度最高的一条或多条语料信息作为目标语料信息。
例如,针对某一问题的多条语料信息所包含的语料关键字为“理赔、额度、赔偿款、支付”,某一目标文字信息为“产品A的理赔额度是多少,如何支付”,则与该目标文字信息相匹配的语料关键字为“理赔、额度、支付”,得到目标文字信息与上述多个语料信息之间的匹配度P=6/9=66.7%。
S162、从所述目标语料信息中选择与所述情绪等级对应的语料信息作为答复语料信息并反馈至所述用户终端。
从所述目标语料信息中选择与所述情绪等级对应的语料信息作为答复语料信息并反馈至所述用户终端。所获取到的目标语料信息包含多条语料信息,每一语料信息对应不同的情绪等级,则可从目标语料信息中获取与情绪等级对应的语料信息作为答复语料信息。具体的,目标语料信息中至少包含一条与情绪等级对应的语料信息,若目标语料信息中与情绪等级对应的语料信息仅由一条,则将该语料信息确定为答复语料信息;若目标语料信息 中与情绪等级对应的语料信息包含多条,则从多条语料信息中随机选择一条作为答复语料信息。
在一实施例中,如图8所示,步骤S160之后还包括步骤S170和S180。
S170、对所述情绪等级是否为负面情绪进行判断。
S180、若所述情绪等级为负面情绪,将所述待识别信息发送至人工客服端。
判断情绪等级是否为负面情绪,若为负面情绪,则需将待识别信息发送至人工客服端以人工方式对待识别信息进行处理,人工客服端即为企业中人工客服所使用的用户终端,人工客服可使用人工客服端获取待识别信息并采用人工方式输入答复信息至人工客服端,人工客服端可发送该答复信息至用户终端;若不为负面情绪,则可返回执行所述若接收到来自用户终端的待识别信息,判断所述待识别信息的信息类型的步骤,以再次接收来自用户终端的待识别信息并进行处理。
此外,步骤S170和步骤S180还可在步骤S150之后执行,若步骤S170和步骤S180在S150之后执行,则先执行步骤S170判断情绪等级是否为负面情绪;若情绪等级为负面情绪,则执行步骤S180;若情绪等级不为负面情绪,则执行步骤S160;也即是当情绪等级不为负面情绪时,执行所述根据所述情绪等级及所述目标文字信息从预存的语料信息库中获取相匹配的语料信息作为答复语料信息并反馈至所述用户终端的步骤。
在本申请实施例所提供的基于情绪识别的答复方法中,获取来自用户终端的待识别信息对应的目标文字信息,根据文字处理规则获取与目标文字信息对应的文字特征向量,并根据情绪识别模型获取与文字特征向量对应的情绪等级,获取语料信息库中与情绪等级及目标文字信息相匹配的语料信息作为答复语料信息反馈至用户终端,以完成答复。通过上述方法,可获取与待识别信息相匹配的情绪等级,并根据情绪等级及与待识别信息对应的目标文字信息获取答复语料信息,可实现根据待识别信息中的情绪等级灵活调整答复语料信息,提高了对提问信息进行答复的灵活性。
需要强调的是,为进一步保证上述文字处理规则、语料信息库信息等的私密和安全性,上述文字处理规则、语料信息库信息等信息还可以存储于一区块链的节点中。
本申请所指区块链是分布式数据存储、点对点传输、共识机制、加密算法等计算机技术的新型应用模式。区块链(Blockchain),本质上是一个去中心化的数据库,是一串使用密码学方法相关联产生的数据块,每一个数据块中包含了一批次网络交易的信息,用于验证其信息的有效性(防伪)和生成下一个区块。区块链可以包括区块链底层平台、平台产品服务层以及应用服务层等。
本申请实施例还提供一种基于情绪识别的答复装置,该基于情绪识别的答复装置用于执行前述基于情绪识别的答复方法的任一实施例。具体地,请参阅图9,图9是本申请实施例提供的基于情绪识别的答复装置的示意性框图。该基于情绪识别的答复装置可以配置于管理服务器10中。
如图9所示,基于情绪识别的答复装置100包括待识别信息判断单元110、待识别信息识别单元120、目标文字信息获取单元130、文字特征向量获取单元140、情绪等级获取单元150和答复语料信息获取单元160。
待识别信息判断单元110,用于若接收到来自用户终端的待识别信息,判断所述待识别信息的信息类型,所述信息类型包括文字信息和语音信息。
待识别信息识别单元120,用于若所述待识别信息为语音信息,根据预存的语音识别模型对所述语音信息进行识别以得到与所述待识别信息对应的目标文字信息。
其他申请实施例中,所述待识别信息识别单元120包括子单元:噪音判断单元、语音信息识别单元和提示信息反馈单元。
噪音判断单元,用于根据所述噪音判断规则对所述待识别信息中的语音信息是否包含 噪音进行判断;语音信息识别单元,用于若所述待识别信息中的语音信息不包含噪音,根据所述文字信息获取模型对所述待识别信息中的语音信息进行识别以得到与所述待识别信息对应的目标文字信息;提示信息反馈单元,用于若所述待识别信息中的语音信息包含噪音,反馈重新输入的提示信息以提示所述用户终端的使用者在低噪音环境下再次输入所述待识别信息。
其他申请实施例中,所述语音信息识别单元包括子单元:音素获取单元、拼音信息获取单元和语义解析单元。
音素获取单元,用于根据所述文字信息获取模型中的声学模型对所述待识别信息进行切分以得到所述待识别信息中所包含的多个音素;拼音信息获取单元,用于根据所述文字信息获取模型中的语音特征词典对所述音素进行匹配以将所述音素转换为拼音信息;语义解析单元,用于根据所述文字信息获取模型中的语义解析模型对所述拼音信息进行语义解析以得到与所述待识别信息对应的目标文字信息。
目标文字信息获取单元130,用于若所述待识别信息为文字信息,将所述待识别信息作为目标文字信息。
文字特征向量获取单元140,用于根据预存的文字处理规则获取与所述目标文字信息对应的文字特征向量。
其他申请实施例中,所述文字特征向量获取单元140包括子单元:筛选文字信息获取单元、标准化处理单元和文字信息转换单元。
筛选文字信息获取单元,用于根据所述字符筛选规则对所述目标文字信息进行筛选以得到筛选文字信息;标准化处理单元,用于根据所述字符长度信息对所述筛选文字信息进行标准化处理以得到对应的待转换文字信息;文字信息转换单元,用于根据所述文字处理规则中的字符向量表获取与所述待转换文字信息对应的文字特征向量。
情绪等级获取单元150,用于将所述文字特征向量输入预存的情绪识别模型以获取与所述文字特征向量对应的情绪等级。
其他申请实施例中,所述情绪等级获取单元150包括子单元:筛选文字信息获取单元和计算单元。
记忆网络输出信息获取单元,用于将所述文字特征向量输入所述长短期记忆网络以将对应的记忆网络输出信息;计算单元,用于根据所述权重层及所述神经网络对所述记忆网络输出信息进行计算以获取对应的情绪等级。
答复语料信息获取单元160,用于根据所述情绪等级及所述目标文字信息从预存的语料信息库中获取相匹配的语料信息作为答复语料信息并反馈至所述用户终端。
其他申请实施例中,所述答复语料信息获取单元160包括子单元:目标语料信息获取单元和答复语料信息选择单元。
目标语料信息获取单元,用于从语料信息库中获取与所述目标文字信息对应的目标语料信息;答复语料信息选择单元,用于从所述目标语料信息中选择与所述情绪等级对应的语料信息作为答复语料信息并反馈至所述用户终端。
其他申请实施例中,所述基于情绪识别的答复装置100还包括子单元:情绪等级判断单元和待识别信息发送单元。
情绪等级判断单元,用于对所述情绪等级是否为负面情绪进行判断;待识别信息发送单元,用于若所述情绪等级为负面情绪,将所述待识别信息发送至人工客服端。
在本申请实施例所提供的基于情绪识别的答复装置应用上述基于情绪识别的答复方法,获取来自用户终端的待识别信息对应的目标文字信息,根据文字处理规则获取与目标文字信息对应的文字特征向量,并根据情绪识别模型获取与文字特征向量对应的情绪等级,获取语料信息库中与情绪等级及目标文字信息相匹配的语料信息作为答复语料信息反馈至用 户终端,以完成答复。通过上述方法,可获取与待识别信息相匹配的情绪等级,并根据情绪等级及与待识别信息对应的目标文字信息获取答复语料信息,可实现根据待识别信息中的情绪等级灵活调整答复语料信息,提高了对提问信息进行答复的灵活性。
上述基于情绪识别的答复装置可以实现为计算机程序的形式,该计算机程序可以在如图10所示的计算机设备上运行。
请参阅图10,图10是本申请实施例提供的计算机设备的示意性框图。该计算机设备500可以实现为上述用于执行基于情绪识别的答复方法的管理服务器10。
参阅图10,该计算机设备500包括通过系统总线501连接的处理器502、存储器和网络接口505,其中,存储器可以包括非易失性存储介质503和内存储器504。
该非易失性存储介质503可存储操作系统5031和计算机程序5032。该计算机程序5032被执行时,可使得处理器502执行基于情绪识别的答复方法。
该处理器502用于提供计算和控制能力,支撑整个计算机设备500的运行。
该内存储器504为非易失性存储介质503中的计算机程序5032的运行提供环境,该计算机程序5032被处理器502执行时,可使得处理器502执行基于情绪识别的答复方法。
该网络接口505用于进行网络通信,如提供数据信息的传输等。管理服务器10可基于该网络接口505通过互联网与用户终端20建立网络连接,以实现与用户终端20进行通信。本领域技术人员可以理解,图10中示出的结构,仅仅是与本申请方案相关的部分结构的框图,并不构成对本申请方案所应用于其上的计算机设备500的限定,具体的计算机设备500可以包括比图中所示更多或更少的部件,或者组合某些部件,或者具有不同的部件布置。
其中,所述处理器502用于运行存储在存储器中的计算机程序5032,以实现如下功能:若接收到来自用户终端的待识别信息,判断所述待识别信息的信息类型,所述信息类型包括文字信息和语音信息;若所述待识别信息为语音信息,根据预存的语音识别模型对所述语音信息进行识别以得到与所述待识别信息对应的目标文字信息;若所述待识别信息为文字信息,将所述待识别信息作为目标文字信息;根据预存的文字处理规则获取与所述目标文字信息对应的文字特征向量;将所述文字特征向量输入预存的情绪识别模型以获取与所述文字特征向量对应的情绪等级;根据所述情绪等级及所述目标文字信息从预存的语料信息库中获取相匹配的语料信息作为答复语料信息并反馈至所述用户终端。
在一实施例中,处理器502在执行若所述待识别信息为语音信息,根据预存的语音识别模型对所述语音信息进行识别以得到与所述待识别信息对应的目标文字信息的步骤时,执行如下操作:根据所述噪音判断规则对所述待识别信息中的语音信息是否包含噪音进行判断;若所述待识别信息中的语音信息不包含噪音,根据所述文字信息获取模型对所述待识别信息中的语音信息进行识别以得到与所述待识别信息对应的目标文字信息;若所述待识别信息中的语音信息包含噪音,反馈重新输入的提示信息以提示所述用户终端的使用者在低噪音环境下再次输入所述待识别信息。
在一实施例中,处理器502在执行若所述待识别信息中的语音信息不包含噪音,根据所述文字信息获取模型对所述待识别信息中的语音信息进行识别以得到与所述待识别信息对应的目标文字信息的步骤时,执行如下操作:根据所述文字信息获取模型中的声学模型对所述待识别信息进行切分以得到所述待识别信息中所包含的多个音素;根据所述文字信息获取模型中的语音特征词典对所述音素进行匹配以将所述音素转换为拼音信息;根据所述文字信息获取模型中的语义解析模型对所述拼音信息进行语义解析以得到与所述待识别信息对应的目标文字信息。
在一实施例中,处理器502在执行根据预存的文字处理规则获取与所述目标文字信息对应的文字特征向量的步骤时,执行如下操作:根据所述字符筛选规则对所述目标文字信息进行筛选以得到筛选文字信息;根据所述字符长度信息对所述筛选文字信息进行标准化 处理以得到对应的待转换文字信息;根据所述文字处理规则中的字符向量表获取与所述待转换文字信息对应的文字特征向量。
在一实施例中,处理器502在执行将所述文字特征向量输入预存的情绪识别模型以获取与所述文字特征向量对应的情绪等级的步骤时,执行如下操作:所述文字特征向量输入所述长短期记忆网络以将对应的记忆网络输出信息;根据所述权重层及所述神经网络对所述记忆网络输出信息进行计算以获取对应的情绪等级。
在一实施例中,处理器502在执行根据所述情绪等级及所述目标文字信息从预存的语料信息库中获取相匹配的语料信息作为答复语料信息并反馈至所述用户终端的步骤时,执行如下操作:从语料信息库中获取与所述目标文字信息对应的目标语料信息;从所述目标语料信息中选择与所述情绪等级对应的语料信息作为答复语料信息并反馈至所述用户终端。
在一实施例中,处理器502还执行如下操作:对所述情绪等级是否为负面情绪进行判断;若所述情绪等级为负面情绪,将所述待识别信息发送至人工客服端。
本领域技术人员可以理解,图10中示出的计算机设备的实施例并不构成对计算机设备具体构成的限定,在其他实施例中,计算机设备可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。例如,在一些实施例中,计算机设备可以仅包括存储器及处理器,在这样的实施例中,存储器及处理器的结构及功能与图10所示实施例一致,在此不再赘述。
应当理解,在本申请实施例中,处理器502可以是中央处理单元(Central Processing Unit,CPU),该处理器502还可以是其他通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现成可编程门阵列(Field-Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。其中,通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。
在本申请的另一实施例中提供计算机可读存储介质。该计算机可读存储介质可以为非易失性的计算机可读存储介质。其中,该计算机可读存储介质可以是非易失性,也可以是易失性。该计算机可读存储介质存储有计算机程序,其中计算机程序被处理器执行时实现以下步骤:若接收到来自用户终端的待识别信息,判断所述待识别信息的信息类型,所述信息类型包括文字信息和语音信息;若所述待识别信息为语音信息,根据预存的语音识别模型对所述语音信息进行识别以得到与所述待识别信息对应的目标文字信息;若所述待识别信息为文字信息,将所述待识别信息作为目标文字信息;根据预存的文字处理规则获取与所述目标文字信息对应的文字特征向量;将所述文字特征向量输入预存的情绪识别模型以获取与所述文字特征向量对应的情绪等级;根据所述情绪等级及所述目标文字信息从预存的语料信息库中获取相匹配的语料信息作为答复语料信息并反馈至所述用户终端。
在一实施例中,所述若所述待识别信息为语音信息,根据预存的语音识别模型对所述语音信息进行识别以得到与所述待识别信息对应的目标文字信息的步骤,包括:根据所述噪音判断规则对所述待识别信息中的语音信息是否包含噪音进行判断;若所述待识别信息中的语音信息不包含噪音,根据所述文字信息获取模型对所述待识别信息中的语音信息进行识别以得到与所述待识别信息对应的目标文字信息;若所述待识别信息中的语音信息包含噪音,反馈重新输入的提示信息以提示所述用户终端的使用者在低噪音环境下再次输入所述待识别信息。
在一实施例中,所述若所述待识别信息中的语音信息不包含噪音,根据所述文字信息获取模型对所述待识别信息中的语音信息进行识别以得到与所述待识别信息对应的目标文字信息的步骤,包括:根据所述文字信息获取模型中的声学模型对所述待识别信息进行切分以得到所述待识别信息中所包含的多个音素;根据所述文字信息获取模型中的语音特征 词典对所述音素进行匹配以将所述音素转换为拼音信息;根据所述文字信息获取模型中的语义解析模型对所述拼音信息进行语义解析以得到与所述待识别信息对应的目标文字信息。
在一实施例中,所述根据预存的文字处理规则获取与所述目标文字信息对应的文字特征向量的步骤,包括:根据所述字符筛选规则对所述目标文字信息进行筛选以得到筛选文字信息;根据所述字符长度信息对所述筛选文字信息进行标准化处理以得到对应的待转换文字信息;根据所述文字处理规则中的字符向量表获取与所述待转换文字信息对应的文字特征向量。
在一实施例中,所述将所述文字特征向量输入预存的情绪识别模型以获取与所述文字特征向量对应的情绪等级的步骤,包括:所述文字特征向量输入所述长短期记忆网络以将对应的记忆网络输出信息;根据所述权重层及所述神经网络对所述记忆网络输出信息进行计算以获取对应的情绪等级。
在一实施例中,所述根据所述情绪等级及所述目标文字信息从预存的语料信息库中获取相匹配的语料信息作为答复语料信息并反馈至所述用户终端的步骤,包括:从语料信息库中获取与所述目标文字信息对应的目标语料信息;从所述目标语料信息中选择与所述情绪等级对应的语料信息作为答复语料信息并反馈至所述用户终端。
在一实施例中,还包括:对所述情绪等级是否为负面情绪进行判断;若所述情绪等级为负面情绪,将所述待识别信息发送至人工客服端。
所属领域的技术人员可以清楚地了解到,为了描述的方便和简洁,上述描述的设备、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、计算机软件或者二者的结合来实现,为了清楚地说明硬件和软件的可互换性,在上述说明中已经按照功能一般性地描述了各示例的组成及步骤。这些功能究竟以硬件还是软件方式来执行取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
在本申请所提供的几个实施例中,应该理解到,所揭露的设备、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为逻辑功能划分,实际实现时可以有另外的划分方式,也可以将具有相同功能的单元集合成一个单元,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另外,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口、装置或单元的间接耦合或通信连接,也可以是电的,机械的或其它的形式连接。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本申请实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以是两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分,或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个计算机可读存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方 法的全部或部分步骤。
所述计算机可读存储介质为实体的、非瞬时性的存储介质,所述计算机可读存储介质可以是前述设备的内部存储单元,例如设备的硬盘或内存等实体存储介质。所述存储介质也可以是所述设备的外部存储设备,例如所述设备上配备的插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字(Secure Digital,SD)卡,闪存卡(Flash Card)等实体存储介质。
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到各种等效的修改或替换,这些修改或替换都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以权利要求的保护范围为准。

Claims (20)

  1. 一种基于情绪识别的答复方法,应用于管理服务器,所述管理服务器与至少一台用户终端进行通信,其中,所述方法包括:
    若接收到来自用户终端的待识别信息,判断所述待识别信息的信息类型,所述信息类型包括文字信息和语音信息;
    若所述待识别信息为语音信息,根据预存的语音识别模型对所述语音信息进行识别以得到与所述待识别信息对应的目标文字信息;
    若所述待识别信息为文字信息,将所述待识别信息作为目标文字信息;
    根据预存的文字处理规则获取与所述目标文字信息对应的文字特征向量;
    将所述文字特征向量输入预存的情绪识别模型以获取与所述文字特征向量对应的情绪等级;
    根据所述情绪等级及所述目标文字信息从预存的语料信息库中获取相匹配的语料信息作为答复语料信息并反馈至所述用户终端。
  2. 根据权利要求1所述的基于情绪识别的答复方法,其中,所述预存的语音识别模型包括噪音判断规则及文字信息获取模型,所述根据预存的语音识别模型对所述语音信息进行识别以得到与所述待识别信息对应的目标文字信息,包括:
    根据所述噪音判断规则对所述待识别信息中的语音信息是否包含噪音进行判断;
    若所述待识别信息中的语音信息不包含噪音,根据所述文字信息获取模型对所述待识别信息中的语音信息进行识别以得到与所述待识别信息对应的目标文字信息;
    若所述待识别信息中的语音信息包含噪音,反馈重新输入的提示信息以提示所述用户终端的使用者在低噪音环境下再次输入所述待识别信息。
  3. 根据权利要求2所述的基于情绪识别的答复方法,其中,所述文字信息获取模型包括声学模型、语音特征词典及语义解析模型,所述根据所述文字信息获取模型对所述待识别信息中的语音信息进行识别以得到与所述待识别信息对应的目标文字信息,包括:
    根据所述文字信息获取模型中的声学模型对所述待识别信息进行切分以得到所述待识别信息中所包含的多个音素;
    根据所述文字信息获取模型中的语音特征词典对所述音素进行匹配以将所述音素转换为拼音信息;
    根据所述文字信息获取模型中的语义解析模型对所述拼音信息进行语义解析以得到与所述待识别信息对应的目标文字信息。
  4. 根据权利要求1所述的基于情绪识别的答复方法,其中,所述预存的文字处理规则中包括字符筛选规则、字符长度信息及字符向量表,所述根据预存的文字处理规则获取与所述目标文字信息对应的文字特征向量,包括:
    根据所述字符筛选规则对所述目标文字信息进行筛选以得到筛选文字信息;
    根据所述字符长度信息对所述筛选文字信息进行标准化处理以得到对应的待转换文字信息;
    根据所述文字处理规则中的字符向量表获取与所述待转换文字信息对应的文字特征向量。
  5. 根据权利要求1所述的基于情绪识别的答复方法,其中,所述情绪识别模型包括长短期记忆网络、权重层及神经网络,所述将所述文字特征向量输入预存的情绪识别模型以获取与所述文字特征向量对应的情绪等级,包括:
    将所述文字特征向量输入所述长短期记忆网络以将对应的记忆网络输出信息;
    根据所述权重层及所述神经网络对所述记忆网络输出信息进行计算以获取对应的情绪等级。
  6. 根据权利要求1所述的基于情绪识别的答复方法,其中,所述根据所述情绪等级及所述目标文字信息从预存的语料信息库中获取相匹配的语料信息作为答复语料信息并反馈至所述用户终端,包括:
    从语料信息库中获取与所述目标文字信息对应的目标语料信息;
    从所述目标语料信息中选择与所述情绪等级对应的语料信息作为答复语料信息并反馈至所述用户终端。
  7. 根据权利要求1所述的基于情绪识别的答复方法,其中,还包括:
    对所述情绪等级是否为负面情绪进行判断;
    若所述情绪等级为负面情绪,将所述待识别信息发送至人工客服端。
  8. 根据权利要求2所述的基于情绪识别的答复方法,其中,所述根据所述噪音判断规则对所述待识别信息中的语音信息是否包含噪音进行判断,包括:
    基于所述语音信息中声纹信号的频率,从所述语音信息中获取位于固定频率区间的声纹信号,将所述位于所述固定频率区间的声纹信号的平均信号强度确定为目标声音信号强度;
    从所述语音信息中获取位于所述固定频率区间外的声纹信号,将所述位于所述固定频率区间外的声纹信号的平均强度确定为背景噪声信号强度;
    若所述背景噪声信号强度与所述目标声音信号强度的比值大于所述噪音判断规则对应的判断阈值,则确定所述语音信息中包含噪音;
    若所述背景噪声信号强度与所述目标声音信号强度的比值不大于所述噪音判断规则对应的判断阈值,则确定所述语音信息中不包含噪音。
  9. 根据权利要求4所述的基于情绪识别的答复方法,其中,所述字符长度信息为N,所述根据所述字符长度信息对所述筛选文字信息进行标准化处理以得到对应的待转换文字信息,包括:
    若所述筛选文字信息中所包含的字符数量超过所述字符长度信息,则截取所述筛选文字信息中前N个字符作为待转换文字信息;
    若所述筛选文字信息中所包含的字符数量少于所述字符长度信息,则使用空字符将所述筛选文字信息的字符进行补齐以得到包含N个字符的所述待转换文字信息;
    若所述筛选文字信息中所包含的字符数量等于字符长度信息N,则直接将所述筛选文字信息作为所述待转换文字信息。
  10. 一种基于情绪识别的答复装置,其中,包括:
    待识别信息判断单元,用于若接收到来自用户终端的待识别信息,判断所述待识别信息的信息类型,所述信息类型包括文字信息和语音信息;
    待识别信息识别单元,用于若所述待识别信息为语音信息,根据预存的语音识别模型对所述语音信息进行识别以得到与所述待识别信息对应的目标文字信息;
    目标文字信息获取单元,用于若所述待识别信息为文字信息,将所述待识别信息作为目标文字信息;
    文字特征向量获取单元,用于根据预存的文字处理规则获取与所述目标文字信息对应的文字特征向量;
    情绪等级获取单元,用于将所述文字特征向量输入预存的情绪识别模型以获取与所述文字特征向量对应的情绪等级;
    答复语料信息获取单元,用于根据所述情绪等级及所述目标文字信息从预存的语料信息库中获取相匹配的语料信息作为答复语料信息并反馈至所述用户终端。
  11. 一种计算机设备,包括存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机程序,其中,所述处理器执行所述计算机程序时实现:
    若接收到来自用户终端的待识别信息,判断所述待识别信息的信息类型,所述信息类型包括文字信息和语音信息;
    若所述待识别信息为语音信息,根据预存的语音识别模型对所述语音信息进行识别以得到与所述待识别信息对应的目标文字信息;
    若所述待识别信息为文字信息,将所述待识别信息作为目标文字信息;
    根据预存的文字处理规则获取与所述目标文字信息对应的文字特征向量;
    将所述文字特征向量输入预存的情绪识别模型以获取与所述文字特征向量对应的情绪等级;
    根据所述情绪等级及所述目标文字信息从预存的语料信息库中获取相匹配的语料信息作为答复语料信息并反馈至所述用户终端。
  12. 根据权利要求11所述的计算机设备,其中,所述预存的语音识别模型包括噪音判断规则及文字信息获取模型,所述处理器执行所述计算机程序时实现:
    根据所述噪音判断规则对所述待识别信息中的语音信息是否包含噪音进行判断;
    若所述待识别信息中的语音信息不包含噪音,根据所述文字信息获取模型对所述待识别信息中的语音信息进行识别以得到与所述待识别信息对应的目标文字信息;
    若所述待识别信息中的语音信息包含噪音,反馈重新输入的提示信息以提示所述用户终端的使用者在低噪音环境下再次输入所述待识别信息。
  13. 根据权利要求12所述的计算机设备,其中,所述文字信息获取模型包括声学模型、语音特征词典及语义解析模型,所述处理器执行所述计算机程序时实现:
    根据所述文字信息获取模型中的声学模型对所述待识别信息进行切分以得到所述待识别信息中所包含的多个音素;
    根据所述文字信息获取模型中的语音特征词典对所述音素进行匹配以将所述音素转换为拼音信息;
    根据所述文字信息获取模型中的语义解析模型对所述拼音信息进行语义解析以得到与所述待识别信息对应的目标文字信息。
  14. 根据权利要求11所述的计算机设备,其中,所述预存的文字处理规则中包括字符筛选规则、字符长度信息及字符向量表,所述处理器执行所述计算机程序时实现:
    根据所述字符筛选规则对所述目标文字信息进行筛选以得到筛选文字信息;
    根据所述字符长度信息对所述筛选文字信息进行标准化处理以得到对应的待转换文字信息;
    根据所述文字处理规则中的字符向量表获取与所述待转换文字信息对应的文字特征向量。
  15. 根据权利要求11所述的计算机设备,其中,所述情绪识别模型包括长短期记忆网络、权重层及神经网络,所述处理器执行所述计算机程序时实现:
    将所述文字特征向量输入所述长短期记忆网络以将对应的记忆网络输出信息;
    根据所述权重层及所述神经网络对所述记忆网络输出信息进行计算以获取对应的情绪等级。
  16. 根据权利要求11所述的计算机设备,其中,所述处理器执行所述计算机程序时实现:
    从语料信息库中获取与所述目标文字信息对应的目标语料信息;
    从所述目标语料信息中选择与所述情绪等级对应的语料信息作为答复语料信息并反馈至所述用户终端。
  17. 根据权利要求11所述的计算机设备,其中,所述处理器执行所述计算机程序时实现:
    对所述情绪等级是否为负面情绪进行判断;
    若所述情绪等级为负面情绪,将所述待识别信息发送至人工客服端。
  18. 根据权利要求12所述的计算机设备,其中,所述处理器执行所述计算机程序时实现:
    基于所述语音信息中声纹信号的频率,从所述语音信息中获取位于固定频率区间的声纹信号,将所述位于所述固定频率区间的声纹信号的平均信号强度确定为目标声音信号强度;
    从所述语音信息中获取位于所述固定频率区间外的声纹信号,将所述位于所述固定频率区间外的声纹信号的平均强度确定为背景噪声信号强度;
    若所述背景噪声信号强度与所述目标声音信号强度的比值大于所述噪音判断规则对应的判断阈值,则确定所述语音信息中包含噪音;
    若所述背景噪声信号强度与所述目标声音信号强度的比值不大于所述噪音判断规则对应的判断阈值,则确定所述语音信息中不包含噪音。
  19. 根据权利要求14所述的计算机设备,其中,所述字符长度信息为N,所述处理器执行所述计算机程序时实现:
    若所述筛选文字信息中所包含的字符数量超过所述字符长度信息,则截取所述筛选文字信息中前N个字符作为待转换文字信息;
    若所述筛选文字信息中所包含的字符数量少于所述字符长度信息,则使用空字符将所述筛选文字信息的字符进行补齐以得到包含N个字符的所述待转换文字信息;
    若所述筛选文字信息中所包含的字符数量等于字符长度信息N,则直接将所述筛选文字信息作为所述待转换文字信息。
  20. 一种计算机可读存储介质,其中,所述计算机可读存储介质存储有计算机程序,所述计算机程序当被处理器执行时使所述处理器执行如权利要求1至9任一项所述的基于情绪识别的答复方法。
PCT/CN2020/092977 2020-04-27 2020-05-28 基于情绪识别的答复方法、装置、计算机设备及存储介质 WO2021217769A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010345920.2 2020-04-27
CN202010345920.2A CN111694938B (zh) 2020-04-27 2020-04-27 基于情绪识别的答复方法、装置、计算机设备及存储介质

Publications (1)

Publication Number Publication Date
WO2021217769A1 true WO2021217769A1 (zh) 2021-11-04

Family

ID=72476636

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/092977 WO2021217769A1 (zh) 2020-04-27 2020-05-28 基于情绪识别的答复方法、装置、计算机设备及存储介质

Country Status (2)

Country Link
CN (1) CN111694938B (zh)
WO (1) WO2021217769A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116233540A (zh) * 2023-03-10 2023-06-06 北京富通亚讯网络信息技术有限公司 基于视频图像识别的并行信号处理方法及系统
CN116434733A (zh) * 2023-04-25 2023-07-14 深圳市中诺智联科技有限公司 一种关于智能安全帽的ai语音交互处理方法
CN116741143A (zh) * 2023-08-14 2023-09-12 深圳市加推科技有限公司 基于数字分身的个性化ai名片的交互方法及相关组件

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112383593B (zh) * 2020-10-30 2023-06-02 中国平安人寿保险股份有限公司 基于线下陪访的智能内容推送方法、装置及计算机设备
CN112951429A (zh) * 2021-03-25 2021-06-11 浙江连信科技有限公司 用于中小学生心理危机筛查的信息处理方法及装置
CN113569031A (zh) * 2021-07-30 2021-10-29 北京达佳互联信息技术有限公司 一种信息交互方法、装置、电子设备及存储介质
CN113761206A (zh) * 2021-09-10 2021-12-07 平安科技(深圳)有限公司 基于意图识别的信息智能查询方法、装置、设备及介质
CN114999024B (zh) * 2022-05-31 2023-12-19 合众新能源汽车股份有限公司 一种车辆用户反馈信息的收集方法及装置

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108009297A (zh) * 2017-12-27 2018-05-08 广州市云润大数据服务有限公司 基于自然语言处理的文本情感分析方法与系统
WO2018169000A1 (ja) * 2017-03-16 2018-09-20 国立研究開発法人情報通信研究機構 対話システム及びそのためのコンピュータプログラム
CN109949071A (zh) * 2019-01-31 2019-06-28 平安科技(深圳)有限公司 基于语音情绪分析的产品推荐方法、装置、设备和介质
CN110149450A (zh) * 2019-05-22 2019-08-20 欧冶云商股份有限公司 智能客服应答方法及系统
CN110379445A (zh) * 2019-06-20 2019-10-25 深圳壹账通智能科技有限公司 基于情绪分析的业务处理方法、装置、设备及存储介质
CN110459210A (zh) * 2019-07-30 2019-11-15 平安科技(深圳)有限公司 基于语音分析的问答方法、装置、设备及存储介质
CN110890088A (zh) * 2019-10-12 2020-03-17 中国平安财产保险股份有限公司 语音信息反馈方法、装置、计算机设备和存储介质

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107705807B (zh) * 2017-08-24 2019-08-27 平安科技(深圳)有限公司 基于情绪识别的语音质检方法、装置、设备及存储介质
CN109582780B (zh) * 2018-12-20 2021-10-01 广东小天才科技有限公司 一种基于用户情绪的智能问答方法及装置
CN109767765A (zh) * 2019-01-17 2019-05-17 平安科技(深圳)有限公司 话术匹配方法及装置、存储介质、计算机设备
CN110597952A (zh) * 2019-08-20 2019-12-20 深圳壹账通智能科技有限公司 信息处理方法、服务器及计算机存储介质
CN110570879A (zh) * 2019-09-11 2019-12-13 深圳壹账通智能科技有限公司 基于情绪识别的智能会话方法、装置及计算机设备
CN110910901B (zh) * 2019-10-08 2023-03-28 平安科技(深圳)有限公司 一种情绪识别方法及装置、电子设备和可读存储介质
CN110751943A (zh) * 2019-11-07 2020-02-04 浙江同花顺智能科技有限公司 一种语音情绪识别方法、装置以及相关设备

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018169000A1 (ja) * 2017-03-16 2018-09-20 国立研究開発法人情報通信研究機構 対話システム及びそのためのコンピュータプログラム
CN108009297A (zh) * 2017-12-27 2018-05-08 广州市云润大数据服务有限公司 基于自然语言处理的文本情感分析方法与系统
CN109949071A (zh) * 2019-01-31 2019-06-28 平安科技(深圳)有限公司 基于语音情绪分析的产品推荐方法、装置、设备和介质
CN110149450A (zh) * 2019-05-22 2019-08-20 欧冶云商股份有限公司 智能客服应答方法及系统
CN110379445A (zh) * 2019-06-20 2019-10-25 深圳壹账通智能科技有限公司 基于情绪分析的业务处理方法、装置、设备及存储介质
CN110459210A (zh) * 2019-07-30 2019-11-15 平安科技(深圳)有限公司 基于语音分析的问答方法、装置、设备及存储介质
CN110890088A (zh) * 2019-10-12 2020-03-17 中国平安财产保险股份有限公司 语音信息反馈方法、装置、计算机设备和存储介质

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116233540A (zh) * 2023-03-10 2023-06-06 北京富通亚讯网络信息技术有限公司 基于视频图像识别的并行信号处理方法及系统
CN116233540B (zh) * 2023-03-10 2024-04-02 北京富通亚讯网络信息技术有限公司 基于视频图像识别的并行信号处理方法及系统
CN116434733A (zh) * 2023-04-25 2023-07-14 深圳市中诺智联科技有限公司 一种关于智能安全帽的ai语音交互处理方法
CN116741143A (zh) * 2023-08-14 2023-09-12 深圳市加推科技有限公司 基于数字分身的个性化ai名片的交互方法及相关组件
CN116741143B (zh) * 2023-08-14 2023-10-31 深圳市加推科技有限公司 基于数字分身的个性化ai名片的交互方法及相关组件

Also Published As

Publication number Publication date
CN111694938B (zh) 2024-05-14
CN111694938A (zh) 2020-09-22

Similar Documents

Publication Publication Date Title
WO2021217769A1 (zh) 基于情绪识别的答复方法、装置、计算机设备及存储介质
CN111028827B (zh) 基于情绪识别的交互处理方法、装置、设备和存储介质
US11657234B2 (en) Computer-based interlocutor understanding using classifying conversation segments
CN109147770B (zh) 声音识别特征的优化、动态注册方法、客户端和服务器
WO2021218069A1 (zh) 基于场景动态配置的交互处理方法、装置、计算机设备
WO2003050799A1 (en) Method and system for non-intrusive speaker verification using behavior models
WO2022178942A1 (zh) 情绪识别方法、装置、计算机设备和存储介质
CN111680142A (zh) 基于文本识别的自动答复方法、装置、计算机设备
WO2021204017A1 (zh) 文本意图识别方法、装置以及相关设备
CN110475032A (zh) 多业务界面切换方法、装置、计算机装置及存储介质
US11095601B1 (en) Connection tier structure defining for control of multi-tier propagation of social network content
CN113407677B (zh) 评估咨询对话质量的方法、装置、设备和存储介质
WO2019227629A1 (zh) 文本信息的生成方法、装置、计算机设备及存储介质
JP4143541B2 (ja) 動作モデルを使用して非煩雑的に話者を検証するための方法及びシステム
US20180342240A1 (en) System and method for assessing audio files for transcription services
US11841932B2 (en) System and method for updating biometric evaluation systems
WO2023035529A1 (zh) 基于意图识别的信息智能查询方法、装置、设备及介质
CN112116165B (zh) 一种业务绩效确定方法和装置
CN112131369A (zh) 一种业务类别确定方法和装置
US20240161123A1 (en) Auditing user feedback data
US11978475B1 (en) Systems and methods for determining a next action based on a predicted emotion by weighting each portion of the action's reply
US20230085433A1 (en) Systems and methods for correcting automatic speech recognition errors
CN117951265A (zh) 营销话术实时生成方法、装置、设备及存储介质
CN117131093A (zh) 基于人工智能的业务数据处理方法、装置、设备及介质
WO2022212237A1 (en) Systems and methods for training natural language processing models in a contact center

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20933524

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20933524

Country of ref document: EP

Kind code of ref document: A1