WO2022257452A1 - 表情回复方法、装置、设备及存储介质 - Google Patents

表情回复方法、装置、设备及存储介质 Download PDF

Info

Publication number
WO2022257452A1
WO2022257452A1 PCT/CN2022/071318 CN2022071318W WO2022257452A1 WO 2022257452 A1 WO2022257452 A1 WO 2022257452A1 CN 2022071318 W CN2022071318 W CN 2022071318W WO 2022257452 A1 WO2022257452 A1 WO 2022257452A1
Authority
WO
WIPO (PCT)
Prior art keywords
information
result
replied
vector
reply
Prior art date
Application number
PCT/CN2022/071318
Other languages
English (en)
French (fr)
Inventor
杜振中
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2022257452A1 publication Critical patent/WO2022257452A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3346Query execution using probabilistic model
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/01Social networking

Definitions

  • the present application relates to the technical field of artificial intelligence, and in particular to an expression reply method, device, equipment and storage medium.
  • the first aspect of the present application provides an expression replying method, the expression replying method comprising:
  • the classification result includes a target result, and the target result is used to indicate the need to reply to the expression;
  • classification result is the target result, detecting whether user expression information is included in the message to be replied, and obtaining a detection result;
  • reply score is greater than a preset threshold, extracting characteristic information of the information to be replied;
  • a second aspect of the present application provides an electronic device, the electronic device includes a processor and a memory, and the processor is configured to execute computer-readable instructions stored in the memory to implement the following steps:
  • the classification result includes a target result, and the target result is used to indicate the need to reply to the expression;
  • classification result is the target result, detecting whether user expression information is included in the message to be replied, and obtaining a detection result;
  • reply score is greater than a preset threshold, extracting characteristic information of the information to be replied;
  • a third aspect of the present application provides a computer-readable storage medium, on which at least one computer-readable instruction is stored, and the at least one computer-readable instruction is executed by a processor to implement the following steps:
  • the classification result includes a target result, and the target result is used to indicate the need to reply to the expression;
  • classification result is the target result, detecting whether user expression information is included in the message to be replied, and obtaining a detection result;
  • reply score is greater than a preset threshold, extracting characteristic information of the information to be replied;
  • the fourth aspect of the present application provides an expression replying device, which includes:
  • An obtaining unit configured to obtain information to be replied according to the reply request when the reply request is received;
  • a generating unit configured to generate an information vector according to the information to be replied
  • the input unit is used to input the information vector into the pre-trained classification model to obtain the classification result and the result probability of the classification result, the classification result includes a target result, and the target result is used to indicate the need to reply to the expression ;
  • a detection unit configured to detect whether the message to be replied contains user expression information if the classification result is the target result, and obtain a detection result
  • the generation unit is further configured to generate a reply score according to the result probability and the detection result;
  • An extracting unit configured to extract feature information of the information to be replied if the reply score is greater than a preset threshold
  • a recognition unit configured to perform emotion recognition on the characteristic information to obtain an emotion result, and perform intention recognition on the characteristic information to obtain an intention result;
  • the selection unit is configured to select a matching expression from a preset expression library according to the emotion result and the intention result as the reply expression of the message to be replied.
  • the present application uses the classification model to analyze the information to be replied and detect whether the information to be replied contains user expression information, and then generate a reply based on the probability of the result and the detection result
  • the score is compared with the preset threshold to determine whether the expression needs to be replied in the information to be replied. Since the information to be replied is analyzed from multiple dimensions, and each dimension corresponds to a different weight, it is possible to improve whether the expression needs to be replied.
  • the determination accuracy of the reply expression, and then when the reply score is greater than the preset threshold analyze the emotion and intention in the information to be replied, and accurately select the expression that matches the emotion result and the intention result .
  • Fig. 1 is a flow chart of a preferred embodiment of the expression replying method of the present application.
  • Fig. 2 is a functional block diagram of a preferred embodiment of the expression replying device of the present application.
  • FIG. 3 is a schematic structural diagram of an electronic device implementing a preferred embodiment of the expression reply method of the present application.
  • FIG. 1 it is a flow chart of a preferred embodiment of the expression reply method of the present application. According to different requirements, the order of the steps in the flowchart can be changed, and some steps can be omitted.
  • the expression reply method is applied to one or more electronic devices, and the electronic device is a device that can automatically perform numerical calculation and/or information processing according to preset or stored computer-readable instructions, and its hardware includes But not limited to microprocessors, application specific integrated circuits (Application Specific Integrated Circuit, ASIC), programmable gate arrays (Field-Programmable Gate Array, FPGA), digital signal processors (Digital Signal Processor, DSP), embedded devices, etc.
  • ASIC Application Specific Integrated Circuit
  • FPGA Field-Programmable Gate Array
  • DSP Digital Signal Processor
  • the electronic device may be any electronic product capable of man-machine interaction with the user, for example, a personal computer, a tablet computer, a smart phone, a personal digital assistant (Personal Digital Assistant, PDA), a game console, an interactive Internet TV ( Internet Protocol Television, IPTV), smart wearable devices, etc.
  • a personal computer a tablet computer
  • a smart phone a personal digital assistant (Personal Digital Assistant, PDA)
  • PDA Personal Digital Assistant
  • game console an interactive Internet TV ( Internet Protocol Television, IPTV), smart wearable devices, etc.
  • IPTV Internet Protocol Television
  • smart wearable devices etc.
  • the electronic devices may include network devices and/or user devices.
  • the network device includes, but is not limited to, a single network electronic device, an electronic device group composed of multiple network electronic devices, or a cloud composed of a large number of hosts or network electronic devices based on Cloud Computing.
  • the network where the electronic device is located includes, but is not limited to: the Internet, a wide area network, a metropolitan area network, a local area network, a virtual private network (Virtual Private Network, VPN) and the like.
  • VPN Virtual Private Network
  • the reply request is triggered and generated when input information from a user is received.
  • the information carried in the reply request includes, but is not limited to: log number and so on.
  • the information to be replied refers to information that needs to be replied, and the information to be replied may include, but is not limited to: information currently input by the user, information on multiple rounds of conversations between the user and the chat robot, and the like.
  • the information to be replied may be text information, image information, or voice information, and this application does not limit the specific form of the information to be replied.
  • the obtaining of the information to be replied by the electronic device according to the reply request includes:
  • the information carried in the method body in the target log is determined as the information to be replied.
  • the data information includes, but is not limited to: a label indicating a log, the log number, and the like.
  • the preset template refers to a preset statement capable of querying information, and the preset template may be a structured query statement.
  • Log information of multiple chat robots and users is stored in the log repository.
  • the method body refers to the dialog information between the chat robot and the user.
  • the data information can be quickly obtained by parsing the message, so that the target log can be quickly obtained from the log storage library according to the obtained log number, and then the information to be replied can be quickly obtained .
  • the information vector refers to a representation vector of the information to be replied.
  • the electronic device generating an information vector according to the information to be replied includes:
  • the image vector and the word segmentation vector are spliced according to the image position and the word segmentation position to obtain the information vector.
  • the target image may include an emoticon package sent by any terminal in the message to be replied, and the any terminal includes a user terminal and a chat robot.
  • the stop words include words such as prepositions.
  • the image position refers to the position where the target image appears in the message to be replied, and the image position can be a serial number, for example, the message to be replied is ⁇ user: are you happy today; chat robot: A (A is an emoticon package), what about you; user: I am very happy ⁇ , it is determined that A is the target image, and since A is in the second sentence in the message to be replied, the position of the image for 2.
  • the participle position refers to the position where the participle of the information appears in all the participles of the message to be replied. Following the above example, the participle "today" of the information is in the second position among all the participles. Therefore, the participle of the information The participle position of the participle "today" is 2.
  • the image vector of the target image can be accurately generated through all the pixels, the word segmentation vector can be quickly obtained through the information word segmentation, and then the information to be replied can be accurately generated according to the image position and the word segmentation position. corresponding to the information vector.
  • the electronic device extracting the target image in the message to be replied includes:
  • the preset format may be any format indicating an image, for example, the preset format may be a JPG format, or the preset format may be a PNG format.
  • the target image can be quickly obtained from the information to be replied by using the preset format.
  • the electronic device generating the image vector of the target image according to all the pixels includes:
  • the image vector is obtained by splicing the vector value according to the pixel position of each pixel in the target image.
  • the target image has 10 pixels, and the vector values corresponding to each pixel are 0, 1, 0, 0, 1, 0, 0, 1, 1, 1, and the vector values are spliced according to the pixel positions to obtain
  • the image vector is [0, 1, 0, 0, 1, 0, 0, 1, 1, 1].
  • the electronic device uses a stuttering algorithm to perform word segmentation processing on the processed information to obtain information word segmentation.
  • the electronic device acquires a vector corresponding to the information word segmentation as a word segmentation vector from a vector mapping table.
  • S12 Input the information vector into a pre-trained classification model to obtain a classification result and a result probability of the classification result.
  • the classification result includes a target result, and the target result is used to indicate that an expression needs to be replied.
  • the classification model is used to detect whether the message to be replied requires an expression reply.
  • the result probability refers to the probability that the classification model classifies the information to be replied as the classification result.
  • the classification result may also include a characteristic result, and the characteristic result is used to indicate that a reply expression is not required.
  • the method before inputting the information vector into the pre-trained classification model, the method further includes:
  • the preset learner further includes a convolutional layer and a pooling layer, and each convolutional layer includes a plurality of convolution kernels of different sizes.
  • the fully connected layer is used to map the vector generated by the pooling layer.
  • the mapping accuracy of the classification learner can be improved, and then the classification learner can be verified by the verification data, which can improve the classification learner as a whole.
  • the classification accuracy of the classification model can be improved.
  • the electronic device uses the training data to adjust the parameters in the fully connected layer, and the classification learner obtained includes:
  • the learning rate in the fully connected layer can be increased, thereby improving the classification accuracy of the classification learner.
  • generating a learning rate by the electronic device according to the user satisfaction degree and the output result includes:
  • the detection result includes two results: the information to be replied contains user expression information, and the information to be replied does not contain user expression information.
  • the chat robot can use more expressions to communicate with the user.
  • the information to be replied does not include user emoticons, it means that the user has low dependence on emoticon packages, and the chat robot tries to avoid using emoticons to communicate with the user.
  • the electronic device detects whether the message to be replied contains user expression information, and the detection result includes:
  • the detection result is determined to be that the user expression information is included in the message to be replied.
  • the target image includes the expression information sent by the user and the expression information sent by the chat robot.
  • the preset terminal library includes machine numbers of all chatbots.
  • the input address can be quickly obtained through the target log, the input terminal can be accurately determined through the input address, and the terminal number can be accurately obtained. Since there is no need to Symbols are compared, so the detection result can be quickly determined through the terminal number. In addition, by comparing the terminal number with the all machine numbers, the user expression information sent by the user can be accurately extracted from the target image, which is beneficial to detect whether the information to be replied needs Reply with emoticons.
  • the input address for the electronic device to obtain the target image includes:
  • the information indicating the address is acquired from the target log as the input address of the target image.
  • the reply score indicates the score value for which the message to be replied needs to be replied through an emoticon package.
  • the electronic device generating a reply score according to the result probability and the detection result includes:
  • the detection value refers to a numerical value corresponding to the detection result, for example, if the detection result is that the message to be replied includes user expression information, then the detection value is 1.
  • the first weight of the classification model is 0.2
  • the result probability is 0.8
  • the detection result A is that the information to be replied includes the user expression information, and then the information corresponding to the detection result A is obtained.
  • the detection value of is 1, and the second weight is 0.8.
  • the first score is 0.16
  • the second score is 0.8. Therefore, the reply score is 0.96.
  • the detection result B is that the user expression information is not included in the information to be replied, and the detection value corresponding to the detection result B is obtained as -1, and the first score obtained after calculation is 0.16, the second score is -0.8, therefore, the reply score is -0.64.
  • the preset threshold can be customized, and the present application does not limit the value of the preset threshold.
  • the characteristic information refers to information that can characterize the semantics of the information to be replied.
  • the feature information extracted by the electronic device from the information to be replied includes:
  • the information word segment corresponding to the word segment vector with the largest similarity and the target image are determined as the feature information.
  • the first preset matrix and the second preset matrix are respectively preset weight matrices.
  • the information word segmentation containing contextual semantics can be extracted from the information to be replied as the feature information, and the accuracy of determining the feature information can be improved.
  • the target image can better Representing the user's emotions, so determining the target image as the feature information can facilitate emotion recognition of the information to be replied.
  • the emotional result may be positive emotions such as happiness, or negative emotions such as unhappiness.
  • the intention result refers to the intention of the user in the information to be replied.
  • the electronic device performs emotion recognition on the feature information through the pre-trained emotion recognition model to obtain an emotion result.
  • the training method of the emotion recognition model belongs to the prior art, which will not be repeated in this application.
  • the electronic device performs intent recognition on the characteristic information, and the obtained intent results include:
  • the feature vector is input into the pre-trained two-way long-short-term memory network to obtain a semantic vector
  • the semantic vector is processed by using the cascaded conditional random field to obtain the intention result.
  • the semantic information in the feature information can be obtained through the two-way long-short-term memory network, and then the intention result can be accurately identified.
  • the above-mentioned reply emoticons can also be stored in nodes of a block chain.
  • the preset expression database stores a plurality of predefined expressions.
  • the reply emoticon refers to an emoticon that needs to be replied to the message to be replied.
  • the electronic device selects a matching expression from a preset expression library according to the emotion result and the intention result as the reply expression of the message to be replied, including:
  • the target expression refers to an expression corresponding to the emotional result.
  • the reply emoticon can be accurately acquired from the preset emoticon database.
  • the method also includes:
  • the intended result may be text information, for example, the intended result is "impossible”.
  • the reply emoticon refers to an emoticon containing text information (ie: the intended result).
  • the replying expression can be automatically synthesized to improve comprehensiveness.
  • the reply information since the reply information includes the intention result, it can assist the user to accurately understand the meaning expressed by the reply emoticon.
  • the electronic device synthesizes the arbitrary expression and the intended result, and obtaining the reply expression includes:
  • the arbitrary position may include below the arbitrary expression, and may also include above the arbitrary expression.
  • the present application uses the classification model to analyze the information to be replied and detect whether the information to be replied contains user expression information, and then generate a reply based on the probability of the result and the detection result
  • the score is compared with the preset threshold to determine whether the expression needs to be replied in the information to be replied. Since the information to be replied is analyzed from multiple dimensions, and each dimension corresponds to a different weight, it is possible to improve whether the expression needs to be replied.
  • the determination accuracy of the reply expression, and then when the reply score is greater than the preset threshold analyze the emotion and intention in the information to be replied, and accurately select the expression that matches the emotion result and the intention result .
  • the expression reply device 11 includes an acquisition unit 110, a generation unit 111, an input unit 112, a detection unit 113, an extraction unit 114, a recognition unit 115, a selection unit 116, a division unit 117, an adjustment unit 118, a determination unit 119 and a synthesis unit 120 .
  • the module/unit referred to in this application refers to a series of computer-readable instruction segments that can be acquired by the processor 13 and can perform fixed functions, and are stored in the memory 12 . In this embodiment, the functions of each module/unit will be described in detail in subsequent embodiments.
  • the acquiring unit 110 acquires information to be replied according to the reply request.
  • the reply request is triggered and generated when input information from a user is received.
  • the information carried in the reply request includes, but is not limited to: log number and so on.
  • the information to be replied refers to information that needs to be replied, and the information to be replied may include, but is not limited to: information currently input by the user, information on multiple rounds of conversations between the user and the chat robot, and the like.
  • the information to be replied may be text information, image information, or voice information, and this application does not limit the specific form of the information to be replied.
  • the acquiring unit 110 acquiring the information to be replied according to the reply request includes:
  • the information carried in the method body in the target log is determined as the information to be replied.
  • the data information includes, but is not limited to: a label indicating a log, the log number, and the like.
  • the preset template refers to a preset statement capable of querying information, and the preset template may be a structured query statement.
  • Log information of multiple chat robots and users is stored in the log repository.
  • the method body refers to the dialog information between the chat robot and the user.
  • the data information can be quickly obtained by parsing the message, so that the target log can be quickly obtained from the log storage library according to the obtained log number, and then the information to be replied can be quickly obtained .
  • the generating unit 111 generates an information vector according to the information to be replied.
  • the information vector refers to a representation vector of the information to be replied.
  • the generating unit 111 generating an information vector according to the information to be replied includes:
  • the image vector and the word segmentation vector are spliced according to the image position and the word segmentation position to obtain the information vector.
  • the target image may include an emoticon package sent by any terminal in the message to be replied, and the any terminal includes a user terminal and a chat robot.
  • the stop words include words such as prepositions.
  • the image position refers to the position where the target image appears in the message to be replied, and the image position can be a serial number, for example, the message to be replied is ⁇ user: are you happy today; chat robot: A (A is an emoticon package), what about you; user: I am very happy ⁇ , it is determined that A is the target image, and since A is in the second sentence in the message to be replied, the position of the image for 2.
  • the participle position refers to the position where the participle of the information appears in all the participles of the message to be replied. Following the above example, the participle "today" of the information is in the second position among all the participles. Therefore, the participle of the information The participle position of the participle "today" is 2.
  • the image vector of the target image can be accurately generated through all the pixels, the word segmentation vector can be quickly obtained through the information word segmentation, and then the information to be replied can be accurately generated according to the image position and the word segmentation position. corresponding to the information vector.
  • the generating unit 111 extracting the target image in the information to be replied includes:
  • the preset format may be any format indicating an image, for example, the preset format may be a JPG format, or the preset format may be a PNG format.
  • the target image can be quickly obtained from the information to be replied by using the preset format.
  • the generating unit 111 generating the image vector of the target image according to all the pixels includes:
  • the image vector is obtained by splicing the vector value according to the pixel position of each pixel in the target image.
  • the target image has 10 pixels, and the vector values corresponding to each pixel are 0, 1, 0, 0, 1, 0, 0, 1, 1, 1, and the vector values are spliced according to the pixel positions to obtain
  • the image vector is [0, 1, 0, 0, 1, 0, 0, 1, 1, 1].
  • the generating unit 111 performs word segmentation processing on the processed information by using a stuttering algorithm to obtain information word segmentation.
  • the generating unit 111 acquires a vector corresponding to the information word segmentation as a word segmentation vector from the vector mapping table.
  • the input unit 112 inputs the information vector into a pre-trained classification model to obtain a classification result and a result probability of the classification result.
  • the classification result includes a target result, and the target result is used to indicate that an expression needs to be replied.
  • the classification model is used to detect whether the message to be replied requires an expression reply.
  • the result probability refers to the probability that the classification model classifies the information to be replied as the classification result.
  • the classification result may also include a characteristic result, and the characteristic result is used to indicate that a reply expression is not required.
  • the obtaining unit 110 before inputting the information vector into the pre-trained classification model, obtains a preset learner, and the preset learner includes a fully connected layer;
  • the acquiring unit 110 acquires historical sample data, which includes historical messages and user satisfaction;
  • the division unit 117 divides the historical sample data into training data and verification data
  • the adjustment unit 118 uses the training data to adjust the parameters in the fully connected layer to obtain a classification learner
  • the determination unit 119 determines the accuracy of the classification learner based on the verification data
  • the adjustment unit 118 adjusts the classification learner according to the verification data until the accuracy rate of the classification learner is greater than or equal to the preset accuracy, and the classification model.
  • the preset learner further includes a convolutional layer and a pooling layer, and each convolutional layer includes a plurality of convolution kernels of different sizes.
  • the fully connected layer is used to map the vector generated by the pooling layer.
  • the mapping accuracy of the classification learner can be improved, and then the classification learner can be verified by the verification data, which can improve the classification learner as a whole.
  • the classification accuracy of the classification model can be improved.
  • the adjustment unit 118 uses the training data to adjust the parameters in the fully connected layer, so that the classification learner includes:
  • the learning rate in the fully connected layer can be increased, thereby improving the classification accuracy of the classification learner.
  • the adjustment unit 118 generating a learning rate according to the user satisfaction and the output results includes:
  • the detection unit 113 detects whether the information to be replied contains user expression information, and obtains a detection result.
  • the detection result includes two results: the information to be replied contains user expression information, and the information to be replied does not contain user expression information.
  • the chat robot can use more expressions to communicate with the user.
  • the information to be replied does not include user emoticons, it means that the user has low dependence on emoticon packages, and the chat robot tries to avoid using emoticons to communicate with the user.
  • the detection unit 113 detects whether the information to be replied contains user expression information, and the detection results obtained include:
  • the detection result is determined to be that the user expression information is included in the message to be replied.
  • the target image includes the expression information sent by the user and the expression information sent by the chat robot.
  • the preset terminal library includes machine numbers of all chatbots.
  • the input address can be quickly obtained through the target log, the input terminal can be accurately determined through the input address, and the terminal number can be accurately obtained. Since there is no need to Symbols are compared, so the detection result can be quickly determined through the terminal number. In addition, by comparing the terminal number with the all machine numbers, the user expression information sent by the user can be accurately extracted from the target image, which is beneficial to detect whether the information to be replied needs Reply with emoticons.
  • the detection unit 113 obtaining the input address of the target image includes:
  • the information indicating the address is acquired from the target log as the input address of the target image.
  • the generation unit 111 generates a reply score according to the result probability and the detection result.
  • the reply score indicates the score value for which the message to be replied needs to be replied through an emoticon package.
  • the generating unit 111 generating a reply score according to the result probability and the detection result includes:
  • the detection value refers to a numerical value corresponding to the detection result, for example, if the detection result is that the message to be replied includes user expression information, then the detection value is 1.
  • the first weight of the classification model is 0.2
  • the result probability is 0.8
  • the detection result A is that the information to be replied includes the user expression information, and then the information corresponding to the detection result A is obtained.
  • the detection value of is 1, and the second weight is 0.8.
  • the first score is 0.16
  • the second score is 0.8. Therefore, the reply score is 0.96.
  • the detection result B is that the user expression information is not included in the information to be replied, and the detection value corresponding to the detection result B is obtained as -1, and the first score obtained after calculation is 0.16, the second score is -0.8, therefore, the reply score is -0.64.
  • the extracting unit 114 extracts feature information of the information to be replied.
  • the preset threshold can be customized, and the present application does not limit the value of the preset threshold.
  • the characteristic information refers to information that can characterize the semantics of the information to be replied.
  • the extracting unit 114 extracting the characteristic information of the information to be replied includes:
  • the information word segment corresponding to the word segment vector with the largest similarity and the target image are determined as the feature information.
  • the first preset matrix and the second preset matrix are respectively preset weight matrices.
  • the information word segmentation containing contextual semantics can be extracted from the information to be replied as the feature information, and the accuracy of determining the feature information can be improved.
  • the target image can better Representing the user's emotions, so determining the target image as the feature information can facilitate emotion recognition of the information to be replied.
  • the recognition unit 115 performs emotion recognition on the characteristic information to obtain an emotion result, and performs intention recognition on the characteristic information to obtain an intention result.
  • the emotional result may be positive emotions such as happiness, or negative emotions such as unhappiness.
  • the intention result refers to the intention of the user in the information to be replied.
  • the recognition unit 115 performs emotion recognition on the feature information through the pre-trained emotion recognition model to obtain an emotion result.
  • the training method of the emotion recognition model belongs to the prior art, which will not be repeated in this application.
  • the identification unit 115 performs intention identification on the feature information, and the intention results obtained include:
  • the feature vector is input into the pre-trained two-way long-short-term memory network to obtain a semantic vector
  • the semantic vector is processed by using the cascaded conditional random field to obtain the intention result.
  • the semantic information in the feature information can be obtained through the two-way long-short-term memory network, and then the intention result can be accurately identified.
  • the selection unit 116 selects a matching expression from a preset expression library according to the emotion result and the intention result as the reply expression of the message to be replied.
  • the above-mentioned reply emoticons can also be stored in nodes of a block chain.
  • the preset expression database stores a plurality of predefined expressions.
  • the reply emoticon refers to an emoticon that needs to be replied to the message to be replied.
  • the selection unit 116 selects a matching expression from the preset expression library according to the emotion result and the intention result as the reply expression of the message to be replied, including:
  • the target expression refers to an expression corresponding to the emotional result.
  • the reply emoticon can be accurately acquired from the preset emoticon library.
  • the acquisition unit 110 acquires any expression from the target expression
  • the synthesis unit 120 synthesizes the arbitrary emoticon and the intended result to obtain the reply emoticon.
  • the intended result may be text information, for example, the intended result is "impossible”.
  • the reply emoticon refers to an emoticon containing text information (ie: the intended result).
  • the replying expression can be automatically synthesized to improve comprehensiveness.
  • the reply information since the reply information includes the intention result, it can assist the user to accurately understand the meaning expressed by the reply emoticon.
  • the synthesis unit 120 synthesizes the arbitrary expression and the intention result, and obtaining the reply expression includes:
  • the arbitrary position may include below the arbitrary expression, and may also include above the arbitrary expression.
  • the present application uses the classification model to analyze the information to be replied and detect whether the information to be replied contains user expression information, and then generate a reply based on the probability of the result and the detection result
  • the score is compared with the preset threshold to determine whether the expression needs to be replied in the information to be replied. Since the information to be replied is analyzed from multiple dimensions, and each dimension corresponds to a different weight, it is possible to improve whether the expression needs to be replied.
  • the determination accuracy of the reply expression, and then when the reply score is greater than the preset threshold analyze the emotion and intention in the information to be replied, and accurately select the expression that matches the emotion result and the intention result .
  • FIG. 3 it is a schematic structural diagram of an electronic device in a preferred embodiment of the expression replying method of the present application.
  • the electronic device 1 includes, but is not limited to, a memory 12, a processor 13, and computer-readable instructions stored in the memory 12 and operable on the processor 13 , such as the emoji reply program.
  • the schematic diagram is only an example of the electronic device 1, and does not constitute a limitation to the electronic device 1, and may include more or less components than those shown in the illustration, or combine certain components, or have different Components, for example, the electronic device 1 may also include input and output devices, network access devices, buses, and the like.
  • the processor 13 can be a central processing unit (Central Processing Unit, CPU), and can also be other general-purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), Field-Programmable Gate Array (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc.
  • the general-purpose processor can be a microprocessor or the processor can also be any conventional processor, etc.
  • the processor 13 is the computing core and control center of the electronic device 1, and uses various interfaces and lines to connect the entire electronic device 1, and execute the operating system of the electronic device 1 and various installed applications, program codes, etc.
  • the computer-readable instructions may be divided into one or more modules/units, and the one or more modules/units are stored in the memory 12 and executed by the processor 13 to Complete this application.
  • the one or more modules/units may be a series of computer-readable instruction segments capable of accomplishing specific functions, and the computer-readable instruction segments are used to describe the execution process of the computer-readable instructions in the electronic device 1 .
  • the computer readable instructions may be divided into an acquisition unit 110, a generation unit 111, an input unit 112, a detection unit 113, an extraction unit 114, an identification unit 115, a selection unit 116, a division unit 117, an adjustment unit 118, a determination unit 119 and the synthesis unit 120.
  • the memory 12 can be used to store the computer-readable instructions and/or modules, and the processor 13 runs or executes the computer-readable instructions and/or modules stored in the memory 12, and calls the computer-readable instructions and/or modules stored in the memory 12.
  • the data in it realizes various functions of the electronic device 1 .
  • Described memory 12 can mainly comprise storage program area and storage data area, wherein, storage program area can store operating system, the application program (such as sound playback function, image playback function etc.) required by at least one function etc.; Storage data area can be Stores data, etc. created in accordance with the use of electronic devices.
  • Memory 12 can comprise nonvolatile and volatile memory, for example: hard disk, internal memory, plug-in hard disk, smart memory card (Smart Media Card, SMC), secure digital (Secure Digital, SD) card, flash memory card (Flash Card), at least one magnetic disk storage device, flash memory device, or other storage device.
  • the memory 12 may be an external memory and/or an internal memory of the electronic device 1 . Further, the memory 12 may be a memory in physical form, such as a memory stick, a TF card (Trans-flash Card) or the like.
  • TF card Trans-flash Card
  • the integrated modules/units of the electronic device 1 are realized in the form of software function units and sold or used as independent products, they can be stored in a computer-readable storage medium, and the computer-readable storage medium can be non-volatile A volatile storage medium may also be a volatile storage medium.
  • all or part of the processes in the methods of the above-mentioned embodiments in the present application can also be completed by instructing related hardware through computer-readable instructions, and the computer-readable instructions can be stored in a computer-readable storage medium
  • the steps of the above-mentioned various method embodiments can be realized.
  • the computer-readable instructions include computer-readable instruction codes
  • the computer-readable instruction codes may be in the form of source code, object code, executable file, or some intermediate form.
  • the computer-readable medium may include: any entity or device capable of carrying the computer-readable instruction code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer memory, a read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory).
  • Blockchain essentially a decentralized database, is a series of data blocks associated with each other using cryptographic methods. Each data block contains a batch of network transaction information, which is used to verify its Validity of information (anti-counterfeiting) and generation of the next block.
  • the blockchain can include the underlying platform of the blockchain, the platform product service layer, and the application service layer.
  • the memory 12 in the electronic device 1 stores computer-readable instructions to implement a method for expressing facial expressions
  • the processor 13 can execute the computer-readable instructions to achieve:
  • the classification result includes a target result, and the target result is used to indicate the need to reply to the expression;
  • classification result is the target result, detecting whether user expression information is included in the message to be replied, and obtaining a detection result;
  • reply score is greater than a preset threshold, extracting characteristic information of the information to be replied;
  • Computer-readable instructions are stored on the computer-readable storage medium, wherein the computer-readable instructions are used to implement the following steps when executed by the processor 13:
  • the classification result includes a target result, and the target result is used to indicate the need to reply to the expression;
  • classification result is the target result, detecting whether user expression information is included in the message to be replied, and obtaining a detection result;
  • reply score is greater than a preset threshold, extracting characteristic information of the information to be replied;
  • modules described as separate components may or may not be physically separated, and the components shown as modules may or may not be physical units, that is, they may be located in one place, or may be distributed to multiple network units. Part or all of the modules can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • each functional module in each embodiment of the present application may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit.
  • the above-mentioned integrated units can be implemented in the form of hardware, or in the form of hardware plus software function modules.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Artificial Intelligence (AREA)
  • Databases & Information Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Computing Systems (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

一种表情回复方法、装置、设备及存储介质,涉及人工智能领域。该方法能够获取待回复信息,根据待回复信息生成信息向量(S11),将信息向量输入至分类模型中,得到分类结果及结果概率,若分类结果为目标结果,检测待回复信息中是否包含用户表情信息,得到检测结果,根据结果概率及检测结果生成回复分数(S14),若回复分数大于预设阈值,提取待回复信息的特征信息(S15),对特征信息进行情感识别,得到情感结果,对特征信息进行意图识别,得到意图结果(S16),根据情感结果及意图结果从预设表情库中选取匹配的表情作为待回复信息的回复表情(S17)。上述方法能够准确使用表情回复用户信息。此外,上述方法、装置、设备及存储介质还涉及区块链技术,所述回复表情可存储于区块链中。

Description

表情回复方法、装置、设备及存储介质
本申请要求于2021年06月10日提交中国专利局,申请号为202110645856.4,发明名称为“表情回复方法、装置、设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及人工智能技术领域,尤其涉及一种表情回复方法、装置、设备及存储介质。
背景技术
在社交网络应用中,使用表情包能够丰富人们日常情感的表达,为此,在聊天机器人当中加入了表情包的使用,然而,发明人意识到,目前聊天机器人在使用聊天表情进行回复时,由于无法准确分析出用户的聊天情感,进而无法准确地确定出是否应当使用表情包回复当前用户问题,同时也无法准确地确定出应当使用的表情包,导致无法准确使用表情回复用户信息。
发明内容
鉴于以上内容,有必要提供一种表情回复方法、装置、设备及存储介质,能够准确使用表情回复用户信息。
本申请的第一方面提供一种表情回复方法,所述表情回复方法包括:
当接收到回复请求时,根据所述回复请求获取待回复信息;
根据所述待回复信息生成信息向量;
将所述信息向量输入至预先训练好的分类模型中,得到分类结果及所述分类结果的结果概率,所述分类结果包括目标结果,所述目标结果用于指示需要回复表情;
若所述分类结果为所述目标结果,检测所述待回复信息中是否包含用户表情信息,得到检测结果;
根据所述结果概率及所述检测结果生成回复分数;
若所述回复分数大于预设阈值,提取所述待回复信息的特征信息;
对所述特征信息进行情感识别,得到情感结果,并对所述特征信息进行意图识别,得到意图结果;
根据所述情感结果及所述意图结果从预设表情库中选取匹配的表情作为所述待回复信息的回复表情。
本申请的第二方面提供一种电子设备,所述电子设备包括处理器和存储器,所述处理器用于执行所述存储器中存储的计算机可读指令以实现以下步骤:
当接收到回复请求时,根据所述回复请求获取待回复信息;
根据所述待回复信息生成信息向量;
将所述信息向量输入至预先训练好的分类模型中,得到分类结果及所述分类结果的结果概率,所述分类结果包括目标结果,所述目标结果用于指示需要回复表情;
若所述分类结果为所述目标结果,检测所述待回复信息中是否包含用户表情信息,得到检测结果;
根据所述结果概率及所述检测结果生成回复分数;
若所述回复分数大于预设阈值,提取所述待回复信息的特征信息;
对所述特征信息进行情感识别,得到情感结果,并对所述特征信息进行意图识别, 得到意图结果;
根据所述情感结果及所述意图结果从预设表情库中选取匹配的表情作为所述待回复信息的回复表情。
本申请的第三方面提供一种计算机可读存储介质,所述计算机可读存储介质上存储有至少一个计算机可读指令,所述至少一个计算机可读指令被处理器执行以实现以下步骤:
当接收到回复请求时,根据所述回复请求获取待回复信息;
根据所述待回复信息生成信息向量;
将所述信息向量输入至预先训练好的分类模型中,得到分类结果及所述分类结果的结果概率,所述分类结果包括目标结果,所述目标结果用于指示需要回复表情;
若所述分类结果为所述目标结果,检测所述待回复信息中是否包含用户表情信息,得到检测结果;
根据所述结果概率及所述检测结果生成回复分数;
若所述回复分数大于预设阈值,提取所述待回复信息的特征信息;
对所述特征信息进行情感识别,得到情感结果,并对所述特征信息进行意图识别,得到意图结果;
根据所述情感结果及所述意图结果从预设表情库中选取匹配的表情作为所述待回复信息的回复表情。
本申请的第四方面提供一种表情回复装置,所述表情回复装置包括:
获取单元,用于当接收到回复请求时,根据所述回复请求获取待回复信息;
生成单元,用于根据所述待回复信息生成信息向量;
输入单元,用于将所述信息向量输入至预先训练好的分类模型中,得到分类结果及所述分类结果的结果概率,所述分类结果包括目标结果,所述目标结果用于指示需要回复表情;
检测单元,用于若所述分类结果为所述目标结果,检测所述待回复信息中是否包含用户表情信息,得到检测结果;
所述生成单元,还用于根据所述结果概率及所述检测结果生成回复分数;
提取单元,用于若所述回复分数大于预设阈值,提取所述待回复信息的特征信息;
识别单元,用于对所述特征信息进行情感识别,得到情感结果,并对所述特征信息进行意图识别,得到意图结果;
选取单元,用于根据所述情感结果及所述意图结果从预设表情库中选取匹配的表情作为所述待回复信息的回复表情。
由以上技术方案可以看出,本申请通过所述分类模型,分析所述待回复信息以及检测所述待回复信息中是否包含用户表情信息,进而根据所述结果概率及所述检测结果生成的回复分数与预设阈值进行比较,以确定所述待回复信息中是否需要回复表情,由于从多个维度对所述待回复信息分析,以及每个维度对应有不同的权值,因此能够提高是否需要回复表情的确定准确率,进而在所述回复分数大于所述预设阈值时,分析所述待回复信息中的情感及意图,能够准确选取出与所述情感结果及所述意图结果匹配的表情。
附图说明
图1是本申请表情回复方法的较佳实施例的流程图。
图2是本申请表情回复装置的较佳实施例的功能模块图。
图3是本申请实现表情回复方法的较佳实施例的电子设备的结构示意图。
具体实施方式
为了使本申请的目的、技术方案和优点更加清楚,下面结合附图和具体实施例对本申请 进行详细描述。
如图1所示,是本申请表情回复方法的较佳实施例的流程图。根据不同的需求,该流程图中步骤的顺序可以改变,某些步骤可以省略。
所述表情回复方法应用于一个或者多个电子设备中,所述电子设备是一种能够按照事先设定或存储的计算机可读指令,自动进行数值计算和/或信息处理的设备,其硬件包括但不限于微处理器、专用集成电路(Application Specific Integrated Circuit,ASIC)、可编程门阵列(Field-Programmable Gate Array,FPGA)、数字信号处理器(Digital Signal Processor,DSP)、嵌入式设备等。
所述电子设备可以是任何一种可与用户进行人机交互的电子产品,例如,个人计算机、平板电脑、智能手机、个人数字助理(Personal Digital Assistant,PDA)、游戏机、交互式网络电视(Internet Protocol Television,IPTV)、智能穿戴式设备等。
所述电子设备可以包括网络设备和/或用户设备。其中,所述网络设备包括,但不限于单个网络电子设备、多个网络电子设备组成的电子设备组或基于云计算(Cloud Computing)的由大量主机或网络电子设备构成的云。
所述电子设备所处的网络包括,但不限于:互联网、广域网、城域网、局域网、虚拟专用网络(Virtual Private Network,VPN)等。
S10,当接收到回复请求时,根据所述回复请求获取待回复信息。
在本申请的至少一个实施例中,所述回复请求是在接收到用户的输入信息时触发生成的。所述回复请求携带的信息包括,但不限于:日志编号等。
所述待回复信息是指需要进行回复的信息,所述待回复信息可以包括,但不限于:用户当前输入的信息,用户与聊天机器人的多轮对话信息等。
所述待回复信息可以是文本信息,也可以是图像信息,还可以是语音信息,本申请对所述待回复信息的具体形式不作限制。
在本申请的至少一个实施例中,所述电子设备根据所述回复请求获取待回复信息包括:
解析所述回复请求的报文,得到所述报文携带的数据信息;
从所述数据信息中获取指示日志的信息作为日志编号;
将所述日志编号写入预设模板,得到查询语句;
获取日志存储库,并在所述日志存储库中运行所述查询语句,得到目标日志;
将所述目标日志中的方法体携带的信息确定为所述待回复信息。
其中,所述数据信息包括,但不限于:指示日志的标签、所述日志编号等。
所述预设模板是指能够进行信息查询的预先设定好的语句,所述预设模板可以是结构化查询语句。
所述日志存储库中存储有多个聊天机器人与用户的日志信息。
所述方法体是指聊天机器人与用户的对话信息。
通过解析所述报文能够快速获取到所述数据信息,从而根据获取到的所述日志编号能够快速从所述日志存储库中获取到所述目标日志,进而能够快速获取到所述待回复信息。
S11,根据所述待回复信息生成信息向量。
在本申请的至少一个实施例中,所述信息向量是指所述待回复信息的表征向量。
在本申请的至少一个实施例中,所述电子设备根据所述待回复信息生成信息向量包括:
提取所述待回复信息中的目标图像,并获取所述目标图像中的所有像素;
根据所述所有像素生成所述目标图像的图像向量;
将所述待回复信息中除所述目标图像外的信息确定为待处理信息;
过滤所述待处理信息中的停用词,得到已处理信息;
对所述已处理信息进行分词处理,得到信息分词,并获取所述信息分词的分词向量;
确定所述目标图像在所述待回复信息中的图像位置,并确定所述信息分词在所述待回复信息中的分词位置;
根据所述图像位置及所述分词位置拼接所述图像向量及所述分词向量,得到所述信息向量。
其中,所述目标图像可以包括所述待回复信息中的任意端发送的表情包,所述任意端包括用户端及聊天机器人。
所述停用词包括词性为介词等词汇。
所述图像位置是指所述目标图像在所述待回复信息中出现的位置,所述图像位置可以是一个序号,例如,所述待回复信息为{用户:你今天开心吗;聊天机器人:A(A为一个表情包),你呢;用户:我很开心},经确定,A为所述目标图像,由于A处于所述待回复信息中的第二个语句中,因此,所述图像位置为2。
所述分词位置是指所述信息分词在所述待回复信息的所有分词中出现的位置,承接上述例子,所述信息分词“今天”处于所有分词中的第二个位置,因此,所述信息分词“今天”的分词位置为2。
通过所述所有像素能够准确生成所述目标图像的图像向量,通过所述信息分词能够快速获取到所述分词向量,进而根据所述图像位置及所述分词位置能够准确生成与所述待回复信息对应的所述信息向量。
具体地,所述电子设备提取所述待回复信息中的目标图像包括:
从所述待回复信息中获取与预设格式相同的信息作为所述目标图像。
其中,所述预设格式可以是任意指示图像的格式,例如,所述预设格式可以是JPG格式,所述预设格式也可以是PNG格式。
通过所述预设格式能够从所述待回复信息中快速获取到所述目标图像。
进一步地,所述电子设备根据所述所有像素生成所述目标图像的图像向量包括:
获取每个像素对应的向量值;
根据每个像素在所述目标图像中的像素位置拼接所述向量值,得到所述图像向量。
例如,目标图像有10个像素,每个像素对应的向量值分别为0,1,0,0,1,0,0,1,1,1,根据所述像素位置拼接所述向量值,得到所述图像向量为[0,1,0,0,1,0,0,1,1,1]。
进一步地,所述电子设备利用结巴算法对所述已处理信息进行分词处理,得到信息分词。
进一步地,所述电子设备从向量映射表中获取与所述信息分词对应的向量作为分词向量。
S12,将所述信息向量输入至预先训练好的分类模型中,得到分类结果及所述分类结果的结果概率,所述分类结果包括目标结果,所述目标结果用于指示需要回复表情。
在本申请的至少一个实施例中,所述分类模型用于检测所述待回复信息是否需要进行表情回复。
所述结果概率是指所述分类模型将所述待回复信息归类为所述分类结果的概率。
所述分类结果还可以包括特征结果,所述特征结果用于指示不需要回复表情。
在本申请的至少一个实施例中,在将所述信息向量输入至预先训练好的分类模型中之前,所述方法还包括:
获取预设学习器,所述预设学习器中包括全连接层;
获取历史样本数据,所述历史样本数据中包括历史消息、用户满意度;
将所述历史样本数据划分为训练数据及验证数据;
利用所述训练数据调整所述全连接层中的参数,得到分类学习器;
基于所述验证数据确定所述分类学习器的准确率;
若所述准确率小于预设准确度,根据所述验证数据调整所述分类学习器,直至所述分类学习器的准确率大于或者等于所述预设准确度,得到所述分类模型。
其中,所述预设学习器还包括卷积层及池化层,每个卷积层中包括多个不同大小的卷积核。
所述全连接层用于对所述池化层生成的向量进行映射。
通过利用所述训练数据调整所述全连接层中的参数,能够提高所述分类学习器的映射准确率,进而通过所述验证数据对所述分类学习器进行验证,能够从整体上提高所述分类模型的分类准确度。
具体地,所述电子设备利用所述训练数据调整所述全连接层中的参数,得到分类学习器包括:
对于每个训练数据,将所述历史消息输入所述全连接层,得到输出结果;
根据所述用户满意度及所述输出结果生成学习率;
将所述学习率最好的训练数据确定为目标训练数据;
根据所述目标训练数据调整所述参数,得到所述分类学习器。
通过上述实施方式,能够使所述全连接层中的学习率得以提升,从而能够提高所述分类学习器的分类准确率。
具体地,所述电子设备根据所述用户满意度及所述输出结果生成学习率包括:
计算所述用户满意度与所述输出结果的差值;
将所述差值除以所述输出结果,得到所述学习率。
S13,若所述分类结果为所述目标结果,检测所述待回复信息中是否包含用户表情信息,得到检测结果。
在本申请的至少一个实施例中,所述检测结果包括所述待回复信息中包含用户表情信息,以及,所述待回复信息中不包含用户表情信息两种结果。
需要说明的是,当所述待回复信息中包含用户表情时,说明用户对表情包的依赖性高,则聊天机器人可以更多地使用表情与用户交流。当所述待回复信息中不包含用户表情时,说明用户对表情包的依赖性低,则聊天机器人尽量避免使用表情与用户交流。
在本申请的至少一个实施例中,所述电子设备检测所述待回复信息中是否包含用户表情信息,得到检测结果包括:
获取所述目标图像的输入地址;
将与所述输入地址对应的终端确定为所述目标图像的输入终端,并获取所述输入终端的终端编号;
将所述终端编号与预设终端库中所有机器编号进行比较;
若所述终端编号与所述所有机器编号均不相同,将所述检测结果确定为所述待回复信息中包含所述用户表情信息。
其中,所述目标图像中包含了用户发送的表情信息及聊天机器人发送的表情信息。
所述预设终端库中包含所有聊天机器人的机器编号。
通过所述目标日志能够快速获取到所述输入地址,通过所述输入地址能够准确的确定出所述输入终端,进而能够准确的获取到所述终端编号,由于无需对所述输入地址中每个符号进行比较,因此通过所述终端编号能够快速确定出所述检测结果。此外,通过将所述终端编号与所述所有机器编号进行比较,能够准确的从所述目标图像中提取出由用户发送的所述用户表情信息,从而有利于检测所述待回复信息中是否需要采用表情进行回复。
具体地,所述电子设备获取所述目标图像的输入地址包括:
从所述目标日志上获取指示地址的信息作为所述目标图像的输入地址。
S14,根据所述结果概率及所述检测结果生成回复分数。
在本申请的至少一个实施例中,所述回复分数指示所述待回复信息需要通过表情包进行回复的分数值。
在本申请的至少一个实施例中,所述电子设备根据所述结果概率及所述检测结果生成回复分数包括:
获取所述分类模型的第一权值;
将所述结果概率及所述第一权值的乘积确定为所述待回复信息的第一分数;
获取与所述检测结果对应的检测值,并获取所述用户表情信息的第二权值;
将所述检测值及所述第二权值的乘积确定为所述待回复信息的第二分数;
计算所述第一分数与所述第二分数的总和,得到所述回复分数。
其中,所述检测值是指与所述检测结果对应的数值,例如,所述检测结果为所述待回复信息中包含用户表情信息,则所述检测值为1。
例如,所述分类模型的第一权值为0.2,所述结果概率为0.8,所述检测结果A为所述待回复信息中包含所述用户表情信息,进而获取到与所述检测结果A对应的检测值为1,所述第二权值为0.8,经计算,得到所述第一分数为0.16,所述第二分数为0.8,因此,所述回复分数为0.96。
又如,所述检测结果B为所述待回复信息中没有包含所述用户表情信息,进而获取到与所述检测结果B对应的检测值为-1,经计算,得到所述第一分数为0.16,所述第二分数为-0.8,因此,所述回复分数为-0.64。
通过上述实施方式,能够从多个维度上全面确定出所述待回复信息中是否需要回复表情包,提高确定准确率。
S15,若所述回复分数大于预设阈值,提取所述待回复信息的特征信息。
在本申请的至少一个实施例中,所述预设阈值可以自定义设置,本申请对所述预设阈值的取值不作限制。
所述特征信息是指能够表征所述待回复信息语义的信息。
在本申请的至少一个实施例中,所述电子设备提取所述待回复信息的特征信息包括:
根据所述分词向量生成每个信息分词的上下文特征向量集;
计算所述上下文特征向量集中每个分词向量与第一预设矩阵的乘积,得到所述信息分词的多个运算向量,并计算所述多个运算向量的平均值,得到所述信息分词的中间向量;
将所述中间向量点乘第二预设矩阵,得到目标矩阵,所述目标矩阵中每列向量表征所述待回复信息的每个特征;
计算所述目标矩阵中每列向量与所述分词向量的相似度;
将所述相似度最大的分词向量对应的信息分词及所述目标图像确定为所述特征信息。
其中,所述第一预设矩阵及所述第二预设矩阵分别是预先设定好的权值矩阵。
通过上述实施方式,能够从所述待回复信息中提取出包含上下文语义的所述信息分词作为所述特征信息,提高所述特征信息的确定准确率,同时,由于所述目标图像能够更好的表现用户情绪,因此将所述目标图像确定为所述特征信息,能够有利于所述待回复信息的情感识别。
S16,对所述特征信息进行情感识别,得到情感结果,并对所述特征信息进行意图识别,得到意图结果。
在本申请的至少一个实施例中,所述情感结果可以是高兴等正面情绪,也可以是不高兴等负面情绪。
所述意图结果是指所述待回复信息中用户的意图。
在本申请的至少一个实施例中,所述电子设备通过所述预先训练好的情感识别模型对所述特征信息进行情感识别,得到情感结果。
其中,所述情感识别模型的训练方式属于现有技术,本申请对此不再赘述。
在本申请的至少一个实施例中,所述电子设备对所述特征信息进行意图识别,得到意图结果包括:
从所述分词向量中获取所述特征信息的向量作为特征向量;
将所述特征向量输入至预先训练好的双向长短期记忆网络中,得到语义向量;
利用层叠条件随机场对所述语义向量进行处理,得到所述意图结果。
通过所述双向长短期记忆网络能够获取到所述特征信息中的语义信息,进而能够准确识别出所述意图结果。
S17,根据所述情感结果及所述意图结果从预设表情库中选取匹配的表情作为所述待回复 信息的回复表情。
需要强调的是,为进一步保证上述回复表情的私密和安全性,上述回复表情还可以存储于一区块链的节点中。
在本申请的至少一个实施例中,所述预设表情库存储有多个预先定义好的表情。
所述回复表情是指需要对所述待回复信息进行回复的表情。
在本申请的至少一个实施例中,所述电子设备根据所述情感结果及所述意图结果从预设表情库中选取匹配的表情作为所述待回复信息的回复表情包括:
根据所述情感结果从预设表情库中选取目标类表情;
从所述目标类表情中筛选与所述意图结果匹配的表情作为所述待回复信息的回复表情。
其中,所述目标类表情是指与所述情感结果对应的表情。
通过上述实施方式,能够从所述预设表情库中准确获取到所述回复表情。
在本申请的至少一个实施例中,所述方法还包括:
若所述预设表情库中不包含与所述意图结果匹配的表情,从所述目标类表情中获取任意表情;
合成所述任意表情与所述意图结果,得到所述回复表情。
其中,所述意图结果可以是文字信息,例如,所述意图结果为“无法做到”。
所述回复表情是指包含有文字信息(即:所述意图结果)的表情。
通过上述实施方式,能够在预设表情库中没有存储相应表情时,自动合成所述回复表情,提高全面性。此外,由于所述回复信息中包含由所述意图结果,因此,能够辅助用户准确获悉所述回复表情所表达的含义。
具体地,所述电子设备合成所述任意表情与所述意图结果,得到所述回复表情包括:
将所述意图结果录入所述任意表情的任意位置,得到所述回复表情。
其中,所述任意位置可以包括所述任意表情的下方,也可以包括所述任意表情的上方。
由以上技术方案可以看出,本申请通过所述分类模型,分析所述待回复信息以及检测所述待回复信息中是否包含用户表情信息,进而根据所述结果概率及所述检测结果生成的回复分数与预设阈值进行比较,以确定所述待回复信息中是否需要回复表情,由于从多个维度对所述待回复信息分析,以及每个维度对应有不同的权值,因此能够提高是否需要回复表情的确定准确率,进而在所述回复分数大于所述预设阈值时,分析所述待回复信息中的情感及意图,能够准确选取出与所述情感结果及所述意图结果匹配的表情。
如图2所示,是本申请表情回复装置的较佳实施例的功能模块图。所述表情回复装置11包括获取单元110、生成单元111、输入单元112、检测单元113、提取单元114、识别单元115、选取单元116、划分单元117、调整单元118、确定单元119及合成单元120。本申请所称的模块/单元是指一种能够被处理器13所获取,并且能够完成固定功能的一系列计算机可读指令段,其存储在存储器12中。在本实施例中,关于各模块/单元的功能将在后续的实施例中详述。
当接收到回复请求时,获取单元110根据所述回复请求获取待回复信息。
在本申请的至少一个实施例中,所述回复请求是在接收到用户的输入信息时触发生成的。所述回复请求携带的信息包括,但不限于:日志编号等。
所述待回复信息是指需要进行回复的信息,所述待回复信息可以包括,但不限于:用户当前输入的信息,用户与聊天机器人的多轮对话信息等。
所述待回复信息可以是文本信息,也可以是图像信息,还可以是语音信息,本申请对所述待回复信息的具体形式不作限制。
在本申请的至少一个实施例中,所述获取单元110根据所述回复请求获取待回复信息包括:
解析所述回复请求的报文,得到所述报文携带的数据信息;
从所述数据信息中获取指示日志的信息作为日志编号;
将所述日志编号写入预设模板,得到查询语句;
获取日志存储库,并在所述日志存储库中运行所述查询语句,得到目标日志;
将所述目标日志中的方法体携带的信息确定为所述待回复信息。
其中,所述数据信息包括,但不限于:指示日志的标签、所述日志编号等。
所述预设模板是指能够进行信息查询的预先设定好的语句,所述预设模板可以是结构化查询语句。
所述日志存储库中存储有多个聊天机器人与用户的日志信息。
所述方法体是指聊天机器人与用户的对话信息。
通过解析所述报文能够快速获取到所述数据信息,从而根据获取到的所述日志编号能够快速从所述日志存储库中获取到所述目标日志,进而能够快速获取到所述待回复信息。
生成单元111根据所述待回复信息生成信息向量。
在本申请的至少一个实施例中,所述信息向量是指所述待回复信息的表征向量。
在本申请的至少一个实施例中,所述生成单元111根据所述待回复信息生成信息向量包括:
提取所述待回复信息中的目标图像,并获取所述目标图像中的所有像素;
根据所述所有像素生成所述目标图像的图像向量;
将所述待回复信息中除所述目标图像外的信息确定为待处理信息;
过滤所述待处理信息中的停用词,得到已处理信息;
对所述已处理信息进行分词处理,得到信息分词,并获取所述信息分词的分词向量;
确定所述目标图像在所述待回复信息中的图像位置,并确定所述信息分词在所述待回复信息中的分词位置;
根据所述图像位置及所述分词位置拼接所述图像向量及所述分词向量,得到所述信息向量。
其中,所述目标图像可以包括所述待回复信息中的任意端发送的表情包,所述任意端包括用户端及聊天机器人。
所述停用词包括词性为介词等词汇。
所述图像位置是指所述目标图像在所述待回复信息中出现的位置,所述图像位置可以是一个序号,例如,所述待回复信息为{用户:你今天开心吗;聊天机器人:A(A为一个表情包),你呢;用户:我很开心},经确定,A为所述目标图像,由于A处于所述待回复信息中的第二个语句中,因此,所述图像位置为2。
所述分词位置是指所述信息分词在所述待回复信息的所有分词中出现的位置,承接上述例子,所述信息分词“今天”处于所有分词中的第二个位置,因此,所述信息分词“今天”的分词位置为2。
通过所述所有像素能够准确生成所述目标图像的图像向量,通过所述信息分词能够快速获取到所述分词向量,进而根据所述图像位置及所述分词位置能够准确生成与所述待回复信息对应的所述信息向量。
具体地,所述生成单元111提取所述待回复信息中的目标图像包括:
从所述待回复信息中获取与预设格式相同的信息作为所述目标图像。
其中,所述预设格式可以是任意指示图像的格式,例如,所述预设格式可以是JPG格式,所述预设格式也可以是PNG格式。
通过所述预设格式能够从所述待回复信息中快速获取到所述目标图像。
进一步地,所述生成单元111根据所述所有像素生成所述目标图像的图像向量包括:
获取每个像素对应的向量值;
根据每个像素在所述目标图像中的像素位置拼接所述向量值,得到所述图像向量。
例如,目标图像有10个像素,每个像素对应的向量值分别为0,1,0,0,1,0,0,1, 1,1,根据所述像素位置拼接所述向量值,得到所述图像向量为[0,1,0,0,1,0,0,1,1,1]。
进一步地,所述生成单元111利用结巴算法对所述已处理信息进行分词处理,得到信息分词。
进一步地,所述生成单元111从向量映射表中获取与所述信息分词对应的向量作为分词向量。
输入单元112将所述信息向量输入至预先训练好的分类模型中,得到分类结果及所述分类结果的结果概率,所述分类结果包括目标结果,所述目标结果用于指示需要回复表情。
在本申请的至少一个实施例中,所述分类模型用于检测所述待回复信息是否需要进行表情回复。
所述结果概率是指所述分类模型将所述待回复信息归类为所述分类结果的概率。
所述分类结果还可以包括特征结果,所述特征结果用于指示不需要回复表情。
在本申请的至少一个实施例中,在将所述信息向量输入至预先训练好的分类模型中之前,所述获取单元110获取预设学习器,所述预设学习器中包括全连接层;
所述获取单元110获取历史样本数据,所述历史样本数据中包括历史消息、用户满意度;
划分单元117将所述历史样本数据划分为训练数据及验证数据;
调整单元118利用所述训练数据调整所述全连接层中的参数,得到分类学习器;
确定单元119基于所述验证数据确定所述分类学习器的准确率;
若所述准确率小于预设准确度,所述调整单元118根据所述验证数据调整所述分类学习器,直至所述分类学习器的准确率大于或者等于所述预设准确度,得到所述分类模型。
其中,所述预设学习器还包括卷积层及池化层,每个卷积层中包括多个不同大小的卷积核。
所述全连接层用于对所述池化层生成的向量进行映射。
通过利用所述训练数据调整所述全连接层中的参数,能够提高所述分类学习器的映射准确率,进而通过所述验证数据对所述分类学习器进行验证,能够从整体上提高所述分类模型的分类准确度。
具体地,所述调整单元118利用所述训练数据调整所述全连接层中的参数,得到分类学习器包括:
对于每个训练数据,将所述历史消息输入所述全连接层,得到输出结果;
根据所述用户满意度及所述输出结果生成学习率;
将所述学习率最好的训练数据确定为目标训练数据;
根据所述目标训练数据调整所述参数,得到所述分类学习器。
通过上述实施方式,能够使所述全连接层中的学习率得以提升,从而能够提高所述分类学习器的分类准确率。
具体地,所述调整单元118根据所述用户满意度及所述输出结果生成学习率包括:
计算所述用户满意度与所述输出结果的差值;
将所述差值除以所述输出结果,得到所述学习率。
若所述分类结果为所述目标结果,检测单元113检测所述待回复信息中是否包含用户表情信息,得到检测结果。
在本申请的至少一个实施例中,所述检测结果包括所述待回复信息中包含用户表情信息,以及,所述待回复信息中不包含用户表情信息两种结果。
需要说明的是,当所述待回复信息中包含用户表情时,说明用户对表情包的依赖性高,则聊天机器人可以更多地使用表情与用户交流。当所述待回复信息中不包含用户表情时,说明用户对表情包的依赖性低,则聊天机器人尽量避免使用表情与用户交流。
在本申请的至少一个实施例中,所述检测单元113检测所述待回复信息中是否包含用户表情信息,得到检测结果包括:
获取所述目标图像的输入地址;
将与所述输入地址对应的终端确定为所述目标图像的输入终端,并获取所述输入终端的终端编号;
将所述终端编号与预设终端库中所有机器编号进行比较;
若所述终端编号与所述所有机器编号均不相同,将所述检测结果确定为所述待回复信息中包含所述用户表情信息。
其中,所述目标图像中包含了用户发送的表情信息及聊天机器人发送的表情信息。
所述预设终端库中包含所有聊天机器人的机器编号。
通过所述目标日志能够快速获取到所述输入地址,通过所述输入地址能够准确的确定出所述输入终端,进而能够准确的获取到所述终端编号,由于无需对所述输入地址中每个符号进行比较,因此通过所述终端编号能够快速确定出所述检测结果。此外,通过将所述终端编号与所述所有机器编号进行比较,能够准确的从所述目标图像中提取出由用户发送的所述用户表情信息,从而有利于检测所述待回复信息中是否需要采用表情进行回复。
具体地,所述检测单元113获取所述目标图像的输入地址包括:
从所述目标日志上获取指示地址的信息作为所述目标图像的输入地址。
所述生成单元111根据所述结果概率及所述检测结果生成回复分数。
在本申请的至少一个实施例中,所述回复分数指示所述待回复信息需要通过表情包进行回复的分数值。
在本申请的至少一个实施例中,所述生成单元111根据所述结果概率及所述检测结果生成回复分数包括:
获取所述分类模型的第一权值;
将所述结果概率及所述第一权值的乘积确定为所述待回复信息的第一分数;
获取与所述检测结果对应的检测值,并获取所述用户表情信息的第二权值;
将所述检测值及所述第二权值的乘积确定为所述待回复信息的第二分数;
计算所述第一分数与所述第二分数的总和,得到所述回复分数。
其中,所述检测值是指与所述检测结果对应的数值,例如,所述检测结果为所述待回复信息中包含用户表情信息,则所述检测值为1。
例如,所述分类模型的第一权值为0.2,所述结果概率为0.8,所述检测结果A为所述待回复信息中包含所述用户表情信息,进而获取到与所述检测结果A对应的检测值为1,所述第二权值为0.8,经计算,得到所述第一分数为0.16,所述第二分数为0.8,因此,所述回复分数为0.96。
又如,所述检测结果B为所述待回复信息中没有包含所述用户表情信息,进而获取到与所述检测结果B对应的检测值为-1,经计算,得到所述第一分数为0.16,所述第二分数为-0.8,因此,所述回复分数为-0.64。
通过上述实施方式,能够从多个维度上全面确定出所述待回复信息中是否需要回复表情包,提高确定准确率。
若所述回复分数大于预设阈值,提取单元114提取所述待回复信息的特征信息。
在本申请的至少一个实施例中,所述预设阈值可以自定义设置,本申请对所述预设阈值的取值不作限制。
所述特征信息是指能够表征所述待回复信息语义的信息。
在本申请的至少一个实施例中,所述提取单元114提取所述待回复信息的特征信息包括:
根据所述分词向量生成每个信息分词的上下文特征向量集;
计算所述上下文特征向量集中每个分词向量与第一预设矩阵的乘积,得到所述信息分词的多个运算向量,并计算所述多个运算向量的平均值,得到所述信息分词的中间向量;
将所述中间向量点乘第二预设矩阵,得到目标矩阵,所述目标矩阵中每列向量表征所述待回复信息的每个特征;
计算所述目标矩阵中每列向量与所述分词向量的相似度;
将所述相似度最大的分词向量对应的信息分词及所述目标图像确定为所述特征信息。
其中,所述第一预设矩阵及所述第二预设矩阵分别是预先设定好的权值矩阵。
通过上述实施方式,能够从所述待回复信息中提取出包含上下文语义的所述信息分词作为所述特征信息,提高所述特征信息的确定准确率,同时,由于所述目标图像能够更好的表现用户情绪,因此将所述目标图像确定为所述特征信息,能够有利于所述待回复信息的情感识别。
识别单元115对所述特征信息进行情感识别,得到情感结果,并对所述特征信息进行意图识别,得到意图结果。
在本申请的至少一个实施例中,所述情感结果可以是高兴等正面情绪,也可以是不高兴等负面情绪。
所述意图结果是指所述待回复信息中用户的意图。
在本申请的至少一个实施例中,所述识别单元115通过所述预先训练好的情感识别模型对所述特征信息进行情感识别,得到情感结果。
其中,所述情感识别模型的训练方式属于现有技术,本申请对此不再赘述。
在本申请的至少一个实施例中,所述识别单元115对所述特征信息进行意图识别,得到意图结果包括:
从所述分词向量中获取所述特征信息的向量作为特征向量;
将所述特征向量输入至预先训练好的双向长短期记忆网络中,得到语义向量;
利用层叠条件随机场对所述语义向量进行处理,得到所述意图结果。
通过所述双向长短期记忆网络能够获取到所述特征信息中的语义信息,进而能够准确识别出所述意图结果。
选取单元116根据所述情感结果及所述意图结果从预设表情库中选取匹配的表情作为所述待回复信息的回复表情。
需要强调的是,为进一步保证上述回复表情的私密和安全性,上述回复表情还可以存储于一区块链的节点中。
在本申请的至少一个实施例中,所述预设表情库存储有多个预先定义好的表情。
所述回复表情是指需要对所述待回复信息进行回复的表情。
在本申请的至少一个实施例中,所述选取单元116根据所述情感结果及所述意图结果从预设表情库中选取匹配的表情作为所述待回复信息的回复表情包括:
根据所述情感结果从预设表情库中选取目标类表情;
从所述目标类表情中筛选与所述意图结果匹配的表情作为所述待回复信息的回复表情。
其中,所述目标类表情是指与所述情感结果对应的表情。
通过上述实施方式,能够从所述预设表情库中准确获取到所述回复表情。
在本申请的至少一个实施例中,若所述预设表情库中不包含与所述意图结果匹配的表情,所述获取单元110从所述目标类表情中获取任意表情;
合成单元120合成所述任意表情与所述意图结果,得到所述回复表情。
其中,所述意图结果可以是文字信息,例如,所述意图结果为“无法做到”。
所述回复表情是指包含有文字信息(即:所述意图结果)的表情。
通过上述实施方式,能够在预设表情库中没有存储相应表情时,自动合成所述回复表情,提高全面性。此外,由于所述回复信息中包含由所述意图结果,因此,能够辅助用户准确获悉所述回复表情所表达的含义。
具体地,所述合成单元120合成所述任意表情与所述意图结果,得到所述回复表情包括:
将所述意图结果录入所述任意表情的任意位置,得到所述回复表情。
其中,所述任意位置可以包括所述任意表情的下方,也可以包括所述任意表情的上方。
由以上技术方案可以看出,本申请通过所述分类模型,分析所述待回复信息以及检测所 述待回复信息中是否包含用户表情信息,进而根据所述结果概率及所述检测结果生成的回复分数与预设阈值进行比较,以确定所述待回复信息中是否需要回复表情,由于从多个维度对所述待回复信息分析,以及每个维度对应有不同的权值,因此能够提高是否需要回复表情的确定准确率,进而在所述回复分数大于所述预设阈值时,分析所述待回复信息中的情感及意图,能够准确选取出与所述情感结果及所述意图结果匹配的表情。
如图3所示,是本申请实现表情回复方法的较佳实施例的电子设备的结构示意图。
在本申请的一个实施例中,所述电子设备1包括,但不限于,存储器12、处理器13,以及存储在所述存储器12中并可在所述处理器13上运行的计算机可读指令,例如表情回复程序。
本领域技术人员可以理解,所述示意图仅仅是电子设备1的示例,并不构成对电子设备1的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件,例如所述电子设备1还可以包括输入输出设备、网络接入设备、总线等。
所述处理器13可以是中央处理单元(Central Processing Unit,CPU),还可以是其他通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现场可编程门阵列(Field-Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等,所述处理器13是所述电子设备1的运算核心和控制中心,利用各种接口和线路连接整个电子设备1的各个部分,及执行所述电子设备1的操作系统以及安装的各类应用程序、程序代码等。
示例性的,所述计算机可读指令可以被分割成一个或多个模块/单元,所述一个或者多个模块/单元被存储在所述存储器12中,并由所述处理器13执行,以完成本申请。所述一个或多个模块/单元可以是能够完成特定功能的一系列计算机可读指令段,该计算机可读指令段用于描述所述计算机可读指令在所述电子设备1中的执行过程。例如,所述计算机可读指令可以被分割成获取单元110、生成单元111、输入单元112、检测单元113、提取单元114、识别单元115、选取单元116、划分单元117、调整单元118、确定单元119及合成单元120。
所述存储器12可用于存储所述计算机可读指令和/或模块,所述处理器13通过运行或执行存储在所述存储器12内的计算机可读指令和/或模块,以及调用存储在存储器12内的数据,实现所述电子设备1的各种功能。所述存储器12可主要包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需的应用程序(比如声音播放功能、图像播放功能等)等;存储数据区可存储根据电子设备的使用所创建的数据等。存储器12可以包括非易失性和易失性存储器,例如:硬盘、内存、插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字(Secure Digital,SD)卡,闪存卡(Flash Card)、至少一个磁盘存储器件、闪存器件、或其他存储器件。
所述存储器12可以是电子设备1的外部存储器和/或内部存储器。进一步地,所述存储器12可以是具有实物形式的存储器,如内存条、TF卡(Trans-flash Card)等等。
所述电子设备1集成的模块/单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中,所述计算机可读存储介质可以是非易失性的存储介质,也可以是易失性的存储介质。基于这样的理解,本申请实现上述实施例方法中的全部或部分流程,也可以通过计算机可读指令来指令相关的硬件来完成,所述的计算机可读指令可存储于一计算机可读存储介质中,该计算机可读指令在被处理器执行时,可实现上述各个方法实施例的步骤。
其中,所述计算机可读指令包括计算机可读指令代码,所述计算机可读指令代码可以为源代码形式、对象代码形式、可执行文件或某些中间形式等。所述计算机可读介质可以包括:能够携带所述计算机可读指令代码的任何实体或装置、记录介质、U盘、移动硬盘、磁碟、光盘、计算机存储器、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM, Random Access Memory)。
本申请所指区块链是分布式数据存储、点对点传输、共识机制、加密算法等计算机技术的新型应用模式。区块链(Blockchain),本质上是一个去中心化的数据库,是一串使用密码学方法相关联产生的数据块,每一个数据块中包含了一批次网络交易的信息,用于验证其信息的有效性(防伪)和生成下一个区块。区块链可以包括区块链底层平台、平台产品服务层以及应用服务层等。
结合图1,所述电子设备1中的所述存储器12存储计算机可读指令实现一种表情回复方法,所述处理器13可执行所述计算机可读指令从而实现:
当接收到回复请求时,根据所述回复请求获取待回复信息;
根据所述待回复信息生成信息向量;
将所述信息向量输入至预先训练好的分类模型中,得到分类结果及所述分类结果的结果概率,所述分类结果包括目标结果,所述目标结果用于指示需要回复表情;
若所述分类结果为所述目标结果,检测所述待回复信息中是否包含用户表情信息,得到检测结果;
根据所述结果概率及所述检测结果生成回复分数;
若所述回复分数大于预设阈值,提取所述待回复信息的特征信息;
对所述特征信息进行情感识别,得到情感结果,并对所述特征信息进行意图识别,得到意图结果;
根据所述情感结果及所述意图结果从预设表情库中选取匹配的表情作为所述待回复信息的回复表情。
具体地,所述处理器13对上述计算机可读指令的具体实现方法可参考图1对应实施例中相关步骤的描述,在此不赘述。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统,装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述模块的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。
所述计算机可读存储介质上存储有计算机可读指令,其中,所述计算机可读指令被处理器13执行时用以实现以下步骤:
当接收到回复请求时,根据所述回复请求获取待回复信息;
根据所述待回复信息生成信息向量;
将所述信息向量输入至预先训练好的分类模型中,得到分类结果及所述分类结果的结果概率,所述分类结果包括目标结果,所述目标结果用于指示需要回复表情;
若所述分类结果为所述目标结果,检测所述待回复信息中是否包含用户表情信息,得到检测结果;
根据所述结果概率及所述检测结果生成回复分数;
若所述回复分数大于预设阈值,提取所述待回复信息的特征信息;
对所述特征信息进行情感识别,得到情感结果,并对所述特征信息进行意图识别,得到意图结果;
根据所述情感结果及所述意图结果从预设表情库中选取匹配的表情作为所述待回复信息的回复表情。
所述作为分离部件说明的模块可以是或者也可以不是物理上分开的,作为模块显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能模块可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用硬件加软件功能模块的形式实现。
因此,无论从哪一点来看,均应将实施例看作是示范性的,而且是非限制性的,本申请 的范围由所附权利要求而不是上述说明限定,因此旨在将落在权利要求的等同要件的含义和范围内的所有变化涵括在本申请内。不应将权利要求中的任何附关联图标记视为限制所涉及的权利要求。
此外,显然“包括”一词不排除其他单元或步骤,单数不排除复数。所述的多个单元或装置也可以由一个单元或装置通过软件或者硬件来实现。第一、第二等词语用来表示名称,而并不表示任何特定的顺序。
最后应说明的是,以上实施例仅用以说明本申请的技术方案而非限制,尽管参照较佳实施例对本申请进行了详细说明,本领域的普通技术人员应当理解,可以对本申请的技术方案进行修改或等同替换,而不脱离本申请技术方案的精神和范围。

Claims (20)

  1. 一种表情回复方法,其中,所述表情回复方法包括:
    当接收到回复请求时,根据所述回复请求获取待回复信息;
    根据所述待回复信息生成信息向量;
    将所述信息向量输入至预先训练好的分类模型中,得到分类结果及所述分类结果的结果概率,所述分类结果包括目标结果,所述目标结果用于指示需要回复表情;
    若所述分类结果为所述目标结果,检测所述待回复信息中是否包含用户表情信息,得到检测结果;
    根据所述结果概率及所述检测结果生成回复分数;
    若所述回复分数大于预设阈值,提取所述待回复信息的特征信息;
    对所述特征信息进行情感识别,得到情感结果,并对所述特征信息进行意图识别,得到意图结果;
    根据所述情感结果及所述意图结果从预设表情库中选取匹配的表情作为所述待回复信息的回复表情。
  2. 根据权利要求1所述的表情回复方法,其中,所述根据所述待回复信息生成信息向量包括:
    提取所述待回复信息中的目标图像,并获取所述目标图像中的所有像素;
    根据所述所有像素生成所述目标图像的图像向量;
    将所述待回复信息中除所述目标图像外的信息确定为待处理信息;
    过滤所述待处理信息中的停用词,得到已处理信息;
    对所述已处理信息进行分词处理,得到信息分词,并获取所述信息分词的分词向量;
    确定所述目标图像在所述待回复信息中的图像位置,并确定所述信息分词在所述待回复信息中的分词位置;
    根据所述图像位置及所述分词位置拼接所述图像向量及所述分词向量,得到所述信息向量。
  3. 根据权利要求2所述的表情回复方法,其中,所述检测所述待回复信息中是否包含用户表情信息,得到检测结果包括:
    获取所述目标图像的输入地址;
    将与所述输入地址对应的终端确定为所述目标图像的输入终端,并获取所述输入终端的终端编号;
    将所述终端编号与预设终端库中所有机器编号进行比较;
    若所述终端编号与所述所有机器编号均不相同,将所述检测结果确定为所述待回复信息中包含所述用户表情信息。
  4. 根据权利要求2所述的表情回复方法,其中,所述提取所述待回复信息的特征信息包括:
    根据所述分词向量生成每个信息分词的上下文特征向量集;
    计算所述上下文特征向量集中每个分词向量与第一预设矩阵的乘积,得到所述信息分词的多个运算向量,并计算所述多个运算向量的平均值,得到所述信息分词的中间向量;
    将所述中间向量点乘第二预设矩阵,得到目标矩阵,所述目标矩阵中每列向量表征所述待回复信息的每个特征;
    计算所述目标矩阵中每列向量与所述分词向量的相似度;
    将所述相似度最大的分词向量对应的信息分词及所述目标图像确定为所述特征信息。
  5. 根据权利要求2所述的表情回复方法,其中,所述对所述特征信息进行意图识别,得到意图结果包括:
    从所述分词向量中获取所述特征信息的向量作为特征向量;
    将所述特征向量输入至预先训练好的双向长短期记忆网络中,得到语义向量;
    利用层叠条件随机场对所述语义向量进行处理,得到所述意图结果。
  6. 根据权利要求1所述的表情回复方法,其中,在将所述信息向量输入至预先训练好的分类模型中之前,所述方法还包括:
    获取预设学习器,所述预设学习器中包括全连接层;
    获取历史样本数据,所述历史样本数据中包括历史消息、用户满意度;
    将所述历史样本数据划分为训练数据及验证数据;
    利用所述训练数据调整所述全连接层中的参数,得到分类学习器;
    基于所述验证数据确定所述分类学习器的准确率;
    若所述准确率小于预设准确度,根据所述验证数据调整所述分类学习器,直至所述分类学习器的准确率大于或者等于所述预设准确度,得到所述分类模型。
  7. 根据权利要求1所述的表情回复方法,其中,所述根据所述结果概率及所述检测结果生成回复分数包括:
    获取所述分类模型的第一权值;
    将所述结果概率及所述第一权值的乘积确定为所述待回复信息的第一分数;
    获取与所述检测结果对应的检测值,并获取所述用户表情信息的第二权值;
    将所述检测值及所述第二权值的乘积确定为所述待回复信息的第二分数;
    计算所述第一分数与所述第二分数的总和,得到所述回复分数。
  8. 一种表情回复装置,其中,所述表情回复装置包括:
    获取单元,用于当接收到回复请求时,根据所述回复请求获取待回复信息;
    生成单元,用于根据所述待回复信息生成信息向量;
    输入单元,用于将所述信息向量输入至预先训练好的分类模型中,得到分类结果及所述分类结果的结果概率,所述分类结果包括目标结果,所述目标结果用于指示需要回复表情;
    检测单元,用于若所述分类结果为所述目标结果,检测所述待回复信息中是否包含用户表情信息,得到检测结果;
    所述生成单元,还用于根据所述结果概率及所述检测结果生成回复分数;
    提取单元,用于若所述回复分数大于预设阈值,提取所述待回复信息的特征信息;
    识别单元,用于对所述特征信息进行情感识别,得到情感结果,并对所述特征信息进行意图识别,得到意图结果;
    选取单元,用于根据所述情感结果及所述意图结果从预设表情库中选取匹配的表情作为所述待回复信息的回复表情。
  9. 一种电子设备,其中,所述电子设备包括处理器和存储器,所述处理器用于执行存储器中存储的至少一个计算机可读指令以实现以下步骤:
    当接收到回复请求时,根据所述回复请求获取待回复信息;
    根据所述待回复信息生成信息向量;
    将所述信息向量输入至预先训练好的分类模型中,得到分类结果及所述分类结果的结果概率,所述分类结果包括目标结果,所述目标结果用于指示需要回复表情;
    若所述分类结果为所述目标结果,检测所述待回复信息中是否包含用户表情信息,得到检测结果;
    根据所述结果概率及所述检测结果生成回复分数;
    若所述回复分数大于预设阈值,提取所述待回复信息的特征信息;
    对所述特征信息进行情感识别,得到情感结果,并对所述特征信息进行意图识别,得到意图结果;
    根据所述情感结果及所述意图结果从预设表情库中选取匹配的表情作为所述待回复 信息的回复表情。
  10. 根据权利要求9所述的电子设备,其中,在所述根据所述待回复信息生成信息向量时,所述处理器执行所述至少一个计算机可读指令以实现以下步骤:
    提取所述待回复信息中的目标图像,并获取所述目标图像中的所有像素;
    根据所述所有像素生成所述目标图像的图像向量;
    将所述待回复信息中除所述目标图像外的信息确定为待处理信息;
    过滤所述待处理信息中的停用词,得到已处理信息;
    对所述已处理信息进行分词处理,得到信息分词,并获取所述信息分词的分词向量;
    确定所述目标图像在所述待回复信息中的图像位置,并确定所述信息分词在所述待回复信息中的分词位置;
    根据所述图像位置及所述分词位置拼接所述图像向量及所述分词向量,得到所述信息向量。
  11. 根据权利要求10所述的电子设备,其中,在所述检测所述待回复信息中是否包含用户表情信息,得到检测结果时,所述处理器执行所述至少一个计算机可读指令以实现以下步骤:
    获取所述目标图像的输入地址;
    将与所述输入地址对应的终端确定为所述目标图像的输入终端,并获取所述输入终端的终端编号;
    将所述终端编号与预设终端库中所有机器编号进行比较;
    若所述终端编号与所述所有机器编号均不相同,将所述检测结果确定为所述待回复信息中包含所述用户表情信息。
  12. 根据权利要求10所述的电子设备,其中,在所述提取所述待回复信息的特征信息时,所述处理器执行所述至少一个计算机可读指令以实现以下步骤:
    根据所述分词向量生成每个信息分词的上下文特征向量集;
    计算所述上下文特征向量集中每个分词向量与第一预设矩阵的乘积,得到所述信息分词的多个运算向量,并计算所述多个运算向量的平均值,得到所述信息分词的中间向量;
    将所述中间向量点乘第二预设矩阵,得到目标矩阵,所述目标矩阵中每列向量表征所述待回复信息的每个特征;
    计算所述目标矩阵中每列向量与所述分词向量的相似度;
    将所述相似度最大的分词向量对应的信息分词及所述目标图像确定为所述特征信息。
  13. 根据权利要求9所述的电子设备,其中,在将所述信息向量输入至预先训练好的分类模型中之前,所述处理器执行所述至少一个计算机可读指令还用以实现以下步骤:
    获取预设学习器,所述预设学习器中包括全连接层;
    获取历史样本数据,所述历史样本数据中包括历史消息、用户满意度;
    将所述历史样本数据划分为训练数据及验证数据;
    利用所述训练数据调整所述全连接层中的参数,得到分类学习器;
    基于所述验证数据确定所述分类学习器的准确率;
    若所述准确率小于预设准确度,根据所述验证数据调整所述分类学习器,直至所述分类学习器的准确率大于或者等于所述预设准确度,得到所述分类模型。
  14. 根据权利要求9所述的电子设备,其中,在所述根据所述结果概率及所述检测结果生成回复分数时,所述处理器执行所述至少一个计算机可读指令以实现以下步骤:
    获取所述分类模型的第一权值;
    将所述结果概率及所述第一权值的乘积确定为所述待回复信息的第一分数;
    获取与所述检测结果对应的检测值,并获取所述用户表情信息的第二权值;
    将所述检测值及所述第二权值的乘积确定为所述待回复信息的第二分数;
    计算所述第一分数与所述第二分数的总和,得到所述回复分数。
  15. 一种计算机可读存储介质,其中,所述计算机可读存储介质存储有至少一个计算机可读指令,所述至少一个计算机可读指令被处理器执行时实现以下步骤:
    当接收到回复请求时,根据所述回复请求获取待回复信息;
    根据所述待回复信息生成信息向量;
    将所述信息向量输入至预先训练好的分类模型中,得到分类结果及所述分类结果的结果概率,所述分类结果包括目标结果,所述目标结果用于指示需要回复表情;
    若所述分类结果为所述目标结果,检测所述待回复信息中是否包含用户表情信息,得到检测结果;
    根据所述结果概率及所述检测结果生成回复分数;
    若所述回复分数大于预设阈值,提取所述待回复信息的特征信息;
    对所述特征信息进行情感识别,得到情感结果,并对所述特征信息进行意图识别,得到意图结果;
    根据所述情感结果及所述意图结果从预设表情库中选取匹配的表情作为所述待回复信息的回复表情。
  16. 根据权利要求15所述的存储介质,其中,在所述根据所述待回复信息生成信息向量时,所述至少一个计算机可读指令被处理器执行以实现以下步骤:
    提取所述待回复信息中的目标图像,并获取所述目标图像中的所有像素;
    根据所述所有像素生成所述目标图像的图像向量;
    将所述待回复信息中除所述目标图像外的信息确定为待处理信息;
    过滤所述待处理信息中的停用词,得到已处理信息;
    对所述已处理信息进行分词处理,得到信息分词,并获取所述信息分词的分词向量;
    确定所述目标图像在所述待回复信息中的图像位置,并确定所述信息分词在所述待回复信息中的分词位置;
    根据所述图像位置及所述分词位置拼接所述图像向量及所述分词向量,得到所述信息向量。
  17. 根据权利要求16所述的存储介质,其中,在所述检测所述待回复信息中是否包含用户表情信息,得到检测结果时,所述至少一个计算机可读指令被处理器执行以实现以下步骤:
    获取所述目标图像的输入地址;
    将与所述输入地址对应的终端确定为所述目标图像的输入终端,并获取所述输入终端的终端编号;
    将所述终端编号与预设终端库中所有机器编号进行比较;
    若所述终端编号与所述所有机器编号均不相同,将所述检测结果确定为所述待回复信息中包含所述用户表情信息。
  18. 根据权利要求16所述的存储介质,其中,在所述提取所述待回复信息的特征信息时,所述至少一个计算机可读指令被处理器执行以实现以下步骤:
    根据所述分词向量生成每个信息分词的上下文特征向量集;
    计算所述上下文特征向量集中每个分词向量与第一预设矩阵的乘积,得到所述信息分词的多个运算向量,并计算所述多个运算向量的平均值,得到所述信息分词的中间向量;
    将所述中间向量点乘第二预设矩阵,得到目标矩阵,所述目标矩阵中每列向量表征所述待回复信息的每个特征;
    计算所述目标矩阵中每列向量与所述分词向量的相似度;
    将所述相似度最大的分词向量对应的信息分词及所述目标图像确定为所述特征信息。
  19. 根据权利要求16所述的存储介质,其中,在所述对所述特征信息进行意图识别,得到意图结果时,所述至少一个计算机可读指令被处理器执行时以实现以下步骤:
    从所述分词向量中获取所述特征信息的向量作为特征向量;
    将所述特征向量输入至预先训练好的双向长短期记忆网络中,得到语义向量;
    利用层叠条件随机场对所述语义向量进行处理,得到所述意图结果。
  20. 根据权利要求15所述的存储介质,其中,在将所述信息向量输入至预先训练好的分类模型中之前,所述至少一个计算机可读指令被处理器执行还用以实现以下步骤:
    获取预设学习器,所述预设学习器中包括全连接层;
    获取历史样本数据,所述历史样本数据中包括历史消息、用户满意度;
    将所述历史样本数据划分为训练数据及验证数据;
    利用所述训练数据调整所述全连接层中的参数,得到分类学习器;
    基于所述验证数据确定所述分类学习器的准确率;
    若所述准确率小于预设准确度,根据所述验证数据调整所述分类学习器,直至所述分类学习器的准确率大于或者等于所述预设准确度,得到所述分类模型。
PCT/CN2022/071318 2021-06-10 2022-01-11 表情回复方法、装置、设备及存储介质 WO2022257452A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110645856.4 2021-06-10
CN202110645856.4A CN113094478B (zh) 2021-06-10 2021-06-10 表情回复方法、装置、设备及存储介质

Publications (1)

Publication Number Publication Date
WO2022257452A1 true WO2022257452A1 (zh) 2022-12-15

Family

ID=76662686

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/071318 WO2022257452A1 (zh) 2021-06-10 2022-01-11 表情回复方法、装置、设备及存储介质

Country Status (2)

Country Link
CN (1) CN113094478B (zh)
WO (1) WO2022257452A1 (zh)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113094478B (zh) * 2021-06-10 2021-08-13 平安科技(深圳)有限公司 表情回复方法、装置、设备及存储介质
CN114637833A (zh) * 2022-03-24 2022-06-17 支付宝(杭州)信息技术有限公司 一种人机交互方法、装置及设备
CN114860912B (zh) * 2022-05-20 2023-08-29 马上消费金融股份有限公司 数据处理方法、装置、电子设备和存储介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110232116A (zh) * 2019-05-31 2019-09-13 三角兽(北京)科技有限公司 回复语句中的表情添加的方法及装置
CN111193657A (zh) * 2019-12-12 2020-05-22 广州啦咔网络科技有限公司 聊天表情回复方法、装置及存储介质
CN111897933A (zh) * 2020-07-27 2020-11-06 腾讯科技(深圳)有限公司 情感对话生成方法、装置及情感对话模型训练方法、装置
US20200396187A1 (en) * 2019-06-14 2020-12-17 Microsoft Technology Licensing, Llc Email reactions through embedded tokens
CN113094478A (zh) * 2021-06-10 2021-07-09 平安科技(深圳)有限公司 表情回复方法、装置、设备及存储介质

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106683672B (zh) * 2016-12-21 2020-04-03 竹间智能科技(上海)有限公司 一种基于情感和语义的智能对话方法及系统
JP2019012457A (ja) * 2017-06-30 2019-01-24 新日鉄住金ソリューションズ株式会社 情報処理装置、情報処理方法及びプログラム
US10726204B2 (en) * 2018-05-24 2020-07-28 International Business Machines Corporation Training data expansion for natural language classification
CN110895558B (zh) * 2018-08-23 2024-01-30 北京搜狗科技发展有限公司 一种对话回复的方法及相关装置

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110232116A (zh) * 2019-05-31 2019-09-13 三角兽(北京)科技有限公司 回复语句中的表情添加的方法及装置
US20200396187A1 (en) * 2019-06-14 2020-12-17 Microsoft Technology Licensing, Llc Email reactions through embedded tokens
CN111193657A (zh) * 2019-12-12 2020-05-22 广州啦咔网络科技有限公司 聊天表情回复方法、装置及存储介质
CN111897933A (zh) * 2020-07-27 2020-11-06 腾讯科技(深圳)有限公司 情感对话生成方法、装置及情感对话模型训练方法、装置
CN113094478A (zh) * 2021-06-10 2021-07-09 平安科技(深圳)有限公司 表情回复方法、装置、设备及存储介质

Also Published As

Publication number Publication date
CN113094478B (zh) 2021-08-13
CN113094478A (zh) 2021-07-09

Similar Documents

Publication Publication Date Title
WO2021217843A1 (zh) 企业舆情分析方法、装置、电子设备及介质
WO2022257452A1 (zh) 表情回复方法、装置、设备及存储介质
WO2020258502A1 (zh) 文本分析方法、装置、计算机装置及存储介质
US20210201143A1 (en) Computing device and method of classifying category of data
US20190005951A1 (en) Method of processing dialogue based on dialog act information
WO2020007129A1 (zh) 基于语音交互的上下文获取方法及设备
CN111046133A (zh) 基于图谱化知识库的问答方法、设备、存储介质及装置
WO2020057413A1 (zh) 垃圾文本的识别方法、装置、计算设备及可读存储介质
CN112528637B (zh) 文本处理模型训练方法、装置、计算机设备和存储介质
WO2020107834A1 (zh) 唇语识别的验证内容生成方法及相关装置
WO2020233131A1 (zh) 问答处理方法、装置、计算机设备和存储介质
CN113032528B (zh) 案件分析方法、装置、设备及存储介质
CN113408278B (zh) 意图识别方法、装置、设备及存储介质
CN113435196B (zh) 意图识别方法、装置、设备及存储介质
CN112101042A (zh) 文本情绪识别方法、装置、终端设备和存储介质
CN111368066B (zh) 获取对话摘要的方法、装置和计算机可读存储介质
WO2021174923A1 (zh) 概念词序列生成方法、装置、计算机设备及存储介质
CN115858776B (zh) 一种变体文本分类识别方法、系统、存储介质和电子设备
CN113420545B (zh) 摘要生成方法、装置、设备及存储介质
CN112786041B (zh) 语音处理方法及相关设备
CN113408265B (zh) 基于人机交互的语义解析方法、装置、设备及存储介质
CN113255368B (zh) 针对文本数据进行情感分析的方法、装置及相关设备
CN114242047A (zh) 一种语音处理方法、装置、电子设备及存储介质
CN113326365A (zh) 回复语句生成方法、装置、设备及存储介质
CN113704623A (zh) 一种数据推荐方法、装置、设备及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22819060

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 22819060

Country of ref document: EP

Kind code of ref document: A1