CN113822062A - Text data processing method, device and equipment and readable storage medium - Google Patents
Text data processing method, device and equipment and readable storage medium Download PDFInfo
- Publication number
- CN113822062A CN113822062A CN202111112498.7A CN202111112498A CN113822062A CN 113822062 A CN113822062 A CN 113822062A CN 202111112498 A CN202111112498 A CN 202111112498A CN 113822062 A CN113822062 A CN 113822062A
- Authority
- CN
- China
- Prior art keywords
- user
- data
- score
- preset
- sensitive
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000003672 processing method Methods 0.000 title claims abstract description 23
- 238000012545 processing Methods 0.000 claims abstract description 143
- 238000000034 method Methods 0.000 claims abstract description 29
- 238000007781 pre-processing Methods 0.000 claims abstract description 15
- 230000035945 sensitivity Effects 0.000 claims description 60
- 238000013145 classification model Methods 0.000 claims description 20
- 238000004458 analytical method Methods 0.000 claims description 18
- 238000004891 communication Methods 0.000 claims description 18
- 230000015654 memory Effects 0.000 claims description 16
- 238000004590 computer program Methods 0.000 claims description 10
- 238000005516 engineering process Methods 0.000 abstract description 13
- 238000003058 natural language processing Methods 0.000 abstract description 10
- 238000013473 artificial intelligence Methods 0.000 abstract description 8
- 230000008569 process Effects 0.000 description 9
- 230000008451 emotion Effects 0.000 description 8
- 238000010586 diagram Methods 0.000 description 7
- 230000006870 function Effects 0.000 description 5
- 238000013135 deep learning Methods 0.000 description 3
- 230000003993 interaction Effects 0.000 description 2
- 230000007935 neutral effect Effects 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 230000011218 segmentation Effects 0.000 description 2
- RWSOTUBLDIXVET-UHFFFAOYSA-N Dihydrogen sulfide Chemical compound S RWSOTUBLDIXVET-UHFFFAOYSA-N 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000003190 augmentative effect Effects 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 230000003203 everyday effect Effects 0.000 description 1
- 230000002349 favourable effect Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000036651 mood Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000001737 promoting effect Effects 0.000 description 1
- 230000000306 recurrent effect Effects 0.000 description 1
- 230000006403 short-term memory Effects 0.000 description 1
- 230000003393 splenic effect Effects 0.000 description 1
- 238000012549 training Methods 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/284—Lexical analysis, e.g. tokenisation or collocates
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
- G06F40/35—Discourse or dialogue representation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/063—Operations research, analysis or management
- G06Q10/0639—Performance analysis of employees; Performance analysis of enterprise or organisation operations
- G06Q10/06393—Score-carding, benchmarking or key performance indicator [KPI] analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/10—Services
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Business, Economics & Management (AREA)
- Physics & Mathematics (AREA)
- Human Resources & Organizations (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Computational Linguistics (AREA)
- General Engineering & Computer Science (AREA)
- Tourism & Hospitality (AREA)
- Strategic Management (AREA)
- Economics (AREA)
- Software Systems (AREA)
- Biophysics (AREA)
- Computing Systems (AREA)
- Molecular Biology (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Evolutionary Computation (AREA)
- Data Mining & Analysis (AREA)
- Development Economics (AREA)
- Educational Administration (AREA)
- Mathematical Physics (AREA)
- Entrepreneurship & Innovation (AREA)
- General Business, Economics & Management (AREA)
- Biomedical Technology (AREA)
- Marketing (AREA)
- Life Sciences & Earth Sciences (AREA)
- Quality & Reliability (AREA)
- Operations Research (AREA)
- Game Theory and Decision Science (AREA)
- Primary Health Care (AREA)
- Information Transfer Between Computers (AREA)
Abstract
The embodiment of the application discloses a text data processing method, a text data processing device, text data processing equipment and a readable storage medium, and relates to a natural language processing technology in the field of artificial intelligence, wherein the method comprises the following steps: preprocessing chatting data sent by a client, and determining whether sensitive data exists in the chatting data, wherein the chatting data comprises chatting data between a user and an intelligent robot; if the user portrait exists, acquiring sensitive information in the sensitive data, matching the sensitive information with a preset portrait database to obtain a matching result, and determining the user portrait of the user according to the matching result; and sending the user portrait of the user to a customer service end so that the customer service end performs service processing based on the user portrait. By adopting the embodiment of the application, the data processing efficiency can be improved.
Description
Technical Field
The present application relates to the technical field of natural language processing in artificial intelligence, and in particular, to a text data processing method, apparatus, device, and readable storage medium.
Background
The intelligent customer service robot plays an important role in releasing the workload of manual customer service. However, the existing intelligent customer service has a limited function, and only serves as a function of providing the most basic questions and answers for the customer and changing the questions and answers to the most manual, and cannot play a substantial auxiliary role in the communication between the manual customer service and the customer.
The technical scheme that a robot can be changed into an artificial customer service mode is provided in the prior art, but the problem of how to assist the artificial customer service mode after the robot is switched to the artificial service mode is not solved. The customer service intervenes without knowing the current state of the client, and is easily influenced by the excited emotion of the client, so that conflict can be generated. Therefore, how to optimize the chat between the client and the customer service, and improve the data processing efficiency, and further improve the service processing efficiency is an urgent problem to be solved.
Disclosure of Invention
The embodiment of the application provides a text data processing method, a text data processing device, text data processing equipment and a readable storage medium, and the data processing efficiency can be improved.
In a first aspect, the present application provides a text data processing method, including:
preprocessing chatting data sent by a client, and determining whether sensitive data exists in the chatting data, wherein the chatting data comprises chatting data between a user and an intelligent robot;
if the user portrait exists, acquiring sensitive information in the sensitive data, matching the sensitive information with a preset portrait database to obtain a matching result, and determining the user portrait of the user according to the matching result;
and sending the user portrait of the user to a customer service end so that the customer service end performs service processing based on the user portrait.
With reference to the first aspect, in a possible implementation manner, the preprocessing the chat data sent by the client to determine whether sensitive data exists in the chat data includes:
splitting the chatting data sent by the client based on the target classification model to obtain at least one word;
performing part-of-speech analysis on the at least one word, and determining the part-of-speech category of each word;
performing semantic analysis on the chatting data based on the target classification model, and determining key information in the chatting data;
and determining whether sensitive data exists in the chat data or not based on the matching degree between the part of speech category of each word and the key information in the chat data.
With reference to the first aspect, in a possible implementation manner, the sensitive information in the sensitive data includes a number of occurrences of the sensitive data and/or a sensitivity level of the sensitive data, where the sensitivity level of the sensitive data is used to indicate a sensitivity degree of the sensitive data;
should match with the preset portrait base based on this sensitive information and obtain the matching result, confirm this user's portrait according to this matching result, include:
matching the occurrence frequency of the sensitive data with a preset frequency range in a preset image library, and determining a preset user image corresponding to the preset frequency range matched with the occurrence frequency of the sensitive data in the preset image library as the user image of the user; and/or
And determining a preset user portrait corresponding to a preset grade matched with the sensitivity grade of the sensitive data in the preset portrait library as the user portrait of the user based on the matching of the sensitivity grade of the sensitive data and the preset grade in the preset portrait library.
With reference to the first aspect, in a possible implementation manner, the user representation of the user includes a user score of the user, the preset number range includes a first preset number range, a second preset number range and a third preset number range, the preset level includes a first preset level, a second preset level and a third preset level, the first preset level is smaller than the second preset level, and the second preset level is smaller than the third preset level;
should match with the preset portrait base based on this sensitive information and obtain the matching result, confirm this user's portrait according to this matching result, include:
if the occurrence frequency of the sensitive data is matched with a first preset frequency range and/or the sensitivity level of the sensitive data is matched with a first preset level, determining that the user score is a first score;
if the occurrence frequency of the sensitive data is matched with a second preset frequency range and/or the sensitivity level of the sensitive data is matched with a second preset level, determining that the user score is a second score;
and if the occurrence frequency of the sensitive data is matched with a third preset frequency range and/or the sensitivity level of the sensitive data is matched with a third preset level, determining that the user score is a third score, wherein the first score is greater than the second score, and the second score is greater than the third score.
With reference to the first aspect, in a possible implementation manner, the text data processing method further includes:
if the user score is smaller than or equal to the third score, generating low-score prompt information aiming at the user, wherein the low-score prompt information is used for reflecting that the user is a low-quality user;
and sending the low score prompt information to the customer service end.
With reference to the first aspect, in a possible implementation manner, the text data processing method further includes:
generating scoring prompt information aiming at the customer service end;
sending the scoring prompt information to the client to prompt the user to score the customer service end on the client;
receiving a score aiming at the customer service end sent by the client, acquiring the user score in the user portrait, and determining the user weight based on the user score;
and determining the total score of the customer service end based on the score of the customer service end and the user weight.
With reference to the first aspect, in a possible implementation manner, the determining a user weight based on the user score includes:
acquiring a corresponding relation between a user scoring threshold and a user weight;
determining a target scoring threshold value matched with the user score in the user portrait from the user scoring threshold values, determining a target weight corresponding to the target scoring threshold value based on the corresponding relation, and determining the target weight as the user weight.
In a second aspect, the present application provides a text data processing apparatus comprising:
the data processing module is used for preprocessing chatting data sent by the client and determining whether sensitive data exist in the chatting data, wherein the chatting data comprises chatting data between a user and the intelligent robot;
the portrait determining module is used for acquiring the sensitive information in the sensitive data if the user portrait exists, matching the sensitive information with a preset portrait database to obtain a matching result, and determining the user portrait of the user according to the matching result;
and the service processing module is used for sending the user portrait of the user to the customer service end so as to enable the customer service end to perform service processing based on the user portrait.
With reference to the second aspect, in one possible implementation manner, the data processing module includes:
the data splitting unit is used for splitting the chatting data sent by the client based on the target classification model to obtain at least one word;
the part of speech analysis unit is used for carrying out part of speech analysis on the at least one word and determining the part of speech category of each word;
the semantic analysis unit is used for carrying out semantic analysis on the chatting data based on the target classification model and determining key information in the chatting data;
and the data determining unit is used for determining whether sensitive data exists in the chat data or not based on the matching degree between the part of speech category of each word and the key information in the chat data.
With reference to the second aspect, in a possible implementation manner, the sensitive information in the sensitive data includes the number of occurrences of the sensitive data and/or a sensitivity level of the sensitive data, where the sensitivity level of the sensitive data is used to indicate a sensitivity degree of the sensitive data;
the portrait determination module is specifically configured to:
matching the occurrence frequency of the sensitive data with a preset frequency range in a preset image library, and determining a preset user image corresponding to the preset frequency range matched with the occurrence frequency of the sensitive data in the preset image library as the user image of the user; and/or
And determining a preset user portrait corresponding to a preset grade matched with the sensitivity grade of the sensitive data in the preset portrait library as the user portrait of the user based on the matching of the sensitivity grade of the sensitive data and the preset grade in the preset portrait library.
With reference to the second aspect, in a possible implementation manner, the user representation of the user includes a user score of the user, the preset number range includes a first preset number range, a second preset number range and a third preset number range, the preset level includes a first preset level, a second preset level and a third preset level, the first preset level is smaller than the second preset level, and the second preset level is smaller than the third preset level; the portrait determination module is specifically configured to:
if the occurrence frequency of the sensitive data is matched with a first preset frequency range and/or the sensitivity level of the sensitive data is matched with a first preset level, determining that the user score is a first score;
if the occurrence frequency of the sensitive data is matched with a second preset frequency range and/or the sensitivity level of the sensitive data is matched with a second preset level, determining that the user score is a second score;
and if the occurrence frequency of the sensitive data is matched with a third preset frequency range and/or the sensitivity level of the sensitive data is matched with a third preset level, determining that the user score is a third score, wherein the first score is greater than the second score, and the second score is greater than the third score.
With reference to the second aspect, in a possible implementation manner, the text data processing apparatus further includes:
the information prompting module is used for generating low-score prompting information aiming at the user if the user score is less than or equal to the third score, wherein the low-score prompting information is used for reflecting that the user is a low-quality user;
the information prompt module is used for sending the low score prompt information to the customer service end.
With reference to the second aspect, in a possible implementation manner, the text data processing apparatus further includes:
the information prompting module is used for generating low-score prompting information aiming at the user if the user score is less than or equal to the third score, wherein the low-score prompting information is used for reflecting that the user is a low-quality user;
the information prompt module is used for sending the low score prompt information to the customer service end.
With reference to the second aspect, in a possible implementation manner, the text data processing apparatus further includes:
the user scoring module is used for generating scoring prompt information aiming at the customer service end;
the user scoring module is used for sending the scoring prompt information to the client so as to prompt the user to score the customer service terminal on the client;
the user scoring module is used for receiving the score aiming at the customer service end sent by the client, acquiring the user score in the user portrait and determining the user weight based on the user score;
the user scoring module is used for determining the total score of the customer service end based on the score of the customer service end and the user weight.
With reference to the second aspect, in a possible implementation manner, the user scoring module is specifically configured to obtain a correspondence between a user scoring threshold and a user weight;
the user scoring module is specifically configured to determine a target scoring threshold matching the user score in the user representation from the user scoring thresholds, determine a target weight corresponding to the target scoring threshold based on the correspondence, and determine the target weight as the user weight.
In a third aspect, the present application provides a computer device comprising: a processor, a memory, a network interface;
the processor is connected with a memory and a network interface, wherein the network interface is used for providing a data communication function, the memory is used for storing a computer program, and the processor is used for calling the computer program so as to enable a computer device comprising the processor to execute the text data processing method.
In a fourth aspect, the present application provides a computer-readable storage medium having stored therein a computer program adapted to be loaded and executed by a processor, so as to cause a computer device having the processor to execute the above-mentioned text data processing method.
In a fifth aspect, the present application provides a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions to cause the computer device to perform the text data processing method provided in the various alternatives in the first aspect of the present application.
In the embodiment of the application, the chat data sent by the client side is preprocessed, so that whether sensitive data exists in the chat data is determined, wherein the chat data comprises the chat data between a user and the intelligent robot; if the user portrait exists, acquiring sensitive information in the sensitive data, matching the sensitive information with a preset portrait library to obtain a matching result, and determining the user portrait of the user according to the matching result; and sending the user portrait of the user to the customer service end so that the customer service end performs service processing based on the user portrait. By preprocessing the chatting data sent by the client, sensitive data in the chatting data can be determined, and the user portrait of the user is determined based on the sensitive data. Since the user portrait can reflect the user's situation, such as the user's personality and quality, the customer service end can perform targeted processing on the user according to the user portrait by sending the user portrait to the customer service end. The chat between the user and the customer service can be optimized, the chat content is returned instead of the emotion disclosure, and invalid communication with the user due to the fact that the user condition is not known is reduced, so that the data processing efficiency can be improved, the service processing efficiency is improved, and the user experience is improved.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings without creative efforts.
FIG. 1 is a block diagram of a text data processing system according to an embodiment of the present disclosure;
fig. 2 is a schematic flowchart of a text data processing method according to an embodiment of the present application;
FIG. 3 is a schematic flow chart of another text data processing method provided in the embodiments of the present application;
fig. 4 is a schematic structural diagram of a text data processing apparatus according to an embodiment of the present application;
fig. 5 is a schematic structural diagram of a computer device according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
The artificial intelligence technology is a comprehensive subject and relates to the field of extensive technology, namely the technology of a hardware level and the technology of a software level. The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.
Among them, Natural Language Processing (NLP) is an important direction in the fields of computer science and artificial intelligence. It studies various theories and methods that enable efficient communication between humans and computers using natural language. Natural language processing is a science integrating linguistics, computer science and mathematics. Therefore, the research in this field will involve natural language, i.e. the language that people use everyday, so it is closely related to the research of linguistics. Natural language processing techniques typically include text processing, semantic understanding, machine translation, robotic question and answer, knowledge mapping, and the like.
The application relates to a natural language processing technology in artificial intelligence, which is used for preprocessing chatting data between a user and an intelligent robot and determining whether sensitive data exists in the chatting data, so that the accuracy of data processing can be improved; moreover, the user portrait is obtained after the chat data is processed, so that a customer service end can conveniently and quickly know the user based on the user portrait, the data processing efficiency is improved, and the service processing efficiency is further improved; this application can be applicable to fields such as wisdom government affairs, wisdom education, is favorable to promoting the construction in wisdom city.
Referring to fig. 1, fig. 1 is a network architecture diagram of a text data processing system according to an embodiment of the present disclosure, as shown in fig. 1, the network architecture diagram includes a data processing server 101, a client 102, and a client 103, where the data processing server 101 may perform data interaction with the client 102 and the client 103, and the number of the client and the client may be one or more. Specifically, the data processing server 101 may pre-process the chat data sent by the client to determine whether sensitive data exists in the chat data. If the chat data contains sensitive data, the data processing server 101 can acquire the sensitive information in the sensitive data, match the sensitive information in the sensitive data with a preset image library to obtain a matching result, determine a user portrait of the user according to the matching result, and send the user portrait of the user to the customer service end, so that the customer service end performs service processing on the basis of the user portrait.
The method comprises the steps of obtaining chat data between a user and the intelligent robot to preprocess, determining whether sensitive data exist in the chat data, determining a user portrait of the user based on the sensitive information of the sensitive data if the sensitive data exist in the chat data, wherein the user portrait can reflect the chat condition of the user and can also reflect the information of the splenic atmosphere and the like of the user. Through drawing the portrait of user and sending to customer service end, customer service end can carry out the pertinence processing to the user according to user's portrait, reduces because not knowing the invalid communication between user's condition and the customer to can improve data processing efficiency, and then improve the business processing efficiency.
It can be understood that the data processing server 101 mentioned in this embodiment of the present application may be an independent physical server, may also be a server cluster or a distributed system formed by a plurality of physical servers, and may also be a cloud server that provides basic cloud computing services such as a cloud service, a cloud database, cloud computing, a cloud function, cloud storage, a Network service, cloud communication, a middleware service, a domain name service, a security service, a Content Delivery Network (CDN), and a big data and artificial intelligence platform. The client 102 and the customer service end 103 may be an electronic Device, including but not limited to a Mobile phone, a tablet computer, a desktop computer, a notebook computer, a palm computer, a vehicle-mounted Device, an Augmented Reality/Virtual Reality (AR/VR) Device, a helmet display, a wearable Device, a smart speaker, a digital camera, a camera, and other Mobile Internet Devices (MID) with network access capability.
Further, please refer to fig. 2, fig. 2 is a schematic flowchart of a text data processing method provided in an embodiment of the present application, where the text data processing method can be applied to a data processing server; as shown in fig. 2, the text data processing method includes, but is not limited to, the following steps:
s101, preprocessing the chatting data sent by the client and determining whether sensitive data exists in the chatting data.
In the embodiment of the application, the data processing server may receive chat data sent by the client, wherein the chat data includes chat data between the user and the intelligent robot. Specifically, the client may establish a connection with the intelligent robot through the data processing server, so that the user may chat with the intelligent robot through the client. The client can send the chatting data to the intelligent robot, and the client and the intelligent robot are connected based on the data processing server, so that the data processing server can receive the chatting data sent by the client. The chat data is sent to the intelligent robot by the client, and the client is a terminal used by a user. Chat data may include data related to a business process, such as a user's desire to transact a business; alternatively, the chat data may be data related to a business consultation, such as a user's desire to consult a certain business; alternatively, the chat data may be data related to complaints, such as a customer's desire to complain about a business processing system or client, and so on. In a specific implementation, after the user inputs the chat data in the display interface of the client, the user can click the sending button or key, and then the data processing server receives the chat data sent by the client.
Optionally, the chat data may include voice data and/or text data, and if the chat data is voice data, the data processing server may process the voice data by using an Automatic Speech Recognition (ASR) technology to convert the voice data into the text data.
Further, the data processing server may pre-process the chat data sent by the client to determine whether sensitive data exists in the chat data. Sensitive data may include data with a personal attack or violation vocabulary, among other things. Data with a personal attack may include data with slurs, etc., slurs, and data with offending vocabulary may include data that violates an industry regulation, violates a legal regulation, etc.
Optionally, the data processing server may process the chat data based on the target classification model to determine whether sensitive data exists in the chat data. Specifically, the data processing server can split the chat data sent by the client based on the target classification model to obtain at least one word; performing part-of-speech analysis on at least one word, and determining the part-of-speech category of each word; performing semantic analysis on the chatting data based on the target classification model, and determining key information in the chatting data; and determining whether sensitive data exists in the chat data or not based on the matching degree between the part of speech category of each word and the key information in the chat data.
In a specific implementation, the data processing server can perform word segmentation processing on the chat data sent by the client based on the target classification model, and divide the chat data into at least one word; and performing part-of-speech analysis on each word to determine a part-of-speech category of each word, wherein the part-of-speech category may include a derogatory category, a neutral category and a recognition category. Further, the data Processing server may perform semantic analysis on the chat data by using a Natural Language Processing (NLP) technology to determine key information in the chat data, where the key information is used to reflect the essential meaning information expressed in the chat data by the user. For example, if the chat data is "your service attitude is too spammed at a bar", the key information may include "service attitude", "spammed". Further, since the data processing server determines the part-of-speech category of each word in the chat data and the key information in the chat data, it can be determined whether the chat data contains sensitive words or not based on the matching degree between the part-of-speech category of each word in the chat data and the key information in the chat data.
Determining whether the chat data contains sensitive words or not based on the part-of-speech category of each word in the chat data and the matching degree between the key information in the chat data may be: and determining the part-of-speech category of each word, and further judging whether the chatting data contains the keywords or not by combining the meanings expressed by the key information. Specifically, if the part-of-speech category of a word is a derogatory word and the meaning expressed by the key information is a recognition, the part-of-speech category of the word is not matched with the key information, and the fact that sensitive data does not exist in the chat data is determined by combining the key information. And if the part-of-speech category of one word is a deprecated word and the meaning expressed by the key information is deprecated, matching the part-of-speech category of the word with the key information, and determining that sensitive data exists in the chatting data by combining the key information. And if the part of speech category of each word is a positive word and the meaning expressed by the key information is a negative word, the part of speech category of the word is not matched with the key information, and the sensitive data in the chat data is determined by combining the key information. If the part of speech category of each word is a positive word and the meaning expressed by the key information is a positive word, the part of speech category of the word is matched with the key information, and the fact that the sensitive data is not contained in the chat data is determined by combining the key information. That is to say, in the embodiment of the present application, it is not only determined whether sensitive data exists in the chat data according to the part-of-speech category of each word, but also further determined by combining the key information in the chat data, and it is determined whether sensitive data exists in the chat data, so that the determination result is more accurate.
For example, the chat data is "your service attitude is too spammed in a bar", and the word segmentation processing on the chat data can respectively obtain "your", "what", "service attitude", "also", "too", "spam", "in a bar". The parts of speech categories of "your", "service attitude", "also", "too", "got", "bar" in the chat data are all neutral word categories, and the parts of speech category of "garbage" is a devalued word category; the key information may include "service attitude" and "spam", meaning expressed by the key information is derogative, part of speech category of the word is matched with the key information, and the chat data is determined to include sensitive data by combining the key information. Whether the chat data contains sensitive data or not is judged by combining the part of speech category of each word in the chat data and the key information in the chat data, and the accuracy of determining the sensitive data can be improved.
Optionally, the data processing server may pre-train the target classification model prior to processing the chat data using the target classification model. Specifically, the target classification model may be trained in a Long Short Term Memory network (LSTM) supervised learning manner, so that the trained target classification model may analyze and classify the parts of speech of words in the text data, and perform semantic analysis on the text data to determine the key information in the text data. The LSTM deep learning method is an improved version of a Recurrent Neural Networks (RNN) and is used to solve the problem that the RNN cannot cope with pre-dependency in long sequences. In the specific implementation, a plurality of training sets (namely sample chat data) are input into the LSTM in advance for deep learning, so that the target classification model has the part-of-speech classification capability of processing words in the chat data sent by a subsequent client and the key information determination capability in the chat data. In the subsequent process of using the target classification model, the part-of-speech category of each word in the chat data can be output by inputting the chat data into the target classification model. Optionally, the target classification model may perform semantic analysis on the chat data in a whole sentence, and remove redundant words in the chat data (i.e. irrelevant words, for example, the chat data is "your service attitude is also too spammed in a bar", and the redundant words are "your", "also", "too", "having", "bar"), so as to obtain the content that the user really wants to express, i.e. the key information.
And S102, if the user portrait exists, acquiring sensitive information in the sensitive data, matching the sensitive information with a preset portrait library to obtain a matching result, and determining the user portrait of the user according to the matching result.
In the embodiment of the application, if the sensitive data exists in the chatting data, the data processing server obtains the sensitive information in the sensitive data in the chatting data, matches the sensitive information in the sensitive data with the preset image library to obtain a matching result, and determines the user image of the user according to the matching result. If the chat data contains sensitive data, the chat data sent by the client side comprises data of personal attacks or illegal words. The sensitive information in the sensitive data may include the number of occurrences of the sensitive data, the sensitivity of the sensitive data, and the like.
In a possible implementation manner, the sensitive information in the sensitive data may include the occurrence frequency of the sensitive data, and the data processing server may match the occurrence frequency of the sensitive data with a preset frequency range in a preset image library, and determine a preset user image corresponding to the preset frequency range in the preset image library, where the preset frequency range is matched with the occurrence frequency of the sensitive data, as the user image of the user, where the preset frequency range may include a first preset frequency range, a second preset frequency range, and a third preset frequency range. Specifically, the user representation of the user may include a user score of the user, and if the number of occurrences of the sensitive data matches a first preset number range, the data processing server may determine that the user score is the first score. If the occurrence frequency of the sensitive data is matched with the second preset frequency range, the data processing server can determine that the user score is the second score. If the occurrence frequency of the sensitive data matches the third preset frequency range, the data processing server may determine that the user score is a third score. Wherein the first score is greater than the second score, and the second score is greater than the third score. The larger the user score is, the fewer the number of occurrences of sensitive data contained in the chat data representing the user is; the smaller the user score, the more sensitive data is present in the chat data representing the user.
In another possible implementation manner, the sensitive information in the sensitive data may include a sensitivity level of the sensitive data, and the data processing server may match a preset level in a preset portrait library based on the sensitivity level of the sensitive data, and determine a preset user portrait corresponding to the preset level in the preset portrait library, where the preset level matches the sensitivity level of the sensitive data, as the user portrait of the user. The preset levels may include a first preset level, a second preset level, and a third preset level, where the first preset level is smaller than the second preset level, the second preset level is smaller than the third preset level, and the sensitivity level of the sensitive data may be determined according to the category of the sensitive data, for example, the sensitivity level of the sensitive data of the personal attack category may be greater than the sensitivity level of the sensitive data of the violation category, and the sensitivity level of the sensitive data of the category violating the regulations such as laws and regulations in the sensitive data of the violation category may be greater than the sensitivity level of the sensitive data of the category violating the industry regulations, and the like. For example, the sensitivity level of sensitive data of a category that violates an industry regulation may be a first preset level, the sensitivity level of sensitive data of a category that violates a regulation of law, etc. may be a second preset level, the sensitivity level of sensitive data of a personal attack category may be a third preset level, and so on.
Specifically, the user representation of the user may include a user score of the user, and if the sensitivity level of the sensitive data matches a first preset level, the data processing server may determine that the user score is a first score. If the sensitivity level of the sensitive data matches the second preset level, the data processing server may determine that the user score is the second score. If the sensitivity level of the sensitive data matches a third preset level, the data processing server may determine that the user score is a third score. Wherein the first score is greater than the second score, and the second score is greater than the third score. The larger the user score is, the smaller the sensitivity level of sensitive data contained in the chat data representing the user is; the smaller the user score, the greater the sensitivity level of sensitive data contained in the chat data representing the user.
In yet another possible implementation manner, the sensitive information in the sensitive data may include the occurrence number of the sensitive data and the sensitivity level of the sensitive data, and the data processing server may perform matching based on the occurrence number of the sensitive data and a preset number range in a preset image library, and may determine, as the user image of the user, the preset user image corresponding to both the preset number range in the preset image library matching the occurrence number of the sensitive data and the preset level matching the sensitivity level of the sensitive data based on the sensitivity level of the sensitive data and the preset level in the preset image library matching the preset number range in the preset image library. Specifically, the user representation of the user may include a user score of the user, and if the occurrence frequency of the sensitive data matches a first preset frequency range and the sensitivity level of the sensitive data matches a first preset level, the user score is determined to be the first score. And if the occurrence frequency of the sensitive data is matched with the second preset frequency range and the sensitivity level of the sensitive data is matched with the second preset level, determining that the user score is a second score. And if the occurrence frequency of the sensitive data is matched with the third preset frequency range and the sensitivity level of the sensitive data is matched with the third preset level, determining that the user score is a third score.
Through the implementation manner, the data processing server can match the preset image library based on the occurrence frequency of the sensitive data and/or the sensitivity level of the sensitive data to obtain a matching result, so as to determine the user image of the user, wherein the user image of the user can include user scores, the occurrence frequency of the sensitive data of the user, the sensitivity level of the sensitive data of the user and the like. The number of occurrences of the sensitive data of the user may include the number of occurrences of the sensitive data in the chat data between the user and the intelligent robot at this time, and may also include the number of occurrences of the sensitive data in the historical chat data between the user and the intelligent robot. The sensitivity level of the sensitive data of the user can comprise the sensitivity level of the sensitive data in the chatting data between the user and the intelligent robot at this time and can also comprise the sensitivity level of the sensitive data in the historical chatting data between the user and the intelligent robot.
Optionally, if the user score is less than or equal to the third score, the data processing server generates low-score prompt information for the user, wherein the low-score prompt information is used for reflecting that the user is a low-quality user; and sending the low score prompt information to the customer service end. The low-score hint information may include "the client has published m personal attack statements, please note, within n days," where n and m are both non-negative numbers. The low-quality user is used for reflecting that the number of occurrences of sensitive data in the chat data of the user is large, the user is low in quality, and the low-grade prompt information is sent to the customer service end, so that the customer service end can be prompted to pay attention to when the customer service end subsequently chats with the user.
Optionally, if there is sensitive data in the chat data, the data processing server may generate a sensitive prompt message for the sensitive data, for example, "you speak at present involves personal attack or violation data, please pay attention to", and send the sensitive prompt message to the client to prompt the user to pay attention to his/her own language, so as to avoid influencing subsequent communication and further influencing the efficiency of data processing. For example, some users may not notice that the chat data contains sensitive data, and a prompt effect can be achieved by generating sensitive prompt information aiming at the sensitive data. Optionally, the data processing server may also send sensitive data included in the chat data to the client, so that the user can quickly know the specific sensitive data.
S103, sending the user portrait of the user to the customer service end so that the customer service end performs service processing based on the user portrait.
In the embodiment of the application, the data processing server can send the user image of the user to the customer service end, so that the customer service end can quickly know the condition of the user, for example, the information such as the character, the quality and the like of the user, the customer service end can conveniently carry out corresponding business processing according to the condition of the user, the problem that the emotion of the user is influenced due to the fact that the customer service end cannot know the condition of the user and conflicts to influence the business processing efficiency is avoided, the chatting content is returned, instead of the emotion disclosure, the business processing efficiency is improved, and the user experience is improved.
Optionally, the data processing server may also send chat data between the user and the intelligent robot to the customer service end, so that the customer service end can know the user condition more clearly. Specifically, if the chat data includes sensitive data, the data processing server may obtain target replacement data corresponding to the sensitive data from the preset information replacement library, and send the target replacement data to the customer service end for display. The preset information replacement library is used for storing the corresponding relation between the sensitive data and the replacement data. For example, if the chat data is "your system is too spammed at a bar" containing sensitive data, the corresponding target replacement data may be "i'm now mood is less stable", or "i'm now angry". Optionally, the data processing server may also mask sensitive data, replacing it with a sign, e.g. "your system is also. Optionally, if the data processing server receives a sensitive data viewing instruction sent by the customer service end, the complete chat data including the sensitive data may be output at the customer service end. By processing and outputting the sensitive data, the user experience can be improved.
Optionally, if there is no sensitive data in the chat data, which indicates that the chat data sent by the client does not include data with personal attack data or illegal words, the data processing server may continue to obtain the chat data sent by the client, so as to perform the processing of steps S101 to S104 on the obtained chat data.
It can be understood that, in the embodiment of the present application, the user representation of an arbitrary user is obtained by processing the chat data between the user and the intelligent robot, and if there are multiple users, the processing method may be referred to by the data processing server for obtaining the user representation of each of the multiple users.
In the embodiment of the application, the chat data sent by the client side is preprocessed, so that whether sensitive data exists in the chat data is determined, wherein the chat data comprises the chat data between a user and the intelligent robot; if the user portrait exists, acquiring sensitive information in the sensitive data, matching the sensitive information with a preset portrait library to obtain a matching result, and determining the user portrait of the user according to the matching result; and sending the user portrait of the user to the customer service end so that the customer service end performs service processing based on the user portrait. By preprocessing the chatting data sent by the client, sensitive data in the chatting data can be determined, and the user portrait of the user is determined based on the sensitive data. Since the user portrait can reflect the user's situation, such as the user's personality and quality, the customer service end can perform targeted processing on the user according to the user portrait by sending the user portrait to the customer service end. The chat between the user and the customer service can be optimized, the chat content is returned instead of the emotion disclosure, and invalid communication with the user due to the fact that the user condition is not known is reduced, so that the data processing efficiency can be improved, the service processing efficiency is improved, and the user experience is improved.
Optionally, please refer to fig. 3, where fig. 3 is a schematic flowchart of another text data processing method provided in the embodiment of the present application, and the text data processing method describes processing procedures of a data processing server, a client, and a client as a whole; as shown in fig. 3, the text data processing method includes, but is not limited to, the following steps:
s201, the client sends the chat data to the data processing server.
S202, the data processing server preprocesses the chatting data and determines whether sensitive data exists in the chatting data.
S203, if the user portrait exists, the data processing server obtains the sensitive information in the sensitive data, matches the sensitive information with a preset portrait library to obtain a matching result, and determines the user portrait of the user according to the matching result.
S204, the data processing server sends the user portrait of the user to the customer service end so that the customer service end performs service processing based on the user portrait.
In the embodiment of the present application, the implementation manners of steps S201 to S204 may refer to the implementation manners of steps S101 to S103 in fig. 2, and are not described herein again.
S205, the data processing server generates scoring prompt information aiming at the customer service end.
In this embodiment of the application, the data processing server may generate scoring prompt information for the customer service end, for example, when a service processing completion instruction sent by the customer service end is received, or when a service end instruction sent by a client or the customer service end is received, the data processing server may generate scoring prompt information for the customer service end. And the grading prompt information is used for indicating the client to grade aiming at the service processing of the customer service end.
S206, the data processing server sends the grading prompt information to the client to prompt the user to grade on the client aiming at the customer service terminal.
Optionally, the data processing server can also receive data such as complaint suggestions and the like of the client for the customer service end, so that the customer service end can perform targeted improvement.
S207, the client sends the score aiming at the customer service end to the data processing server.
In a specific implementation, after the user inputs the score for the customer service end in the display interface of the client, the user can click a determination button or a key in the display interface of the client, and the client responds to the trigger instruction and sends the score for the customer service end to the data processing server.
S208, the data processing server obtains the user scores in the user portrait and determines the user weight based on the user scores.
The higher the user score of the user is, the larger the corresponding user weight is; the lower the user score of a user, the lower the corresponding user weight.
S209, the data processing server determines the total score of the customer service end based on the score of the customer service end and the user weight, and sends the total score of the customer service end to the customer service end.
In a specific implementation, the data processing server may obtain a product between the score of the customer service end and the user weight, and determine the product between the score of the customer service end and the user weight as a total score of the customer service end. Or the data processing server can also obtain the sum of the score of the customer service end and the user weight, and the sum of the score of the customer service end and the user weight is determined as the total score of the customer service end. Or an average value between the score of the customer service end and the user weight may also be determined as a total score of the customer service end, which is not limited in the embodiment of the present application.
In the embodiment of the application, the higher the user score of the user is, the higher the corresponding user weight is, and the higher the total score of the customer service end is determined based on the score of the customer service end and the user weight. The lower the user score of the user is, the lower the corresponding user weight is, and the lower the total score of the customer service end is determined based on the score of the customer service end and the user weight. Because the malicious evaluation behavior may exist for the user with the user score lower than the score threshold in the user portrait, when the data processing server obtains the user score of the user with the user score lower than the score threshold for the customer service end, the score of the customer service end can be adjusted, the weight of the customer service end is reduced, the score of the customer service end is more objective and accurate, the reference value is higher, and the subsequent customer service system can be improved and corrected conveniently.
In the embodiment of the application, the chat data sent by the client side is preprocessed, so that whether sensitive data exists in the chat data is determined, wherein the chat data comprises the chat data between a user and the intelligent robot; if the user portrait exists, acquiring sensitive information in the sensitive data, matching the sensitive information with a preset portrait library to obtain a matching result, and determining the user portrait of the user according to the matching result; and sending the user portrait of the user to the customer service end so that the customer service end performs service processing based on the user portrait. By preprocessing the chatting data sent by the client, sensitive data in the chatting data can be determined, and the user portrait of the user is determined based on the sensitive data. Since the user portrait can reflect the user's situation, such as the user's personality and quality, the customer service end can perform targeted processing on the user according to the user portrait by sending the user portrait to the customer service end. The chat between the user and the customer service can be optimized, the chat content is returned instead of the emotion disclosure, and invalid communication with the user due to the fact that the user condition is not known is reduced, so that the data processing efficiency can be improved, the service processing efficiency is improved, and the user experience is improved.
The method of the embodiments of the present application is described above, and the apparatus of the embodiments of the present application is described below.
Referring to fig. 4, fig. 4 is a schematic diagram illustrating a structure of a text data processing apparatus according to an embodiment of the present application, where the text data processing apparatus may be a computer program (including program code) running in a computer device, for example, the text data processing apparatus is an application software; the text data processing device can be used for executing corresponding steps in the text data processing method provided by the embodiment of the application. The text data processing device 40 includes:
the data processing module 41 is configured to pre-process chat data sent by a client, and determine whether sensitive data exists in the chat data, where the chat data includes chat data between a user and an intelligent robot;
the portrait determining module 42 is configured to, if the user portrait exists, obtain sensitive information in the sensitive data, perform matching with a preset portrait database based on the sensitive information to obtain a matching result, and determine a user portrait of the user according to the matching result;
and a service processing module 43, configured to send the user portrait of the user to a client, so that the client performs service processing based on the user portrait.
Optionally, the data processing module 41 includes:
the data splitting unit 411 is configured to split the chat data sent by the client based on the target classification model to obtain at least one word;
a part-of-speech analysis unit 412, configured to perform part-of-speech analysis on the at least one word, and determine a part-of-speech category of each word;
a semantic analysis unit 413, configured to perform semantic analysis on the chat data based on the target classification model, and determine key information in the chat data;
and a data determining unit 414, configured to determine whether sensitive data exists in the chat data based on a degree of matching between the part of speech category of each word and the key information in the chat data.
Optionally, the sensitive information in the sensitive data includes the number of occurrences of the sensitive data and/or a sensitivity level of the sensitive data, where the sensitivity level of the sensitive data is used to indicate a sensitivity degree of the sensitive data;
the portrait determination module 42 is specifically configured to:
matching the occurrence frequency of the sensitive data with a preset frequency range in a preset image library, and determining a preset user image corresponding to the preset frequency range matched with the occurrence frequency of the sensitive data in the preset image library as the user image of the user; and/or
And determining a preset user portrait corresponding to a preset grade matched with the sensitivity grade of the sensitive data in the preset portrait library as the user portrait of the user based on the matching of the sensitivity grade of the sensitive data and the preset grade in the preset portrait library.
Optionally, the user portrait of the user includes a user score of the user, the preset number range includes a first preset number range, a second preset number range and a third preset number range, the preset level includes a first preset level, a second preset level and a third preset level, the first preset level is smaller than the second preset level, and the second preset level is smaller than the third preset level; the portrait determination module 42 is specifically configured to:
if the occurrence frequency of the sensitive data is matched with a first preset frequency range and/or the sensitivity level of the sensitive data is matched with a first preset level, determining that the user score is a first score;
if the occurrence frequency of the sensitive data is matched with a second preset frequency range and/or the sensitivity level of the sensitive data is matched with a second preset level, determining that the user score is a second score;
and if the occurrence frequency of the sensitive data is matched with a third preset frequency range and/or the sensitivity level of the sensitive data is matched with a third preset level, determining that the user score is a third score, wherein the first score is greater than the second score, and the second score is greater than the third score.
Optionally, the text data processing apparatus 40 further includes:
an information prompt module 44, configured to generate low-score prompt information for the user if the user score is less than or equal to the third score, where the low-score prompt information is used to reflect that the user is a low-quality user;
the information prompt module 44 is configured to send the low score prompt information to the customer service end.
Optionally, the text data processing apparatus 40 further includes:
the user scoring module 45 is used for generating scoring prompt information aiming at the customer service end;
the user scoring module 45 is configured to send the scoring prompt information to the client, so as to prompt the user to score on the client for the customer service;
the user scoring module 45 is configured to receive a score for the customer service end sent by the client, obtain the user score in the user representation, and determine a user weight based on the user score;
the user scoring module 45 is configured to determine a total score of the customer service end based on the score of the customer service end and the user weight.
Optionally, the user scoring module 45 is specifically configured to obtain a corresponding relationship between a user scoring threshold and a user weight;
the user scoring module 45 is specifically configured to determine a target scoring threshold matching the user score in the user representation from the user scoring thresholds, determine a target weight corresponding to the target scoring threshold based on the corresponding relationship, and determine the target weight as the user weight.
It should be noted that, for the content that is not mentioned in the embodiment corresponding to fig. 4, reference may be made to the description of the method embodiment, and details are not described here again.
In the embodiment of the application, the chat data sent by the client side is preprocessed, so that whether sensitive data exists in the chat data is determined, wherein the chat data comprises the chat data between a user and the intelligent robot; if the user portrait exists, acquiring sensitive information in the sensitive data, matching the sensitive information with a preset portrait library to obtain a matching result, and determining the user portrait of the user according to the matching result; and sending the user portrait of the user to the customer service end so that the customer service end performs service processing based on the user portrait. By preprocessing the chatting data sent by the client, sensitive data in the chatting data can be determined, and the user portrait of the user is determined based on the sensitive data. Since the user portrait can reflect the user's situation, such as the user's personality and quality, the customer service end can perform targeted processing on the user according to the user portrait by sending the user portrait to the customer service end. The chat between the user and the customer service can be optimized, the chat content is returned instead of the emotion disclosure, and invalid communication with the user due to the fact that the user condition is not known is reduced, so that the data processing efficiency can be improved, the service processing efficiency is improved, and the user experience is improved.
Referring to fig. 5, fig. 5 is a schematic structural diagram of a computer device according to an embodiment of the present disclosure. As shown in fig. 5, the computer device 50 may include: the processor 501, the network interface 504 and the memory 505, and the computer device 50 may further include: a user interface 503, and at least one communication bus 502. Wherein a communication bus 502 is used to enable connective communication between these components. The user interface 503 may include a Display screen (Display) and a Keyboard (Keyboard), and the optional user interface 503 may also include a standard wired interface and a standard wireless interface. The network interface 504 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface). The memory 505 may be a high-speed RAM memory or a non-volatile memory (non-volatile memory), such as at least one magnetic disk memory. The memory 505 may alternatively be at least one memory device located remotely from the processor 501. As shown in fig. 5, the memory 505, which is a kind of computer-readable storage medium, may include therein an operating system, a network communication module, a user interface module, and a device control application program.
In the computer device 50 shown in fig. 5, the network interface 504 may provide network communication functions; while the user interface 503 is primarily an interface for providing input to a user; and processor 501 may be used to invoke a device control application stored in memory 505 to implement:
preprocessing chatting data sent by a client, and determining whether sensitive data exists in the chatting data, wherein the chatting data comprises chatting data between a user and an intelligent robot;
if the user portrait exists, acquiring sensitive information in the sensitive data, matching the sensitive information with a preset portrait database to obtain a matching result, and determining the user portrait of the user according to the matching result;
and sending the user portrait of the user to a customer service end so that the customer service end performs service processing based on the user portrait.
It should be understood that the computer device 50 described in this embodiment may perform the description of the text data processing method in the embodiment corresponding to fig. 2 and fig. 3, and may also perform the description of the text data processing apparatus in the embodiment corresponding to fig. 4, which is not described herein again. In addition, the beneficial effects of the same method are not described in detail.
In the embodiment of the application, the chat data sent by the client side is preprocessed, so that whether sensitive data exists in the chat data is determined, wherein the chat data comprises the chat data between a user and the intelligent robot; if the user portrait exists, acquiring sensitive information in the sensitive data, matching the sensitive information with a preset portrait library to obtain a matching result, and determining the user portrait of the user according to the matching result; and sending the user portrait of the user to the customer service end so that the customer service end performs service processing based on the user portrait. By preprocessing the chatting data sent by the client, sensitive data in the chatting data can be determined, and the user portrait of the user is determined based on the sensitive data. Since the user portrait can reflect the user's situation, such as the user's personality and quality, the customer service end can perform targeted processing on the user according to the user portrait by sending the user portrait to the customer service end. The chat between the user and the customer service can be optimized, the chat content is returned instead of the emotion disclosure, and invalid communication with the user due to the fact that the user condition is not known is reduced, so that the data processing efficiency can be improved, the service processing efficiency is improved, and the user experience is improved.
Embodiments of the present application also provide a computer-readable storage medium storing a computer program, the computer program comprising program instructions, which, when executed by a computer, cause the computer to perform the method according to the foregoing embodiments, and the computer may be a part of the above-mentioned computer device. Such as processor 501 described above. By way of example, the program instructions may be executed on one computer device, or on multiple computer devices located at one site, or distributed across multiple sites and interconnected by a communication network, which may comprise a blockchain network.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), or the like.
The above disclosure is only for the purpose of illustrating the preferred embodiments of the present application and is not to be construed as limiting the scope of the present application, so that the present application is not limited thereto, and all equivalent variations and modifications can be made to the present application.
Claims (10)
1. A text data processing method, comprising:
preprocessing chatting data sent by a client, and determining whether sensitive data exists in the chatting data, wherein the chatting data comprises chatting data between a user and an intelligent robot;
if the user portrait exists, acquiring sensitive information in the sensitive data, matching the sensitive information with a preset portrait library to obtain a matching result, and determining the user portrait of the user according to the matching result;
and sending the user portrait of the user to a customer service end so that the customer service end performs service processing on the basis of the user portrait.
2. The method of claim 1, wherein preprocessing the chat data sent by the client to determine whether sensitive data exists in the chat data comprises:
splitting the chatting data sent by the client based on a target classification model to obtain at least one word;
performing part-of-speech analysis on the at least one word, and determining the part-of-speech category of each word;
performing semantic analysis on the chatting data based on the target classification model, and determining key information in the chatting data;
and determining whether sensitive data exists in the chat data or not based on the matching degree between the part of speech category of each word and the key information in the chat data.
3. The method according to claim 1, wherein the sensitive information in the sensitive data comprises the number of occurrences of the sensitive data and/or a sensitivity level of the sensitive data, and the sensitivity level of the sensitive data is used for indicating the sensitivity degree of the sensitive data;
the matching based on the sensitive information and a preset image library to obtain a matching result, and determining the user image of the user according to the matching result comprises the following steps:
matching the occurrence times of the sensitive data with a preset time range in a preset image library, and determining a preset user image corresponding to the preset time range matched with the occurrence times of the sensitive data in the preset image library as the user image of the user; and/or
And matching the sensitivity level of the sensitive data with a preset level in the preset portrait database, and determining a preset user portrait corresponding to the preset level matched with the sensitivity level of the sensitive data in the preset portrait database as the user portrait of the user.
4. The method of claim 3, wherein the user representation of the user comprises a user rating of the user, the predetermined range of times comprises a first predetermined range of times, a second predetermined range of times, and a third predetermined range of times, the predetermined levels comprise a first predetermined level, a second predetermined level, and a third predetermined level, the first predetermined level is less than the second predetermined level, and the second predetermined level is less than the third predetermined level;
the matching based on the sensitive information and a preset image library to obtain a matching result, and determining the user image of the user according to the matching result comprises the following steps:
if the occurrence frequency of the sensitive data is matched with the first preset frequency range and/or the sensitivity level of the sensitive data is matched with the first preset level, determining that the user score is a first score;
if the occurrence frequency of the sensitive data is matched with the second preset frequency range and/or the sensitivity level of the sensitive data is matched with the second preset level, determining that the user score is a second score;
and if the occurrence frequency of the sensitive data is matched with the third preset frequency range and/or the sensitivity grade of the sensitive data is matched with the third preset grade, determining that the user score is a third score, wherein the first score is larger than the second score, and the second score is larger than the third score.
5. The method of claim 4, further comprising:
if the user score is smaller than or equal to the third score, generating low-score prompt information aiming at the user, wherein the low-score prompt information is used for reflecting that the user is a low-quality user;
and sending the low score prompt information to the customer service end.
6. The method of claim 1, further comprising:
generating scoring prompt information aiming at the customer service end;
sending the grading prompt information to the client to prompt the user to grade on the client aiming at the customer service terminal;
receiving a score aiming at the customer service terminal and sent by the client, acquiring the user score in the user portrait, and determining the user weight based on the user score;
and determining the total score of the customer service end based on the score of the customer service end and the user weight.
7. The method of claim 6, wherein determining a user weight based on the user score comprises:
acquiring a corresponding relation between a user scoring threshold and a user weight;
determining a target scoring threshold value matched with the user score in the user portrait from the user scoring threshold values, determining a target weight corresponding to the target scoring threshold value based on the corresponding relation, and determining the target weight as the user weight.
8. A text data processing apparatus, characterized by comprising:
the data processing module is used for preprocessing chatting data sent by a client and determining whether sensitive data exist in the chatting data, wherein the chatting data comprises chatting data between a user and the intelligent robot;
the portrait determining module is used for acquiring sensitive information in the sensitive data if the user portrait is present, matching the sensitive information with a preset portrait library to obtain a matching result, and determining the user portrait of the user according to the matching result;
and the service processing module is used for sending the user portrait of the user to a customer service end so as to enable the customer service end to perform service processing based on the user portrait.
9. A computer device, comprising: a processor, a memory, and a network interface;
the processor is coupled to the memory and the network interface, wherein the network interface is configured to provide data communication functionality, the memory is configured to store program code, and the processor is configured to invoke the program code to cause the computer device to perform the method of any of claims 1-7.
10. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program adapted to be loaded and executed by a processor to cause a computer device having the processor to perform the method of any of claims 1-7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111112498.7A CN113822062A (en) | 2021-09-22 | 2021-09-22 | Text data processing method, device and equipment and readable storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111112498.7A CN113822062A (en) | 2021-09-22 | 2021-09-22 | Text data processing method, device and equipment and readable storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113822062A true CN113822062A (en) | 2021-12-21 |
Family
ID=78915142
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111112498.7A Pending CN113822062A (en) | 2021-09-22 | 2021-09-22 | Text data processing method, device and equipment and readable storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113822062A (en) |
Citations (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106603381A (en) * | 2016-11-24 | 2017-04-26 | 北京小米移动软件有限公司 | Chat information processing method and device |
CN109684364A (en) * | 2018-08-21 | 2019-04-26 | 平安普惠企业管理有限公司 | The problem of being drawn a portrait based on user processing method, device, equipment and storage medium |
CN110782318A (en) * | 2019-10-21 | 2020-02-11 | 五竹科技(天津)有限公司 | Marketing method and device based on audio interaction and storage medium |
CN110932960A (en) * | 2019-11-04 | 2020-03-27 | 深圳市声扬科技有限公司 | Social software-based fraud prevention method, server and system |
KR20200063282A (en) * | 2018-11-16 | 2020-06-05 | 주식회사 깃플 | System and method for providing hybrid counselling service using chatbot |
CN111274380A (en) * | 2020-01-16 | 2020-06-12 | 平安银行股份有限公司 | Consultation complaint information processing method based on big data and related device |
CN111641757A (en) * | 2020-05-15 | 2020-09-08 | 北京青牛技术股份有限公司 | Real-time quality inspection and auxiliary speech pushing method for seat call |
CN111640436A (en) * | 2020-05-15 | 2020-09-08 | 北京青牛技术股份有限公司 | Method for providing a dynamic customer representation of a call partner to an agent |
CN111797210A (en) * | 2020-03-03 | 2020-10-20 | 中国平安人寿保险股份有限公司 | Information recommendation method, device and equipment based on user portrait and storage medium |
CN112559776A (en) * | 2020-12-21 | 2021-03-26 | 绿瘦健康产业集团有限公司 | Sensitive information positioning method and system |
KR20210057308A (en) * | 2019-11-12 | 2021-05-21 | 주식회사 테서 | Method and system for providing chatbot service based on machine learning |
CN112860876A (en) * | 2021-03-31 | 2021-05-28 | 中国工商银行股份有限公司 | Session auxiliary processing method and device |
CN113112282A (en) * | 2021-04-20 | 2021-07-13 | 平安银行股份有限公司 | Method, device, equipment and medium for processing consult problem based on client portrait |
-
2021
- 2021-09-22 CN CN202111112498.7A patent/CN113822062A/en active Pending
Patent Citations (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106603381A (en) * | 2016-11-24 | 2017-04-26 | 北京小米移动软件有限公司 | Chat information processing method and device |
CN109684364A (en) * | 2018-08-21 | 2019-04-26 | 平安普惠企业管理有限公司 | The problem of being drawn a portrait based on user processing method, device, equipment and storage medium |
KR20200063282A (en) * | 2018-11-16 | 2020-06-05 | 주식회사 깃플 | System and method for providing hybrid counselling service using chatbot |
CN110782318A (en) * | 2019-10-21 | 2020-02-11 | 五竹科技(天津)有限公司 | Marketing method and device based on audio interaction and storage medium |
CN110932960A (en) * | 2019-11-04 | 2020-03-27 | 深圳市声扬科技有限公司 | Social software-based fraud prevention method, server and system |
KR20210057308A (en) * | 2019-11-12 | 2021-05-21 | 주식회사 테서 | Method and system for providing chatbot service based on machine learning |
CN111274380A (en) * | 2020-01-16 | 2020-06-12 | 平安银行股份有限公司 | Consultation complaint information processing method based on big data and related device |
CN111797210A (en) * | 2020-03-03 | 2020-10-20 | 中国平安人寿保险股份有限公司 | Information recommendation method, device and equipment based on user portrait and storage medium |
CN111641757A (en) * | 2020-05-15 | 2020-09-08 | 北京青牛技术股份有限公司 | Real-time quality inspection and auxiliary speech pushing method for seat call |
CN111640436A (en) * | 2020-05-15 | 2020-09-08 | 北京青牛技术股份有限公司 | Method for providing a dynamic customer representation of a call partner to an agent |
CN112559776A (en) * | 2020-12-21 | 2021-03-26 | 绿瘦健康产业集团有限公司 | Sensitive information positioning method and system |
CN112860876A (en) * | 2021-03-31 | 2021-05-28 | 中国工商银行股份有限公司 | Session auxiliary processing method and device |
CN113112282A (en) * | 2021-04-20 | 2021-07-13 | 平安银行股份有限公司 | Method, device, equipment and medium for processing consult problem based on client portrait |
Non-Patent Citations (2)
Title |
---|
BYUNGSOO, SO ET AL.: "《Use and Protection of Personal Information on Social Media - Focusing on the Personal Information Infringement Case of the AI Chat Robot ‘Iruda\'》", 《INHA LAW REVIEW》, 31 March 2021 (2021-03-31) * |
朱泽圻;: "面向聊天机器人的敏感内容识别研究", 智能计算机与应用, no. 03, 1 March 2020 (2020-03-01) * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113962315B (en) | Model pre-training method, device, equipment, storage medium and program product | |
CN111428010B (en) | Man-machine intelligent question-answering method and device | |
CN112417158A (en) | Training method, classification method, device and equipment of text data classification model | |
CN115631261B (en) | Training method of image generation model, image generation method and device | |
CN117332072B (en) | Dialogue processing, voice abstract extraction and target dialogue model training method | |
CN111639162A (en) | Information interaction method and device, electronic equipment and storage medium | |
CN115083434A (en) | Emotion recognition method and device, computer equipment and storage medium | |
CN111694941A (en) | Reply information determining method and device, storage medium and electronic equipment | |
CN117725163A (en) | Intelligent question-answering method, device, equipment and storage medium | |
CN115640398A (en) | Comment generation model training method, comment generation device and storage medium | |
CN111625636A (en) | Man-machine conversation refusal identification method, device, equipment and medium | |
KR20190074508A (en) | Method for crowdsourcing data of chat model for chatbot | |
CN113360630B (en) | Interactive information prompting method | |
CN111402864A (en) | Voice processing method and electronic equipment | |
CN116913278B (en) | Voice processing method, device, equipment and storage medium | |
CN109002498B (en) | Man-machine conversation method, device, equipment and storage medium | |
CN110377706B (en) | Search sentence mining method and device based on deep learning | |
CN113822062A (en) | Text data processing method, device and equipment and readable storage medium | |
CN114281969A (en) | Reply sentence recommendation method and device, electronic equipment and storage medium | |
CN111883111B (en) | Method, device, computer equipment and readable storage medium for processing speech training | |
CN109325234B (en) | Sentence processing method, sentence processing device and computer readable storage medium | |
CN114490967A (en) | Training method of dialogue model, dialogue method and device of dialogue robot and electronic equipment | |
WO2023272833A1 (en) | Data detection method, apparatus and device and readable storage medium | |
CN113746814A (en) | Mail processing method and device, electronic equipment and storage medium | |
KR20210116223A (en) | Apparatus and Method of Artificial Intelligence-based Virtual Consultation Service |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |