CN115080741A - Questionnaire survey analysis method, device, storage medium and equipment - Google Patents

Questionnaire survey analysis method, device, storage medium and equipment Download PDF

Info

Publication number
CN115080741A
CN115080741A CN202210731392.3A CN202210731392A CN115080741A CN 115080741 A CN115080741 A CN 115080741A CN 202210731392 A CN202210731392 A CN 202210731392A CN 115080741 A CN115080741 A CN 115080741A
Authority
CN
China
Prior art keywords
data
comment
questionnaire
clustering
comment content
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210731392.3A
Other languages
Chinese (zh)
Inventor
史文鑫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Bank Co Ltd
Original Assignee
Ping An Bank Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Bank Co Ltd filed Critical Ping An Bank Co Ltd
Priority to CN202210731392.3A priority Critical patent/CN115080741A/en
Publication of CN115080741A publication Critical patent/CN115080741A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/34Browsing; Visualisation therefor
    • G06F16/345Summarisation for human users
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/02Banking, e.g. interest calculation or account maintenance

Abstract

The embodiment of the application provides a questionnaire survey analysis method, a device, a storage medium and equipment, wherein in the method, response results of users to an NPS questionnaire are obtained, comment contents in the response results are coded based on a preset model, characteristic features representing semantics of corresponding comment contents are obtained, then the characteristic features are clustered, and key phrase extraction is carried out on the clustered comment contents, so that representative phrases of various categories are extracted. Therefore, the NPS questionnaire is automatically analyzed, the viewpoint and the intention of the user can be accurately found, and effective reference information is provided for the decision of an enterprise manager.

Description

Questionnaire survey analysis method, device, storage medium and equipment
Technical Field
The present application relates to the field of data processing technologies, and in particular, to a questionnaire analysis method, apparatus, storage medium, and device.
Background
Nps (net Promoter score), the net recommendation value, also called net facilitator score, is an index that measures how likely a customer will recommend a business or service to others. The enterprise can judge whether the interaction quality of the business interaction object is good or not by taking the net recommendation value of each business interaction object as an evaluation index, so that enterprise management personnel can conveniently make a corresponding service strategy.
Currently, NPS is generally collected by issuing an NPS questionnaire to a user, and the NPS questionnaire is generally divided into two parts, one part is a score and the other part is a reason for the user to fill in the score. In fact, the reason for giving the score, which is filled out by the user, can better reflect the real preference of the user, but the result is not easy to analyze, and the comparison consumes time cost and labor cost.
Disclosure of Invention
An embodiment of the present application aims to provide a questionnaire analysis method, device, storage medium, and apparatus, so as to solve the problem that analysis of an NPS questionnaire in the related art is time-consuming and labor-consuming.
In a first aspect, an analysis method for questionnaires provided in an embodiment of the present application includes:
obtaining the reply result of each user to the NPS questionnaire, wherein the reply result comprises comment content;
coding each comment content based on a preset model to obtain a characterization feature corresponding to each comment content;
clustering the characteristic features of each comment content;
and extracting key phrases in each category aiming at the clustered comment contents.
In the implementation process, the response results of each user to the NPS questionnaire are obtained, the comment contents in each response result are encoded based on a preset model, the characteristic features representing the semantics of the corresponding comment contents are obtained, then, the characteristic features are clustered, and the clustered comment contents are subjected to key phrase extraction, so that representative phrases of each category are extracted. Therefore, the NPS questionnaire is automatically analyzed, the viewpoint and the intention of the user can be accurately found, and effective reference information is provided for the decision of an enterprise manager.
Further, in some embodiments, the response results further include a score; the method further comprises the following steps:
carrying out sectional statistics on each response result based on the scoring scores;
and performing word cloud display on the comment content of each segment.
In the implementation process, the attitude of the user is rapidly displayed through segmented statistics and word cloud display.
Further, in some embodiments, the training data of the preset model includes initial data and retrieval data obtained by a search engine for the initial data; the preset model is obtained by training based on the following modes:
processing the initial data by using a pre-training model to generate predicted data;
constructing a first loss function based on the prediction data and the retrieval data, and constructing a second loss function based on the similarity between the prediction data and the retrieval data;
and accumulating the first loss function and the second loss function to obtain a target loss function, and training the pre-training model by using the target loss function.
In the implementation process, the model is trained by using a first loss function for controlling the generation content of the model and a second loss function for controlling the similarity of the generation content, so that the encoding result of the model can more accurately represent the semantics of the text.
Further, in some embodiments, the pre-trained model is a unified pre-trained language model UniLM.
In the implementation process, a selection scheme of the pre-training model is provided.
Further, in some embodiments, the training data is derived based on:
searching the initial data serving as a problem in a search engine to obtain a search result;
using a text similarity algorithm to carry out similarity sequencing on the retrieval results;
and determining the search results with the preset number sorted at the front as the search data.
In the implementation process, a solution for acquiring training data is provided.
Further, in some embodiments, the clustering the characterizing features of the comment contents includes:
and clustering the characteristic features of each comment content by using a K-means clustering algorithm, wherein the clustering number is determined based on an elbow rule.
In the implementation process, a clustering implementation mode is provided.
Further, in some embodiments, the key phrases are extracted using the TextRank algorithm.
In the implementation process, a solution for extracting key phrases is provided.
In a second aspect, an apparatus for analyzing questionnaire provided in an embodiment of the present application includes:
the system comprises an acquisition module, a query module and a query module, wherein the acquisition module is used for acquiring the reply result of each user to the NPS questionnaire, and the reply result comprises comment content;
the encoding module is used for encoding each comment content based on a preset model to obtain the characterization feature corresponding to each comment content;
the clustering module is used for clustering the characteristic features of the comment contents;
and the extraction module is used for extracting the key phrases in each category aiming at the clustered comment contents.
In a third aspect, an electronic device provided in an embodiment of the present application includes: memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing the steps of the method according to any of the first aspect when executing the computer program.
In a fourth aspect, an embodiment of the present application provides a computer-readable storage medium having instructions stored thereon, which, when executed on a computer, cause the computer to perform the method according to any one of the first aspect.
In a fifth aspect, embodiments of the present application provide a computer program product, which when run on a computer, causes the computer to perform the method according to any one of the first aspect.
Additional features and advantages of the disclosure will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by the practice of the above-described techniques.
In order to make the aforementioned objects, features and advantages of the present application more comprehensible, preferred embodiments accompanied with figures are described in detail below.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the embodiments of the present application will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and that those skilled in the art can also obtain other related drawings based on the drawings without inventive efforts.
Fig. 1 is a flowchart of a questionnaire analysis method provided in an embodiment of the present application;
FIG. 2 is a schematic diagram of an NPS questionnaire provided by an embodiment of the present application;
fig. 3 is a schematic diagram of a result of performing word cloud display on neutral comment data according to an embodiment of the present application;
FIG. 4 is a diagram of an Attention Mask of a model provided by an embodiment of the present application;
FIG. 5 is a schematic diagram illustrating a determination of the number of clusters based on elbow rules according to an embodiment of the present application;
fig. 6 is a block diagram of a questionnaire analysis apparatus according to an embodiment of the present application;
fig. 7 is a block diagram of an electronic device according to an embodiment of the present disclosure.
Detailed Description
The technical solutions in the embodiments of the present application will be described below with reference to the drawings in the embodiments of the present application.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined or explained in subsequent figures. Meanwhile, in the description of the present application, the terms "first", "second", and the like are used only for distinguishing the description, and are not to be construed as indicating or implying relative importance.
As described in the related art, the current analysis scheme for NPS questionnaires has problems of difficulty in analyzing the comment content, and time and labor consumption. Based on this, the embodiment of the present application provides an electronic ticket issuing scheme to solve this problem.
As shown in fig. 1, fig. 1 is a flowchart of a questionnaire analysis method shown in an embodiment of the present application, which can be applied to a terminal or a server, where the terminal can be various electronic devices including, but not limited to, a smart phone, a tablet computer, a laptop portable computer, a desktop computer, and the like; the server may be a single server or a distributed server cluster consisting of a plurality of servers. It should be noted that the terminal/server may also be implemented as a plurality of software or software modules, or may also be implemented as a single software or software module, which is not limited in this application.
The method comprises the following steps:
in step 101, obtaining the reply result of each user to the NPS questionnaire, wherein the reply result comprises comment content;
the NPS questionnaire referred to in this step is a common survey tool, generally consisting of two parts, the first part being a scoring question, usually asking the user to score a company, product or service from 0 to 10 points; the second part is the subsequent open questions, typically asking the customer why the score was given. Accordingly, the response result of the user to the NPS questionnaire also generally includes two parts, one part is a score and the other part is comment content for responding to the question.
After the client interacts with the business once, the enterprise system can push an NPS questionnaire to the client in the modes of short messages, WeChat, popup windows and the like. Taking a bank system as an example, a customer goes to a website to handle business, an agent answers a call of the customer, the customer logs in a bank APP (Application), a network public platform service number of an enterprise concerned by the customer, even a click operation of the customer on a website and the like belong to one interaction between the customer and the business, and after one interaction, the bank system can push an NPS questionnaire to the customer to inquire whether the customer wants to recommend the business to others or not in order to better provide the service for the customer. After the client answers and submits the NPS questionnaire, the enterprise system retrieves the NPS questionnaire and extracts the answer results therefrom.
In some embodiments, the response results further include a score; the method further comprises the following steps: carrying out sectional statistics on each response result based on the scoring scores; and performing word cloud display on the comment content of each segment. The NPS questionnaire enables the user to score according to the recommendation willing degree between 0 and 10, wherein 10 scores represent the recommendation willing, and 0 score represents the recommendation unwilling; according to the scoring situation, the users with the scores of 9-10 are recommenders, the users with the scores of 7-8 are neutralizers, and the users with the scores of 0-6 are derogators. After the answer results of each user are obtained, in order to facilitate checking, segmented statistics can be carried out on each answer result based on the score scores, for example, the length of each segment is 2, so that the distribution conditions of 0-2-point result number, 3-4-point result number, 5-6-point result number, 7-8-point result number, 9-10-point result number and the like are obtained; and then, performing word cloud display on the comment content of each segment, so that the attitude of the user can be quickly seen, wherein the comment content which is recommended, neutral and not recommended can be respectively displayed in a word cloud manner, that is, when the word cloud is displayed, the comment content corresponding to the three segments of 0-2 min, 3-4 min and 5-6 min can be combined to obtain the non-recommended comment content. Specifically, a Jieba word segmentation tool can be used for segmenting words of each comment content, stop words are removed, and then a WordCloud toolkit of Python is used as a presentation tool, so that a word cloud presentation result is obtained.
In step 102, coding each comment content based on a preset model to obtain a characterization feature corresponding to each comment content;
in the step, a preset model is used as an Encoder to encode each comment content to obtain a characterization feature corresponding to each comment content, wherein the characterization feature may refer to a feature value or a feature vector for characterizing the semantics of the comment content. In some examples, the preset model may be a BERT (Bidirectional Encoder Representation based on Transformers) model, where the BERT model is essentially a multi-layer Bidirectional Encoder network constructed by using a Transformer structure, and sentence vectors of texts generated by the BERT model can accurately represent the semantics of sentences.
In order to improve the accuracy of the semantic meaning of the corresponding comment content represented by the coded characterization features, in some other embodiments, the training data of the preset model includes initial data and retrieval data obtained by a search engine for the initial data, and the preset model is trained based on the following manners: processing the initial data by using a pre-training model to generate predicted data; constructing a first loss function based on the prediction data and the retrieval data, and constructing a second loss function based on the similarity between the prediction data and the retrieval data; accumulating the first loss function and the second loss function to obtain a target loss function; and training the pre-training model by using the target loss function. The pre-training Model may be a Unified Language Model (UniLM), which is a pre-training Model that can be both read and automatically generated. The initial data can be set according to the requirements of a specific scene, for example, if the method is applied to a bank system, data of bank FAQ (frequent-activated Questions) can be used as the initial data, and the initial data is used as a question to be retrieved in a search engine to obtain retrieved data; in addition, considering that the search results of the search engine are more, the search results of the search engine can be subjected to similarity ranking by using a text similarity algorithm, the search results with the top preset number are taken as the search data, specifically, each sentence can be represented by using the average of word vectors, and then the similarity between the two sentences can be measured by using cosine similarity. The pre-training model is trained by a multi-task method, two loss functions are used, the similarity of the generation content and the generation content of the model is controlled respectively, the final target loss function is the sum of the first loss function and the second loss function, and the model is stored when the value of the target loss function is smaller than the preset minimum value, namely the training is finished.
In step 103, clustering the characterization features of the comment contents;
and after the characteristic features of the comment contents are obtained, clustering the characteristic features to realize text clustering. In some embodiments, the clustering algorithm employed may be a K-means (K-means) clustering algorithm, which is an iterative solution clustering analysis algorithm that generally comprises the steps of: dividing the data into K groups in advance, randomly selecting K objects as initial clustering centers, calculating the distance between each object and each clustering center, and allocating each object to the nearest clustering center. The cluster centers and the objects assigned to them represent a cluster. The cluster center of a cluster is recalculated for each sample assigned based on the objects existing in the cluster. This process will be repeated until some termination condition is met. Wherein the termination condition may be that no (or minimum number) objects are reassigned to different clusters, no (or minimum number) cluster centers are changed again, and the sum of squared errors is locally minimal.
Because the data distribution of each type of comment content is different, it is difficult to use a specific certain value as the number of clusters. Thus, in some embodiments, the number of clusters is determined based on the rule of elbows. The principle of computation of the elbow rule is a cost function, which is the sum of the distortion degrees of the classes, the distortion degree of each class being equal to the sum of the squares of the distances of the positions of each variable point from its class center (the more compact the members inside the class are to each other, the smaller the distortion degree of the class, the more dispersed the members inside the class are to each other, the larger the distortion degree of the class). In selecting the number of categories, the elbow rule would draw cost function values for different values. As the value increases, the number of samples contained in each class decreases, and the average distortion level decreases as the samples move closer to their center of gravity. As the value continues to increase, the improvement in the average distortion level decreases. In the process of increasing the value, the elbow is the value corresponding to the position with the largest reduction amplitude of the improvement effect of the distortion degree, and the value is the optimal classification number. Specifically, when the elbow rule is used to find the optimal classification number, the maximum and minimum values of the classes are set, then a step greater than or equal to 0 is set as the interval of the classes, and then the optimal classification number is screened by using SSE (sum of the squared errors) as the evaluation index.
At step 104, for the clustered comment content, key phrases in each category are extracted.
After clustering, a plurality of key phrases can be extracted from each category to be used as the abstract of the category, so that the enterprise manager can determine the viewpoint and intention of each user conveniently.
In some embodiments, the key phrases in each category are extracted using the TextRank algorithm. The TextRank algorithm is a graph-based sorting algorithm for keyword extraction and document summarization, and can extract keywords and keyword groups of a given text and extract key sentences of the text by using an extraction-type automatic abstraction method. Therefore, in implementation, all the comment contents in each category of data are divided by commas, and are subjected to sentence division processing and then are sent to an algorithm, that is, the comment contents in each category are converted into a text with a plurality of sentences. When the TextRank algorithm is used for constructing a graph, sentences are used as nodes, weights are introduced into edges between the nodes, wherein the weights represent the similarity of the two sentences, and the calculation formula is as follows:
Figure BDA0003713656530000091
wherein WS (V) i ) Characterizing node V j The importance of (c); w ji Is node V j To node V i The weight of the edge of (1); d is a damping coefficient, and is generally 0.85; in (V) i ) To point to node V i A set of (a); out (V) i ) Is node V i The indicated set. By TextRaAnd the nk algorithm can extract key phrases as representative phrases of corresponding categories. Of course, in other embodiments, other algorithms, such as the PageRank algorithm, may also be used to extract the key phrases, which is not limited in this application.
According to the embodiment of the application, the response results of each user to the NPS questionnaire are obtained, the comment contents in each response result are coded based on the preset model, the characteristic features representing the semantics of the corresponding comment contents are obtained, then, the characteristic features are clustered, the clustered comment contents are subjected to key phrase extraction, and therefore representative phrases of all categories are extracted. Therefore, the NPS questionnaire is automatically analyzed, the viewpoint and the intention of the user can be accurately found, and effective reference information is provided for the decision of an enterprise manager.
To illustrate the questionnaire analysis scheme of the present application in more detail, a specific example is described below:
the embodiment relates to a bank scene, and the automatic analysis process of a bank system aiming at an NPS questionnaire mainly comprises the following parts of questionnaire acquisition, statistical value distribution, word cloud display, text clustering and key phrase extraction, wherein:
in the questionnaire obtaining part, after the system interacts with the bank once, the system pushes an NPS questionnaire to the client, as shown in fig. 2, fig. 2 is a schematic diagram of an NPS questionnaire provided by an embodiment of the present application; the NPS questionnaire contains two questions, one is the invitation to give the NPS score and one is the reason why the query gives the score. After the customer completes the filling, the system recovers the questionnaire and acquires the reply result of the customer.
In the statistical value distribution part, the system adopts a sectional statistical method, the length of the section is 2, 2639 questionnaire results are totally recovered in practical application, and the distribution condition is shown in the following table:
score value 0-2 3-4 5-6 7-8 9-10
Number of 45 156 246 948 1244
In the word cloud display part, the system respectively displays the unrendered, neutral and recommended comment data in a word cloud mode, specifically, a Jieba word segmentation tool is used for segmenting the comment data to remove stop words, and a WordCloud tool package of Python is used as a display tool. As shown in fig. 3, fig. 3 is a schematic diagram of a result of performing word cloud display on neutral comment data according to an embodiment of the present application. Through the word cloud display, an enterprise manager can conveniently view the contents of the main discussions of the user.
In a text clustering part, in order to encode comment data, a model needs to be pre-trained, specifically, a main body of the model uses UniLM as a pre-trained model and is trained by using a multi-task method, wherein a task 1 is SENT _ a to generate SENT _ b, and a Loss function is Loss 1; task 2 is to compute the similarity of SENT _ a and SENT _ b, and the Loss function is Loss 2. The pre-training data is derived from the retrieval data of the search engine, for example, the initial data "cancellation of bank card" is used for retrieval in the search engine, and the retrieval data such as "how to cancel bank card", "technology of canceling bank card", and "why bank card cannot be cancelled" is obtained.
The model has stronger text comprehension capability through a special Attention Mask; the use of the Attention Mask is shown in FIG. 4, where S1 is source segments, S2 is target segments, and a random Mask masks two of the segments, where if the Mask is the word of S1, the model can note all the tokens of S1, if the Mask is the word of S2, the model can only note all the tokens of S1, and all the tokens of S2 to the left of the word and the current word (inclusive). Thus, the model can invisibly learn a bidirectional Encoder and a unidirectional Decoder.
In order to enhance the NLU (Natural Language Understanding) capability of the model, two Loss functions are used in the model training, wherein the Loss1 used for controlling the generation content of the model can be a Loss function of seq2seq (sequence-to-sequence), such as a cross entropy Loss function; loss2 for controlling the degree of similarity of generated content is calculated based on the following equations (1) to (5):
V=R b×d # (1)
Figure BDA0003713656530000111
Figure BDA0003713656530000112
Figure BDA0003713656530000113
Figure BDA0003713656530000114
specifically, the obtaining step of Loss2 may be described as follows: the [ CLS ] in the whole batch]All vectors are taken out to obtain a sentence vector matrix V ═ R b×d (b is batch _ size, d is hidden _ size); however, the device is not suitable for use in a kitchenThen, L2 norm normalization is carried out on d dimension to obtain
Figure BDA0003713656530000115
Then, inner products are carried out on every two to obtain a similarity matrix of b multiplied by b
Figure BDA0003713656530000116
Multiplying by a scale, wherein the purpose is to facilitate Loss visualization during training, and optionally, the scale value is 50; the diagonal part is covered, finally, softmax is carried out on each row, as a classification task training, the target label of each sample is a similar sentence of the sample, and therefore the Loss function Loss2 is obtained.
The final Loss function Loss is α Loss1+ β Loss2, where α and β are adjustable hyper-parameters, and the value range is [0, 1], and the Loss function is used to control the model under the condition of the emphasis of NLG and NLU, and multiple experiments show that the best effect can be achieved when α is 1 and β is 1. In addition, during the training process, the model uses batch _ size 64, the optimizer selects AdamW, and the model is saved when Loss is less than the current minimum.
After the model is trained, the model is used as an Encoder to perform feature extraction on each comment data, and then the K-means algorithm is used to perform clustering on the data. The number of clusters is determined by using the elbow rule, specifically, the maximum value is set to 8, the minimum value is set to 1, step is set to 1 as the interval of the categories, and according to the SSE as the evaluation index, the schematic diagram of the elbow rule shown in fig. 5 is obtained, wherein the value corresponding to the elbow is 4, that is, the best effect is achieved when the number of categories is 4. Therefore, the number of clusters is set to 4.
In the key phrase extraction part, the system extracts a plurality of key phrases from each category as representative phrases of corresponding categories according to the clustering result, so that an enterprise manager can quickly know which aspects of service are more concerned by the client. Specifically, the system uses the TextRank algorithm to extract key phrases, separates all comment data in each category by commas, performs sentence division processing, and sends the separated comment data to the algorithm, thereby obtaining key phrases of each category, such as "too few websites", "good service attitude", and the like. The key phrases can play an effective guiding role for related business departments.
As can be seen from the above, the solution of the present embodiment has at least the following effects: starting from multiple dimensions, the method not only can display the recommended value of the NPS, but also can analyze the comment of a user by combining with the emotion analysis technology of the NLP, and judge whether the comment content is positively correlated with the score; secondly, clustering the text data of the NPS by using a deep learning technology, finding out topics concerned by users, and guiding relevant business departments.
Corresponding to the foregoing method embodiments, the present application further provides embodiments of a questionnaire survey analysis apparatus and a terminal applied thereto:
as shown in fig. 6, fig. 6 is a block diagram of a questionnaire analysis apparatus according to an embodiment of the present application; the device comprises:
an obtaining module 61, configured to obtain a response result of each user to the NPS questionnaire, where the response result includes comment content;
the encoding module 62 is configured to encode each comment content based on a preset model to obtain a characterization feature corresponding to each comment content;
the clustering module 63 is used for clustering the characteristic features of the comment contents;
and the extracting module 64 is used for extracting the key phrases in each category according to the clustered comment contents.
Fig. 7 shows a block diagram of an electronic device according to an embodiment of the present disclosure, where fig. 7 is a block diagram of the electronic device. The electronic device may include a processor 710, a communication interface 720, a memory 730, and at least one communication bus 740. Wherein a communication bus 740 is used to enable direct, connected communication of these components. In this embodiment, the communication interface 720 of the electronic device is used for performing signaling or data communication with other node devices. Processor 710 may be an integrated circuit chip having signal processing capabilities.
The Processor 710 may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), and the like; but may also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components. The various methods, steps, and logic blocks disclosed in the embodiments of the present application may be implemented or performed. A general purpose processor may be a microprocessor or the processor 710 may be any conventional processor or the like.
The Memory 730 may be, but is not limited to, a Random Access Memory (RAM), a Read Only Memory (ROM), a Programmable Read Only Memory (PROM), an Erasable Read Only Memory (EPROM), an electrically Erasable Read Only Memory (EEPROM), and the like. The memory 730 stores computer readable instructions, which when executed by the processor 710, enable the electronic device to perform the steps described above in connection with the method embodiment of fig. 1.
Optionally, the electronic device may further include a memory controller, an input output unit.
The memory 730, the memory controller, the processor 710, the peripheral interface, and the input/output unit are electrically connected to each other directly or indirectly to implement data transmission or interaction. For example, these components may be electrically coupled to each other via one or more communication buses 740. The processor 710 is adapted to execute executable modules stored in the memory 730, such as software functional modules or computer programs comprised by the electronic device.
The input and output unit is used for providing a task for a user to create and start an optional time period or preset execution time for the task creation so as to realize the interaction between the user and the server. The input/output unit may be, but is not limited to, a mouse, a keyboard, and the like.
It will be appreciated that the configuration shown in fig. 7 is merely illustrative and that the electronic device may include more or fewer components than shown in fig. 7 or have a different configuration than shown in fig. 7. The components shown in fig. 7 may be implemented in hardware, software, or a combination thereof.
The embodiments of the present application further provide a storage medium, where instructions are stored in the storage medium, and when the instructions are run on a computer, when the computer program is executed by a processor, the method described in the method embodiments is implemented, and for avoiding repetition, details are not repeated here.
The present application also provides a computer program product which, when run on a computer, causes the computer to perform the method of the method embodiments.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method can be implemented in other ways. The apparatus embodiments described above are merely illustrative, and for example, the flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
In addition, functional modules in the embodiments of the present application may be integrated together to form an independent part, or each module may exist separately, or two or more modules may be integrated to form an independent part.
The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk, and various media capable of storing program codes.
The above description is only an example of the present application and is not intended to limit the scope of the present application, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present application shall be included in the protection scope of the present application. It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.
The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

Claims (10)

1. A questionnaire analysis method, characterized by comprising:
obtaining the reply result of each user to the NPS questionnaire, wherein the reply result comprises comment content;
coding each comment content based on a preset model to obtain a characterization feature corresponding to each comment content;
clustering the characteristic features of each comment content;
and extracting key phrases in each category aiming at the clustered comment contents.
2. The method of claim 1, wherein the response results further include a score; the method further comprises the following steps:
carrying out sectional statistics on each response result based on the scoring scores;
and performing word cloud display on the comment content of each segment.
3. The method according to claim 1, wherein the training data of the preset model comprises initial data and retrieval data obtained by a search engine for the initial data; the preset model is obtained by training based on the following modes:
processing the initial data by using a pre-training model to generate predicted data;
constructing a first loss function based on the prediction data and the retrieval data, and constructing a second loss function based on the similarity between the prediction data and the retrieval data;
and accumulating the first loss function and the second loss function to obtain a target loss function, and training the pre-training model by using the target loss function.
4. The method of claim 3, wherein the pre-trained model is a unified pre-trained language model (UniLM).
5. The method of claim 3, wherein the training data is derived based on:
searching the initial data serving as a problem in a search engine to obtain a search result;
using a text similarity algorithm to carry out similarity sequencing on the retrieval results;
and determining the search results with the preset number sorted at the front as the search data.
6. The method of claim 1, wherein clustering the characterizing features of the comment contents comprises:
and clustering the characteristic features of each comment content by using a K-means clustering algorithm, wherein the clustering number is determined based on an elbow rule.
7. The method of claim 1, wherein the key phrases are extracted using a TextRank algorithm.
8. A questionnaire analysis device characterized by comprising:
the system comprises an acquisition module, a query module and a query module, wherein the acquisition module is used for acquiring the reply result of each user to the NPS questionnaire, and the reply result comprises comment content;
the coding module is used for coding each comment content based on a preset model to obtain the characterization feature corresponding to each comment content;
the clustering module is used for clustering the characteristic features of the comment contents;
and the extraction module is used for extracting the key phrases in each category aiming at the clustered comment contents.
9. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1 to 7.
10. A computer device comprising a processor, a memory, and a computer program stored on the memory and executable on the processor, wherein the processor implements the method of any of claims 1 to 7 when executing the computer program.
CN202210731392.3A 2022-06-24 2022-06-24 Questionnaire survey analysis method, device, storage medium and equipment Pending CN115080741A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210731392.3A CN115080741A (en) 2022-06-24 2022-06-24 Questionnaire survey analysis method, device, storage medium and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210731392.3A CN115080741A (en) 2022-06-24 2022-06-24 Questionnaire survey analysis method, device, storage medium and equipment

Publications (1)

Publication Number Publication Date
CN115080741A true CN115080741A (en) 2022-09-20

Family

ID=83256031

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210731392.3A Pending CN115080741A (en) 2022-06-24 2022-06-24 Questionnaire survey analysis method, device, storage medium and equipment

Country Status (1)

Country Link
CN (1) CN115080741A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116151872A (en) * 2022-11-28 2023-05-23 荣耀终端有限公司 Product characteristic analysis method and device

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116151872A (en) * 2022-11-28 2023-05-23 荣耀终端有限公司 Product characteristic analysis method and device
CN116151872B (en) * 2022-11-28 2023-11-14 荣耀终端有限公司 Product characteristic analysis method and device

Similar Documents

Publication Publication Date Title
US11334635B2 (en) Domain specific natural language understanding of customer intent in self-help
CN109284357B (en) Man-machine conversation method, device, electronic equipment and computer readable medium
Mostafa Clustering halal food consumers: A Twitter sentiment analysis
CN106649818B (en) Application search intention identification method and device, application search method and server
CN109657054B (en) Abstract generation method, device, server and storage medium
US11429405B2 (en) Method and apparatus for providing personalized self-help experience
CN110032639B (en) Method, device and storage medium for matching semantic text data with tag
CN109086265B (en) Semantic training method and multi-semantic word disambiguation method in short text
CN111046941A (en) Target comment detection method and device, electronic equipment and storage medium
US20200192921A1 (en) Suggesting text in an electronic document
CN111507573A (en) Business staff assessment method, system, device and storage medium
Fouzia Sayeedunnissa et al. Supervised opinion mining of social network data using a bag-of-words approach on the cloud
CN112148831B (en) Image-text mixed retrieval method and device, storage medium and computer equipment
CN111753082A (en) Text classification method and device based on comment data, equipment and medium
CN116109373A (en) Recommendation method and device for financial products, electronic equipment and medium
Almiman et al. Deep neural network approach for Arabic community question answering
CN111782793A (en) Intelligent customer service processing method, system and equipment
CN115080741A (en) Questionnaire survey analysis method, device, storage medium and equipment
CN107908649B (en) Text classification control method
CN116402166B (en) Training method and device of prediction model, electronic equipment and storage medium
CN117351336A (en) Image auditing method and related equipment
CN109254993B (en) Text-based character data analysis method and system
Nguyen et al. A model of convolutional neural network combined with external knowledge to measure the question similarity for community question answering systems
CN113688633A (en) Outline determination method and device
CN114116967A (en) Data cleaning method and device, electronic equipment and computer readable medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination