CN115099899A - Data processing method and device, electronic equipment and storage medium - Google Patents

Data processing method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN115099899A
CN115099899A CN202210747502.5A CN202210747502A CN115099899A CN 115099899 A CN115099899 A CN 115099899A CN 202210747502 A CN202210747502 A CN 202210747502A CN 115099899 A CN115099899 A CN 115099899A
Authority
CN
China
Prior art keywords
user
determining
initial
keywords
derogation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210747502.5A
Other languages
Chinese (zh)
Inventor
瞿学新
陈涛
翟文博
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Bank Co Ltd
Original Assignee
Ping An Bank Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Bank Co Ltd filed Critical Ping An Bank Co Ltd
Priority to CN202210747502.5A priority Critical patent/CN115099899A/en
Publication of CN115099899A publication Critical patent/CN115099899A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0631Item recommendations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0605Supply or demand aggregation

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Finance (AREA)
  • Accounting & Taxation (AREA)
  • Strategic Management (AREA)
  • Development Economics (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Marketing (AREA)
  • Economics (AREA)
  • General Business, Economics & Management (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Data Mining & Analysis (AREA)
  • Game Theory and Decision Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the application discloses a data processing method and device, electronic equipment and a storage medium. The method comprises the following steps: the electronic equipment obtains initial derogation keyword sets by obtaining initial feedback samples corresponding to a plurality of research users and extracting initial derogation keywords from the initial feedback samples; determining similar users similar to the predicted user from the multiple research users, and determining a first feedback sample corresponding to the similar users in the initial feedback sample; determining first derogatory keywords appearing in the initial derogatory keyword set in a first feedback sample; determining a confidence level of each first derogative keyword in the first feedback sample; and determining the target derogatory keywords corresponding to the predicted user from the first derogatory keywords according to the confidence degree. Therefore, the derogation reason of the predicted user is determined according to the research user and the initial feedback sample corresponding to the research user.

Description

Data processing method and device, electronic equipment and storage medium
Technical Field
The present application relates to the field of data processing, and in particular, to a data processing method and apparatus, an electronic device, and a storage medium.
Background
At the moment of the vigorous development of the internet, good public praise can bring huge flow to enterprises, and the stickiness of users is also improved. Several internet companies use a Net Promoter Score (NPS) to determine the likelihood that a user would like to recommend a product or service to others.
However, the conventional method for investigating the net recommended value is mainly in the form of questionnaires. The method has limited coverage, and the retrieval timeliness of the questionnaire is poor, so that the derogatory reason of the user cannot be accurately determined, and the user cannot be saved in time.
Disclosure of Invention
The embodiment of the application provides a data processing method and device, electronic equipment and a storage medium. The data processing method can determine the derogation reason of the predicted user according to the researched user and the initial feedback sample corresponding to the researched user.
In a first aspect, an embodiment of the present application provides a data processing method, including:
acquiring initial feedback samples corresponding to a plurality of research users, and performing initial derogation keyword extraction on the initial feedback samples to obtain an initial derogation keyword set;
determining similar users similar to the predicted user from the multiple research users, and determining a first feedback sample corresponding to the similar users in the initial feedback sample;
determining first derogatory keywords appearing in the initial derogatory keyword set in a first feedback sample;
determining a confidence level of each first derogative keyword in the first feedback sample;
and determining the target derogatory keywords corresponding to the predicted user from the first derogatory keywords according to the confidence degree.
In a second aspect, an embodiment of the present application provides a data processing apparatus, including:
the acquisition module is used for acquiring initial feedback samples corresponding to a plurality of research users, and performing initial derogation keyword extraction on the initial feedback samples to obtain an initial derogation keyword set;
the first determining module is used for determining similar users similar to the predicted user in the multiple research users and determining first feedback samples corresponding to the similar users in the initial feedback samples;
a second determining module, configured to determine, in the first feedback sample, a first detraction keyword that appears in the initial detraction keyword set;
a third determining module, configured to determine a confidence level of each of the first derogatory keywords in the first feedback sample;
and the fourth determining module is used for determining the target derogatory keywords corresponding to the predicted user from the first derogatory keywords according to the confidence degree.
In a third aspect, an embodiment of the present application provides an electronic device, including: a memory storing executable program code, a processor coupled to the memory; the processor calls the executable program codes stored in the memory to execute the steps in the data processing method provided by the embodiment of the application.
In a fourth aspect, an embodiment of the present application provides a storage medium, where the storage medium stores multiple instructions, and the instructions are suitable for a processor to load, so as to implement steps in a data processing method provided in an embodiment of the present application.
In the embodiment of the application, the electronic device obtains an initial derogation keyword set by obtaining initial feedback samples corresponding to a plurality of research users and extracting initial derogation keywords from the initial feedback samples; determining similar users similar to the predicted user from the multiple research users, and determining a first feedback sample corresponding to the similar users in the initial feedback sample; determining first derogatory keywords appearing in the initial derogatory keyword set in a first feedback sample; determining a confidence level of each first derogative keyword in the first feedback sample; and determining the target derogatory keywords corresponding to the predicted user from the first derogatory keywords according to the confidence degree. Therefore, the derogation reason of the predicted user is determined according to the research user and the initial feedback sample corresponding to the research user.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 is a first flowchart of a data processing method according to an embodiment of the present application.
Fig. 2 is a second flowchart of the data processing method according to the embodiment of the present application.
Fig. 3 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present application.
Fig. 4 is a schematic structural diagram of an electronic device provided in an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
When the internet is vigorously developed, good public praise can bring huge traffic to enterprises, and the stickiness of users is improved. Many internet companies use a Net recommendation Score (NPS) to determine the likelihood that a user would like to recommend a product or service to others.
However, the conventional method of investigating the net recommended value is mainly in the form of a questionnaire. The method has limited coverage, and the retrieval timeliness of the questionnaire is poor, so that the derogatory reason of the user cannot be accurately determined, and the user cannot be saved in time.
In order to solve the technical problem, embodiments of the present application provide a data processing method and apparatus, an electronic device, and a storage medium. The data processing method can determine the derogation reasons of the predicted users according to the researched users and the initial feedback samples corresponding to the researched users.
The data processing method provided by the embodiment of the application can be applied to electronic equipment with storage and calculation capabilities, such as computers, servers, workstations and the like.
Referring to fig. 1, fig. 1 is a first flowchart of a data processing method according to an embodiment of the present disclosure. The data processing method may include the steps of:
110. and acquiring initial feedback samples corresponding to a plurality of research users, and extracting initial derogatory keywords from the initial feedback samples to obtain an initial derogatory keyword set.
In some embodiments, by way of questionnaire research, an initial feedback sample of feedback of different research users may be obtained from a network channel or an offline channel, for example, some evaluation contents of the research users may be used as the initial feedback sample, or different options in questionnaires of the research users may also be used as the initial feedback sample.
Specifically, the electronic device may integrate the questionnaire contents of the same research user in the same time period, for example, extract the text information therein, so as to obtain an initial feedback sample, where the initial feedback sample includes information for evaluation feedback of the research user on the product or service.
It should be noted that, one research user may correspond to a plurality of initial feedback samples at different times, for example, the research user fills in a questionnaire once in january, and the research user fills in a questionnaire once in may. And the questionnaire of january correspondingly generates an initial feedback sample of the research user, and the questionnaire of may correspondingly generates an initial feedback sample of the research user.
In some embodiments, after obtaining the initial feedback sample corresponding to the researched user, the initial derogation keyword may be extracted from the initial feedback sample to obtain an initial derogation keyword set.
Specifically, the electronic device may perform word segmentation and/or filtering on the initial feedback sample to obtain a first processing sample, input the first processing sample into the text processing model, and output the initial derogatory keywords to obtain an initial derogatory keyword set.
For example, the electronic device may first obtain all texts in the initial feedback sample, and then perform word segmentation extraction on all texts, for example, a word segmentation extraction tool is used for extraction, and the extracted keywords may be multiple words and short sentences. Thereby obtaining a first processed sample.
For another example, the electronic device may further set corresponding stop words, such as some positive words, like "good comment", "good mistake", "good use", etc., which have positive meanings, and set the stop words as the stop words, and then filter all the text stop words, thereby filtering the stop words. Thereby obtaining a first processed sample.
In the process of processing the initial feedback sample, the electronic device may further extract a text from the initial feedback sample to obtain a plurality of texts corresponding to the initial feedback sample, then deactivate some words by filtering stop words, and extract the text after filtering the stop words by a word extraction method, thereby obtaining a first processing sample.
And finally, processing the first processing sample through the text processing model to obtain initial derogatory keywords, so that an initial derogatory keyword set is formed according to the initial derogatory keywords. For example, word vectors of different keywords in the first processing sample may be obtained by the text processing model, then the initial derogation keywords determined according to the word vectors are generated, and finally the initial derogation keyword set is generated.
The text processing model may adopt a Bert model or a keyBert model, so that keyword extraction of the first processing sample is completed, and the initial derogatory keywords are obtained. The initial derogation keywords are words which are used for negative evaluation on products or services by research users.
120. And determining similar users similar to the predicted user from the plurality of research users, and determining a first feedback sample corresponding to the similar users in the initial feedback sample.
For different users, each user has corresponding user characteristics, wherein the user characteristics can be user characteristics of multiple dimensions such as user age, gender, consumption habits, economic strength, personal hobbies and the like.
In some embodiments, the electronic device may acquire a first feature corresponding to each research user and a second feature of the predicted user, then determine a similarity between each research user and the predicted user according to the first feature and the second feature, and finally determine the research user with the similarity greater than a preset similarity threshold as a similar user.
For example, the electronic device may determine the user characteristics of each research user according to the corresponding personal data of each research user in the database, for example, acquire personal data, such as age, sex, consumption habits, economic strength, personal preference, and the like, of the user authorized for storage in the database. And finally, generating a first feature corresponding to the investigation user according to the vector of the user feature of each dimension, wherein the first feature can be understood as being obtained by adding vectors of the multi-dimensional investigation user features.
The electronic device may determine the user characteristics of the predicted user according to the personal data corresponding to each predicted user in the database, for example, obtain, in the database, personal data that the user has authorized to store and use, such as user characteristics of multiple dimensions of age, gender, consumption habits, economic strength, personal preference, and the like. And finally, generating a second feature corresponding to the predicted user according to the vector of the user feature of each dimension, wherein the second feature can be obtained by adding vectors of multi-dimensional user features.
Finally, the electronic device may obtain the feature similarity between each first feature and each second feature, for example, the feature similarity between each first feature and each second feature may be obtained in a plurality of manners, such as a cosine distance, a euclidean distance, and a pearson correlation coefficient.
The electronic device may determine a feature similarity between each first feature and each second feature as a similarity between each research user and the predicted user, and finally determine the research user with the similarity greater than a preset similarity threshold as a similar user.
In some embodiments, after determining similar users, a first feedback sample corresponding to the similar users may be determined in the initial feedback sample.
In some embodiments, the electronic device may calculate a net recommendation value corresponding to each similar user according to the first feedback sample corresponding to each similar user, and then determine a first derogation user among the similar users according to the net recommendation value.
For example, the electronic device determines, among the similar users, a target similar user whose net recommendation value is smaller than a preset net recommendation value, and then determines the target similar user as a first derogative user.
In some embodiments, the electronic device may determine a proportion of the first detracting user among the similar users; and if the proportion is larger than a preset proportion threshold value, determining that the predicted user is the target derogation user.
That is to say, when the proportion of the first derogative user in the similar users is greater than the preset proportion threshold, it is indicated that most of the users similar to the predicted user are derogative users, and the predicted user probably rate is the derogative user.
130. First detraction keywords that appear in the initial detraction keyword set are determined in the first feedback sample.
In some embodiments, after determining the first feedback sample corresponding to the similar user, the electronic device may further obtain each keyword of the first feedback sample, and then determine a keyword appearing in the initial detraction keyword set as the first detraction keyword.
For example, the electronic device may match each keyword with the initial detraction keyword set by means of keyword matching, and determine a certain keyword as the first detraction keyword if the certain keyword is successfully matched.
After determining the first detraction keywords, the electronic device may further determine the number of times that each of the first detraction keywords appears in the first feedback sample. For example, a certain derogative keyword may be matched with each first feedback sample, then the times of the occurrence of the first derogative keyword in each first feedback sample are counted, and finally the times are added to obtain the times of the occurrence of the first derogative keyword in all first feedback samples.
140. And determining the confidence of each derogatory keyword in the first feedback sample.
In some embodiments, the electronic device may determine, from the first detraction keywords, target first detraction keywords, determine a first total number of times that the target first detraction keywords appear in all the first feedback samples, then determine a second total number of times that all the first detraction keywords appear in all the first feedback samples, and finally divide the first total number of times by the second total number of times to obtain a confidence corresponding to the target first detraction keywords.
For example, the target first detraction keywords include a keyword a, a keyword B, and a keyword C, where the keyword a is the target first detraction keyword, and then a first total number of times that the keyword a appears in all the first feedback samples is obtained. And the electronic equipment determines a second total number of times of occurrence of the keyword A, the keyword B and the keyword C in all the first feedback samples. And finally, dividing the first total times by the second total times to obtain the corresponding confidence coefficient of the keyword A.
In some embodiments, the electronic device may further determine first similar users corresponding to the target first derogation keyword, then obtain similarities between each first similar user and the predicted user, add the similarities corresponding to each first similar user to obtain a sum of the first similarities, and then multiply the sum of the first similarities by the first total number of times that the target first derogation keyword appears in all the first feedback samples to obtain a first product result.
The electronic device obtains the sum of the second similarities of the similar users corresponding to all the first derogatory keywords and the sum of the second similarities corresponding to each of the similar users and the predicted user, and then multiplies the sum of the second similarities by the second total times of occurrence of all the first derogatory keywords in all the first feedback samples to obtain a second multiplication result.
And finally, dividing the first product result by the second product result to obtain the confidence corresponding to the target first derogative keyword.
By the method, the corresponding confidence coefficient of each first keyword in all the first feedback samples can be determined in sequence.
150. And determining the target derogatory keywords corresponding to the predicted user from the first derogatory keywords according to the confidence degree.
In some embodiments, the electronic device determines, from the first detraction keywords, second detraction keywords with a confidence degree greater than a preset confidence degree, and determines the second detraction keywords as target detraction keywords corresponding to the predicted user.
That is to say, the second derogation keyword may be used as a derogation reason corresponding to the predicted user, and if the user is predicted to be derogated, the corresponding adjustment may be made to the whole service or product according to the derogation reason.
If the predicted user is not a derogative user, some existing disadvantages or experiences may be improved in subsequent product development according to the derogative reason of the predicted user.
In the embodiment of the application, the electronic device determines whether the predicted user is a derogation user by acquiring the researched user and an initial feedback sample corresponding to the researched user, and determines a derogation reason corresponding to the predicted user. Thereby timely determining the defects existing in the products or services and improving the defects in time to save the users.
In the embodiment of the application, the electronic device obtains an initial derogation keyword set by obtaining initial feedback samples corresponding to a plurality of research users and extracting initial derogation keywords from the initial feedback samples; determining similar users similar to the predicted user from the multiple research users, and determining a first feedback sample corresponding to the similar users in the initial feedback sample; determining first derogatory keywords appearing in the initial derogatory keyword set in a first feedback sample; determining the confidence of each derogatory keyword in the first feedback sample; and determining the target derogatory keywords corresponding to the predicted user from the first derogatory keywords according to the confidence degree. Therefore, the method and the device realize that the damage-removing reason of the predicted user is determined according to the researched user and the initial feedback sample corresponding to the researched user.
For a more detailed understanding of the data processing method provided in the embodiment of the present application, please refer to fig. 2, wherein fig. 2 is a second flowchart of the data processing method provided in the embodiment of the present application. The data processing method may include the steps of:
201. the method comprises the steps of obtaining initial feedback samples corresponding to a plurality of research users, and performing word segmentation extraction and/or filtering stop words on the initial feedback samples to obtain a first processing sample.
In some embodiments, by way of questionnaire research, an initial feedback sample of feedback of different research users may be obtained from a network channel or an offline channel, for example, some evaluation contents of the research users may be used as the initial feedback sample, or different options in questionnaires of the research users may also be used as the initial feedback sample.
Specifically, the electronic device may integrate the questionnaire contents of the same research user in the same time period, for example, extract the text information therein, so as to obtain an initial feedback sample, where the initial feedback sample includes information for evaluation feedback of the research user on the product or service.
It should be noted that, one research user may correspond to a plurality of initial feedback samples at different times, for example, the research user fills in a questionnaire once in january, and the research user fills in a questionnaire once in may. And the questionnaire of january correspondingly generates an initial feedback sample of the research user, and the questionnaire of may correspondingly generates an initial feedback sample of the research user.
Specifically, the electronic device may perform word segmentation on the initial feedback sample and/or filter stop words to obtain a first processing sample, then input the first processing sample into the text processing model, and output the initial derogation keywords to obtain an initial derogation keyword set.
For example, the electronic device may first obtain all texts in the initial feedback sample, and then perform word segmentation extraction on all texts, for example, perform extraction by using a word segmentation extraction tool, where the extracted keywords may be multiple words and short sentences. Thereby obtaining a first processed sample.
For another example, the electronic device may further set corresponding stop words, such as some positive words, like "good comment", "good error", "good use", etc., which have positive meanings, to be stop words, and then filter all the text for stop words, thereby filtering out stop words. Thereby obtaining a first processed sample.
In the process of processing the initial feedback sample, the electronic device may further extract a text from the initial feedback sample to obtain a plurality of texts corresponding to the initial feedback sample, then deactivate some words by filtering stop words, and extract the text after filtering the stop words by a word extraction method, thereby obtaining a first processing sample.
202. And inputting the first processing sample into a text processing model, and outputting the initial derogation keywords to obtain an initial derogation keyword set.
And processing the first processing sample through the text processing model so as to obtain the initial derogatory keywords, and forming an initial derogatory keyword set according to the initial derogatory keywords. For example, word vectors of different keywords in the first processing sample may be obtained by the text processing model, then the initial derogatory keywords determined according to the word vectors are generated, and finally the initial derogatory keyword set is generated.
The text processing model may adopt a Bert model or a keyBert model, so as to complete keyword extraction of the first processing sample and obtain the initial devaluation keywords. The initial derogation keywords are words which are used for negative evaluation on products or services by research users.
203. And acquiring a first characteristic corresponding to each research user and a second characteristic of the predicted user.
For different users, each user has corresponding user characteristics, wherein the user characteristics can be user characteristics of multiple dimensions such as user age, gender, consumption habits, economic strength, personal hobbies and the like.
For example, the electronic device may determine the user characteristics of each research user according to the corresponding personal data of each research user in the database, for example, acquire personal data, such as age, sex, consumption habits, economic strength, personal preference, and the like, of the user authorized for storage in the database. And finally, generating a first feature corresponding to the investigation user according to the vector of the user feature of each dimension, wherein the first feature can be understood as being obtained by adding vectors of the multi-dimensional investigation user features.
The electronic device may determine the user characteristics of the predicted user according to the personal data corresponding to each predicted user in the database, for example, obtain, in the database, personal data that the user has authorized to store and use, such as user characteristics of multiple dimensions of age, gender, consumption habits, economic strength, personal preference, and the like. And finally, generating a second feature corresponding to the predicted user according to the vector of the user feature of each dimension, wherein the second feature can be obtained by adding vectors of multi-dimensional user features.
204. And determining the similarity between each research user and the predicted user according to the first characteristic and the second characteristic, and determining the research users with the similarity larger than a preset similarity threshold as similar users.
Finally, the electronic device may obtain the feature similarity between each first feature and each second feature, for example, the feature similarity between each first feature and each second feature may be obtained in a plurality of manners, such as a cosine distance, a euclidean distance, a pearson correlation coefficient, and the like.
The electronic device may determine a feature similarity between each first feature and each second feature as a similarity between each research user and the predicted user, and finally determine the research user with the similarity greater than a preset similarity threshold as a similar user.
In some embodiments, the electronic device may further input the first feature and the second feature of each research user into a similarity calculation formula to obtain a corresponding similarity of each research user, where the similarity calculation formula is:
Figure BDA0003717359620000101
wherein s is j For similarity, α is the decay factor, j is the investigator, x is the predictor, diff (j) is the time difference between the current date and the investigator's investigation date,
Figure BDA0003717359620000111
to investigate the cosine similarity between the users and the predicted users.
For example, the investigation user may be investigated for a plurality of times in one month, and for example, the investigation user may be investigated for one time in one month, and the investigation sample corresponding to the investigation time closest to the current time may be taken as the feedback sample of the investigation user in five months.
In the above similarity calculation formula, the smaller the time difference between the current date and the investigation date of the investigation user, the larger the value of the first half of the multiplier in the similarity calculation formula, that is, the value of the first half of the multiplier
Figure BDA0003717359620000112
The larger the value of (c). The greater the similarity value corresponding to the investigation user.
205. And determining first feedback samples corresponding to the similar users in the initial feedback samples, and calculating a net recommendation value corresponding to each similar user according to the first feedback samples corresponding to each similar user.
In some embodiments, the electronic device may calculate a net recommendation value for each similar user using a corresponding NPS algorithm and a first feedback sample for each similar user.
For example, the net recommended value may be a value in the range of 0 to 10.
206. And determining a first derogation user from the similar users according to the net recommendation value.
In some embodiments, the user whose net recommendation value is less than the preset net recommendation value is determined to be the first derogatory user, for example, the preset net recommendation value may be set to 6, as long as the user whose net recommendation value is less than 6 among the similar users is the first derogatory user.
207. And determining the proportion of the first derogation user in the similar users, and if the proportion is greater than a preset proportion threshold, determining the predicted user as a target derogation user.
That is, when the proportion of the first derogated user in the similar users is greater than the preset proportion threshold, it is indicated that most of the users similar to the predicted user are derogated users, and the predicted user probably is a derogated user.
208. First detraction keywords that appear in the initial detraction keyword set are determined in the first feedback sample.
In some embodiments, after determining the first feedback sample corresponding to the similar user, the electronic device may further obtain each keyword of the first feedback sample, and then determine a keyword appearing in the initial detraction keyword set as the first detraction keyword.
For example, the electronic device may match each keyword with the set of initial detraction keywords by means of keyword matching, and determine a certain keyword as a first detraction keyword if the certain keyword is successfully matched.
After determining the first detraction keywords, the electronic device may further determine the number of times that each of the first detraction keywords appears in the first feedback sample. For example, a certain first detraction keyword may be matched with each first feedback sample, then the times of occurrence of the first detraction keyword in each first feedback sample are counted, and finally the times are added to obtain the times of occurrence of the first detraction keyword in all first feedback samples.
209. And determining the confidence of each derogatory keyword in the first feedback sample.
In some embodiments, the electronic device may determine, from the first detraction keywords, target first detraction keywords, determine a first total number of times that the target first detraction keywords appear in all the first feedback samples, then determine a second total number of times that all the first detraction keywords appear in all the first feedback samples, and finally divide the first total number of times by the second total number of times to obtain a confidence corresponding to the target first detraction keywords.
For example, the target first derogatory keywords include a keyword a, a keyword B, and a keyword C, where the keyword a is the target first derogatory keyword, and then a first total number of times that the keyword a appears in all the first feedback samples is obtained. And the electronic equipment determines a second total number of times that the keywords A, B and C appear in all the first feedback samples. And finally, dividing the first total times by the second total times to obtain the corresponding confidence coefficient of the keyword A.
In some embodiments, the electronic device may further determine first similar users corresponding to the target first derogation keyword, then obtain similarities between each first similar user and the predicted user, add the similarities corresponding to each first similar user to obtain a first similarity sum, and then multiply the first similarity sum by the first total number of times that the target first derogation keyword appears in all the first feedback samples to obtain a first product result.
The electronic device obtains the sum of the second similarities of the similar users corresponding to all the first derogatory keywords and the sum of the second similarities corresponding to each of the similar users and the predicted user, and then multiplies the sum of the second similarities by the second total times of occurrence of all the first derogatory keywords in all the first feedback samples to obtain a second multiplication result.
And finally, dividing the first product result by the second product result to obtain the confidence corresponding to the target first derogative keyword.
By the method, the corresponding confidence coefficient of each first keyword in all the first feedback samples can be determined in sequence.
210. And determining second derogation keywords with the confidence degrees larger than the preset confidence degrees from the first derogation keywords, and determining the second derogation keywords as target derogation keywords corresponding to the predicted user.
For example, the electronic device may preset a preset confidence threshold, and when the confidence of a certain first detraction keyword is greater than the preset confidence threshold, it is determined that the first detraction keyword is higher in confidence, and at this time, the first detraction keyword is determined to be a second detraction keyword, that is, the target detraction keyword corresponding to the user is predicted.
In the embodiment of the application, the electronic device determines whether the predicted user is a derogation user by acquiring the researched user and an initial feedback sample corresponding to the researched user, and determines a derogation reason corresponding to the predicted user. Thereby timely determining the defects existing in the products or services and improving the defects in time to save the users.
In the embodiment of the application, the electronic device obtains initial feedback samples corresponding to a plurality of research users, and performs word segmentation and extraction and/or filtering stop words on the initial feedback samples to obtain a first processing sample. And inputting the first processing sample into a text processing model, and outputting the initial derogation keywords to obtain an initial derogation keyword set.
And then acquiring a first characteristic corresponding to each research user and a second characteristic of the predicted user. And determining the similarity between each research user and the predicted user according to the first characteristic and the second characteristic, and determining the research users with the similarity larger than a preset similarity threshold as similar users. And determining first feedback samples corresponding to the similar users in the initial feedback samples, and calculating a net recommendation value corresponding to each similar user according to the first feedback samples corresponding to each similar user. And determining a first derogative user from the similar users according to the net recommendation value.
And finally, determining the proportion of the first derogation user in the similar users, and if the proportion is greater than a preset proportion threshold, determining the predicted user as the target derogation user. First detraction keywords that appear in the initial detraction keyword set are determined in the first feedback sample. And determining the confidence of each derogatory keyword in the first feedback sample. And determining second derogation keywords with the confidence degrees larger than the preset confidence degrees from the first derogation keywords, and determining the second derogation keywords as target derogation keywords corresponding to the predicted user. Therefore, the derogation reason of the predicted user is determined according to the research user and the initial feedback sample corresponding to the research user.
Referring to fig. 3, fig. 3 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present disclosure. Wherein the data processing apparatus 300 may comprise:
the obtaining module 310 is configured to obtain initial feedback samples corresponding to multiple research users, and perform initial derogation keyword extraction on the initial feedback samples to obtain an initial derogation keyword set.
The obtaining module 310 is further configured to perform word segmentation and extraction and/or filter stop words on the initial feedback sample to obtain a first processing sample; and inputting the first processing sample into a text processing model, and outputting the initial derogation keywords to obtain an initial derogation keyword set.
The first determining module 320 is configured to determine similar users similar to the predicted user among the multiple research users, and determine a first feedback sample corresponding to the similar users in the initial feedback sample.
The first determining module 320 is further configured to calculate a net recommendation value corresponding to each similar user according to the first feedback sample corresponding to each similar user;
and determining a first derogation user from the similar users according to the net recommendation value.
The first determining module 320 is further configured to determine, among the similar users, a target similar user whose net recommendation value is smaller than a preset net recommendation value;
and determining the target similar user as a first derogation user.
A first determining module 320, further configured to determine a proportion of the first derogating user in the similar users;
and if the proportion is larger than a preset proportion threshold value, determining that the predicted user is the target derogation user.
The first determining module 320 is further configured to obtain a first feature corresponding to each research user and a second feature of the predicted user;
determining the similarity between each research user and the prediction user according to the first characteristic and the second characteristic;
and determining the research user with the similarity larger than a preset similarity threshold as a similar user.
The first determining module 320 is further configured to input the first feature and the second feature of each research user into a similarity calculation formula to obtain a similarity corresponding to each research user, where the similarity calculation formula is as follows:
Figure BDA0003717359620000151
wherein s is j For similarity, α is the attenuation coefficient, j is the survey user, x is the forecast user, diff (j) is the time difference between the current date and the survey user's survey date,
Figure BDA0003717359620000152
to investigate the cosine similarity between the users and the predicted users.
A second determining module 330, configured to determine, in the first feedback sample, a first detraction keyword that appears in the initial detraction keyword set.
A third determining module 340, configured to determine a confidence of each of the first derogation keywords in the first feedback sample.
The third determining module 340 is further configured to determine a target first detraction keyword from the first detraction keywords, and determine a first total number of times that the target first detraction keyword appears in all the first feedback samples;
determining a second total number of occurrences of all the first derogatory keywords in all the first feedback samples;
and dividing the first total times by the second total times to obtain the confidence corresponding to the target first derogative keyword.
A fourth determining module 350, configured to determine, according to the confidence, the target detraction keywords corresponding to the predicted user from the first detraction keywords.
The fourth determining module 350 is further configured to determine, from the first detraction keywords, second detraction keywords whose confidence degrees are greater than the preset confidence degrees, and determine the second detraction keywords as the target detraction keywords corresponding to the predicted user.
In the embodiment of the application, the electronic device obtains an initial derogation keyword set by obtaining initial feedback samples corresponding to a plurality of research users and extracting initial derogation keywords from the initial feedback samples; determining similar users similar to the predicted user from the multiple research users, and determining a first feedback sample corresponding to the similar users in the initial feedback sample; determining first derogatory keywords appearing in the initial derogatory keyword set in a first feedback sample; determining the confidence of each derogatory keyword in the first feedback sample; and determining the target derogatory keywords corresponding to the predicted user from the first derogatory keywords according to the confidence degree. Therefore, the derogation reason of the predicted user is determined according to the research user and the initial feedback sample corresponding to the research user.
Accordingly, an electronic device 400 may include one or more computer-readable storage media, a memory 401, an input unit 402, a display unit 403, a sensor 404, a processor 405 including one or more processing cores, and a power supply 406, as shown in fig. 4. Those skilled in the art will appreciate that the electronic device configuration shown in fig. 4 does not constitute a limitation of the electronic device and may include more or fewer components than shown, or some components may be combined, or a different arrangement of components. Wherein:
the memory 401 may be used to store software programs and modules, and the processor 405 executes various functional applications and data processing by operating the software programs and modules stored in the memory 401. The memory 401 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data (such as audio data, a phonebook, etc.) created according to the use of the electronic device, and the like. Further, the memory 401 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device. Accordingly, the memory 401 may also include a memory controller to provide the processor 405 and the input unit 402 with access to the memory 401.
The input unit 402 may be used to receive input numeric or character information and generate keyboard, mouse, joystick, optical or trackball signal inputs related to user settings and function control. In particular, in one particular embodiment, input unit 402 may include a touch-sensitive surface as well as other input devices. The touch-sensitive surface, also referred to as a touch display screen or a touch pad, may collect touch operations by a user (e.g., operations by a user on or near the touch-sensitive surface using a finger, a stylus, or any other suitable object or attachment) thereon or nearby, and drive the corresponding connection device according to a predetermined program. Alternatively, the touch sensitive surface may comprise two parts, a touch detection means and a touch controller. The touch detection device detects the touch direction of a user, detects a signal brought by touch operation and transmits the signal to the touch controller; the touch controller receives touch information from the touch sensing device, converts the touch information into touch point coordinates, sends the touch point coordinates to the processor 405, and can receive and execute commands sent by the processor 405. In addition, touch sensitive surfaces may be implemented using various types of resistive, capacitive, infrared, and surface acoustic waves. The input unit 402 may include other input devices in addition to a touch-sensitive surface. In particular, other input devices may include, but are not limited to, one or more of a physical keyboard, function keys (such as volume control keys, switch keys, etc.), a trackball, a mouse, a joystick, and the like.
The display unit 403 may be used to display information input by or provided to a user and various graphical user interfaces of the electronic device, which may be made up of graphics, text, icons, video, and any combination thereof. The Display unit 403 may include a Display panel, and optionally, the Display panel may be configured in the form of a Liquid Crystal Display (LCD), an Organic Light-Emitting Diode (OLED), or the like. Further, the touch-sensitive surface may overlay the display panel, and when a touch operation is detected on or near the touch-sensitive surface, the touch operation is transmitted to the processor 405 to determine the type of touch event, and the processor 405 then provides a corresponding visual output on the display panel according to the type of touch event. Although in FIG. 4 the touch-sensitive surface and the display panel are shown as two separate components to implement input and output functions, in some embodiments the touch-sensitive surface may be integrated with the display panel to implement input and output functions.
The electronic device may also include at least one sensor 404, such as a light sensor, a motion sensor, and other sensors. In particular, the light sensor may include an ambient light sensor that may adjust the brightness of the display panel according to the brightness of ambient light, and a proximity sensor that may turn off the display panel and/or the backlight when the electronic device is moved to the ear. As one of the motion sensors, the gravity acceleration sensor can detect the magnitude of acceleration in each direction (generally, three axes), detect the magnitude and direction of gravity when the motion sensor is stationary, and can be used for applications (such as horizontal and vertical screen switching, related games, magnetometer attitude calibration) for recognizing the attitude of an electronic device, vibration recognition related functions (such as pedometer and tapping), and the like; as for other sensors such as a gyroscope, a barometer, a hygrometer, a thermometer, and an infrared sensor, which may be further configured to the electronic device, detailed descriptions thereof are omitted.
The processor 405 is a control center of the electronic device, connects various parts of the whole electronic device by using various interfaces and lines, performs various functions of the electronic device and processes data by operating or executing software programs and/or modules stored in the memory 401 and calling data stored in the memory 401, thereby performing overall monitoring of the electronic device. Alternatively, processor 405 may include one or more processing cores; preferably, the processor 405 may integrate an application processor, which mainly handles operating systems, user interfaces, application programs, etc., and a modem processor, which mainly handles wireless communications. It will be appreciated that the modem processor described above may not be integrated into the processor 405.
The electronic device also includes a power supply 406 (e.g., a battery) for powering the various components, which may preferably be logically coupled to the processor 405 via a power management system, such that functions such as managing charging, discharging, and power consumption are performed via the power management system. The power supply 406 may also include any component of one or more dc or ac power sources, recharging systems, power failure detection circuitry, power converters or inverters, power status indicators, and the like.
Although not shown, the electronic device may further include a camera, a bluetooth module, and the like, which are not described in detail herein. Specifically, in this embodiment, the processor 405 in the electronic device loads the computer program stored in the memory 401, and the processor 405 loads the computer program, thereby implementing various functions:
acquiring initial feedback samples corresponding to a plurality of research users, and performing initial derogation keyword extraction on the initial feedback samples to obtain an initial derogation keyword set;
determining similar users similar to the predicted user from the multiple research users, and determining a first feedback sample corresponding to the similar users in the initial feedback sample;
determining a first derogatory keyword appearing in the initial derogatory keyword set in the first feedback sample;
determining a confidence level of each first derogative keyword in the first feedback sample;
and determining the target derogatory keywords corresponding to the predicted user from the first derogatory keywords according to the confidence degree.
It will be understood by those skilled in the art that all or part of the steps of the methods of the above embodiments may be performed by instructions or by associated hardware controlled by the instructions, which may be stored in a computer readable storage medium and loaded and executed by a processor.
To this end, embodiments of the present application provide a computer-readable storage medium, in which a plurality of instructions are stored, and the instructions can be loaded by a processor to execute the steps in any data processing method provided by the embodiments of the present application. For example, the instructions may perform the steps of:
acquiring initial feedback samples corresponding to a plurality of research users, and performing initial derogation keyword extraction on the initial feedback samples to obtain an initial derogation keyword set;
determining similar users similar to the predicted user from the multiple research users, and determining a first feedback sample corresponding to the similar users in the initial feedback sample;
determining first derogatory keywords appearing in the initial derogatory keyword set in a first feedback sample;
determining a confidence level of each first derogative keyword in the first feedback sample;
and determining the target derogatory keywords corresponding to the predicted user from the first derogatory keywords according to the confidence degree.
The above operations can be implemented in the foregoing embodiments, and are not described in detail herein.
Wherein the storage medium may include: read Only Memory (ROM), Random Access Memory (RAM), magnetic or optical disks, and the like.
Since the instructions stored in the storage medium can execute the steps in any data processing method provided in the embodiments of the present application, beneficial effects that can be achieved by any data processing method provided in the embodiments of the present application can be achieved, which are detailed in the foregoing embodiments and will not be described herein again.
The foregoing detailed description is directed to a data processing method, an apparatus, an electronic device, and a storage medium provided in the embodiments of the present application, and specific examples are applied in the present application to explain the principles and implementations of the present application, and the descriptions of the foregoing embodiments are only used to help understand the method and the core ideas of the present application; meanwhile, for those skilled in the art, according to the idea of the present application, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present application.

Claims (12)

1. A data processing method, comprising:
acquiring initial feedback samples corresponding to a plurality of research users, and performing initial derogation keyword extraction on the initial feedback samples to obtain an initial derogation keyword set;
determining similar users similar to the predicted user from the plurality of research users, and determining a first feedback sample corresponding to the similar users in the initial feedback sample;
determining, in the first feedback sample, first detraction keywords that appear in the initial detraction keyword set;
determining a confidence level of each of the first derogative keywords in the first feedback sample;
and determining the target derogation keywords corresponding to the predicted user in the first derogation keywords according to the confidence degrees.
2. The data processing method of claim 1, wherein after determining similar users from the plurality of research users that are similar to the predicted user, the method further comprises:
calculating a net recommendation value corresponding to each similar user according to a first feedback sample corresponding to each similar user;
and determining a first derogative user in the similar users according to the net recommendation value.
3. The data processing method according to claim 2, wherein the determining a derogative user among the similar users according to the net recommendation value comprises:
determining target similar users of which the net recommendation values are smaller than a preset net recommendation value from the similar users;
determining the target similar user as the first derogative user.
4. The data processing method according to claim 2, wherein after said determining a derogated user among said similar users according to said net recommendation value, the method further comprises:
determining the proportion of the first derogative user in the similar users;
and if the ratio is larger than a preset ratio threshold, determining that the predicted user is a target derogation user.
5. The data processing method of claim 1, wherein the performing of the initial derogation keyword extraction on the initial feedback samples to obtain an initial derogation keyword set comprises:
performing word segmentation extraction and/or filtering stop words on the initial feedback sample to obtain a first processing sample;
and inputting the first processing sample into a text processing model, and outputting the initial derogation keywords to obtain the initial derogation keyword set.
6. The data processing method of claim 1, wherein determining similar users among the plurality of research users that are similar to the predicted user comprises:
acquiring a first characteristic corresponding to each investigation user and a second characteristic of the prediction user;
determining a similarity between each of the research users and the predictive user according to the first feature and the second feature;
and determining the investigation user with the similarity larger than a preset similarity threshold as the similar user.
7. The data processing method of claim 6, wherein said determining a similarity between each of said research users and said predictive user based on said first feature and said second feature comprises:
inputting the first feature and the second feature of each research user into a similarity calculation formula to obtain a similarity corresponding to each research user, where the similarity calculation formula is:
Figure FDA0003717359610000021
wherein s is j For similarity, α is the attenuation coefficient, j is the survey user, x is the forecast user, diff (j) is the time difference between the current date and the survey user's survey date,
Figure FDA0003717359610000022
to investigate the cosine similarity between the users and the predicted users.
8. The data processing method of claim 1, wherein the determining the confidence level of each of the first detraction keywords in the first feedback sample comprises:
determining target first derogatory keywords in the first derogatory keywords, and determining a first total number of times that the target first derogatory keywords appear in all the first feedback samples;
determining a second total number of times that all the first derogatory keywords appear in all the first feedback samples;
and dividing the first total times by the second total times to obtain the confidence corresponding to the target first derogatory keywords.
9. The data processing method according to any one of claims 1 to 8, wherein the determining, according to the confidence, the target derogation keyword corresponding to the predicted user from the first derogation keywords comprises:
and determining second derogation keywords with the confidence degrees larger than a preset confidence degree from the first derogation keywords, and determining the second derogation keywords as target derogation keywords corresponding to the predicted user.
10. A data processing apparatus, comprising:
the system comprises an acquisition module, a processing module and a processing module, wherein the acquisition module is used for acquiring initial feedback samples corresponding to a plurality of research users, and extracting initial derogatory keywords from the initial feedback samples to obtain an initial derogatory keyword set;
a first determining module, configured to determine, among the multiple research users, similar users similar to the predicted user, and determine a first feedback sample corresponding to the similar users in the initial feedback sample;
a second determining module, configured to determine, in the first feedback sample, a first detraction keyword that appears in the initial detraction keyword set;
a third determining module, configured to determine a confidence level of each of the first derogatory keywords in the first feedback sample;
a fourth determining module, configured to determine, according to the confidence, a target derogation keyword corresponding to the predicted user from the first derogation keywords.
11. An electronic device, comprising:
a memory storing executable program code, a processor coupled with the memory;
the processor calls the executable program code stored in the memory to perform the steps in the data processing method according to any one of claims 1 to 9.
12. A storage medium storing a plurality of instructions adapted to be loaded by a processor to perform the steps of the data processing method according to any one of claims 1 to 9.
CN202210747502.5A 2022-06-28 2022-06-28 Data processing method and device, electronic equipment and storage medium Pending CN115099899A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210747502.5A CN115099899A (en) 2022-06-28 2022-06-28 Data processing method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210747502.5A CN115099899A (en) 2022-06-28 2022-06-28 Data processing method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN115099899A true CN115099899A (en) 2022-09-23

Family

ID=83295178

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210747502.5A Pending CN115099899A (en) 2022-06-28 2022-06-28 Data processing method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN115099899A (en)

Similar Documents

Publication Publication Date Title
US11829720B2 (en) Analysis and validation of language models
CN106776673B (en) Multimedia document summarization
US10037367B2 (en) Modeling actions, consequences and goal achievement from social media and other digital traces
CN108227564B (en) Information processing method, terminal and computer readable medium
CN111475729A (en) Search content recommendation method and device
CN109408829B (en) Method, device, equipment and medium for determining readability of article
CN114791982B (en) Object recommendation method and device
CN114357278A (en) Topic recommendation method, device and equipment
US10229212B2 (en) Identifying Abandonment Using Gesture Movement
CN113407738B (en) Similar text retrieval method and device, electronic equipment and storage medium
US9933861B2 (en) Method and apparatus for generating a user interface for taking or viewing the results of predicted actions
US11947758B2 (en) Diffusion-based handedness classification for touch-based input
CN116307394A (en) Product user experience scoring method, device, medium and equipment
US20220269935A1 (en) Personalizing Digital Experiences Based On Predicted User Cognitive Style
CN115099899A (en) Data processing method and device, electronic equipment and storage medium
KR102406634B1 (en) Personalized recommendation method and system based on future interaction prediction
CN114970562A (en) Semantic understanding method, device, medium and equipment
US20140365454A1 (en) Entity relevance for search queries
CN114547242A (en) Questionnaire investigation method and device, electronic equipment and readable storage medium
CN111159558B (en) Recommendation list generation method and device and electronic equipment
CN113392177A (en) Keyword acquisition method and device, electronic equipment and storage medium
CN114398501B (en) Multimedia resource grouping method, device, equipment and storage medium
CN117710020B (en) Big data-based user preference analysis method
CN117572991A (en) Input interface display method and device, electronic equipment and readable storage medium
CN114065023A (en) Content recommendation method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination