WO2020000875A1 - Data processing method and electronic device - Google Patents

Data processing method and electronic device Download PDF

Info

Publication number
WO2020000875A1
WO2020000875A1 PCT/CN2018/116169 CN2018116169W WO2020000875A1 WO 2020000875 A1 WO2020000875 A1 WO 2020000875A1 CN 2018116169 W CN2018116169 W CN 2018116169W WO 2020000875 A1 WO2020000875 A1 WO 2020000875A1
Authority
WO
WIPO (PCT)
Prior art keywords
user
cluster
users
question
information
Prior art date
Application number
PCT/CN2018/116169
Other languages
French (fr)
Chinese (zh)
Inventor
缪庆亮
胡长建
李杨
Original Assignee
联想(北京)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 联想(北京)有限公司 filed Critical 联想(北京)有限公司
Publication of WO2020000875A1 publication Critical patent/WO2020000875A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/01Customer relationship services

Definitions

  • the invention relates to the field of processing, in particular to a data processing method and an electronic device.
  • the present invention provides a data processing method and an electronic device to solve the problems raised by different users in the prior art and cannot provide targeted answers.
  • the specific solutions are as follows:
  • a data processing method includes:
  • determining the first user tag cluster according to the user information of the first user includes:
  • the user relationship graph includes: no less than two users and similarity between each two users;
  • a first user label cluster of the first user is determined according to a user label cluster of an initial user in the user relationship graph.
  • determining the first user tag cluster according to the user information of the first user includes:
  • the first user tag cluster is determined according to the similarity.
  • determining the first user tag cluster of the first user according to the user tag cluster of the initial user in the user relationship graph includes:
  • a user tag cluster of other users except the initial user in the user relationship graph is determined.
  • determining the initial user from no less than two users in the user relationship diagram includes:
  • An initial user is determined according to the number of initial users corresponding to each question cluster.
  • the question information is combined with other question information.
  • An electronic device includes a processor and a memory, wherein:
  • the memory is configured to store a user tag cluster and a technical level parameter corresponding to the user tag cluster
  • the processor is configured to obtain user information of the first user when receiving the question information sent by the first user, determine a first user tag cluster according to the user information of the first user, and group the first user
  • the tag cluster is determined as the user tag cluster of the first user, and the question information sent by the first user is returned according to the technical level parameter of the first user tag cluster.
  • determining, by the processor according to the user information of the first user, a first user tag cluster includes:
  • the processor searches for a user relationship graph, and the user relationship graph includes: no less than two users and a similarity between each two users, and when the user relationship graph includes the first user, according to the The user tag cluster of the initial user in the user relationship graph determines the first user tag cluster of the first user.
  • determining, by the processor according to the user information of the first user, a first user tag cluster includes:
  • the processor determines the similarity ranking of the first user and the first number of user tag clusters according to the user information of the first user, and determines the first user tag cluster according to the similarity.
  • the determining, by the processor according to the user tag cluster of the initial user in the user relationship graph, the first user tag cluster of the first user includes:
  • the processor determines an initial user from no less than two users in the user relationship diagram, sets a user tag cluster for the initial user, and according to the similarity between every two users in the user relationship diagram And an iterative function to determine a user tag cluster of other users in the user relationship graph other than the initial user among the two or more users.
  • the data processing method and electronic device disclosed in this application when receiving the question information sent by the first user, obtain the user information of the first user, and determine the first user according to the user information of the first user Tag clustering, determines the first user tag cluster as the user tag cluster of the first user, and responds to the question information sent by the first user according to the technical level parameters of the first user tag cluster.
  • This solution determines different user tag clusters corresponding to the user according to the user information of different users, thereby achieving the reply to the questions raised by the technical level parameters corresponding to the user tag clusters of each user, and achieving the realization according to different users. Different professional levels for targeted answers.
  • FIG. 1 is a flowchart of a data processing method disclosed by an embodiment of the present invention
  • FIG. 2 is a flowchart of a data processing method disclosed by an embodiment of the present invention.
  • FIG. 3 is a flowchart of a data processing method disclosed by an embodiment of the present invention.
  • FIG. 5 is a flowchart of a data processing method disclosed by an embodiment of the present invention.
  • FIG. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
  • the invention discloses a data processing method.
  • the flowchart is shown in FIG. 1 and includes:
  • Step S11 When the problem information sent by the first user is received, the user information of the first user is acquired;
  • a user asks a question
  • the user asking the question can be logged in.
  • the user's user information can be obtained from the account that the user logs in to.
  • the user information may be personal information filled in by the user when registering or supplementing the logged-in account, and may also be information such as questions or speeches previously issued by the account in which the user is logged in.
  • the user sends a question to the customer service.
  • the user can first log in to the account, after logging in, send the problem to the customer service, and then the customer service system obtains the user's user from the account.
  • Information such as: age, how long the product has been used, the number of products used, questions the user has asked, etc.
  • Step S12 Determine a first user tag cluster according to the user information of the first user, and determine the first user tag cluster as a user tag cluster of the first user;
  • multiple user tag clusters can be set in advance, different user tag clusters correspond to different technical level parameters, and the user's technical level in the corresponding user tag cluster is represented by different technical level parameters.
  • three user tag clusters are set in advance, which are the first user tag cluster, the second user tag cluster, and the third user tag cluster.
  • the first user tag cluster corresponds to a high level of technology, and the technical level parameters can be 1; in the technical level corresponding to the second user tag cluster, the technical level parameter may be 2; the third user tag cluster corresponds to a low technical level, and the technical level parameter may be 3.
  • Determining the first user tag cluster according to the user information of the first user may be specifically: comprehensively evaluating the user tag cluster of the user according to the multiple user information corresponding to the first user.
  • the product the user consulted is an electronic product
  • the user is 30 years old, and the user has been using the electronic product for 3 years, and the user has asked professional questions that are more professional. Since young people have a better understanding of electronic products, and have been using the electronic products for 3 years, the questions they have asked are also more professional. Then, it can be determined from this that the user tag to which the user belongs is clustered as the first user tag. Clustering, high technical level;
  • the product the user consults is an electronic product
  • the user is 65 years old, and the user has used the electronic product for 3 months. Since the elderly know less about the electronic product and use the electronic product for a shorter time, this can be It is determined that the user tag cluster to which the user belongs is a third user tag cluster, and the technical level is low.
  • the user tag cluster of the user may also be determined by other methods, which is not specifically limited herein.
  • Step S13 Reply to the question information sent by the first user according to the technical level parameters of the first user tag clustering.
  • the technical level of users belonging to the corresponding user tag cluster can be determined according to different technical level parameters. Then, according to the different technical levels of different users, they respond to their suggestions. The problem.
  • Different technical level parameters correspond to different ways of answering questions. For example, for users with low technical level, when replying to their questions, use simple language, minimize professional terms, and reply longer; for high technical level Users can use more specialized terminology when replying to their questions, and the reply content is mainly short.
  • the user information of the first user is acquired, the first user tag cluster is determined according to the first user user information, and the first user tags are clustered.
  • the class is determined as the user tag cluster of the first user, and the question information sent by the first user is returned according to the technical level parameters of the first user tag cluster.
  • This embodiment discloses a data processing method.
  • the flowchart is shown in FIG. 2 and includes:
  • Step S21 When the problem information sent by the first user is received, the user information of the first user is acquired;
  • Step S22 Finding a user relationship graph, the user relationship graph includes: no less than two users and similarity between each two users;
  • a user relationship graph is stored in advance, and the user relationship graph includes: no less than two users, and a similarity between each two users.
  • constructing a user relationship graph requires first extracting user features and calculating feature values, and then constructing a user and feature matrix.
  • Extracting user features can be specifically: extracting predefined user features and question features, as shown in Table 1 user feature description and Table 2 user question feature description:
  • User age Integer value Estimate the age based on the information provided when the user registered Number of mobile phones used by users Integer value Number of times a user bought a phone User use phone time span Integer value Time interval between the user's first purchase of the mobile phone and the question
  • Table 1 includes: the name, feature value, and feature description of the user ’s characteristics.
  • the user ’s characteristic is the user ’s age, it is based on the user ’s registration.
  • the information is the estimated age; when the user is specifically the number of mobile phones used by the user, it is determined based on the number of times the user has purchased the mobile phone; the user characteristic is the time span of the user ’s mobile phone, which is based on the user ’s first purchase of the mobile phone to the current question The time interval is determined.
  • Table 2 includes: the name, feature value, and feature description of the user ’s question feature.
  • the feature of the user question is the professional word frequency in the user question, which is based on The frequency of professional vocabulary in user history questions is determined;
  • the user question feature is the representativeness of the user question, which is determined based on the number of samples in the cluster where the user question is located;
  • the user question feature is the detail level of the user question answer, which is based on the user The number of characters for the answer to the question is determined;
  • the feature of the user question is the number of user questions, which is determined based on the number of historical user questions;
  • the feature of the user question is the user's dialogue time, which is based on the average of the user's historical question dialogue in the customer service system It takes time to determine;
  • the user problem is characterized by the number of user interaction rounds, which is determined according to the number of user interactions with the customer service in the historical problems in the customer service system.
  • a user and feature matrix M is constructed, each row of the matrix represents a user, each column of the matrix represents a one-dimensional feature, and then each column is normalized.
  • User relationship graph G is composed of user U and user similarity, where G's nodes are users and edges are similarity between users.
  • V i ⁇ U i , Q i >
  • V i the first i nodes
  • U-i the user information portion of the V i
  • Q i V i the edges between nodes represent similarity node.
  • sim (V i , V j ) ⁇ sim (u i , U j ) + (1- ⁇ ) sim (Q i , Q j )
  • U i , U j , Q i , and Q j in the formulae (1), (2), and (3) are normalized characteristic values, and ⁇ and ⁇ are constants.
  • the user relationship graph can be constructed.
  • Step S23 When the first user is included in the user relationship graph, determine the first user label cluster of the first user according to the user label cluster of the initial user in the user relationship graph;
  • the users in the user relationship graph constructed in the above steps include the first user, that is, the user who sent the question information, directly determine the first user of the first user according to the user tag cluster of the initial user in the user relationship graph. Label clustering.
  • an initial user is determined from not less than two users in the user relationship graph
  • a user tag cluster is set for the initial user
  • the user relationship is determined based on the similarity between each two users in the user relationship graph and the iterative function.
  • the user tag cluster of other users except the initial user among two users is clustered.
  • determining the user tag clustering of users other than the initial user among the two or more users in the user relationship graph according to the similarity between each two users in the user relationship graph and the iterative function may be specifically:
  • n * n matrix M be the edge weight matrix of the user relationship graph G.
  • the element m ij in the matrix represents the similarity of the nodes r i and r j .
  • each row vector of M is normalized to obtain the matrix M ′.
  • Each element in M ′ is calculated by formula (4), so that the sum of terms in each row vector of M ′ is 1.
  • the class vector of the node with the initial labeled category will be restored to the initial set vector to make it consistent with the labeled category.
  • the category information vector v (p (c 1 ), p (c 2 ), ..., p (c n )) n of each node r in the graph is taken as the largest of the relation category vectors.
  • Step S24 Reply to the question information sent by the first user according to the technical level parameters of the first user tag clustering.
  • the user information of the first user is acquired, the first user tag cluster is determined according to the first user user information, and the first user tags are clustered.
  • the class is determined as the user tag cluster of the first user, and the question information sent by the first user is returned according to the technical level parameters of the first user tag cluster.
  • FIG. 3 A flowchart of the method is shown in FIG. 3 and includes:
  • Step S31 When the question information sent by the first user is received, the user information of the first user is obtained;
  • Step S32 Find a user relationship graph.
  • the user relationship graph includes: no less than two users and the similarity between each two users;
  • Step S33 When the first user is included in the user relationship diagram, issue clusters are set for the problem information sent by each of the two or more users in the user relationship diagram.
  • Step S34 Determine the proportion of the number of people in each question cluster, where the proportion of the number of people in the question cluster is: the number of users corresponding to the questions under each question cluster and all the questions under all question clusters The ratio of the number of corresponding users;
  • Step S35 Determine the initial number of users corresponding to each question cluster according to the proportion of the number of people in each question cluster;
  • Step S36 Determine an initial user according to the number of initial users corresponding to each question cluster, and set a user tag cluster for the initial user;
  • the first problem cluster includes 100 questions
  • the second problem cluster includes 300 questions
  • the third problem cluster includes 200 questions
  • the fourth problem cluster There are 400 questions under the category, and 500 questions under the fifth question cluster.
  • one question corresponds to one user.
  • 100 questions under the first problem cluster correspond to 100 users, that is, 100 users have asked questions that belong to the first problem cluster
  • the 300 questions under the second question cluster correspond to 300 users, that is, 300 users have raised questions that belong to the second question cluster.
  • the proportion of the number of people in each question cluster that is, the ratio of the number of users corresponding to the questions under each question cluster and the number of users corresponding to all the questions under all question clusters, where all questions are asked
  • the number of users corresponding to all the questions in the cluster that is, a total of 1500 questions in 5 question clusters, corresponding to 1500 users
  • the proportion of the number of people in the first question cluster is: 100/1500, which is 1 / 15
  • the proportion of the number of people in the second question cluster is: 300/1500, which is 3/15
  • the proportion of the number of people in the third question cluster 200/1500, which is 2/15
  • the fourth question The proportion of the number of people in the cluster is: 400/1500, which is 4/15
  • the proportion of the number of people in the fifth problem cluster is: 500/1500, which is 5/15.
  • the number of users extracted from the first problem cluster accounts for 1/15 of the total number of initial users. That is, if there are 15 initial users in total, then a user is selected as the initial user from the first problem cluster. Three users are selected as initial users in the second problem cluster, two users are selected as initial users in the third problem cluster, and four users are selected as initial users in the fourth problem cluster. Five users are selected as the initial users in this problem cluster. That is, the number of initial users extracted in each problem cluster is related to the proportion of the number of people in the problem cluster to the number of people in all problem clusters, and is directly proportional.
  • Step S37 Determine user tag clusters of users other than the initial user among not less than two users in the user relationship graph according to the similarity between each two users in the user relationship graph and the iterative function;
  • Step S38 Reply to the question information sent by the first user according to the technical level parameter of the first user tag cluster corresponding to the first user.
  • the user information of the first user is acquired, the first user tag cluster is determined according to the first user user information, and the first user tags are clustered.
  • the class is determined as the user tag cluster of the first user, and the question information sent by the first user is returned according to the technical level parameters of the first user tag cluster.
  • This embodiment discloses a data processing method.
  • the flowchart is shown in FIG. 4 and includes:
  • Step S41 When the problem information sent by the first user is received, the user information of the first user is acquired;
  • Step S42 Determine the similarity ranking of the first user and the first number of user tag clusters according to the user information of the first user;
  • Step S43 Determine the first user tag cluster according to the similarity, and determine the first user tag cluster as the user tag cluster of the first user;
  • a first number of user tag clusters are set in advance, and the first number of user tag clusters are set according to user characteristics.
  • the similarity ranking of the first user and the first number of user tag clusters is determined according to the user information of the first user.
  • the user characteristics of the first user are determined according to the user information of the first user, the similarity between the user characteristics of the first user and each user tag cluster in the plurality of user tag clusters is determined, and the multiple similarities are ranked to determine The similarity between the user characteristics of the first user and each of the user tag clusters in the multiple user tag clusters, selecting the user tag cluster with the highest similarity, and determining the user tag cluster as the first user tag cluster, That is, the user tag clustering of the first user.
  • 5 user tag clusters are set in advance.
  • the behavior is ranked as follows: C user tag clustering ⁇ D user tag clustering ⁇ A user tag clustering ⁇ E user tag clustering ⁇ B user tag clustering, then, among the user features with the highest similarity to the first user, the C user tag clustering, the lowest similarity is the B user tag clustering, and the C user tag clustering It is set as the first user tag cluster, that is, the C user tag cluster is determined as the user tag cluster of the first user.
  • the interval is fixed, and the user tag cluster is reset according to the user characteristics of all users. That is, when the number of users who ask questions increases, the user base in the user tag cluster increases. The user characteristics of all users determine the new user tag cluster.
  • Step S44 Reply to the question information sent by the first user according to the technical level parameters of the first user tag clustering.
  • the user information of the first user is acquired, the first user tag cluster is determined according to the user information of the first user, and the first user tags are clustered.
  • the class is determined as the user tag cluster of the first user, and the question information sent by the first user is returned according to the technical level parameters of the first user tag cluster.
  • This embodiment discloses a data processing method.
  • the flowchart is shown in FIG. 5 and includes:
  • Step S51 When receiving the question information sent by the first user, determine whether other question information is received within the first time interval when the question information is received;
  • Step S52 When other problem information is received within the first time interval of receiving the problem information, the problem information is combined with other problem information;
  • the question information sent by the first user it is first determined whether other question information is received within the first time interval in which the question information is received, where the first time interval may be: the moment when the question information is received If the first predetermined time period before and the second predetermined time period after the time when the problem information is received, if other problem information is also received within the first time interval, the problem information is merged with other problem information, so that the problem can be unified.
  • the user does not need to reply multiple times, or when the user divides a question into multiple questions, it will not cause the problem to be unclear.
  • it may be: filtering out information that the length of the question information sent by the user is less than the first threshold, such as greetings, greetings, and other information, such as Hi, Hello, and the like.
  • Step S53 Acquire user information of the first user.
  • Step S54 Determine the first user tag cluster according to the user information of the first user, and determine the first user tag cluster as the user tag cluster of the first user;
  • Step S55 Reply to the combined question information sent by the first user according to the technical level parameters of the first user tag clustering.
  • the user information of the first user is acquired, the first user tag cluster is determined according to the first user user information, and the first user tags are clustered
  • the class is determined as the user tag cluster of the first user, and the question information sent by the first user is returned according to the technical level parameters of the first user tag cluster.
  • This embodiment discloses an electronic device.
  • the structure diagram is shown in FIG. 6 and includes:
  • the processor 61 and the memory 62 are connected to The processor 61 and the memory 62.
  • the memory 62 is configured to store user tag clusters and technical level parameters corresponding to the user tag clusters.
  • the processor 61 is configured to obtain user information of the first user when receiving the question information sent by the first user, determine the first user tag cluster according to the user information of the first user, and determine the first user tag cluster as the first
  • the user tag clustering of a user responds to the question information sent by the first user according to the technical level parameters of the first user tag clustering.
  • a user asks a question
  • the user asking the question can be logged in.
  • the user's user information can be obtained from the account that the user logs in to.
  • the user information may be personal information filled in by the user when registering or supplementing the logged-in account, and may also be information such as questions or speeches previously issued by the account in which the user is logged in.
  • the user sends a question to the customer service.
  • the user can first log in to the account, after logging in, send the problem to the customer service, and then the customer service system obtains the user's user from the account.
  • Information such as: age, how long the product has been used, the number of products used, questions the user has asked, etc.
  • multiple user tag clusters can be set in advance, different user tag clusters correspond to different technical level parameters, and the user's technical level in the corresponding user tag cluster is represented by different technical level parameters.
  • three user tag clusters are set in advance, which are the first user tag cluster, the second user tag cluster, and the third user tag cluster.
  • the first user tag cluster corresponds to a high level of technology, and the technical level parameters can be 1; in the technical level corresponding to the second user tag cluster, the technical level parameter may be 2; the third user tag cluster corresponds to a low technical level, and the technical level parameter may be 3.
  • Determining the first user tag cluster according to the user information of the first user may be specifically: comprehensively evaluating the user tag cluster of the user according to the multiple user information corresponding to the first user.
  • the product the user consulted is an electronic product
  • the user is 30 years old, and the user has been using the electronic product for 3 years, and the user has asked professional questions that are more professional. Since young people have a better understanding of electronic products, and have been using the electronic products for 3 years, the questions they have asked are also more professional. Then, it can be determined from this that the user tag to which the user belongs is clustered as the first user tag. Clustering, high technical level;
  • the product the user consults is an electronic product
  • the user is 65 years old, and the user has used the electronic product for 3 months. Since the elderly know less about the electronic product and use the electronic product for a shorter time, this can be It is determined that the user tag cluster to which the user belongs is a third user tag cluster, and the technical level is low.
  • the user tag cluster of the user may also be determined by other methods, which is not specifically limited herein.
  • the technical level of users belonging to the corresponding user tag cluster can be determined according to different technical level parameters. Then, according to the different technical levels of different users, they respond to their suggestions. The problem.
  • Different technical level parameters correspond to different ways of answering questions. For example, for users with low technical level, when replying to their questions, use simple language, minimize professional terms, and reply longer; for high technical level Users can use more specialized terminology when replying to their questions, and the reply content is mainly short.
  • the processor 61 determining the first user tag cluster according to the user information of the first user includes:
  • the processor is used to find the user relationship graph.
  • the user relationship graph includes: no less than two users and the similarity between each two users.
  • the clustering determines a first user tag cluster of the first user.
  • a user relationship graph is stored in advance, and the user relationship graph is: no less than two users, and the similarity between each two users.
  • constructing a user relationship graph requires first extracting user features and calculating feature values, and then constructing a user and feature matrix.
  • Extracting user features can be specifically: extracting predefined user features and question features, as shown in Table 1 user feature description and Table 2 user question feature description:
  • User age Integer value Estimate the age based on the information provided when the user registered Number of mobile phones used by users Integer value Number of times a user bought a phone User use phone time span Integer value Time interval between the user's first purchase of the mobile phone and the question
  • Table 1 includes: the name, feature value, and feature description of the user ’s characteristics.
  • the user ’s characteristic is the user ’s age, it is based on the user ’s registration.
  • the information is the estimated age; when the user is specifically the number of mobile phones used by the user, it is determined based on the number of times the user has purchased the mobile phone; the user characteristic is the time span of the user ’s mobile phone, which is based on the user ’s first purchase of the mobile phone to the current question The time interval is determined.
  • Table 2 includes: the name, feature value, and feature description of the user ’s question feature.
  • the feature of the user question is the professional word frequency in the user question, which is based on The frequency of professional vocabulary in user history questions is determined;
  • the user question feature is the representativeness of the user question, which is determined based on the number of samples in the cluster where the user question is located;
  • the user question feature is the detail level of the user question answer, which is based on the user The number of characters for the answer to the question is determined;
  • the feature of the user question is the number of user questions, which is determined based on the number of historical user questions;
  • the feature of the user question is the user's dialogue time, which is based on the average of the user's historical question dialogue in the customer service system It takes time to determine;
  • the user problem is characterized by the number of user interaction rounds, which is determined according to the number of user interactions with the customer service in the historical problems in the customer service system.
  • a user and feature matrix M is constructed, each row of the matrix represents a user, each column of the matrix represents a one-dimensional feature, and then each column is normalized.
  • User relationship graph G is composed of user U and user similarity, where G's nodes are users and edges are similarity between users.
  • V i ⁇ U i , Q i >
  • V i the first i nodes
  • U-i the user information portion of the V i
  • Q i V i the edges between nodes represent similarity node.
  • sim (V i , V j ) ⁇ sim (U i , U j ) + (1- ⁇ ) sim (Q i , Q j )
  • U i , U j , Q i , and Q j in the formulae (1), (2), and (3) are normalized characteristic values, and ⁇ and ⁇ are constants.
  • the user relationship graph can be constructed.
  • the users in the user relationship graph constructed in the above steps include the first user, that is, the user who sent the question information, directly determine the first user of the first user according to the user tag cluster of the initial user in the user relationship graph. Label clustering.
  • an initial user is determined from not less than two users in the user relationship graph
  • a user tag cluster is set for the initial user
  • the user relationship is determined based on the similarity between each two users in the user relationship graph and the iterative function.
  • the user tag cluster of other users except the initial user among two users is clustered.
  • determining the user tag clustering of users other than the initial user among the two or more users in the user relationship graph according to the similarity between each two users in the user relationship graph and the iterative function may be specifically:
  • n * n matrix M be the edge weight matrix of the user relationship graph G.
  • the element m ij in the matrix represents the similarity of the nodes r i and r j .
  • each row vector of M is normalized to obtain the matrix M ′.
  • Each element in M ′ is calculated by formula (4), so that the sum of terms in each row vector of M ′ is 1.
  • the class vector of the node with the initial labeled category will be restored to the initial set vector to make it consistent with the labeled category.
  • the category information vector v (p (c 1 ), p (c 2 ), ..., p (c n )) n of each node r in the graph is taken as the largest of the relation category vectors.
  • the processor 61 determines the initial user from no less than two users in the user relationship diagram, including:
  • the processor is configured to set a problem cluster for the problem information sent by each of the users in the user relationship graph, and determine the proportion of the number of people in each problem cluster, of which the number of people in the problem cluster
  • the ratio is: the ratio of the number of users corresponding to the questions under each question cluster to the number of users corresponding to all the questions under all question clusters, and each question cluster is determined according to the proportion of the number of people in each question cluster
  • the number of initial users corresponding to the class is determined according to the number of initial users corresponding to each problem cluster.
  • the first problem cluster includes 100 questions
  • the second problem cluster includes 300 questions
  • the third problem cluster includes 200 questions
  • the fourth problem cluster There are 400 questions under the category, and 500 questions under the fifth question cluster.
  • one question corresponds to one user.
  • 100 questions under the first problem cluster correspond to 100 users, that is, 100 users have asked questions that belong to the first problem cluster
  • the 300 questions under the second question cluster correspond to 300 users, that is, 300 users have raised questions that belong to the second question cluster.
  • the proportion of the number of people in each question cluster that is, the ratio of the number of users corresponding to the questions under each question cluster and the number of users corresponding to all the questions under all question clusters, where all questions are asked
  • the number of users corresponding to all the questions in the cluster that is, a total of 1500 questions in 5 question clusters, corresponding to 1500 users
  • the proportion of the number of people in the first question cluster is: 100/1500, which is 1 / 15
  • the proportion of the number of people in the second question cluster is: 300/1500, which is 3/15
  • the proportion of the number of people in the third question cluster 200/1500, which is 2/15
  • the fourth question The proportion of the number of people in the cluster is: 400/1500, which is 4/15
  • the proportion of the number of people in the fifth problem cluster is: 500/1500, which is 5/15.
  • the number of users extracted from the first problem cluster accounts for 1/15 of the total number of initial users. That is, if there are 15 initial users in total, then a user is selected as the initial user from the first problem cluster. Three users are selected as initial users in the second problem cluster, two users are selected as initial users in the third problem cluster, and four users are selected as initial users in the fourth problem cluster. Five users are selected as the initial users in this problem cluster. That is, the number of initial users extracted in each problem cluster is related to the proportion of the number of people in the problem cluster to the number of people in all problem clusters, and is directly proportional.
  • the processor 61 determining the first user tag cluster according to the user information of the first user includes:
  • the processor determines the similarity ranking of the first user and the first number of user tag clusters according to the user information of the first user, and determines the first user tag cluster according to the similarity.
  • a first number of user tag clusters are set in advance, and the first number of user tag clusters are set according to user characteristics.
  • the similarity ranking of the first user and the first number of user tag clusters is determined according to the user information of the first user.
  • the user characteristics of the first user are determined according to the user information of the first user, the similarity between the user characteristics of the first user and each user tag cluster in the plurality of user tag clusters is determined, and the multiple similarities are ranked to determine The similarity between the user characteristics of the first user and each of the user tag clusters in the multiple user tag clusters, selecting the user tag cluster with the highest similarity, and determining the user tag cluster as the first user tag cluster, That is, the user tag clustering of the first user.
  • 5 user tag clusters are set in advance.
  • the behavior is ranked as follows: C user tag clustering ⁇ D user tag clustering ⁇ A user tag clustering ⁇ E user tag clustering ⁇ B user tag clustering, then, among the user features with the highest similarity to the first user, the C user tag clustering, the lowest similarity is the B user tag clustering, and the C user tag clustering It is set as the first user tag cluster, that is, the C user tag cluster is determined as the user tag cluster of the first user.
  • the interval is fixed, and the user tag cluster is reset according to the user characteristics of all users. That is, when the number of users who ask questions increases, the user base in the user tag cluster increases. The user characteristics of all users determine the new user tag cluster.
  • the processor 61 is further configured to receive the question information sent by the first user, and determine whether other question information is received within the first time interval during which the question information is received. For other problem information, merge the problem information with other problem information.
  • the question information sent by the first user it is first determined whether other question information is received within the first time interval in which the question information is received, where the first time interval may be: the moment when the question information is received If the first predetermined time period before and the second predetermined time period after the time when the problem information is received, if other problem information is also received within the first time interval, the problem information is merged with other problem information, so that the problem can be unified.
  • the user does not need to reply multiple times, or when the user divides a question into multiple questions, it will not cause the problem to be unclear.
  • it may be: filtering out information that the length of the question information sent by the user is less than the first threshold, such as greetings, greetings, and other information, such as Hi, Hello, and the like.
  • the electronic device disclosed in this embodiment includes a memory and a processor.
  • the processor is configured to obtain the user information of the first user when the problem information sent by the first user is received, and determine the first user tag group according to the user information of the first user. Class, determining the first user tag cluster as the user tag cluster of the first user, and replying to the question information sent by the first user according to the technical level parameters of the first user tag cluster.
  • This solution determines different user tag clusters corresponding to the user according to the user information of different users, thereby achieving the reply to the questions raised by the technical level parameters corresponding to the user tag clusters of each user, and achieving the realization according to different users. Different professional levels for targeted answers.
  • the processor 61 may include, for example, a general-purpose microprocessor, an instruction set processor and / or an associated chipset and / or a special-purpose microprocessor (for example, an application-specific integrated circuit (ASIC)), and so on.
  • the processor 61 may also include on-board memory for caching purposes.
  • the memory 62 may be, for example, a non-volatile computer-readable storage medium, and specific examples include, but are not limited to: a magnetic storage device such as a magnetic tape or a hard disk (HDD); an optical storage device such as a compact disc (CD-ROM); a memory such as Random Access Memory (RAM) or Flash; etc.
  • RAM random access memory
  • ROM read-only memory
  • electrically programmable ROM electrically erasable programmable ROM
  • registers hard disks, removable disks, CD-ROMs, or in technical fields Any other form of storage medium is known.

Landscapes

  • Business, Economics & Management (AREA)
  • Accounting & Taxation (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Finance (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

Disclosed are a data processing method and an electronic device. The method comprises: acquiring user information of a first user when problem information sent by the first user is received; determining a first user tag cluster according to the user information of the first user; determining the first user tag cluster as a user tag cluster of the first user; and responding to the problem information sent by the first user according to parameters on the technical level of the first user tag cluster. By means of the solution, according to user information of a different user, a different user tag cluster corresponding to the user is determined so that the problem raised by each user is responded to according to parameters on the technical level corresponding to the user tag cluster of each user, and the targeted solution is achieved according to different professional levels of different users.

Description

一种数据处理方法及电子设备Data processing method and electronic equipment 技术领域Technical field
本发明涉及处理领域,尤其涉及一种数据处理方法及电子设备。The invention relates to the field of processing, in particular to a data processing method and an electronic device.
背景技术Background technique
在客服系统中,用户通过客服系统输入想要问的问题,客服根据用户提出的问题进行解答。In the customer service system, users enter the questions they want to ask through the customer service system, and the customer service answers according to the questions raised by the user.
然而,由于不同用户的技术水平或对产品使用的知识参差不齐,对于客服人员来说,有些用户的问题比较容易解答,有些用户的问题比较复杂;而对于用户来说,技术水平较低或者对产品使用的知识了解较少的用户需要客服人员提供较为详细的回答,而技术水平较高或者对产品使用的知识了解较多的用户则不需要客服人员进行较多的解释。However, due to the different technical levels of different users or the knowledge of product use, for customer service staff, some users' questions are easier to answer and some users have more complex questions; for users, the technical level is low or Users with less knowledge about product use need more detailed answers from customer service staff, while users with higher technical level or more knowledge about product use do not need more explanation from customer service staff.
为了针对不同用户,为不同用户提供具有针对性的问题解答,就需要对不同用户进行技术水平的评估。In order to provide different users with targeted questions and answers, it is necessary to evaluate the technical level of different users.
发明内容Summary of the invention
有鉴于此,本发明提供一种数据处理方法及电子设备,以解决现有技术中针对不同用户提出的问题,无法提供具有针对性的解答的问题,其具体方案如下:In view of this, the present invention provides a data processing method and an electronic device to solve the problems raised by different users in the prior art and cannot provide targeted answers. The specific solutions are as follows:
一种数据处理方法,包括:A data processing method includes:
当接收到第一用户发送的问题信息时,获取所述第一用户的用户信息;When the problem information sent by the first user is received, obtaining the user information of the first user;
根据所述第一用户的用户信息确定第一用户标签聚类,将所述第一用户标签聚类确定为所述第一用户的用户标签聚类;Determining a first user tag cluster according to the user information of the first user, and determining the first user tag cluster as a user tag cluster of the first user;
依据所述第一用户标签聚类的技术水平参数回复所述第一用户发送的问题信息。Reply to the question information sent by the first user according to the technical level parameter of the first user tag cluster.
进一步的,根据所述第一用户的用户信息确定第一用户标签聚类,包括:Further, determining the first user tag cluster according to the user information of the first user includes:
查找用户关系图,所述用户关系图包括:不少于两个用户及每两个用户间的相似度;Find a user relationship graph, the user relationship graph includes: no less than two users and similarity between each two users;
当所述用户关系图中包括所述第一用户时,根据所述用户关系图中初始用户的用户标签聚类确定所述第一用户的第一用户标签聚类。When the first user is included in the user relationship graph, a first user label cluster of the first user is determined according to a user label cluster of an initial user in the user relationship graph.
进一步的,根据所述第一用户的用户信息确定第一用户标签聚类,包括:Further, determining the first user tag cluster according to the user information of the first user includes:
根据所述第一用户的用户信息确定所述第一用户与第一数量的用户标签聚类的相似度排行;Determine the similarity ranking of the first user and the first number of user tag clusters according to the user information of the first user;
按照相似度确定第一用户标签聚类。The first user tag cluster is determined according to the similarity.
进一步的,所述根据所述用户关系图中初始用户的用户标签聚类确定所述第一用户的第一用户标签聚类,包括:Further, determining the first user tag cluster of the first user according to the user tag cluster of the initial user in the user relationship graph includes:
从所述用户关系图中的不少于两个用户中确定初始用户,为所述初始用户设定用户标签聚类;Determining an initial user from no less than two users in the user relationship graph, and setting a user tag cluster for the initial user;
根据所述用户关系图中每两个用户间的相似度及迭代函数确定所述用户关系图中的不少于两个用户中除所述初始用户外的其他用户的用户标签聚类。According to the similarity between each two users in the user relationship graph and an iterative function, a user tag cluster of other users except the initial user in the user relationship graph is determined.
进一步的,所述从所述用户关系图中的不少于两个用户中确定初始用户,包括:Further, determining the initial user from no less than two users in the user relationship diagram includes:
为所述用户关系图中的不少于两个用户中每个用户所发送的问题信息设置问题聚类;Setting a problem cluster for the problem information sent by each of the two or more users in the user relationship graph;
确定每个问题聚类中人数所占比例,其中,所述问题聚类中人数所占比例为:提出每个问题聚类下的问题所对应的用户数量与提出所有问题聚类下所有问题所对应的用户数量的比值;Determine the proportion of the number of people in each question cluster, where the proportion of the number of people in the question cluster is: the number of users corresponding to the questions under each question cluster and the questions The ratio of the number of corresponding users;
按照所述每个问题聚类中人数所占比例确定每个问题聚类所对应的初始用户数量;Determining the initial number of users corresponding to each problem cluster according to the proportion of the number of people in each problem cluster;
按照所述每个问题聚类所对应的初始用户数量确定初始用户。An initial user is determined according to the number of initial users corresponding to each question cluster.
进一步的,还包括:Further, it also includes:
接收第一用户发送的问题信息,确定接收到所述问题信息的第一时间间隔内,是否接收到其他问题信息;Receiving the question information sent by the first user, and determining whether other question information is received within the first time interval when the question information is received;
当在接收到所述问题信息的第一时间间隔内,接收到其他问题信息,将所述问题信息与其他问题信息合并。When other question information is received within the first time interval of receiving the question information, the question information is combined with other question information.
一种电子设备,包括:处理器及存储器,其中:An electronic device includes a processor and a memory, wherein:
所述存储器用于存储用户标签聚类及与所述用户标签聚类对应的技术水平参数;The memory is configured to store a user tag cluster and a technical level parameter corresponding to the user tag cluster;
所述处理器用于当接收到第一用户发送的问题信息时,获取所述第一用户的用户信息,根据所述第一用户的用户信息确定第一用户标签聚类,将所述第一用户标签聚类确定为所述第一用户的用户标签聚类,依据所述第一用户标签聚类的技术水平参数回复所述第一用户发送的问题信息。The processor is configured to obtain user information of the first user when receiving the question information sent by the first user, determine a first user tag cluster according to the user information of the first user, and group the first user The tag cluster is determined as the user tag cluster of the first user, and the question information sent by the first user is returned according to the technical level parameter of the first user tag cluster.
进一步的,所述处理器根据所述第一用户的用户信息确定第一用户标签聚类,包括:Further, the determining, by the processor according to the user information of the first user, a first user tag cluster includes:
所述处理器查找用户关系图,所述用户关系图包括:不少于两个用户及每两个用户间的相似度,当所述用户关系图中包括所述第一用户时,根据所述用户关系图中初始用户的用户标签聚类确定所述第一用户的第一用户标签聚类。The processor searches for a user relationship graph, and the user relationship graph includes: no less than two users and a similarity between each two users, and when the user relationship graph includes the first user, according to the The user tag cluster of the initial user in the user relationship graph determines the first user tag cluster of the first user.
进一步的,所述处理器根据所述第一用户的用户信息确定第一用户标签聚类,包括:Further, the determining, by the processor according to the user information of the first user, a first user tag cluster includes:
所述处理器根据所述第一用户的用户信息确定所述第一用户与第一数量的用户标签聚类的相似度排行,按照相似度确定第一用户标签聚类。The processor determines the similarity ranking of the first user and the first number of user tag clusters according to the user information of the first user, and determines the first user tag cluster according to the similarity.
进一步的,所述处理器根据所述用户关系图中初始用户的用户标签聚类确定所述第一用户的第一用户标签聚类,包括:Further, the determining, by the processor according to the user tag cluster of the initial user in the user relationship graph, the first user tag cluster of the first user includes:
所述处理器从所述用户关系图中的不少于两个用户中确定初始用户,为所述初始用户设定用户标签聚类,根据所述用户关系图中每两个用户间的相似度及迭代函数确定所述用户关系图中的不少于两个用户中除所述初始用户外的其他用户的用户标签聚类。The processor determines an initial user from no less than two users in the user relationship diagram, sets a user tag cluster for the initial user, and according to the similarity between every two users in the user relationship diagram And an iterative function to determine a user tag cluster of other users in the user relationship graph other than the initial user among the two or more users.
从上述技术方案可以看出,本申请公开的数据处理方法及电子设备,当接收到第一用户发送的问题信息时,获取第一用户的用户信息,根据第一用户的用户信息确定第一用户标签聚类,将第一用户标签聚类确定为第一用户的用户标签聚类,依据第一用户标签聚类的技术水平参数回复第一用户发送的问题信息。本方案通过根据不同用户的用户信息确定该用户对应的不同的用户标签聚类,从而实现根据每个用户的用户标签聚类所对应的技术水平参数回复其所提出的问题,实现了根据不同用户的不同专业水平进行针对性解答。As can be seen from the above technical solution, the data processing method and electronic device disclosed in this application, when receiving the question information sent by the first user, obtain the user information of the first user, and determine the first user according to the user information of the first user Tag clustering, determines the first user tag cluster as the user tag cluster of the first user, and responds to the question information sent by the first user according to the technical level parameters of the first user tag cluster. This solution determines different user tag clusters corresponding to the user according to the user information of different users, thereby achieving the reply to the questions raised by the technical level parameters corresponding to the user tag clusters of each user, and achieving the realization according to different users. Different professional levels for targeted answers.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to more clearly explain the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings in the following description are merely These are some embodiments of the present invention. For those of ordinary skill in the art, other drawings can be obtained based on these drawings without paying creative labor.
图1为本发明实施例公开的一种数据处理方法的流程图;FIG. 1 is a flowchart of a data processing method disclosed by an embodiment of the present invention;
图2为本发明实施例公开的一种数据处理方法的流程图;2 is a flowchart of a data processing method disclosed by an embodiment of the present invention;
图3为本发明实施例公开的一种数据处理方法的流程图;3 is a flowchart of a data processing method disclosed by an embodiment of the present invention;
图4为本发明实施例公开的一种数据处理方法的流程图;4 is a flowchart of a data processing method disclosed by an embodiment of the present invention;
图5为本发明实施例公开的一种数据处理方法的流程图;5 is a flowchart of a data processing method disclosed by an embodiment of the present invention;
图6为本发明实施例公开的一种电子设备的结构示意图。FIG. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
具体实施方式detailed description
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。In the following, the technical solutions in the embodiments of the present invention will be clearly and completely described with reference to the drawings in the embodiments of the present invention. Obviously, the described embodiments are only a part of the embodiments of the present invention, but not all of the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by a person of ordinary skill in the art without creative efforts shall fall within the protection scope of the present invention.
在下面的详细描述中,为便于解释,阐述了许多具体的细节以提供对本公开实施例的全面理解。然而,明显地,一个或多个实施例在没有这些具体细节的情况下也可以被实施。此外,在以下说明中,省略了对公知结构和技术的描述,以避免不必要地混淆本公开的概念。In the following detailed description, for ease of explanation, many specific details are set forth to provide a comprehensive understanding of the embodiments of the present disclosure. It is apparent, however, that one or more embodiments may be practiced without these specific details. In addition, in the following description, descriptions of well-known structures and techniques are omitted to avoid unnecessarily obscuring the concepts of the present disclosure.
在此使用的术语仅仅是为了描述具体实施例,而并非意在限制本公开。在此使用的术语“包括”、“包含”等表明了所述特征、步骤、操作和/或部件的存在,但是并不排除存在或添加一个或多个其他特征、步骤、操作或部件。The terminology used herein is for the purpose of describing particular embodiments only and is not intended to limit the present disclosure. As used herein, the terms "including", "comprising", and the like indicate the presence of stated features, steps, operations, and / or components, but do not preclude the presence or addition of one or more other features, steps, operations, or components.
在此使用的所有术语(包括技术和科学术语)具有本领域技术人员通常所理解的含义,除非另外定义。应注意,这里使用的术语应解释为具有与本说明书的上下文相一致的含义,而不应以理想化或过于刻板的方式来解释。All terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art unless otherwise defined. It should be noted that the terms used herein should be interpreted to have meanings consistent with the context of this specification, and should not be interpreted in an idealized or overly rigid manner.
在使用类似于“A、B和C等中至少一个”这样的表述的情况下,一般来说应该按照本领域技术人员通常理解该表述的含义来予以解释(例如,“具有A、B和C中至少一个的系统”应包括但不限于单独具有A、单独具有B、单独具有C、具有A和B、具有A和C、具有B和C、和/或具有A、B、C的系统等)。在使用类似于“A、B或C等中至少一个”这样的表述的情况下,一般来说应该按照本领域技术人员通常理解该表述的含义来予以解释(例如,“具有A、B或C中至少一个的系统”应包括但不限于单独具有A、单独具有B、单独具有C、具有A和B、具有A和C、具有B和C、和/或具有A、B、C的系统等)。Where expressions such as "at least one of A, B, C, etc." are used, they should generally be interpreted in accordance with the meaning commonly understood by those skilled in the art (for example, "having A, B, and C "A system of at least one of" shall include, but is not limited to, a system with A alone, B alone, C alone, A and B, A and C, B and C, and / or A, B, C, etc. ). Where expressions such as "at least one of A, B, or C" are used, they should generally be interpreted in accordance with the meaning commonly understood by those skilled in the art (for example, "having A, B, or C "A system of at least one of" shall include, but is not limited to, a system with A alone, B alone, C alone, A and B, A and C, B and C, and / or A, B, C, etc. ).
附图中示出了一些方框图和/或流程图。应理解,方框图和/或流程图中的一些方框或其组合可以由计算机程序指令来实现。这些计算机程序指令可以提供给通用计算机、专用计算机或其他可编程数据处理装置的处理器,从而这些指令在由该处理器执行时可以创建用于实现这些方框图和/或流程图中所说明的功能/操作的装置。本公开的技术可以硬件和/或软件(包括固件、微代码等)的形式来实现。另外,本公开的技术可以采取存储有指令的计算 机可读存储介质上的计算机程序产品的形式,该计算机程序产品可供指令执行系统使用或者结合指令执行系统使用。Some block diagrams and / or flowcharts are shown in the drawings. It should be understood that some blocks or combinations of block diagrams and / or flowcharts may be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing device, so that when executed by the processor, these instructions may be created to implement the functions illustrated in the block diagrams and / or flowcharts / Operating device. The techniques of this disclosure may be implemented in the form of hardware and / or software (including firmware, microcode, etc.). In addition, the techniques of this disclosure may take the form of a computer program product on a computer-readable storage medium storing instructions, which computer program product may be used by or in conjunction with an instruction execution system.
本发明公开了一种数据处理方法,其流程图如图1所示,包括:The invention discloses a data processing method. The flowchart is shown in FIG. 1 and includes:
步骤S11、当接收到第一用户发送的问题信息时,获取第一用户的用户信息;Step S11: When the problem information sent by the first user is received, the user information of the first user is acquired;
在客服系统或网页、论坛等系统中,当有用户进行提问时,该提问的用户可以处于登录状态,可以在用户发送提问信息时,由该用户登录的账户获知该用户的用户信息。In a customer service system or a webpage or forum, when a user asks a question, the user asking the question can be logged in. When the user sends a question, the user's user information can be obtained from the account that the user logs in to.
其中,用户信息可以为:用户在注册或补充该已登录的账户时所填写的个人信息,还可以为:该用户所登录的账户在之前所发出的提问或发言等信息。The user information may be personal information filled in by the user when registering or supplementing the logged-in account, and may also be information such as questions or speeches previously issued by the account in which the user is logged in.
例如:在客服系统中,用户向客服发送问题,在其发送问题之前,该用户可以首先进行账户的登录,登录后,将该问题发送给客服,之后,客服系统从账户中获取该用户的用户信息,如:年龄、使用产品的时间、使用产品的个数、用户曾经问过的问题等。For example: In the customer service system, the user sends a question to the customer service. Before the user sends the problem, the user can first log in to the account, after logging in, send the problem to the customer service, and then the customer service system obtains the user's user from the account. Information, such as: age, how long the product has been used, the number of products used, questions the user has asked, etc.
步骤S12、根据第一用户的用户信息确定第一用户标签聚类,将第一用户标签聚类确定为第一用户的用户标签聚类;Step S12: Determine a first user tag cluster according to the user information of the first user, and determine the first user tag cluster as a user tag cluster of the first user;
本发明是实施例可以预先设置多个用户标签聚类,不同的用户标签聚类对应不同的技术水平参数,通过不同的技术水平参数表示对应的用户标签聚类中用户的技术水平。In the embodiment of the present invention, multiple user tag clusters can be set in advance, different user tag clusters correspond to different technical level parameters, and the user's technical level in the corresponding user tag cluster is represented by different technical level parameters.
例如:预先设置3个用户标签聚类,分别是第一用户标签聚类,第二用户标签聚类及第三用户标签聚类,第一用户标签聚类对应技术水平高,技术水平参数可以为1;第二用户标签聚类对应技术水平中,技术水平参数可以为2;第三用户标签聚类对应技术水平低,技术水平参数可以为3。For example: three user tag clusters are set in advance, which are the first user tag cluster, the second user tag cluster, and the third user tag cluster. The first user tag cluster corresponds to a high level of technology, and the technical level parameters can be 1; in the technical level corresponding to the second user tag cluster, the technical level parameter may be 2; the third user tag cluster corresponds to a low technical level, and the technical level parameter may be 3.
根据第一用户的用户信息确定第一用户标签聚类,可以具体为:根据第一用户对应的多个用户信息综合评定该用户的用户标签聚类。Determining the first user tag cluster according to the user information of the first user may be specifically: comprehensively evaluating the user tag cluster of the user according to the multiple user information corresponding to the first user.
例如:用户所咨询的产品为电子产品,用户年龄为30岁,使用该电子产品的时间为3年,并且,该用户曾经问过的问题都比较专业。由于年轻人对电子产品的了解比较清楚,并且已经使用该电子产品达到3年时间,其问过的问题也比较专业,那么,可以由此确定该用户所属的用户标签聚类为第一用户标签聚类,技术水平高;For example, the product the user consulted is an electronic product, the user is 30 years old, and the user has been using the electronic product for 3 years, and the user has asked professional questions that are more professional. Since young people have a better understanding of electronic products, and have been using the electronic products for 3 years, the questions they have asked are also more professional. Then, it can be determined from this that the user tag to which the user belongs is clustered as the first user tag. Clustering, high technical level;
若用户所咨询的产品为电子产品,用户年龄为65岁,使用该电子产品的时间为3个月,由于老年人对电子产品了解较少,并且使用该电子产品的时间较短,可以由此确定该用户所属的用户标签聚类为第三用户标签聚类,技术水平低。If the product the user consults is an electronic product, the user is 65 years old, and the user has used the electronic product for 3 months. Since the elderly know less about the electronic product and use the electronic product for a shorter time, this can be It is determined that the user tag cluster to which the user belongs is a third user tag cluster, and the technical level is low.
本实施例公开的数据处理方法中,也可以采用其他方式确定用户的用户标签聚类,在此不做具体限定。In the data processing method disclosed in this embodiment, the user tag cluster of the user may also be determined by other methods, which is not specifically limited herein.
步骤S13、依据第一用户标签聚类的技术水平参数回复第一用户发送的问题信息。Step S13: Reply to the question information sent by the first user according to the technical level parameters of the first user tag clustering.
由于不同的用户标签聚类对应不同的技术水平参数,根据不同的技术水平参数可以确定属于该对应的用户标签聚类中的用户的技术水平,那么,依据不同用户的不同技术水平回复其所提出的问题。Since different user tag clusters correspond to different technical level parameters, the technical level of users belonging to the corresponding user tag cluster can be determined according to different technical level parameters. Then, according to the different technical levels of different users, they respond to their suggestions. The problem.
不同的技术水平参数对应不同的问题回复方式,如:对于技术水平低的用户,回复其所提出的问题时,使用较为浅显的语言,尽量减少专业用语,回复内容较长;对于技术水平高的用户,回复其所提出的问题时,可以使用较为专业的术语,回复内容以简短为主。Different technical level parameters correspond to different ways of answering questions. For example, for users with low technical level, when replying to their questions, use simple language, minimize professional terms, and reply longer; for high technical level Users can use more specialized terminology when replying to their questions, and the reply content is mainly short.
本实施例公开的数据处理方法,当接收到第一用户发送的问题信息时,获取第一用户的用户信息,根据第一用户的用户信息确定第一用户标签聚类,将第一用户标签聚类确定为第一用户的用户标签聚类,依据第一用户标签聚类的技术水平参数回复第一用户发送的问题信息。本方案通过根据不同用户的用户信息确定该用户对应的不同的用户标签聚类,从而实现根据每个用户的用户标签聚类所对应的技术水平参数回复其所提出的问题,实现了根据不同用户的不同专业水平进行针对性解答。In the data processing method disclosed in this embodiment, when the problem information sent by the first user is received, the user information of the first user is acquired, the first user tag cluster is determined according to the first user user information, and the first user tags are clustered. The class is determined as the user tag cluster of the first user, and the question information sent by the first user is returned according to the technical level parameters of the first user tag cluster. This solution determines different user tag clusters corresponding to the user according to the user information of different users, thereby achieving the reply to the questions raised by the technical level parameters corresponding to the user tag clusters of each user, and achieving the realization according to different users. Different professional levels for targeted answers.
本实施例公开了一种数据处理方法,其流程图如图2所示,包括:This embodiment discloses a data processing method. The flowchart is shown in FIG. 2 and includes:
步骤S21、当接收到第一用户发送的问题信息时,获取第一用户的用户信息;Step S21: When the problem information sent by the first user is received, the user information of the first user is acquired;
步骤S22、查找用户关系图,用户关系图包括:不少于两个用户及每两个用户间的相似度;Step S22. Finding a user relationship graph, the user relationship graph includes: no less than two users and similarity between each two users;
预先存储用户关系图,该用户关系图中包括:不少于两个用户,以及每两个用户之间的相似度。A user relationship graph is stored in advance, and the user relationship graph includes: no less than two users, and a similarity between each two users.
具体的,构建用户关系图需要首先提取用户特征并进行特征值计算,之后构建用户及特征矩阵。Specifically, constructing a user relationship graph requires first extracting user features and calculating feature values, and then constructing a user and feature matrix.
提取用户特征可以具体为:提取预先定义的用户特征和问句特征,如表1用户特征说明及表2用户问题特征说明所示:Extracting user features can be specifically: extracting predefined user features and question features, as shown in Table 1 user feature description and Table 2 user question feature description:
表1Table 1
用户特征名称User Feature Name 特征值Eigenvalues 特征说明Feature description
用户年龄User age 整数值Integer value 根据用户注册时填的信息估算出年龄Estimate the age based on the information provided when the user registered
用户使用手机数Number of mobile phones used by users 整数值Integer value 用户购买手机的次数Number of times a user bought a phone
用户使用手机时间跨度User use phone time span 整数值Integer value 用户第一次购买手机到问问题的时间间隔Time interval between the user's first purchase of the mobile phone and the question
表2Table 2
用户问题特征名称User Issue Feature Name 特征值Eigenvalues 特征说明Feature description
用户问题中专业词频率Frequency of professional words in user questions 整数值Integer value 用户历史问题中专业词汇出现的频率Frequency of professional vocabulary in user history issues
用户问题的代表性Representation of user questions 整数值Integer value 用户问题所在聚类的样本数Number of samples in the cluster where the user problem is located
用户问题答案详细程度User question answer level of detail 整数值Integer value 用户问题对应答案的字符数Number of characters for answer to user question
用户问题数User Questions 整数值Integer value 用户历史问题数量Number of user history issues
用户对话时间User conversation time 整数值Integer value 用户对话平均耗时User conversations take an average of time
用户交互轮数User interaction rounds 整数值Integer value 用户与客服人员交互的轮数Number of rounds of user interaction with customer service staff
其中,表1中包括:用户特征的名称、特征值及特征说明,如:当用户在客服系统中所发送的问题是针对手机进行的咨询,用户特征为用户年龄时,其是根据用户注册时填的信息估算出的年龄;用户特殊为用户使用手机数时,其是根据用户购买手机的次数确定的;用户特征为用户使用手机时间跨度,其是根据用户第一次购买手机到问当前问题之间的时间间隔确定的。Among them, Table 1 includes: the name, feature value, and feature description of the user ’s characteristics. For example, when the question sent by the user in the customer service system is a consultation for a mobile phone, and the user ’s characteristic is the user ’s age, it is based on the user ’s registration. The information is the estimated age; when the user is specifically the number of mobile phones used by the user, it is determined based on the number of times the user has purchased the mobile phone; the user characteristic is the time span of the user ’s mobile phone, which is based on the user ’s first purchase of the mobile phone to the current question The time interval is determined.
表2中包括:用户问题特征的名称、特征值及特征说明,如:当用户在客服系统中所发送的问题是针对手机进行的咨询,用户问题特征为用户问题中专业词频率,其是根据用户历史问题中专业词汇出现的频率确定的;用户问题特征为用户问题的代表性,其是根据用户问题所在聚类的样本数确定的;用户问题特征为用户问题答案详细程度,其是根据用户问题对应的答案的字符数确定的;用户问题特征为用户问题数,其是根据用户历史问题数量确定的;用户问题特征为用户对话时间,其是根据用户在客服系统中的历史问题对话的平均耗时确定的;用户问题特征为用户交互轮数,其是根据用户在客服系统中的历史问题中,用户与客服交互的轮数确定的。Table 2 includes: the name, feature value, and feature description of the user ’s question feature. For example, when the question sent by the user in the customer service system is a consultation for a mobile phone, the feature of the user question is the professional word frequency in the user question, which is based on The frequency of professional vocabulary in user history questions is determined; the user question feature is the representativeness of the user question, which is determined based on the number of samples in the cluster where the user question is located; the user question feature is the detail level of the user question answer, which is based on the user The number of characters for the answer to the question is determined; the feature of the user question is the number of user questions, which is determined based on the number of historical user questions; the feature of the user question is the user's dialogue time, which is based on the average of the user's historical question dialogue in the customer service system It takes time to determine; the user problem is characterized by the number of user interaction rounds, which is determined according to the number of user interactions with the customer service in the historical problems in the customer service system.
构建用户及特征矩阵M,矩阵每一行表示一个用户,矩阵每一列表示一维特征,然后对每一列进行归一化处理。A user and feature matrix M is constructed, each row of the matrix represents a user, each column of the matrix represents a one-dimensional feature, and then each column is normalized.
假设用户关系图G为全链接:Assume that the user relationship graph G is fully linked:
用户关系图G由用户U和用户相似度组成,其中,G的节点为用户,边为用户之间的相似度,定义一个节点V i=<U i,Q i>,其中,V i为第i个节点,U i为V i的用户信息部分,Q i为V i的用户问题部分,节点之间的边表示节点的相似度。节点V i=<U i,Q i>与V j=<U j,Q j>之间的相似度的计算方法如下: User relationship graph G is composed of user U and user similarity, where G's nodes are users and edges are similarity between users. Define a node V i = <U i , Q i >, where V i is the first i nodes, the U-i for the user information portion of the V i, Q i V i as part of the user problems, the edges between nodes represent similarity node. The calculation method of the similarity between the nodes V i = <U i , Q i > and V j = <U j , Q j > is as follows:
sim(V i,V j)=αsim(u i,U j)+(1-α)sim(Q i,Q j)  公式(1) sim (V i , V j ) = αsim (u i , U j ) + (1-α) sim (Q i , Q j ) Formula (1)
Figure PCTCN2018116169-appb-000001
Figure PCTCN2018116169-appb-000001
Figure PCTCN2018116169-appb-000002
Figure PCTCN2018116169-appb-000002
其中,公式(1)、(2)及(3)中的U i、U j、Q i、Q j均为归一化后的特征值,δ及γ为常数。 Among them, U i , U j , Q i , and Q j in the formulae (1), (2), and (3) are normalized characteristic values, and δ and γ are constants.
经过上述步骤可实现用户关系图的构建。After the above steps, the user relationship graph can be constructed.
步骤S23、当用户关系图中包括第一用户时,根据用户关系图中初始用户的用户标签聚类确定第一用户的第一用户标签聚类;Step S23: When the first user is included in the user relationship graph, determine the first user label cluster of the first user according to the user label cluster of the initial user in the user relationship graph;
当上述步骤中构建完成的用户关系图中的用户中包括有第一用户,即发送问题信息的用户时,直接根据用户关系图中的初始用户的用户标签聚类确定第一用户的第一用户标签聚类。When the users in the user relationship graph constructed in the above steps include the first user, that is, the user who sent the question information, directly determine the first user of the first user according to the user tag cluster of the initial user in the user relationship graph. Label clustering.
具体的,预先从用户关系图中的不少于两个用户中确定初始用户,为初始用户设定用户标签聚类,根据用户关系图中每两个用户间的相似度及迭代函数确定用户关系图中的不少于两个用户中除初始用户外的其他用户的用户标签聚类。Specifically, an initial user is determined from not less than two users in the user relationship graph, a user tag cluster is set for the initial user, and the user relationship is determined based on the similarity between each two users in the user relationship graph and the iterative function. In the figure, the user tag cluster of other users except the initial user among two users is clustered.
当用户关系图中所有用户的用户标签聚类均已确定时,那么,用户关系图中的第一用户所属的用户标签聚类也已确定。When the user tag clusters of all users in the user relationship graph have been determined, then the user tag cluster to which the first user belongs in the user relationship graph has also been determined.
进一步的,根据用户关系图中每两个用户间的相似度及迭代函数确定用户关系图中的不少于两个用户中除初始用户外的其他用户的用户标签聚类,可以具体为:Further, determining the user tag clustering of users other than the initial user among the two or more users in the user relationship graph according to the similarity between each two users in the user relationship graph and the iterative function may be specifically:
设n*n矩阵M为用户关系图G的边权重矩阵,该矩阵中的元素m ij表示节点r i和r j的相似度,之后对M的每个行向量进行归一化得到矩阵M′,M′中的每个元素通过公式(4)计算得到,使得M′的每个行向量中各项之和为1。 Let n * n matrix M be the edge weight matrix of the user relationship graph G. The element m ij in the matrix represents the similarity of the nodes r i and r j . Then, each row vector of M is normalized to obtain the matrix M ′. Each element in M ′ is calculated by formula (4), so that the sum of terms in each row vector of M ′ is 1.
Figure PCTCN2018116169-appb-000003
Figure PCTCN2018116169-appb-000003
对于图中节点设置其类别信息向量,对于初始标注类别节点设置器小类别向量为:v=(0,...,1 t,...,0) nThe category information vector is set for the nodes in the graph, and the small category vector for the initial label category node setter is: v = (0, ..., 1 t , ..., 0) n .
以n=2为例:Take n = 2 as an example:
对于已标注类别的节点,设其类别向量为v=(0,...,1 t,...,0) n,该向量的第t维为1,其余纬度为0,在迭代的第k+1步中,每个类别节点r的类别向量v被改写成v k+1=M′v kFor a labeled node, let its class vector be v = (0, ..., 1 t , ..., 0) n , the t-th dimension of the vector is 1, and the remaining latitudes are 0. In step k + 1, the class vector v of each class node r is rewritten as v k + 1 = M′v k .
类别扩散过程中,迭代的更新每个节点的类别向量之后,初始标注类别的节点的类别向量将被恢复为初始设置向量,使其与标注类别相一致,而对于其他未标注类别的节点,当第i次迭代后,计算该节点此次迭代前后两个类 别向量的余弦相似度sim(v i,v i+1),并且,记第i次迭代对该节点的影响度为impact(v i)=1-sim(v i,v i+1)。 During the class diffusion process, after the class vector of each node is updated iteratively, the class vector of the node with the initial labeled category will be restored to the initial set vector to make it consistent with the labeled category. For other unlabeled nodes, when After the i-th iteration, calculate the cosine similarity sim (v i , v i + 1 ) of the two class vectors before and after the iteration of the node, and record the impact of the i-th iteration on the node as impact (v i ) = 1-sim (v i , v i + 1 ).
以第i次迭代后所有节点平均的影响度average_impact(i)作为类别扩散是否平衡的标准:Use the average influence degree of all nodes after the i-th iteration average_impact (i) as the criterion of whether the class diffusion is balanced:
Figure PCTCN2018116169-appb-000004
Figure PCTCN2018116169-appb-000004
如果第i次迭代后的节点平均影响度小于一定阈值,就认为扩散已经达到平衡,并且终止迭代的类别扩散过程。If the average influence degree of the node after the i-th iteration is less than a certain threshold, it is considered that the diffusion has reached equilibrium, and the class diffusion process of the iteration is terminated.
扩散达到平衡时,对图中的每个节点r的类别信息向量v=(p(c 1),p(c 2),...,p(c n)) n,取关系类别向量中最大维度对应的类别为该关系对的类别,type(v)=argmaxp(c i)。 When the diffusion reaches equilibrium, the category information vector v = (p (c 1 ), p (c 2 ), ..., p (c n )) n of each node r in the graph is taken as the largest of the relation category vectors. The category corresponding to the dimension is the category of the relationship pair, and type (v) = argmaxp (c i ).
其中,不同类别即为不同的技术水平。Among them, different categories are different levels of technology.
步骤S24、依据第一用户标签聚类的技术水平参数回复第一用户发送的问题信息。Step S24: Reply to the question information sent by the first user according to the technical level parameters of the first user tag clustering.
本实施例公开的数据处理方法,当接收到第一用户发送的问题信息时,获取第一用户的用户信息,根据第一用户的用户信息确定第一用户标签聚类,将第一用户标签聚类确定为第一用户的用户标签聚类,依据第一用户标签聚类的技术水平参数回复第一用户发送的问题信息。本方案通过根据不同用户的用户信息确定该用户对应的不同的用户标签聚类,从而实现根据每个用户的用户标签聚类所对应的技术水平参数回复其所提出的问题,实现了根据不同用户的不同专业水平进行针对性解答。In the data processing method disclosed in this embodiment, when the problem information sent by the first user is received, the user information of the first user is acquired, the first user tag cluster is determined according to the first user user information, and the first user tags are clustered. The class is determined as the user tag cluster of the first user, and the question information sent by the first user is returned according to the technical level parameters of the first user tag cluster. This solution determines different user tag clusters corresponding to the user according to the user information of different users, thereby achieving the reply to the questions raised by the technical level parameters corresponding to the user tag clusters of each user, and achieving the realization according to different users. Different professional levels for targeted answers.
本实施例公开了一种数据处理方法,其流程图如图3所示,包括:This embodiment discloses a data processing method. A flowchart of the method is shown in FIG. 3 and includes:
步骤S31、当接收到第一用户发送的问题信息时,获取第一用户的用户信息;Step S31: When the question information sent by the first user is received, the user information of the first user is obtained;
步骤S32、查找用户关系图,用户关系图包括:不少于两个用户及每两个用户间的相似度;Step S32. Find a user relationship graph. The user relationship graph includes: no less than two users and the similarity between each two users;
步骤S33、当用户关系图中包括第一用户时,为用户关系图中的不少于两个用户中每个用户所发送的问题信息设置问题聚类;Step S33. When the first user is included in the user relationship diagram, issue clusters are set for the problem information sent by each of the two or more users in the user relationship diagram.
步骤S34、确定每个问题聚类中人数所占比例,其中,问题聚类中人数所占比例为:提出每个问题聚类下的问题所对应的用户数量与提出所有问题聚类下所有问题所对应的用户数量的比值;Step S34: Determine the proportion of the number of people in each question cluster, where the proportion of the number of people in the question cluster is: the number of users corresponding to the questions under each question cluster and all the questions under all question clusters The ratio of the number of corresponding users;
步骤S35、按照每个问题聚类中人数所占比例确定每个问题聚类所对应的初始用户数量;Step S35: Determine the initial number of users corresponding to each question cluster according to the proportion of the number of people in each question cluster;
步骤S36、按照每个问题聚类所对应的初始用户数量确定初始用户,为初始用户设定用户标签聚类;Step S36: Determine an initial user according to the number of initial users corresponding to each question cluster, and set a user tag cluster for the initial user;
为用户关系图中所有用户所提出的所有问题设置问题聚类,每个问题聚类下对应不少于一个问题。Set up question clusters for all questions raised by all users in the user relationship graph, and each question cluster corresponds to no less than one question.
例如:设置5个问题聚类,第一个问题聚类下包括100个问题,第二个问题聚类下包括300个问题,第三个问题聚类下包括200个问题,第四个问题聚类下包括400个问题,第五个问题聚类下包括500个问题。For example: set 5 problem clusters, the first problem cluster includes 100 questions, the second problem cluster includes 300 questions, the third problem cluster includes 200 questions, and the fourth problem cluster There are 400 questions under the category, and 500 questions under the fifth question cluster.
在每个问题聚类下,一个问题对应一个用户,那么,第一个问题聚类下100个问题,就对应100个用户,即有100个用户提出过属于第一个问题聚类的问题;第二个问题聚类下300个问题,就对应300的用户,即有300个用户提出过属于第二个问题聚类的问题。Under each problem cluster, one question corresponds to one user. Then, 100 questions under the first problem cluster correspond to 100 users, that is, 100 users have asked questions that belong to the first problem cluster; The 300 questions under the second question cluster correspond to 300 users, that is, 300 users have raised questions that belong to the second question cluster.
确定每个问题聚类中人数所占比例,即:提出每个问题聚类下的问题所对应的用户数量与提出所有问题聚类下所有问题所对应的用户数量的比例,其中,提出所有问题聚类下所有问题所对应的用户数量,即5个问题聚类一共有1500个问题,对应1500个用户,那么,第一个问题聚类中人数所占比例为:100/1500,即1/15;第二个问题聚类中人数所占比例为:300/1500,即3/15;第三个问题聚类中人数所占比例为:200/1500,即2/15;第四个问题聚类中人数所占比例为:400/1500,即4/15;第五个问题聚类中人数所占比例为:500/1500,即5/15。Determine the proportion of the number of people in each question cluster, that is, the ratio of the number of users corresponding to the questions under each question cluster and the number of users corresponding to all the questions under all question clusters, where all questions are asked The number of users corresponding to all the questions in the cluster, that is, a total of 1500 questions in 5 question clusters, corresponding to 1500 users, then the proportion of the number of people in the first question cluster is: 100/1500, which is 1 / 15; the proportion of the number of people in the second question cluster is: 300/1500, which is 3/15; the proportion of the number of people in the third question cluster: 200/1500, which is 2/15; the fourth question The proportion of the number of people in the cluster is: 400/1500, which is 4/15; the proportion of the number of people in the fifth problem cluster is: 500/1500, which is 5/15.
按照每个问题聚类中人数所占比例确定每个问题聚类所对应的初始用户数量,即第一个问题聚类中人数所占比例为1/15,那么,在所有的初始用户中,从第一个问题聚类中抽取的用户数量占所有初始用户数量的1/15,即若所有初始用户一共有15人,那么,从第一个问题聚类中选取一个用户作为初始用户,从第二个问题聚类中选取3个用户作为初始用户,从第三个问题聚类中选取2个用户作为初始用户,从第四个问题聚类中选取4个用户作为初始用户,从第五个问题聚类中选取5个用户作为初始用户。即每个问题聚类中抽取的初始用户数量与该问题聚类中人数占所有问题聚类中人数的比例相关,并且,是呈正比的。Determine the number of initial users corresponding to each question cluster according to the proportion of the number of people in each question cluster, that is, the proportion of the number of people in the first question cluster is 1/15. Then, among all the initial users, The number of users extracted from the first problem cluster accounts for 1/15 of the total number of initial users. That is, if there are 15 initial users in total, then a user is selected as the initial user from the first problem cluster. Three users are selected as initial users in the second problem cluster, two users are selected as initial users in the third problem cluster, and four users are selected as initial users in the fourth problem cluster. Five users are selected as the initial users in this problem cluster. That is, the number of initial users extracted in each problem cluster is related to the proportion of the number of people in the problem cluster to the number of people in all problem clusters, and is directly proportional.
步骤S37、根据用户关系图中每两个用户间的相似度及迭代函数确定用户关系图中的不少于两个用户中除初始用户外的其他用户的用户标签聚类;Step S37: Determine user tag clusters of users other than the initial user among not less than two users in the user relationship graph according to the similarity between each two users in the user relationship graph and the iterative function;
步骤S38、依据第一用户所对应的第一用户标签聚类的技术水平参数回复第一用户发送的问题信息。Step S38: Reply to the question information sent by the first user according to the technical level parameter of the first user tag cluster corresponding to the first user.
本实施例公开的数据处理方法,当接收到第一用户发送的问题信息时,获取第一用户的用户信息,根据第一用户的用户信息确定第一用户标签聚类,将第一用户标签聚类确定为第一用户的用户标签聚类,依据第一用户标签聚类的技术水平参数回复第一用户发送的问题信息。本方案通过根据不同用户的用户信息确定该用户对应的不同的用户标签聚类,从而实现根据每个用户的用户标签聚类所对应的技术水平参数回复其所提出的问题,实现了根据不同用户的不同专业水平进行针对性解答。In the data processing method disclosed in this embodiment, when the problem information sent by the first user is received, the user information of the first user is acquired, the first user tag cluster is determined according to the first user user information, and the first user tags are clustered. The class is determined as the user tag cluster of the first user, and the question information sent by the first user is returned according to the technical level parameters of the first user tag cluster. This solution determines different user tag clusters corresponding to the user according to the user information of different users, thereby achieving the reply to the questions raised by the technical level parameters corresponding to the user tag clusters of each user, and achieving the realization according to different users. Different professional levels for targeted answers.
本实施例公开了一种数据处理方法,其流程图如图4所示,包括:This embodiment discloses a data processing method. The flowchart is shown in FIG. 4 and includes:
步骤S41、当接收到第一用户发送的问题信息时,获取第一用户的用户信息;Step S41: When the problem information sent by the first user is received, the user information of the first user is acquired;
步骤S42、根据第一用户的用户信息确定第一用户与第一数量的用户标签聚类的相似度排行;Step S42: Determine the similarity ranking of the first user and the first number of user tag clusters according to the user information of the first user;
步骤S43、按照相似度确定第一用户标签聚类,将第一用户标签聚类确定为第一用户的用户标签聚类;Step S43: Determine the first user tag cluster according to the similarity, and determine the first user tag cluster as the user tag cluster of the first user;
预先设置第一数量的用户标签聚类,该第一数量的用户标签聚类是根据用户特征设置的。A first number of user tag clusters are set in advance, and the first number of user tag clusters are set according to user characteristics.
当接收到第一用户发送的问题信息时,根据第一用户的用户信息确定第一用户与第一数量的用户标签聚类的相似度排行。When the question information sent by the first user is received, the similarity ranking of the first user and the first number of user tag clusters is determined according to the user information of the first user.
即根据第一用户的用户信息确定第一用户的用户特征,确定第一用户的用户特征与多个用户标签聚类中每一个用户标签聚类的相似度,将多个相似度进行排列,确定第一用户的用户特征与多个用户标签聚类中每一个用户标签聚类的相似度高低,选取相似度最高的用户标签聚类,将该用户标签聚类确定为第一用户标签聚类,即第一用户的用户标签聚类。That is, the user characteristics of the first user are determined according to the user information of the first user, the similarity between the user characteristics of the first user and each user tag cluster in the plurality of user tag clusters is determined, and the multiple similarities are ranked to determine The similarity between the user characteristics of the first user and each of the user tag clusters in the multiple user tag clusters, selecting the user tag cluster with the highest similarity, and determining the user tag cluster as the first user tag cluster, That is, the user tag clustering of the first user.
例如:预先设置5个用户标签聚类,当第一用户的用户特征与5个用户标签聚类的相似度高低排行为:C用户标签聚类→D用户标签聚类→A用户标签聚类→E用户标签聚类→B用户标签聚类,那么,其中与第一用户的用户特征相似度最高的为C用户标签聚类,相似度最低的为B用户标签聚类,将C用户标签聚类设置为第一用户标签聚类,即将C用户标签聚类确定为第一用户的用户标签聚类。For example: 5 user tag clusters are set in advance. When the user characteristics of the first user are similar to the 5 user tag clusters, the behavior is ranked as follows: C user tag clustering → D user tag clustering → A user tag clustering → E user tag clustering → B user tag clustering, then, among the user features with the highest similarity to the first user, the C user tag clustering, the lowest similarity is the B user tag clustering, and the C user tag clustering It is set as the first user tag cluster, that is, the C user tag cluster is determined as the user tag cluster of the first user.
进一步的,也可以为:直接选取与第一用户的用户特征相似度最高的用户标签聚类,将其确定为第一用户的用户标签聚类,无需进行相似度高低排列。Further, it is also possible to directly select the user tag cluster with the highest similarity to the user characteristics of the first user, and determine it as the user tag cluster of the first user, without ranking the similarities.
进一步的,间隔固定时长,根据所有用户的用户特征,对用户标签聚类进行重新设置,即,当提出问题的用户增多时,用户标签聚类中的用户基数增多,根据新增的以及原有的所有用户的用户特征确定新的用户标签聚类。Further, the interval is fixed, and the user tag cluster is reset according to the user characteristics of all users. That is, when the number of users who ask questions increases, the user base in the user tag cluster increases. The user characteristics of all users determine the new user tag cluster.
步骤S44、依据第一用户标签聚类的技术水平参数回复第一用户发送的问题信息。Step S44: Reply to the question information sent by the first user according to the technical level parameters of the first user tag clustering.
本实施例公开的数据处理方法,当接收到第一用户发送的问题信息时,获取第一用户的用户信息,根据第一用户的用户信息确定第一用户标签聚类, 将第一用户标签聚类确定为第一用户的用户标签聚类,依据第一用户标签聚类的技术水平参数回复第一用户发送的问题信息。本方案通过根据不同用户的用户信息确定该用户对应的不同的用户标签聚类,从而实现根据每个用户的用户标签聚类所对应的技术水平参数回复其所提出的问题,实现了根据不同用户的不同专业水平进行针对性解答。In the data processing method disclosed in this embodiment, when the problem information sent by the first user is received, the user information of the first user is acquired, the first user tag cluster is determined according to the user information of the first user, and the first user tags are clustered. The class is determined as the user tag cluster of the first user, and the question information sent by the first user is returned according to the technical level parameters of the first user tag cluster. This solution determines different user tag clusters corresponding to the user according to the user information of different users, thereby achieving the reply to the questions raised by the technical level parameters corresponding to the user tag clusters of each user, and achieving the realization according to different users. Different professional levels for targeted answers.
本实施例公开了一种数据处理方法,其流程图如图5所示,包括:This embodiment discloses a data processing method. The flowchart is shown in FIG. 5 and includes:
步骤S51、当接收到第一用户发送的问题信息时,确定接收到问题信息的第一时间间隔内,是否接收到其他问题信息;Step S51: When receiving the question information sent by the first user, determine whether other question information is received within the first time interval when the question information is received;
步骤S52、当在接收到问题信息的第一时间间隔内,接收到其他问题信息,将问题信息与其他问题信息合并;Step S52: When other problem information is received within the first time interval of receiving the problem information, the problem information is combined with other problem information;
当接收到第一用户发送的问题信息时,首先确定在接收到该问题信息的第一时间间隔内是否接收到其他问题信息,其中,该第一时间间隔可以为:接收到该问题信息的时刻之前的第一预定时长以及接收到该问题信息的时刻之后的第二预定时长,如果在第一时间间隔内还接收到其他问题信息,则将问题信息与其他问题信息合并,以便于能够统一对用户进行回复,无需多次回复,或者,用户将一个问题分多次问完时,不会造成问题不清楚的情况。When receiving the question information sent by the first user, it is first determined whether other question information is received within the first time interval in which the question information is received, where the first time interval may be: the moment when the question information is received If the first predetermined time period before and the second predetermined time period after the time when the problem information is received, if other problem information is also received within the first time interval, the problem information is merged with other problem information, so that the problem can be unified. The user does not need to reply multiple times, or when the user divides a question into multiple questions, it will not cause the problem to be unclear.
进一步的,还可以为:过滤掉用户发送的问题信息的长度低于第一阈值的信息,如:打招呼、寒暄等信息,例如:Hi,Hello等。Further, it may be: filtering out information that the length of the question information sent by the user is less than the first threshold, such as greetings, greetings, and other information, such as Hi, Hello, and the like.
步骤S53、获取第一用户的用户信息;Step S53: Acquire user information of the first user.
步骤S54、根据第一用户的用户信息确定第一用户标签聚类,将第一用户标签聚类确定为第一用户的用户标签聚类;Step S54: Determine the first user tag cluster according to the user information of the first user, and determine the first user tag cluster as the user tag cluster of the first user;
步骤S55、依据第一用户标签聚类的技术水平参数回复第一用户发送的合并后的问题信息。Step S55: Reply to the combined question information sent by the first user according to the technical level parameters of the first user tag clustering.
本实施例公开的数据处理方法,当接收到第一用户发送的问题信息时,获取第一用户的用户信息,根据第一用户的用户信息确定第一用户标签聚类,将第一用户标签聚类确定为第一用户的用户标签聚类,依据第一用户标签聚 类的技术水平参数回复第一用户发送的问题信息。本方案通过根据不同用户的用户信息确定该用户对应的不同的用户标签聚类,从而实现根据每个用户的用户标签聚类所对应的技术水平参数回复其所提出的问题,实现了根据不同用户的不同专业水平进行针对性解答。In the data processing method disclosed in this embodiment, when the problem information sent by the first user is received, the user information of the first user is acquired, the first user tag cluster is determined according to the first user user information, and the first user tags are clustered The class is determined as the user tag cluster of the first user, and the question information sent by the first user is returned according to the technical level parameters of the first user tag cluster. This solution determines different user tag clusters corresponding to the user according to the user information of different users, thereby achieving the reply to the questions raised by the technical level parameters corresponding to the user tag clusters of each user, and achieving the realization according to different users. Different professional levels for targeted answers.
本实施例公开了一种电子设备,其结构示意图如图6所示,包括:This embodiment discloses an electronic device. The structure diagram is shown in FIG. 6 and includes:
处理器61及存储器62。The processor 61 and the memory 62.
其中,存储器62用于存储用户标签聚类及与用户标签聚类对应的技术水平参数。The memory 62 is configured to store user tag clusters and technical level parameters corresponding to the user tag clusters.
处理器61用于当接收到第一用户发送的问题信息时,获取第一用户的用户信息,根据第一用户的用户信息确定第一用户标签聚类,将第一用户标签聚类确定为第一用户的用户标签聚类,依据第一用户标签聚类的技术水平参数回复第一用户发送的问题信息。The processor 61 is configured to obtain user information of the first user when receiving the question information sent by the first user, determine the first user tag cluster according to the user information of the first user, and determine the first user tag cluster as the first The user tag clustering of a user responds to the question information sent by the first user according to the technical level parameters of the first user tag clustering.
在客服系统或网页、论坛等系统中,当有用户进行提问时,该提问的用户可以处于登录状态,可以在用户发送提问信息时,由该用户登录的账户获知该用户的用户信息。In a customer service system or a webpage or forum, when a user asks a question, the user asking the question can be logged in. When the user sends a question, the user's user information can be obtained from the account that the user logs in to.
其中,用户信息可以为:用户在注册或补充该已登录的账户时所填写的个人信息,还可以为:该用户所登录的账户在之前所发出的提问或发言等信息。The user information may be personal information filled in by the user when registering or supplementing the logged-in account, and may also be information such as questions or speeches previously issued by the account in which the user is logged in.
例如:在客服系统中,用户向客服发送问题,在其发送问题之前,该用户可以首先进行账户的登录,登录后,将该问题发送给客服,之后,客服系统从账户中获取该用户的用户信息,如:年龄、使用产品的时间、使用产品的个数、用户曾经问过的问题等。For example: In the customer service system, the user sends a question to the customer service. Before the user sends the problem, the user can first log in to the account, after logging in, send the problem to the customer service, and then the customer service system obtains the user's user from the account. Information, such as: age, how long the product has been used, the number of products used, questions the user has asked, etc.
本发明实施例可以预先设置多个用户标签聚类,不同的用户标签聚类对应不同的技术水平参数,通过不同的技术水平参数表示对应的用户标签聚类中用户的技术水平。In the embodiment of the present invention, multiple user tag clusters can be set in advance, different user tag clusters correspond to different technical level parameters, and the user's technical level in the corresponding user tag cluster is represented by different technical level parameters.
例如:预先设置3个用户标签聚类,分别是第一用户标签聚类,第二用户标签聚类及第三用户标签聚类,第一用户标签聚类对应技术水平高,技术水平参数可以为1;第二用户标签聚类对应技术水平中,技术水平参数可以为2;第三用户标签聚类对应技术水平低,技术水平参数可以为3。For example: three user tag clusters are set in advance, which are the first user tag cluster, the second user tag cluster, and the third user tag cluster. The first user tag cluster corresponds to a high level of technology, and the technical level parameters can be 1; in the technical level corresponding to the second user tag cluster, the technical level parameter may be 2; the third user tag cluster corresponds to a low technical level, and the technical level parameter may be 3.
根据第一用户的用户信息确定第一用户标签聚类,可以具体为:根据第一用户对应的多个用户信息综合评定该用户的用户标签聚类。Determining the first user tag cluster according to the user information of the first user may be specifically: comprehensively evaluating the user tag cluster of the user according to the multiple user information corresponding to the first user.
例如:用户所咨询的产品为电子产品,用户年龄为30岁,使用该电子产品的时间为3年,并且,该用户曾经问过的问题都比较专业。由于年轻人对电子产品的了解比较清楚,并且已经使用该电子产品达到3年时间,其问过的问题也比较专业,那么,可以由此确定该用户所属的用户标签聚类为第一用户标签聚类,技术水平高;For example, the product the user consulted is an electronic product, the user is 30 years old, and the user has been using the electronic product for 3 years, and the user has asked professional questions that are more professional. Since young people have a better understanding of electronic products, and have been using the electronic products for 3 years, the questions they have asked are also more professional. Then, it can be determined from this that the user tag to which the user belongs is clustered as the first user tag. Clustering, high technical level;
若用户所咨询的产品为电子产品,用户年龄为65岁,使用该电子产品的时间为3个月,由于老年人对电子产品了解较少,并且使用该电子产品的时间较短,可以由此确定该用户所属的用户标签聚类为第三用户标签聚类,技术水平低。If the product the user consults is an electronic product, the user is 65 years old, and the user has used the electronic product for 3 months. Since the elderly know less about the electronic product and use the electronic product for a shorter time, this can be It is determined that the user tag cluster to which the user belongs is a third user tag cluster, and the technical level is low.
本实施例公开的数据处理方法中,也可以采用其他方式确定用户的用户标签聚类,在此不做具体限定。In the data processing method disclosed in this embodiment, the user tag cluster of the user may also be determined by other methods, which is not specifically limited herein.
由于不同的用户标签聚类对应不同的技术水平参数,根据不同的技术水平参数可以确定属于该对应的用户标签聚类中的用户的技术水平,那么,依据不同用户的不同技术水平回复其所提出的问题。Since different user tag clusters correspond to different technical level parameters, the technical level of users belonging to the corresponding user tag cluster can be determined according to different technical level parameters. Then, according to the different technical levels of different users, they respond to their suggestions. The problem.
不同的技术水平参数对应不同的问题回复方式,如:对于技术水平低的用户,回复其所提出的问题时,使用较为浅显的语言,尽量减少专业用语,回复内容较长;对于技术水平高的用户,回复其所提出的问题时,可以使用较为专业的术语,回复内容以简短为主。Different technical level parameters correspond to different ways of answering questions. For example, for users with low technical level, when replying to their questions, use simple language, minimize professional terms, and reply longer; for high technical level Users can use more specialized terminology when replying to their questions, and the reply content is mainly short.
处理器61根据第一用户的用户信息确定第一用户标签聚类,包括:The processor 61 determining the first user tag cluster according to the user information of the first user includes:
处理器用于查找用户关系图,用户关系图包括:不少于两个用户及每两个用户间的相似度,当用户关系图中包括第一用户时,根据用户关系图中初始用户的用户标签聚类确定第一用户的第一用户标签聚类。The processor is used to find the user relationship graph. The user relationship graph includes: no less than two users and the similarity between each two users. When the user relationship graph includes the first user, according to the user label of the initial user in the user relationship graph The clustering determines a first user tag cluster of the first user.
预先存储用户关系图,该用户关系图中为:不少于两个用户,以及每两个用户之间的相似度。A user relationship graph is stored in advance, and the user relationship graph is: no less than two users, and the similarity between each two users.
具体的,构建用户关系图需要首先提取用户特征并进行特征值计算,之后构建用户及特征矩阵。Specifically, constructing a user relationship graph requires first extracting user features and calculating feature values, and then constructing a user and feature matrix.
提取用户特征可以具体为:提取预先定义的用户特征和问句特征,如表1用户特征说明及表2用户问题特征说明所示:Extracting user features can be specifically: extracting predefined user features and question features, as shown in Table 1 user feature description and Table 2 user question feature description:
表1Table 1
用户特征名称User Feature Name 特征值Eigenvalues 特征说明Feature description
用户年龄User age 整数值Integer value 根据用户注册时填的信息估算出年龄Estimate the age based on the information provided when the user registered
用户使用手机数Number of mobile phones used by users 整数值Integer value 用户购买手机的次数Number of times a user bought a phone
用户使用手机时间跨度User use phone time span 整数值Integer value 用户第一次购买手机到问问题的时间间隔Time interval between the user's first purchase of the mobile phone and the question
表2Table 2
用户问题特征名称User Issue Feature Name 特征值Eigenvalues 特征说明Feature description
用户问题中专业词频率Frequency of professional words in user questions 整数值Integer value 用户历史问题中专业词汇出现的频率Frequency of professional vocabulary in user history issues
用户问题的代表性Representation of user questions 整数值Integer value 用户问题所在聚类的样本数Number of samples in the cluster where the user problem is located
用户问题答案详细程度User question answer level of detail 整数值Integer value 用户问题对应答案的字符数Number of characters for answer to user question
用户问题数User Questions 整数值Integer value 用户历史问题数量Number of user history issues
用户对话时间User conversation time 整数值Integer value 用户对话平均耗时User conversations take an average of time
用户交互轮数User interaction rounds 整数值Integer value 用户与客服人员交互的轮数Number of rounds of user interaction with customer service staff
其中,表1中包括:用户特征的名称、特征值及特征说明,如:当用户在客服系统中所发送的问题是针对手机进行的咨询,用户特征为用户年龄时,其是根据用户注册时填的信息估算出的年龄;用户特殊为用户使用手机数时,其是根据用户购买手机的次数确定的;用户特征为用户使用手机时间跨度,其是根据用户第一次购买手机到问当前问题之间的时间间隔确定的。Among them, Table 1 includes: the name, feature value, and feature description of the user ’s characteristics. For example, when the question sent by the user in the customer service system is a consultation for a mobile phone, and the user ’s characteristic is the user ’s age, it is based on the user ’s registration. The information is the estimated age; when the user is specifically the number of mobile phones used by the user, it is determined based on the number of times the user has purchased the mobile phone; the user characteristic is the time span of the user ’s mobile phone, which is based on the user ’s first purchase of the mobile phone to the current question The time interval is determined.
表2中包括:用户问题特征的名称、特征值及特征说明,如:当用户在客服系统中所发送的问题是针对手机进行的咨询,用户问题特征为用户问题中专业词频率,其是根据用户历史问题中专业词汇出现的频率确定的;用户 问题特征为用户问题的代表性,其是根据用户问题所在聚类的样本数确定的;用户问题特征为用户问题答案详细程度,其是根据用户问题对应的答案的字符数确定的;用户问题特征为用户问题数,其是根据用户历史问题数量确定的;用户问题特征为用户对话时间,其是根据用户在客服系统中的历史问题对话的平均耗时确定的;用户问题特征为用户交互轮数,其是根据用户在客服系统中的历史问题中,用户与客服交互的轮数确定的。Table 2 includes: the name, feature value, and feature description of the user ’s question feature. For example, when the question sent by the user in the customer service system is a consultation for a mobile phone, the feature of the user question is the professional word frequency in the user question, which is based on The frequency of professional vocabulary in user history questions is determined; the user question feature is the representativeness of the user question, which is determined based on the number of samples in the cluster where the user question is located; the user question feature is the detail level of the user question answer, which is based on the user The number of characters for the answer to the question is determined; the feature of the user question is the number of user questions, which is determined based on the number of historical user questions; the feature of the user question is the user's dialogue time, which is based on the average of the user's historical question dialogue in the customer service system It takes time to determine; the user problem is characterized by the number of user interaction rounds, which is determined according to the number of user interactions with the customer service in the historical problems in the customer service system.
构建用户及特征矩阵M,矩阵每一行表示一个用户,矩阵每一列表示一维特征,然后对每一列进行归一化处理。A user and feature matrix M is constructed, each row of the matrix represents a user, each column of the matrix represents a one-dimensional feature, and then each column is normalized.
假设用户关系图G为全链接:Assume that the user relationship graph G is fully linked:
用户关系图G由用户U和用户相似度组成,其中,G的节点为用户,边为用户之间的相似度,定义一个节点V i=<U i,Q i>,其中,V i为第i个节点,U i为V i的用户信息部分,Q i为V i的用户问题部分,节点之间的边表示节点的相似度。节点V i=<U i,Q i>与V j=<U j,Q j>之间的相似度的计算方法如下: User relationship graph G is composed of user U and user similarity, where G's nodes are users and edges are similarity between users. Define a node V i = <U i , Q i >, where V i is the first i nodes, the U-i for the user information portion of the V i, Q i V i as part of the user problems, the edges between nodes represent similarity node. The calculation method of the similarity between the nodes V i = <U i , Q i > and V j = <U j , Q j > is as follows:
sim(V i,V j)=αsim(U i,U j)+(1-α)sim(Q i,Q j)  公式(1) sim (V i , V j ) = αsim (U i , U j ) + (1-α) sim (Q i , Q j ) Formula (1)
Figure PCTCN2018116169-appb-000005
Figure PCTCN2018116169-appb-000005
Figure PCTCN2018116169-appb-000006
Figure PCTCN2018116169-appb-000006
其中,公式(1)、(2)及(3)中的U i、U j、Q i、Q j均为归一化后的特征值,δ及γ为常数。 Among them, U i , U j , Q i , and Q j in the formulae (1), (2), and (3) are normalized characteristic values, and δ and γ are constants.
经过上述步骤可实现用户关系图的构建。After the above steps, the user relationship graph can be constructed.
当上述步骤中构建完成的用户关系图中的用户中包括有第一用户,即发送问题信息的用户时,直接根据用户关系图中的初始用户的用户标签聚类确定第一用户的第一用户标签聚类。When the users in the user relationship graph constructed in the above steps include the first user, that is, the user who sent the question information, directly determine the first user of the first user according to the user tag cluster of the initial user in the user relationship graph. Label clustering.
具体的,预先从用户关系图中的不少于两个用户中确定初始用户,为初始用户设定用户标签聚类,根据用户关系图中每两个用户间的相似度及迭代 函数确定用户关系图中的不少于两个用户中除初始用户外的其他用户的用户标签聚类。Specifically, an initial user is determined from not less than two users in the user relationship graph, a user tag cluster is set for the initial user, and the user relationship is determined based on the similarity between each two users in the user relationship graph and the iterative function. In the figure, the user tag cluster of other users except the initial user among two users is clustered.
当用户关系图中所有用户的用户标签聚类均已确定时,那么,用户关系图中的第一用户所属的用户标签聚类也已确定。When the user tag clusters of all users in the user relationship graph have been determined, then the user tag cluster to which the first user belongs in the user relationship graph has also been determined.
进一步的,根据用户关系图中每两个用户间的相似度及迭代函数确定用户关系图中的不少于两个用户中除初始用户外的其他用户的用户标签聚类,可以具体为:Further, determining the user tag clustering of users other than the initial user among the two or more users in the user relationship graph according to the similarity between each two users in the user relationship graph and the iterative function may be specifically:
设n*n矩阵M为用户关系图G的边权重矩阵,该矩阵中的元素m ij表示节点r i和r j的相似度,之后对M的每个行向量进行归一化得到矩阵M′,M′中的每个元素通过公式(4)计算得到,使得M′的每个行向量中各项之和为1。 Let n * n matrix M be the edge weight matrix of the user relationship graph G. The element m ij in the matrix represents the similarity of the nodes r i and r j . Then, each row vector of M is normalized to obtain the matrix M ′. Each element in M ′ is calculated by formula (4), so that the sum of terms in each row vector of M ′ is 1.
Figure PCTCN2018116169-appb-000007
Figure PCTCN2018116169-appb-000007
对于图中节点设置其类别信息向量,对于初始标注类别节点设置器小类别向量为:v=(0,...,1 t,...,0) nThe category information vector is set for the nodes in the graph, and the small category vector for the initial label category node setter is: v = (0, ..., 1 t , ..., 0) n .
以n=2为例:Take n = 2 as an example:
对于已标注类别的节点,设其类别向量为v=(0,...,1 t,...,0) n,该向量的第t维为1,其余纬度为0,在迭代的第k+1步中,每个类别节点r的类别向量v被改写成v k+1=M′v kFor a labeled node, let its class vector be v = (0, ..., 1 t , ..., 0) n , the t-th dimension of the vector is 1, and the remaining latitudes are 0. In step k + 1, the class vector v of each class node r is rewritten as v k + 1 = M′v k .
类别扩散过程中,迭代的更新每个节点的类别向量之后,初始标注类别的节点的类别向量将被恢复为初始设置向量,使其与标注类别相一致,而对于其他未标注类别的节点,当第i次迭代后,计算该节点此次迭代前后两个类别向量的余弦相似度sim(v i,v i+1),并且,记第i次迭代对该节点的影响度为impact(v i)=1-sim(v i,v i+1)。 During the class diffusion process, after the class vector of each node is updated iteratively, the class vector of the node with the initial labeled category will be restored to the initial set vector to make it consistent with the labeled category. For other unlabeled nodes, when After the i-th iteration, calculate the cosine similarity sim (v i , v i + 1 ) of the two class vectors before and after the iteration of the node, and record the impact of the i-th iteration on the node as impact (v i ) = 1-sim (v i , v i + 1 ).
以第i次迭代后所有节点平均的影响度average_impact(i)作为类别扩散是否平衡的标准:Use the average influence degree of all nodes after the i-th iteration average_impact (i) as the criterion of whether the class diffusion is balanced:
Figure PCTCN2018116169-appb-000008
Figure PCTCN2018116169-appb-000008
如果第i次迭代后的节点平均影响度小于一定阈值,就认为扩散已经达到平衡,并且终止迭代的类别扩散过程。If the average influence degree of the node after the i-th iteration is less than a certain threshold, it is considered that the diffusion has reached equilibrium, and the class diffusion process of the iteration is terminated.
扩散达到平衡时,对图中的每个节点r的类别信息向量v=(p(c 1),p(c 2),...,p(c n)) n,取关系类别向量中最大维度对应的类别为该关系对的类别,type(v)=argmaxp(c i)。 When the diffusion reaches equilibrium, the category information vector v = (p (c 1 ), p (c 2 ), ..., p (c n )) n of each node r in the graph is taken as the largest of the relation category vectors. The category corresponding to the dimension is the category of the relationship pair, and type (v) = argmaxp (c i ).
其中,不同类别即为不同的技术水平。Among them, different categories are different levels of technology.
处理器61从用户关系图中的不少于两个用户中确定初始用户,包括:The processor 61 determines the initial user from no less than two users in the user relationship diagram, including:
处理器用于为用户关系图中的不少于两个用户中每个用户所发送的问题信息设置问题聚类,确定每个问题聚类中人数所占比例,其中,问题聚类中人数所占比例为:提出每个问题聚类下的问题所对应的用户数量与提出所有问题聚类下所有问题所对应的用户数量的比值,按照每个问题聚类中人数所占比例确定每个问题聚类所对应的初始用户数量,按照每个问题聚类所对应的初始用户数量确定初始用户。The processor is configured to set a problem cluster for the problem information sent by each of the users in the user relationship graph, and determine the proportion of the number of people in each problem cluster, of which the number of people in the problem cluster The ratio is: the ratio of the number of users corresponding to the questions under each question cluster to the number of users corresponding to all the questions under all question clusters, and each question cluster is determined according to the proportion of the number of people in each question cluster The number of initial users corresponding to the class is determined according to the number of initial users corresponding to each problem cluster.
为用户关系图中所有用户所提出的所有问题设置问题聚类,每个问题聚类下对应不少于一个问题。Set up question clusters for all questions raised by all users in the user relationship graph, and each question cluster corresponds to no less than one question.
例如:设置5个问题聚类,第一个问题聚类下包括100个问题,第二个问题聚类下包括300个问题,第三个问题聚类下包括200个问题,第四个问题聚类下包括400个问题,第五个问题聚类下包括500个问题。For example: set 5 problem clusters, the first problem cluster includes 100 questions, the second problem cluster includes 300 questions, the third problem cluster includes 200 questions, and the fourth problem cluster There are 400 questions under the category, and 500 questions under the fifth question cluster.
在每个问题聚类下,一个问题对应一个用户,那么,第一个问题聚类下100个问题,就对应100个用户,即有100个用户提出过属于第一个问题聚类的问题;第二个问题聚类下300个问题,就对应300的用户,即有300个用户提出过属于第二个问题聚类的问题。Under each problem cluster, one question corresponds to one user. Then, 100 questions under the first problem cluster correspond to 100 users, that is, 100 users have asked questions that belong to the first problem cluster; The 300 questions under the second question cluster correspond to 300 users, that is, 300 users have raised questions that belong to the second question cluster.
确定每个问题聚类中人数所占比例,即:提出每个问题聚类下的问题所对应的用户数量与提出所有问题聚类下所有问题所对应的用户数量的比例,其中,提出所有问题聚类下所有问题所对应的用户数量,即5个问题聚类一共有1500个问题,对应1500个用户,那么,第一个问题聚类中人数所占比例为:100/1500,即1/15;第二个问题聚类中人数所占比例为:300/1500,即 3/15;第三个问题聚类中人数所占比例为:200/1500,即2/15;第四个问题聚类中人数所占比例为:400/1500,即4/15;第五个问题聚类中人数所占比例为:500/1500,即5/15。Determine the proportion of the number of people in each question cluster, that is, the ratio of the number of users corresponding to the questions under each question cluster and the number of users corresponding to all the questions under all question clusters, where all questions are asked The number of users corresponding to all the questions in the cluster, that is, a total of 1500 questions in 5 question clusters, corresponding to 1500 users, then the proportion of the number of people in the first question cluster is: 100/1500, which is 1 / 15; the proportion of the number of people in the second question cluster is: 300/1500, which is 3/15; the proportion of the number of people in the third question cluster: 200/1500, which is 2/15; the fourth question The proportion of the number of people in the cluster is: 400/1500, which is 4/15; the proportion of the number of people in the fifth problem cluster is: 500/1500, which is 5/15.
按照每个问题聚类中人数所占比例确定每个问题聚类所对应的初始用户数量,即第一个问题聚类中人数所占比例为1/15,那么,在所有的初始用户中,从第一个问题聚类中抽取的用户数量占所有初始用户数量的1/15,即若所有初始用户一共有15人,那么,从第一个问题聚类中选取一个用户作为初始用户,从第二个问题聚类中选取3个用户作为初始用户,从第三个问题聚类中选取2个用户作为初始用户,从第四个问题聚类中选取4个用户作为初始用户,从第五个问题聚类中选取5个用户作为初始用户。即每个问题聚类中抽取的初始用户数量与该问题聚类中人数占所有问题聚类中人数的比例相关,并且,是呈正比的。Determine the number of initial users corresponding to each question cluster according to the proportion of the number of people in each question cluster, that is, the proportion of the number of people in the first question cluster is 1/15. Then, among all the initial users, The number of users extracted from the first problem cluster accounts for 1/15 of the total number of initial users. That is, if there are 15 initial users in total, then a user is selected as the initial user from the first problem cluster. Three users are selected as initial users in the second problem cluster, two users are selected as initial users in the third problem cluster, and four users are selected as initial users in the fourth problem cluster. Five users are selected as the initial users in this problem cluster. That is, the number of initial users extracted in each problem cluster is related to the proportion of the number of people in the problem cluster to the number of people in all problem clusters, and is directly proportional.
处理器61根据第一用户的用户信息确定第一用户标签聚类,包括:The processor 61 determining the first user tag cluster according to the user information of the first user includes:
处理器根据第一用户的用户信息确定第一用户与第一数量的用户标签聚类的相似度排行,按照相似度确定第一用户标签聚类。The processor determines the similarity ranking of the first user and the first number of user tag clusters according to the user information of the first user, and determines the first user tag cluster according to the similarity.
预先设置第一数量的用户标签聚类,该第一数量的用户标签聚类是根据用户特征设置的。A first number of user tag clusters are set in advance, and the first number of user tag clusters are set according to user characteristics.
当接收到第一用户发送的问题信息时,根据第一用户的用户信息确定第一用户与第一数量的用户标签聚类的相似度排行。When the question information sent by the first user is received, the similarity ranking of the first user and the first number of user tag clusters is determined according to the user information of the first user.
即根据第一用户的用户信息确定第一用户的用户特征,确定第一用户的用户特征与多个用户标签聚类中每一个用户标签聚类的相似度,将多个相似度进行排列,确定第一用户的用户特征与多个用户标签聚类中每一个用户标签聚类的相似度高低,选取相似度最高的用户标签聚类,将该用户标签聚类确定为第一用户标签聚类,即第一用户的用户标签聚类。That is, the user characteristics of the first user are determined according to the user information of the first user, the similarity between the user characteristics of the first user and each user tag cluster in the plurality of user tag clusters is determined, and the multiple similarities are ranked to determine The similarity between the user characteristics of the first user and each of the user tag clusters in the multiple user tag clusters, selecting the user tag cluster with the highest similarity, and determining the user tag cluster as the first user tag cluster, That is, the user tag clustering of the first user.
例如:预先设置5个用户标签聚类,当第一用户的用户特征与5个用户标签聚类的相似度高低排行为:C用户标签聚类→D用户标签聚类→A用户标签聚类→E用户标签聚类→B用户标签聚类,那么,其中与第一用户的用户特 征相似度最高的为C用户标签聚类,相似度最低的为B用户标签聚类,将C用户标签聚类设置为第一用户标签聚类,即将C用户标签聚类确定为第一用户的用户标签聚类。For example: 5 user tag clusters are set in advance. When the user characteristics of the first user are similar to the 5 user tag clusters, the behavior is ranked as follows: C user tag clustering → D user tag clustering → A user tag clustering → E user tag clustering → B user tag clustering, then, among the user features with the highest similarity to the first user, the C user tag clustering, the lowest similarity is the B user tag clustering, and the C user tag clustering It is set as the first user tag cluster, that is, the C user tag cluster is determined as the user tag cluster of the first user.
进一步的,也可以为:直接选取与第一用户的用户特征相似度最高的用户标签聚类,将其确定为第一用户的用户标签聚类,无需进行相似度高低排列。Further, it is also possible to directly select the user tag cluster with the highest similarity to the user characteristics of the first user, and determine it as the user tag cluster of the first user, without ranking the similarities.
进一步的,间隔固定时长,根据所有用户的用户特征,对用户标签聚类进行重新设置,即,当提出问题的用户增多时,用户标签聚类中的用户基数增多,根据新增的以及原有的所有用户的用户特征确定新的用户标签聚类。Further, the interval is fixed, and the user tag cluster is reset according to the user characteristics of all users. That is, when the number of users who ask questions increases, the user base in the user tag cluster increases. The user characteristics of all users determine the new user tag cluster.
处理器61还用于:接收第一用户发送的问题信息,确定接收到问题信息的第一时间间隔内,是否接收到其他问题信息,当在接收到问题信息的第一时间间隔内,接收到其他问题信息,将问题信息与其他问题信息合并。The processor 61 is further configured to receive the question information sent by the first user, and determine whether other question information is received within the first time interval during which the question information is received. For other problem information, merge the problem information with other problem information.
当接收到第一用户发送的问题信息时,首先确定在接收到该问题信息的第一时间间隔内是否接收到其他问题信息,其中,该第一时间间隔可以为:接收到该问题信息的时刻之前的第一预定时长以及接收到该问题信息的时刻之后的第二预定时长,如果在第一时间间隔内还接收到其他问题信息,则将问题信息与其他问题信息合并,以便于能够统一对用户进行回复,无需多次回复,或者,用户将一个问题分多次问完时,不会造成问题不清楚的情况。When receiving the question information sent by the first user, it is first determined whether other question information is received within the first time interval in which the question information is received, where the first time interval may be: the moment when the question information is received If the first predetermined time period before and the second predetermined time period after the time when the problem information is received, if other problem information is also received within the first time interval, the problem information is merged with other problem information, so that the problem can be unified. The user does not need to reply multiple times, or when the user divides a question into multiple questions, it will not cause the problem to be unclear.
进一步的,还可以为:过滤掉用户发送的问题信息的长度低于第一阈值的信息,如:打招呼、寒暄等信息,例如:Hi,Hello等。Further, it may be: filtering out information that the length of the question information sent by the user is less than the first threshold, such as greetings, greetings, and other information, such as Hi, Hello, and the like.
本实施例公开的电子设备,包括存储器及处理器,处理器用于当接收到第一用户发送的问题信息时,获取第一用户的用户信息,根据第一用户的用户信息确定第一用户标签聚类,将第一用户标签聚类确定为第一用户的用户标签聚类,依据第一用户标签聚类的技术水平参数回复第一用户发送的问题信息。本方案通过根据不同用户的用户信息确定该用户对应的不同的用户标签聚类,从而实现根据每个用户的用户标签聚类所对应的技术水平参数回复其所提出的问题,实现了根据不同用户的不同专业水平进行针对性解答。The electronic device disclosed in this embodiment includes a memory and a processor. The processor is configured to obtain the user information of the first user when the problem information sent by the first user is received, and determine the first user tag group according to the user information of the first user. Class, determining the first user tag cluster as the user tag cluster of the first user, and replying to the question information sent by the first user according to the technical level parameters of the first user tag cluster. This solution determines different user tag clusters corresponding to the user according to the user information of different users, thereby achieving the reply to the questions raised by the technical level parameters corresponding to the user tag clusters of each user, and achieving the realization according to different users. Different professional levels for targeted answers.
根据本发明实施例,处理器61例如可以包括通用微处理器、指令集处理器和/或相关芯片组和/或专用微处理器(例如,专用集成电路(ASIC)),等等。处理器61还可以包括用于缓存用途的板载存储器。存储器62,例如可以是非易失性的计算机可读存储介质,具体示例包括但不限于:磁存储装置,如磁带或硬盘(HDD);光存储装置,如光盘(CD-ROM);存储器,如随机存取存储器(RAM)或闪存;等等。According to an embodiment of the present invention, the processor 61 may include, for example, a general-purpose microprocessor, an instruction set processor and / or an associated chipset and / or a special-purpose microprocessor (for example, an application-specific integrated circuit (ASIC)), and so on. The processor 61 may also include on-board memory for caching purposes. The memory 62 may be, for example, a non-volatile computer-readable storage medium, and specific examples include, but are not limited to: a magnetic storage device such as a magnetic tape or a hard disk (HDD); an optical storage device such as a compact disc (CD-ROM); a memory such as Random Access Memory (RAM) or Flash; etc.
本说明书中各个实施例采用递进的方式描述,每个实施例重点说明的都是与其他实施例的不同之处,各个实施例之间相同相似部分互相参见即可。对于实施例公开的装置而言,由于其与实施例公开的方法相对应,所以描述的比较简单,相关之处参见方法部分说明即可。The embodiments in this specification are described in a progressive manner. Each embodiment focuses on the differences from other embodiments. For the same and similar parts between the embodiments, refer to each other. As for the device disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and the relevant part may refer to the description of the method.
专业人员还可以进一步意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、计算机软件或者二者的结合来实现,为了清楚地说明硬件和软件的可互换性,在上述说明中已经按照功能一般性地描述了各示例的组成及步骤。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本发明的范围。Professionals may further realize that the units and algorithm steps of the examples described in connection with the embodiments disclosed herein can be implemented by electronic hardware, computer software, or a combination of the two. In order to clearly illustrate the hardware and software, Interchangeability. In the above description, the composition and steps of each example have been described generally in terms of functions. Whether these functions are performed in hardware or software depends on the specific application and design constraints of the technical solution. A person skilled in the art can use different methods to implement the described functions for each specific application, but such implementation should not be considered to be beyond the scope of the present invention.
结合本文中所公开的实施例描述的方法或算法的步骤可以直接用硬件、处理器执行的软件模块,或者二者的结合来实施。软件模块可以置于随机存储器(RAM)、内存、只读存储器(ROM)、电可编程ROM、电可擦除可编程ROM、寄存器、硬盘、可移动磁盘、CD-ROM、或技术领域内所公知的任意其它形式的存储介质中。The steps of the method or algorithm described in connection with the embodiments disclosed herein may be directly implemented by hardware, a software module executed by a processor, or a combination of the two. Software modules can be placed in random access memory (RAM), memory, read-only memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disks, removable disks, CD-ROMs, or in technical fields Any other form of storage medium is known.
对所公开的实施例的上述说明,使本领域专业技术人员能够实现或使用本发明。对这些实施例的多种修改对本领域的专业技术人员来说将是显而易见的,本文中所定义的一般原理可以在不脱离本发明的精神或范围的情况下, 在其它实施例中实现。因此,本发明将不会被限制于本文所示的这些实施例,而是要符合与本文所公开的原理和新颖特点相一致的最宽的范围。The above description of the disclosed embodiments enables those skilled in the art to implement or use the present invention. Various modifications to these embodiments will be apparent to those skilled in the art, and the general principles defined herein may be implemented in other embodiments without departing from the spirit or scope of the invention. Therefore, the present invention will not be limited to the embodiments shown herein, but shall conform to the widest scope consistent with the principles and novel features disclosed herein.

Claims (10)

  1. 一种数据处理方法,其特征在于,包括:A data processing method, comprising:
    当接收到第一用户发送的问题信息时,获取所述第一用户的用户信息;When the problem information sent by the first user is received, obtaining the user information of the first user;
    根据所述第一用户的用户信息确定第一用户标签聚类,将所述第一用户标签聚类确定为所述第一用户的用户标签聚类;Determining a first user tag cluster according to the user information of the first user, and determining the first user tag cluster as a user tag cluster of the first user;
    依据所述第一用户标签聚类的技术水平参数回复所述第一用户发送的问题信息。Reply to the question information sent by the first user according to the technical level parameter of the first user tag cluster.
  2. 根据权利要求1所述的方法,其特征在于,根据所述第一用户的用户信息确定第一用户标签聚类,包括:The method according to claim 1, wherein determining the first user tag cluster according to the user information of the first user comprises:
    查找用户关系图,所述用户关系图包括:不少于两个用户及每两个用户间的相似度;Find a user relationship graph, the user relationship graph includes: no less than two users and similarity between each two users;
    当所述用户关系图中包括所述第一用户时,根据所述用户关系图中初始用户的用户标签聚类确定所述第一用户的第一用户标签聚类。When the first user is included in the user relationship graph, a first user label cluster of the first user is determined according to a user label cluster of an initial user in the user relationship graph.
  3. 根据权利要求1所述的方法,其特征在于,根据所述第一用户的用户信息确定第一用户标签聚类,包括:The method according to claim 1, wherein determining the first user tag cluster according to the user information of the first user comprises:
    根据所述第一用户的用户信息确定所述第一用户与第一数量的用户标签聚类的相似度排行;Determine the similarity ranking of the first user and the first number of user tag clusters according to the user information of the first user;
    按照相似度确定第一用户标签聚类。The first user tag cluster is determined according to the similarity.
  4. 根据权利要求2所述的方法,其特征在于,所述根据所述用户关系图中初始用户的用户标签聚类确定所述第一用户的第一用户标签聚类,包括:The method according to claim 2, wherein the determining a first user tag cluster of the first user according to a user tag cluster of an initial user in the user relationship graph comprises:
    从所述用户关系图中的不少于两个用户中确定初始用户,为所述初始用户设定用户标签聚类;Determining an initial user from no less than two users in the user relationship graph, and setting a user tag cluster for the initial user;
    根据所述用户关系图中每两个用户间的相似度及迭代函数确定所述用户关系图中的不少于两个用户中除所述初始用户外的其他用户的用户标签聚类。According to the similarity between each two users in the user relationship graph and an iterative function, a user tag cluster of other users except the initial user in the user relationship graph is determined.
  5. 根据权利要求4所述的方法,其特征在于,所述从所述用户关系图中的不少于两个用户中确定初始用户,包括:The method according to claim 4, wherein the determining an initial user from no less than two users in the user relationship graph comprises:
    为所述用户关系图中的不少于两个用户中每个用户所发送的问题信息设置问题聚类;Setting a problem cluster for the problem information sent by each of the two or more users in the user relationship graph;
    确定每个问题聚类中人数所占比例,其中,所述问题聚类中人数所占比例为:提出每个问题聚类下的问题所对应的用户数量与提出所有问题聚类下所有问题所对应的用户数量的比值;Determine the proportion of the number of people in each question cluster, where the proportion of the number of people in the question cluster is: the number of users corresponding to the questions under each question cluster and the questions The ratio of the number of corresponding users;
    按照所述每个问题聚类中人数所占比例确定每个问题聚类所对应的初始用户数量;Determining the initial number of users corresponding to each problem cluster according to the proportion of the number of people in each problem cluster;
    按照所述每个问题聚类所对应的初始用户数量确定初始用户。An initial user is determined according to the number of initial users corresponding to each question cluster.
  6. 根据权利要求1所述的方法,其特征在于,还包括:The method according to claim 1, further comprising:
    接收第一用户发送的问题信息,确定接收到所述问题信息的第一时间间隔内,是否接收到其他问题信息;Receiving the question information sent by the first user, and determining whether other question information is received within the first time interval when the question information is received;
    当在接收到所述问题信息的第一时间间隔内,接收到其他问题信息,将所述问题信息与其他问题信息合并。When other question information is received within the first time interval of receiving the question information, the question information is combined with other question information.
  7. 一种电子设备,其特征在于,包括:处理器及存储器,其中:An electronic device, comprising: a processor and a memory, wherein:
    所述存储器用于存储用户标签聚类及与所述用户标签聚类对应的技术水平参数;The memory is configured to store a user tag cluster and a technical level parameter corresponding to the user tag cluster;
    所述处理器用于当接收到第一用户发送的问题信息时,获取所述第一用户的用户信息,根据所述第一用户的用户信息确定第一用户标签聚类,将所述第一用户标签聚类确定为所述第一用户的用户标签聚类,依据所述第一用户标签聚类的技术水平参数回复所述第一用户发送的问题信息。The processor is configured to obtain user information of the first user when receiving the question information sent by the first user, determine a first user tag cluster according to the user information of the first user, and group the first user The tag cluster is determined as the user tag cluster of the first user, and the question information sent by the first user is returned according to the technical level parameter of the first user tag cluster.
  8. 根据权利要求7所述的电子设备,其特征在于,所述处理器根据所述第一用户的用户信息确定第一用户标签聚类,包括:The electronic device according to claim 7, wherein the processor determining the first user tag cluster according to the user information of the first user comprises:
    所述处理器查找用户关系图,所述用户关系图包括:不少于两个用户及每两个用户间的相似度,当所述用户关系图中包括所述第一用户时,根据所述用户关系图中初始用户的用户标签聚类确定所述第一用户的第一用户标签聚类。The processor searches for a user relationship graph, and the user relationship graph includes: no less than two users and a similarity between each two users, and when the user relationship graph includes the first user, according to the The user tag cluster of the initial user in the user relationship graph determines the first user tag cluster of the first user.
  9. 根据权利要求7所述的电子设备,其特征在于,所述处理器根据所述第一用户的用户信息确定第一用户标签聚类,包括:The electronic device according to claim 7, wherein the processor determining the first user tag cluster according to the user information of the first user comprises:
    所述处理器根据所述第一用户的用户信息确定所述第一用户与第一数量的用户标签聚类的相似度排行,按照相似度确定第一用户标签聚类。The processor determines the similarity ranking of the first user and the first number of user tag clusters according to the user information of the first user, and determines the first user tag cluster according to the similarity.
  10. 根据权利要求8所述的电子设备,其特征在于,所述处理器根据所述用户关系图中初始用户的用户标签聚类确定所述第一用户的第一用户标签聚类,包括:The electronic device according to claim 8, wherein the determining the first user tag cluster of the first user according to the user tag cluster of the initial user in the user relationship graph comprises:
    所述处理器从所述用户关系图中的不少于两个用户中确定初始用户,为所述初始用户设定用户标签聚类,根据所述用户关系图中每两个用户间的相似度及迭代函数确定所述用户关系图中的不少于两个用户中除所述初始用户外的其他用户的用户标签聚类。The processor determines an initial user from no less than two users in the user relationship diagram, sets a user tag cluster for the initial user, and according to the similarity between every two users in the user relationship diagram And an iterative function to determine a user tag cluster of other users in the user relationship graph other than the initial user among the two or more users.
PCT/CN2018/116169 2018-06-28 2018-11-19 Data processing method and electronic device WO2020000875A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810685775.5 2018-06-28
CN201810685775.5A CN108876407B (en) 2018-06-28 2018-06-28 Data processing method and electronic equipment

Publications (1)

Publication Number Publication Date
WO2020000875A1 true WO2020000875A1 (en) 2020-01-02

Family

ID=64295491

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/116169 WO2020000875A1 (en) 2018-06-28 2018-11-19 Data processing method and electronic device

Country Status (2)

Country Link
CN (1) CN108876407B (en)
WO (1) WO2020000875A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111461118A (en) * 2020-03-31 2020-07-28 中国移动通信集团黑龙江有限公司 Interest feature determination method, device, equipment and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105103169A (en) * 2013-03-29 2015-11-25 三菱电机株式会社 Information processing device and information processing system
CN105740463A (en) * 2016-03-03 2016-07-06 世纪禾光科技发展(北京)有限公司 Website message allocation and management method, device and system
CN106022976A (en) * 2016-07-25 2016-10-12 杭州凯达电力建设有限公司 Demand side oriented power consumer classification method and system

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150099254A1 (en) * 2012-07-26 2015-04-09 Sony Corporation Information processing device, information processing method, and system
CN104268290B (en) * 2014-10-22 2017-08-08 武汉科技大学 A kind of recommendation method based on user clustering
CN104915861A (en) * 2015-06-15 2015-09-16 浙江经贸职业技术学院 An electronic commerce recommendation method for a user group model constructed based on scores and labels
CN106447066A (en) * 2016-06-01 2017-02-22 上海坤士合生信息科技有限公司 Big data feature extraction method and device
CN107291815A (en) * 2017-05-22 2017-10-24 四川大学 Recommend method in Ask-Answer Community based on cross-platform tag fusion

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105103169A (en) * 2013-03-29 2015-11-25 三菱电机株式会社 Information processing device and information processing system
CN105740463A (en) * 2016-03-03 2016-07-06 世纪禾光科技发展(北京)有限公司 Website message allocation and management method, device and system
CN106022976A (en) * 2016-07-25 2016-10-12 杭州凯达电力建设有限公司 Demand side oriented power consumer classification method and system

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111461118A (en) * 2020-03-31 2020-07-28 中国移动通信集团黑龙江有限公司 Interest feature determination method, device, equipment and storage medium
CN111461118B (en) * 2020-03-31 2023-11-24 中国移动通信集团黑龙江有限公司 Interest feature determining method, device, equipment and storage medium

Also Published As

Publication number Publication date
CN108876407B (en) 2022-04-19
CN108876407A (en) 2018-11-23

Similar Documents

Publication Publication Date Title
US10536579B2 (en) System, method and marketplace for real-time interactive video/voice services using artificial intelligence
WO2020029590A1 (en) Sample prediction method and device based on federated training, and storage medium
US20200294111A1 (en) Determining target user group
US20200126540A1 (en) Machine Learning Tool for Navigating a Dialogue Flow
US11676093B2 (en) Inferring missing customer data in assigning a ticket to a customer, and preventing reopening of the ticket in response of determining trivial data
US10565525B2 (en) Collaborative filtering method, apparatus, server and storage medium in combination with time factor
US20170308903A1 (en) Satisfaction metric for customer tickets
CN107809550A (en) The method and apparatus of adjustment business speech play order
US20200051143A1 (en) Price estimation system
CN110866767A (en) Method, device, equipment and medium for predicting satisfaction degree of telecommunication user
US20190333176A1 (en) Recording recommendation method, device, apparatus and computer-readable storage medium
WO2015175835A1 (en) Click through ratio estimation model
US20210342744A1 (en) Recommendation method and system and method and system for improving a machine learning system
WO2020000875A1 (en) Data processing method and electronic device
CN109451334B (en) User portrait generation processing method and device and electronic equipment
US11521601B2 (en) Detecting extraneous topic information using artificial intelligence models
JP2018190462A (en) Providing device, providing method, and providing program
US20180315130A1 (en) Intelligent data gathering
US20210110410A1 (en) Methods, systems, and apparatuses for providing data insight and analytics
WO2017016403A1 (en) Method and apparatus for determining brand index information about service object
CN109241249B (en) Method and device for determining burst problem
Rand-Hendriksen et al. A shortcut to mean-based time tradeoff tariffs for the EQ-5D?
Fleissig Return on investment from training programs and intensive services
US11068236B2 (en) Identification of users across multiple platforms
KR102480239B1 (en) A method of managing and matching merchant for user

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18923826

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205 DATED 30/03/2021)

122 Ep: pct application non-entry in european phase

Ref document number: 18923826

Country of ref document: EP

Kind code of ref document: A1