CN113919437A

CN113919437A - Method, device, equipment and storage medium for generating client portrait

Info

Publication number: CN113919437A
Application number: CN202111231874.4A
Authority: CN
Inventors: 林国武; 徐宝莲
Original assignee: Ping An Technology Shenzhen Co Ltd
Current assignee: Ping An Technology Shenzhen Co Ltd
Priority date: 2021-10-22
Filing date: 2021-10-22
Publication date: 2022-01-11

Abstract

The application is applicable to the technical field of artificial intelligence and provides a method, a device, equipment and a storage medium for generating a customer portrait. The method comprises the following steps: collecting data information of a client; processing each group of target data through a natural language processing model to obtain corresponding client tags; processing the client label through a credibility rating model to obtain a credibility rating of the client label; and generating a customer portrait of the customer according to the credibility scores of each group of target data and the customer labels corresponding to each group of target data. In the scheme, the target data of different sources of the client are collected, so that the collected data of the client are more comprehensive, and the carved client portrait is more accurate. The target data are processed through the natural language processing model to obtain the client tags corresponding to the target data, then the client tags are processed through the credibility scoring model, the client tags with high credibility and effectiveness can be selected, and real, effective and accurate client figures are constructed according to the client tags.

Description

Method, device, equipment and storage medium for generating client portrait

Technical Field

The present application relates to the field of artificial intelligence technology, and in particular, to a method, an apparatus, a device, and a storage medium for generating a customer figure.

Background

With the continuous development of internet technology and the endless promotion of data mining technology, people can extract interesting data segments from mass data. The relation between the data segments is found in the mass data through an analysis tool, and then the data segments are used for predicting the development of business and industry.

The construction of the customer portrait based on big data is to divide the customers into different groups, and the customers have very similar customer characteristics in each group. The difference in customer characteristics is large between different groups. By constructing the distinction of the client images to different groups, each group can be effectively managed, and the corresponding business expansion of each group is facilitated.

For example, a bank may build a customer representation of a customer to better enable accurate marketing and risk identification.

However, in the existing method for constructing the client portrait, the acquired data is not comprehensive and the credibility is not high, so that the constructed client portrait is not accurate.

Disclosure of Invention

In view of this, embodiments of the present application provide a method, an apparatus, a device, and a storage medium for generating a client portrait, so as to solve the problem that a constructed client portrait is inaccurate due to incomplete collected data and low reliability in the existing method for constructing a client portrait.

A first aspect of an embodiment of the present application provides a method of generating a client representation, the method comprising:

collecting data information of a client, wherein the data information comprises a plurality of groups of target data, and the sources of the target data of each group are different;

processing each group of target data through a trained natural language processing model to obtain a client label corresponding to each group of target data;

processing the client label corresponding to each group of target data through a trained credibility scoring model to obtain credibility score of the client label corresponding to each group of target data;

and generating a customer portrait of the customer according to the credibility scores of each group of the target data and the customer labels corresponding to each group of the target data.

Optionally, the processing, by using the trained credibility score model, the client label corresponding to each group of target data to obtain a credibility score of the client label corresponding to each group of target data includes:

calculating the time factor score from the present to the present corresponding to each client label through the credibility scoring model;

calculating the source score of each group of target data through the credibility scoring model;

calculating the label updating times score corresponding to each client label through the credibility scoring model;

calculating the label feedback frequency score corresponding to each client label through the credibility scoring model;

and determining the credibility score of the client label corresponding to each group of target data according to the current time length factor score, the label updating frequency score, the label feedback frequency score and the source score of each group of target data corresponding to each client label.

Optionally, the calculating, by the credibility scoring model, a time-to-date long factor score corresponding to each customer label includes:

calculating a time length coefficient from the present time corresponding to each customer label through a time attenuation function;

and calculating to obtain the time factor scores at present corresponding to each client label according to the time factors at present and the credibility score model.

Optionally, the processing each set of target data through the trained natural language processing model to obtain a client tag corresponding to each set of target data includes:

extracting keywords corresponding to each group of target data through the natural language processing model;

and determining the client label corresponding to each group of target data according to the keyword corresponding to each group of target data.

Optionally, after the client label corresponding to each group of target data is processed through the trained credibility score model to obtain the credibility score of the client label corresponding to each group of target data, the method further includes:

collecting feedback data to the customer;

and updating the credibility rating model according to the feedback data.

Optionally, the updating the reliability scoring model according to the feedback data includes:

setting a first label corresponding to the positive feedback data and a second label corresponding to the negative feedback data;

inputting the positive feedback data, the negative feedback data, the first label and the second label into the credibility rating model for training;

and updating the parameters of the credibility rating model according to the training result to obtain an updated credibility rating model.

Optionally, before the processing each set of target data through the trained natural language processing model to obtain the customer label corresponding to each set of target data, the method further includes:

acquiring a sample training set, wherein the sample training set comprises a plurality of groups of sample data and sample labels corresponding to the sample data of each group;

training an initial natural language processing network based on the sample training set, and updating parameters of the initial natural language processing network based on a training result;

and when detecting that the loss function corresponding to the initial natural language processing network is converged, obtaining the natural language processing model.

A second aspect of an embodiment of the present application provides an apparatus for generating a client representation, comprising:

the system comprises a collecting unit, a processing unit and a processing unit, wherein the collecting unit is used for collecting data information of a client, the data information comprises a plurality of groups of target data, and the sources of the target data of each group are different;

the first processing unit is used for processing each group of target data through a trained natural language processing model to obtain a client label corresponding to each group of target data;

the second processing unit is used for processing the client labels corresponding to each group of target data through a trained credibility score model to obtain credibility scores of the client labels corresponding to each group of target data;

and the generating unit is used for generating a customer portrait of the customer according to the credibility scores of each group of target data and the customer labels corresponding to each group of target data.

A third aspect of embodiments of the present application provides a device for generating a client representation, comprising a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor, when executing the computer program, performs the steps of the method for generating a client representation as described in the first aspect.

A fourth aspect of embodiments of the present application provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the method of generating a customer representation as described above in relation to the first aspect.

A fifth aspect of embodiments of the present application provides a computer program product, which, when run on an apparatus, causes the apparatus to perform the steps of the method of generating a client representation as described in the first aspect above.

The method, the device, the equipment and the storage medium for generating the client portrait have the following beneficial effects:

collecting data information of a client, wherein the data information comprises a plurality of groups of target data, and the sources of the target data of each group are different; processing each group of target data through the trained natural language processing model to obtain a client label corresponding to each group of target data; processing the client label corresponding to each group of target data through the trained credibility scoring model to obtain credibility score of the client label corresponding to each group of target data; and generating a customer portrait of the customer according to the credibility scores of each group of target data and the customer labels corresponding to each group of target data. In the scheme, the target data of different sources of the client are collected, so that the collected data of the client are more comprehensive, and the carved client portrait is more accurate. The method comprises the steps of processing target data of different sources through a trained natural language processing model to obtain client labels corresponding to the target data of each source, processing the client labels corresponding to the target data of each source through a trained credibility scoring model, selecting the client labels with high credibility and effectiveness under the condition that the client labels obtained according to multiple sources are inconsistent, and further constructing real, effective and accurate client pictures according to the client labels, so that the method is beneficial to managing, accurate marketing, risk identification and the like of clients according to the client pictures subsequently.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the embodiments or the prior art descriptions will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive exercise.

FIG. 1 is a schematic flow chart diagram of a method of generating a client representation as provided by an exemplary embodiment of the present application;

FIG. 2 is a flowchart detailing a step S102 of a method for generating a client representation according to yet another exemplary embodiment of the present application;

FIG. 3 is a flowchart detailing a step S103 of a method for generating a client representation according to an exemplary embodiment of the present application;

FIG. 4 is a schematic flow chart diagram illustrating a method for generating a client representation in accordance with yet another exemplary embodiment of the present application;

FIG. 5 is a schematic diagram of an apparatus for generating a client representation according to an embodiment of the present application;

FIG. 6 is a schematic diagram of an apparatus for generating a client representation according to another embodiment of the present application.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.

In the description of the embodiments of the present application, "/" means "or" unless otherwise specified, for example, a/B may mean a or B; "and/or" herein is merely an association describing an associated object, and means that there may be three relationships, e.g., a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, in the description of the embodiments of the present application, "a plurality" means two or more than two.

In the following, the terms "first", "second" are used for descriptive purposes only and are not to be understood as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include one or more of that feature. In the description of the present embodiment, "a plurality" means two or more unless otherwise specified.

For example, a bank can construct a client portrait for a client, tag information of the client, establish a tag library by collecting and analyzing data such as basic information, assets and liabilities, transaction data, behavior information, third parties and the like of the client, and present a 360-degree panoramic image for each client so as to be better used in scenes such as precise marketing, risk identification and the like.

However, the existing method for constructing the client portrait is mainly the basic data collection of the client. For example, data such as interests, hobbies, characters, impressions and intentions of customers are collected, so that the collected customer data are not comprehensive and have low credibility, and the constructed customer portrait is inaccurate, which is not favorable for banks to realize accurate marketing and risk identification.

In view of this, the present application provides a method for generating a customer portrait, which collects target data from different sources of a customer, so that the collected data of the customer is more comprehensive, and the portrayed customer portrait is more accurate. The method comprises the steps of processing target data of different sources through a trained natural language processing model to obtain client labels corresponding to the target data of each source, processing the client labels corresponding to the target data of each source through a trained credibility scoring model, selecting the client labels with high credibility and effectiveness under the condition that the client labels obtained according to multiple sources are inconsistent, and further constructing real, effective and accurate client pictures according to the client labels, so that the method is beneficial to managing, accurate marketing, risk identification and the like of clients according to the client pictures subsequently.

The embodiment of the application can acquire and process related data based on an artificial intelligence technology. Among them, Artificial Intelligence (AI) is a theory, method, technique and application system that simulates, extends and expands human Intelligence using a digital computer or a machine controlled by a digital computer, senses the environment, acquires knowledge and uses the knowledge to obtain the best result.

The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a robot technology, a biological recognition technology, a voice processing technology, a natural language processing technology, machine learning/deep learning/supervised learning and the like.

Referring to FIG. 1, FIG. 1 is a schematic flow chart diagram illustrating a method for generating a client representation according to an exemplary embodiment of the present application. The execution subject of the method for generating the client portrait provided by the present application is a device, wherein the device includes, but is not limited to, a mobile terminal such as a smart phone, a tablet computer, a Personal Digital Assistant (PDA), a desktop computer, and the like, and may further include various types of servers. For example, the server may be an independent server, or may be a cloud service that provides basic cloud computing services such as a cloud service, a cloud database, cloud computing, a cloud function, cloud storage, a web service, cloud communication, a middleware service, a domain name service, a security service, a Content Delivery Network (CDN), and a big data and artificial intelligence platform.

In the embodiments of the present application, an execution subject is taken as an example of a computer terminal.

A method of generating a client representation as shown in FIG. 1 may include: s101 to S104 are as follows:

s101: the method comprises the steps of collecting data information of a client, wherein the data information comprises a plurality of groups of target data, and the sources of the target data of each group are different.

Illustratively, the data information is any information related to the customer. The data information includes a plurality of sets of target data, each set of target data may include any one or any combination of basic information, business information, transaction information, account opening information, behavior information, and the like of the customer, and the information collected actually is taken as the main, which is not limited herein.

The basic information may include the name, sex, age, identification number, hobbies, character features, contact information, family address, working condition, financing intention, financing preference, assets, liabilities, risk tolerance, etc. of the client.

The business information may include the business that the customer has transacted (e.g., any type of financial business), the business that is desired to transact, the previously transacted business, etc.

The account opening information may include transaction account numbers which are opened by the customer at different banks according to different services.

The transaction information may include data generated when the customer conducts any business transaction. For example, a customer purchases a financial product (e.g., purchases funds, purchases stocks, purchases bonds, periodic deposits, etc.), redeems a financial product (e.g., redeems funds, redeems stocks, redeems bonds, redeems deposits, etc.), and so forth.

The behavior information may include behavior operations (such as favorite, praise, focus, mark as dislike, etc.) of the client on various services. For example, a customer has collected a fund, a stock, has endorsed an article from a fund manager, etc.

Optionally, the target data may also include an impression of the customer. For example, the administrator's evaluation of the customer, the evaluation of the customer by other customers, the evaluation of the customer by the business, and the like.

The sources of the target data groups are different, and the target data groups can also be understood to come from different channels. For example, sources may include financial systems, financial software, social software, questionnaires, telephone exchanges, online questioning (e.g., a customer asking the customer through an Application (App), applet, web page, etc., or automated question and answer robot, etc.), third party software, customer service revisits, etc.

Illustratively, data information for the customer is collected in these sources. Specifically, a questionnaire will be described as an example. The questionnaire is preset, so that simple questionnaires (such as basic information, hobbies, personality characteristics, customer impressions, customer intentions and the like of customers) can be set, and complex questionnaires (such as risk bearing capacity of customers, family education of customers, family background and the like) can be set. The target data of the client may be collected in the form of an online questionnaire, or may be collected in the form of an offline questionnaire, which is not limited.

It should be noted that a complex questionnaire may configure the cascade relationship between questions (for example, a question asking whether a customer is married, and when the customer selects married, further asking whether the customer has children, etc.), the customization conditions, etc., and further collect richer and more comprehensive target data of the customer.

And richer and more comprehensive target data of the client can be conveniently collected in a telephone communication mode. For example, the service person may ask the customer various questions from all directions while communicating with the customer by telephone. The description is given for illustrative purposes only and is not intended to be limiting.

In the embodiment, the data information of the client is collected through different channels, so that the collected data information of the client is more comprehensive and richer.

S102: and processing each group of target data through the trained natural language processing model to obtain a client label corresponding to each group of target data.

Illustratively, a Natural Language Processing (NLP) model is trained on a sample training set using machine learning. The sample training set comprises a plurality of groups of sample data and sample labels corresponding to each group of sample data.

It can be understood that the NLP model may be trained in advance by the terminal, or a file corresponding to the NLP model may be transplanted to the terminal after being trained in advance by another device. That is, the execution subject for training the NLP model may be the same as or different from the execution subject for using the NLP model.

For example, when the initial natural language processing network is trained by other devices, after the initial natural language processing network is trained by other devices, the parameters of the initial natural language processing network are fixed, and the file corresponding to the trained natural language processing model is obtained. And then migrate the file to the terminal.

For example, in order to improve the accuracy of the processing result of the natural language processing model, each set of target data may be preprocessed. The preprocessing refers to extracting effective characters in each group of target data, or the preprocessing can also be removing redundant information in each group of target data.

Wherein the valid character refers to information having a practical meaning in the target data. And for each group of target data, when the effective characters in the extracted target data are preprocessed, the preprocessed target data are generated by combining the effective characters according to the sequence when the effective characters are extracted.

Redundant information refers to information that has no practical significance in the target data. For example, the redundant information may be stop words, punctuation marks, etc. in the target data. The stop words refer to words without practical meaning, and are usually qualifiers, moods, adverbs, prepositions, conjunctions, English characters, numbers, mathematical characters, and the like.

And aiming at each group of target data, when the target data is preprocessed to remove redundant information in the target data, the preprocessed target data is generated by sequentially combining the residual data after the redundant information in the target data is removed.

And aiming at each group of target data, inputting the group of target data after preprocessing into a trained natural language processing model for processing, and outputting a client label corresponding to the group of target data by the natural language processing model.

The client tag is a representation of the client characteristics, and the client tag corresponding to each group of target data can be one or more. In different application scenarios, the obtained client tags are different.

For example, for the purpose of better accurate marketing and risk identification, a bank collects data information of a client by the method in S101, processes a certain set of target data after preprocessing by a natural language processing model, and obtains a client tag, which may include investment preference (e.g., fund, stock, futures, etc.) of the client, risk tolerance (e.g., 20% loss, 50% loss, failure to withstand loss, etc.) of the client, assets of the client, credit of the client, social attributes (occupation, frequent residence), consumption ability, activity, physical health status, etc. The description is given for illustrative purposes only and is not intended to be limiting.

S103: and processing the client label corresponding to each group of target data through the trained credibility scoring model to obtain the credibility score of the client label corresponding to each group of target data.

The credibility model is a multi-level comprehensive scoring model, the influence of different dimensions on the credibility of the client labels is fully considered, the finally obtained credibility score of the client labels is very accurate, and the credibility score of the client labels corresponding to each group of target data is favorable for subsequent scoring according to the credibility of the client labels corresponding to each group of target data, and the client label with the highest credibility and the target data corresponding to the client label with the highest credibility are selected from the client labels. And further, a client portrait is generated according to the client label with the highest reliability and the target data corresponding to the client label with the highest reliability, and the accuracy of the client portrait is improved.

Illustratively, the credibility model considers the influence of multiple dimensions such as time length from present, collection source, tag updating times, tag feedback times and the like on the credibility of the client tag.

The current time length refers to the time length from the acquisition of each group of target data to the present, the acquisition source refers to the source of each group of target data, the label updating times refer to the times of updating of the client labels corresponding to each group of target data, and the label feedback times refer to the times of feedback of staff, managers, clients, other clients and the like to the client labels corresponding to each group of target data.

For example, different weight values may be set for the information of each dimension in advance, the client tag corresponding to each group of target data is calculated through a credibility scoring model, the initial score corresponding to each dimension is multiplied by the weight value corresponding to each dimension, the sum of the products is calculated, and finally the credibility score of the client tag corresponding to each group of target data is obtained.

For example, the weighted values corresponding to the current time length, the acquisition source, the tag updating times and the tag feedback times are m, n, o and p respectively. The sum of m, n, o, and p is 1, and the specific value may be set by a user according to an actual situation, which is not limited herein. For example, m, n, o, p are 0.3, 0.2, respectively.

And calculating the client label corresponding to each group of target data through a credibility scoring model, wherein the initial score corresponding to the current time length is a, the initial score corresponding to the acquisition source is b, the initial score corresponding to the label updating times is c, and the initial score corresponding to the label feedback times is d. The values a, b, c and d may be in a percentage system or a tenth system, and the specific values may be set by the user according to the actual situation, which is not limited herein.

Multiplying the initial score a corresponding to the current time length by 0.3, multiplying the initial score b corresponding to the acquisition source by 0.3, multiplying the initial score c corresponding to the tag updating times by 0.2, and multiplying the initial score d corresponding to the tag feedback times by 0.2. And adding all the obtained products, wherein the sum of the products is the credibility score of the client label corresponding to the group of target data. The description is given for illustrative purposes only and is not intended to be limiting.

S104: and generating a customer portrait of the customer according to the credibility scores of each group of target data and the customer labels corresponding to each group of target data.

Illustratively, the customer label with the highest credibility score is selected as the customer portrait of the customer. For example, the client tags corresponding to each set of target data may be sorted from high to low according to the credibility score, and the top client tag is selected as the client representation of the client. And sorting the client labels corresponding to each group of target data according to a mode that the credibility score is from low to high, and selecting the client label ranked at the last as the client portrait of the client. The description is given for illustrative purposes only and is not intended to be limiting.

Optionally, in a possible implementation manner, a client tag with the highest credibility score and target data corresponding to the client tag may be selected to generate a client representation of the client together. For example, the target data corresponding to the client tag selects some keywords that are not included in the client tag, and the selected keywords are used together with the client tag as the client representation of the client. The description is given for illustrative purposes only and is not intended to be limiting.

Alternatively, the bank may perform management, precision marketing, risk identification, etc. for the customer according to the customer image of the customer. For example, if the client image of the client shows that the credit rating of the client is extremely low, the bank refuses to transact the client when the client transacts a business such as loan or credit card. As another example, a customer representation of the customer shows that the customer prefers funds to make money, and a bank may recommend funds to the customer for money products. The description is given for illustrative purposes only and is not intended to be limiting.

In the prior art, objective data of a client, such as interests, hobbies, character characteristics and the like of the client, are usually collected, and the data is relatively single, so that the portrayed client portrait is inaccurate. In the scheme, the target data of different sources of the client are collected, so that the collected data of the client are more comprehensive, and the carved client can figure more accurately. The method comprises the steps of processing target data of different sources through a trained natural language processing model to obtain client labels corresponding to the target data of each source, processing the client labels corresponding to the target data of each source through a trained credibility scoring model, selecting the client labels with high credibility and effectiveness under the condition that the client labels obtained according to multiple sources are inconsistent, and further constructing real, effective and accurate client pictures according to the client labels, so that the method is beneficial to managing, accurate marketing, risk identification and the like of clients according to the client pictures subsequently.

Referring to FIG. 2, FIG. 2 is a flowchart detailing a step S102 of a method for generating a client representation according to yet another exemplary embodiment of the present application; optionally, in some possible implementations of the present application, the S102 may include S1021 to S1022, which are as follows:

s1021: and extracting the keywords corresponding to each group of target data through a natural language processing model.

Illustratively, the target data for each set is processed one by one. And performing word segmentation processing on the group of target data through a natural language processing model to obtain a plurality of word segments corresponding to the group of target data. The natural language processing module may include a word segmentation algorithm, and the word segmentation algorithm performs word segmentation on the set of target data to obtain a plurality of words.

Specifically, a dictionary tree can be generated through a ditt.txt dictionary in a word segmentation algorithm, a directed acyclic graph is generated according to target data and the dictionary tree, a maximum probability path is searched in the directed acyclic graph, a word segmentation mode is determined, word segmentation is performed on the target data according to the word segmentation mode, and a plurality of words are obtained.

And determining a keyword corresponding to the group of target data in the plurality of segmented words. Illustratively, the keywords corresponding to the set of target data may be determined among the plurality of participles through a commonly used weighting technique (TF-IDF) algorithm for information retrieval and data mining. The specific processing method can refer to the prior art, and is not described herein again.

S1022: and determining the client label corresponding to each group of target data according to the keyword corresponding to each group of target data.

And mapping each keyword through a network layer in the natural language processing model, namely mapping each keyword to a public semantic space, and outputting a word vector characteristic corresponding to each keyword. And classifying the word vector characteristics through a full connection layer in the natural language processing model to obtain a client label corresponding to each word vector characteristic. And combining the obtained plurality of client tags to obtain the client tag corresponding to the group of target data.

And carrying out the processing on each group of target data to obtain a client label corresponding to each group of target data.

Optionally, after the client tag corresponding to each group of target data is obtained, the client tag corresponding to each group of target data may be adjusted to obtain the client tag corresponding to each group of target data after adjustment. For example, an administrator or a worker manually adds tags in batches, adds tags through a business process, adds tags through preset rules, supplements tags through external links, and the like.

In the scheme, each group of target data is processed through the trained natural language processing model, the client label corresponding to each group of target data can be accurately obtained, and meanwhile, the efficiency of determining the client label is improved. Furthermore, the determined client label can be adjusted, so that the client label is perfected, the finally obtained client label can show the characteristics of the client most, and the follow-up accurate portrait drawing of the client is facilitated.

Referring to FIG. 3, FIG. 3 is a flowchart illustrating an exemplary embodiment of a method for generating a client representation, including step S103; optionally, in some possible implementations of the present application, the S103 may include S1031 to S1035, which are as follows:

s1031: and calculating the time factor score up to the present corresponding to each client label through a credibility scoring model.

Illustratively, the time-to-date factor score is a time-to-date score corresponding to the customer label. The current time length corresponding to the client tag is the time length from the current time length of the target data corresponding to the client tag.

For example, the time length of each group of target data to the present can be counted one by one, and each counted time length is calculated through a credibility scoring model to obtain the time length factor score corresponding to each client label.

Optionally, in some possible implementation manners of the present application, S1031 may include S10311 to S10312, which are specifically as follows:

s10311: and calculating a time-length coefficient from the current time corresponding to each customer label through a time attenuation function.

Illustratively, the credibility score model may include a time decay function, and the time-to-date coefficient corresponding to each customer label is calculated through the time decay function. The time decay function is as follows:

in the above formula (1), n_i(T) represents a time length coefficient up to the present corresponding to the set of target data, T represents the current time, and T represents the collection of the set of target dataAnd α represents the attenuation coefficient.

The longer the time is, the larger the difference between the current time and the time for acquiring the set of target data is, and the smaller the time coefficient is from now. So that the final calculated time-long factor score is smaller. Colloquially, it is understood that the longer the set of target data is collected, the less the target data will affect the confidence calculation of the customer label.

S10312: and calculating to obtain the time factor score of the current time corresponding to each client label according to the time factor of the current time and the credibility scoring model.

Illustratively, scores corresponding to different time segments are preset and stored in the credibility score model. And judging which time length section each time length from the present belongs to, and acquiring a score corresponding to each time length section to obtain a score corresponding to each group of target data. And for each group of target data, multiplying the calculated score of the group of target data by the current-time length coefficient corresponding to the group of target data, wherein the obtained product is the current-time length factor score corresponding to the client label corresponding to the group of target data.

S1032: and calculating the source score of each group of target data through a credibility scoring model.

Illustratively, scores corresponding to different sources are preset and stored in the credibility score model. For example, the source is that the administrator fills out a score that is greater than the customer's own filling out score, the customer's own filling out score is greater than the machine identification score, and so on. The description is given for illustrative purposes only and is not intended to be limiting.

And for each group of target data, multiplying the calculated score of the source of the group of target data by the source weight value corresponding to the group of target data, and obtaining the product which is the source score corresponding to the group of target data.

Alternatively, a coefficient may be set as a source according to actual conditions, and the coefficient is similar to the weight value. And for each group of target data, multiplying the calculated score of the source of the group of target data by the source coefficient corresponding to the group of target data, and obtaining the product which is the source score corresponding to the group of target data. The description is given for illustrative purposes only and is not intended to be limiting.

In the prior art, when the same customer faces multi-channel collection results and the determined customer labels are inconsistent, the appropriate customer label cannot be determined. In the embodiment, the influence of the dimensionality of different channels on the authenticity of the client label is considered, so that the finally obtained credibility score of the client label is very accurate.

S1033: and calculating the label updating times score corresponding to each client label through a credibility scoring model.

Illustratively, scores corresponding to different updating times of the tags are preset and stored in the credibility score model. For example, a score x is assigned when the number of tag updates is 3, and a score y is assigned when the number of tag updates is 5.

And counting the updating times of the label corresponding to each client label. Whether the client, the administrator, the staff or the natural language processing model changes the client label, the label is independently updated as a label.

And for each group of target data, multiplying the score corresponding to the tag updating times of the group of target data obtained by calculation by the weighted value of the tag updating times, wherein the obtained product is the tag updating time score of the client tag corresponding to the group of target data.

Optionally, a coefficient may also be set for the tag update times according to the actual situation, and the coefficient is similar to the weight value. And for each group of target data, multiplying the calculated score corresponding to the label updating times of the group of target data by the coefficient of the label updating times, wherein the obtained product is the label updating time score of the client label corresponding to the group of target data. The description is given for illustrative purposes only and is not intended to be limiting.

S1034: and calculating the label feedback frequency score corresponding to each client label through a credibility scoring model.

Illustratively, scores corresponding to different label feedback times are preset and stored in the credibility score model. For example, a score X is associated with a tag feedback count of 3, and a score Y is associated with a tag feedback count of 4.

And counting the label feedback times corresponding to each client label. The feedback of the client label, whether the client, the administrator or the staff, is independently regarded as the one-time label feedback.

And for each group of target data, multiplying the score corresponding to the label feedback times of the group of target data obtained by calculation by the weighted value of the label feedback times, wherein the obtained product is the label feedback time score of the client label corresponding to the group of target data.

Optionally, a coefficient may also be set for the tag feedback times according to the actual situation, and the coefficient is similar to the weight value. And for each group of target data, multiplying the calculated score corresponding to the label updating times of the group of target data and the coefficient of the label feedback times, wherein the obtained product is the label feedback time score of the client label corresponding to the group of target data. The description is given for illustrative purposes only and is not intended to be limiting.

S1035: and determining the credibility score of the client label corresponding to each group of target data according to the current time length factor score, the label updating frequency score, the label feedback frequency score and the source score of each group of target data corresponding to each client label.

Illustratively, calculating the sum of the time factor score, the tag updating frequency score, the tag feedback frequency score and the source score of each group of target data corresponding to each client tag to obtain the credibility score of the client tag corresponding to each group of target data.

Optionally, the credibility score of the customer label corresponding to each set of target data can also be calculated through a logistic regression algorithm. The formula is as follows:

in the above formula (2), h_θ(x) Representing a confidence score, theta, of the customer label corresponding to each set of target data^Tx＝θ₁x₁+θ₂x₂+θ₃x₃+θ₄x₄+……θ_nx_n，θ₁、θ₂、θ₃And theta₄And respectively representing a time length coefficient, a source coefficient, a coefficient of the tag updating times and a coefficient of the tag feedback times. x is the number of₁、x₂、x₃And x₄Respectively representing the score corresponding to the current time length, the score of the source of the target data, the score corresponding to the label updating times and the score corresponding to the label updating times. Theta_nRepresenting coefficients, x, corresponding to other dimensions_nRepresenting scores corresponding to other dimensions.

In the implementation mode, the client labels corresponding to each group of target data are processed through the credibility scoring model, and the influence of different dimensions on the credibility of the client labels is fully considered, so that the finally obtained credibility scoring of the client labels is very accurate, and the subsequent credibility scoring according to the client labels corresponding to each group of target data is facilitated, and the client label with the highest credibility and the target data corresponding to the client label with the highest credibility are selected from the client labels. And further, a client portrait is generated according to the client label with the highest reliability and the target data corresponding to the client label with the highest reliability, and the accuracy of the client portrait is improved.

Turning to FIG. 4, FIG. 4 is a schematic flow chart diagram illustrating a method for generating a client representation in accordance with yet another exemplary embodiment of the present application. The embodiment of the present invention differs from the embodiment corresponding to fig. 1 in that after S204, the embodiment further includes S205 to S206, where S201 to S204 in the present embodiment are completely the same as S101 to S104 in the embodiment corresponding to fig. 1, and reference is specifically made to the description related to S101 to S104 in the previous embodiment, which is not repeated herein. S205-S206 are specifically as follows:

s205: feedback data to the customer is collected.

The feedback data may include feedback on the customer tags of the customers, feedback on various sets of target data for the customers. The feedback data may be data matching the previously obtained client tag and the target data of each group of the client, or may be correction data for the previously obtained client tag and the target data of each group of the client.

Illustratively, feedback data may be collected for an administrator, staff member, customer, or other customer to the customer. Taking the administrator as an example for explanation, when the administrator looks at the client label and each group of target data of the client, the administrator finds that some client labels and/or some data are not in accordance with the actual situation of the client, and arranges the non-conforming data into feedback data. Or when the administrator looks up the client label and each group of target data of the client, the administrator finds that the information is consistent with the actual situation of the client, confirms the client label and each group of target data, and arranges the confirmed data into feedback data.

For example, the administrator may determine each customer label, and use the target data of the customer label having no problem as positive feedback data, and use the target data of the customer label having a problem as negative feedback data.

S206: and updating the credibility scoring model according to the feedback data.

And inputting the feedback data into the credibility scoring model for training, and updating the credibility scoring model according to the training result to obtain an updated credibility scoring model. And then, processing the client label of the client by using the updated credibility scoring model to obtain a new credibility score.

If a new customer label is determined based on the new credibility score, the customer representation is updated based on the new customer label accordingly. If the customer label determined from the new credibility score is consistent with the previously determined customer label, the customer representation is not updated.

In the scheme, the credibility scoring model is continuously updated and adjusted according to the feedback data, so that the finally carved customer portrait is finer and more accurate.

Optionally, in some possible implementations of the present application, the S206 may include S2061 to S2063, which are as follows:

s2061: and setting a first label corresponding to the positive feedback data and a second label corresponding to the negative feedback data.

Illustratively, the feedback data includes positive feedback data and negative feedback data. The positive feedback data is data that recognizes the current client tag and each set of target data. The negative feedback data is data that has inconsistent data with the current customer label and/or sets of target data.

Illustratively, the first label corresponding to the positive feedback data is set to 0, and the second label corresponding to the negative feedback data is set to 1.

S2062: and inputting the positive feedback data, the negative feedback data, the first label and the second label into a credibility scoring model for training.

Illustratively, the positive feedback data and the first label corresponding to the positive feedback data are input into the credibility score model for training. And inputting the negative feedback data and a second label corresponding to the negative feedback data into the credibility rating model for training.

The processing of these data by the confidence score model in the training process may refer to the description in S103, and is not described herein again.

S2063: and updating the parameters of the credibility rating model according to the training result to obtain the updated credibility rating model.

Illustratively, when the training data is positive feedback data and the first label corresponding to the positive feedback data, the training result is consistent with the previous training result, and the parameters of the credibility score model are not updated.

When the training data is negative feedback data and a second label corresponding to the negative feedback data, the training result changes, and at this time, the parameter of the reliability scoring model is updated (for example, the weight value of the reliability scoring model is adjusted), so that when the training data is positive feedback data and a first label corresponding to the positive feedback data, the training result is also consistent with the previous training result, and at this time, the parameter of the reliability scoring model is not updated.

In the above manner, the credibility scoring model continues to be trained through the positive feedback data, the negative feedback data, the first label and the second label, and parameters in the credibility scoring model are updated, so that the scoring function of the credibility scoring model is more and more accurate.

Optionally, in some possible implementations of the present application, before performing the method shown in fig. 1, a method of training a natural language processing model may be further included, and the method of training the natural language processing model may include: s301 to S303 are as follows:

s301: and acquiring a sample training set.

Illustratively, the sample training set includes a plurality of sets of sample data, and a sample label corresponding to each set of sample data. The sample data is data information of the client.

Multiple groups of sample data can be collected in a network, a bank database and the like, and a sample label is marked on each group of sample data.

Optionally, a part of data in the sample training set may be used as a sample testing set, so as to facilitate subsequent testing of the natural language processing model in training. For example, a plurality of sets of sample data are selected from the sample training set, and the sample labels corresponding to the sample data are the sample test set.

S302: training the initial natural language processing network based on the sample training set, and updating parameters of the initial natural language processing network based on training results.

Exemplarily, each group of sample data of the sample training set is processed through an initial natural language processing network (a natural language processing model before training), so as to obtain an actual tag corresponding to each group of sample data. The specific process of the initial natural language processing network for processing the sample data may refer to the process in S102, which is not described herein again.

And when the preset training times are reached, testing the initial natural language processing network at the moment. Exemplarily, the sample data in the sample test set is input into the initial natural language processing network at this time for processing, and the initial natural language processing network at this time outputs an actual tag corresponding to the sample data. And calculating a loss value between the actual label and a sample label corresponding to the sample data in the sample test set based on a preset loss function.

And when the loss value is detected not to meet the preset condition, adjusting the parameters of the initial natural language processing network, and continuing to train the initial natural language processing network. And when the loss value meets the preset condition, stopping training the initial natural language processing network, and taking the trained initial natural language processing network as a trained natural language processing model.

S303: and when detecting that the loss function corresponding to the initial natural language processing network is converged, obtaining the natural language processing model.

For example, assume that the preset condition is that the loss value is less than or equal to a preset loss value threshold. Then, when the loss value is greater than the loss value threshold, parameters of the initial natural language processing network are adjusted and training of the initial natural language processing network is continued. And when the loss value is less than or equal to the loss value threshold value, stopping training the initial natural language processing network, and taking the trained initial natural language processing network as a trained natural language processing model. The description is given for illustrative purposes only and is not intended to be limiting.

Illustratively, in the process of training the initial natural language processing network, the convergence condition of the loss function corresponding to the initial natural language processing network may also be observed. When the loss function is not converged, adjusting parameters of the initial natural language processing network, and continuing to train the initial natural language processing network based on the sample training set. And when the loss function is converged, stopping training the initial natural language processing network, and taking the trained initial natural language processing network as a trained natural language processing model. Wherein, the convergence of the loss function means that the value of the loss function tends to be stable. The description is given for illustrative purposes only and is not intended to be limiting.

In the embodiment, the natural language processing model is trained, so that each group of target data can be conveniently and accurately processed by adopting the natural language processing model subsequently, and the speed and the accuracy of generating the customer images are improved laterally.

Optionally, in some possible implementations of the present application, the method for generating a client representation provided by the present application may be applied in the medical field, for example, the method for generating a client representation is used to represent patients, so as to facilitate doctors to better manage and treat the patients. The side surface improves the sense of well-being of the patient and is beneficial to promoting the relation between doctors and patients.

Referring to FIG. 5, FIG. 5 is a diagram illustrating an apparatus for generating a client representation according to an embodiment of the present application. The device comprises units for performing the steps in the embodiments corresponding to fig. 1-4. Please refer to the related description of the embodiments corresponding to fig. 1 to 4. For convenience of explanation, only the portions related to the present embodiment are shown. Referring to fig. 5, it includes:

the system comprises an acquisition unit 410, a processing unit and a display unit, wherein the acquisition unit 410 is used for acquiring data information of a client, the data information comprises a plurality of groups of target data, and the sources of the target data of each group are different;

a first processing unit 420, configured to process each set of the target data through a trained natural language processing model to obtain a client tag corresponding to each set of the target data;

the second processing unit 430 is configured to process, through a trained credibility score model, the client label corresponding to each group of target data to obtain a credibility score of the client label corresponding to each group of target data;

and the generating unit 440 is configured to generate a customer portrait of the customer according to each set of the target data and the credibility score of the customer label corresponding to each set of the target data.

Optionally, the first processing unit 420 is specifically configured to:

Optionally, the second processing unit 430 is specifically configured to:

Optionally, the second processing unit 430 is further configured to:

Optionally, the apparatus further comprises:

the feedback data acquisition unit is used for acquiring feedback data of the client;

and the updating unit is used for updating the credibility scoring model according to the feedback data.

Optionally, the updating unit is specifically configured to:

Optionally, the apparatus further comprises a training unit configured to:

Referring to FIG. 6, FIG. 6 is a diagram illustrating an apparatus for generating a client representation according to another embodiment of the present application. As shown in fig. 6, the apparatus 5 of this embodiment includes: a processor 50, a memory 51 and a computer program 52 stored in said memory 51 and executable on said processor 50. The processor 50, when executing the computer program 52, implements the steps in the various method embodiments described above for generating a client representation, such as S101-S104 shown in FIG. 1. Alternatively, the processor 50 implements the functions of the units in the above embodiments, such as the functions of the units 410 to 440 shown in fig. 5, when executing the computer program 52.

Illustratively, the computer program 52 may be divided into one or more units, which are stored in the memory 51 and executed by the processor 50 to accomplish the present application. The one or more units may be a series of computer instruction segments capable of performing specific functions, which are used to describe the execution of the computer program 52 in the device 5. For example, the computer program 52 may be divided into an acquisition unit, a first processing unit, a second processing unit, and a generation unit, each unit functioning specifically as described above.

The apparatus may include, but is not limited to, a processor 50, a memory 51. Those skilled in the art will appreciate that fig. 6 is merely an example of a device 5 and does not constitute a limitation of the device and may include more or fewer components than shown, or some components in combination, or different components, e.g., the device may also include input output devices, network access devices, buses, etc.

The Processor 50 may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic, discrete hardware components, etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The memory 51 may be an internal storage unit of the device, such as a hard disk or a memory of the device. The memory 51 may also be an external storage terminal of the device, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like provided on the device. Further, the memory 51 may also include both an internal storage unit and an external storage terminal of the apparatus. The memory 51 is used for storing the computer instructions and other programs and data required by the terminal. The memory 51 may also be used to temporarily store data that has been output or is to be output.

The present application further provides a computer storage medium, which may be non-volatile or volatile, and the computer storage medium stores a computer program, and when the computer program is executed by a processor, the computer program implements the steps in the above-mentioned method embodiments for generating a client representation.

The present application also provides a computer program product for causing a device to perform the steps in the various above-described method embodiments for generating a client representation when the computer program product is run on the device.

An embodiment of the present application further provides a chip or an integrated circuit, where the chip or the integrated circuit includes: a processor for calling and running the computer program from the memory to cause the device on which the chip or integrated circuit is installed to perform the steps of the above-described method embodiments for generating a client representation.

It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-mentioned division of the functional units and modules is illustrated, and in practical applications, the above-mentioned function distribution may be performed by different functional units and modules according to needs, that is, the internal structure of the apparatus is divided into different functional units or modules, so as to perform all or part of the functions described above. Each functional unit and module in the embodiments may be integrated in one processing unit, or each unit may exist alone physically, or two or more units are integrated in one unit, and the integrated unit may be implemented in a form of hardware, or in a form of software functional unit. In addition, specific names of the functional units and modules are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present application. The specific working processes of the units and modules in the system may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and reference may be made to the related descriptions of other embodiments for parts that are not described or illustrated in a certain embodiment.

Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

The above-mentioned embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not cause the essential features of the corresponding technical solutions to depart from the spirit scope of the technical solutions of the embodiments of the present application, and are intended to be included within the scope of the present application.

Claims

1. A method of generating a customer image, comprising:

2. The method of claim 1, wherein the processing, by the trained credibility score model, the client label corresponding to each set of the target data to obtain the credibility score of the client label corresponding to each set of the target data comprises:

3. The method of claim 2, wherein said calculating, via said credibility score model, a time-to-date long factor score for each of said customer tags comprises:

4. The method of claim 1, wherein the processing each set of the target data through the trained natural language processing model to obtain the customer label corresponding to each set of the target data comprises:

5. The method of claim 1, wherein after processing the customer label corresponding to each set of the target data through the trained credibility score model to obtain the credibility score of the customer label corresponding to each set of the target data, the method further comprises:

collecting feedback data to the customer;

and updating the credibility rating model according to the feedback data.

6. The method of claim 5, wherein the feedback data includes positive feedback data and negative feedback data, and wherein updating the credibility score model based on the feedback data comprises:

7. The method of any one of claims 1 to 6, wherein before processing each set of the target data through the trained natural language processing model to obtain the customer label corresponding to each set of the target data, the method further comprises:

8. An apparatus for generating a client representation, comprising:

9. An apparatus for generating a client representation, comprising a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor implements the method of any of claims 1 to 7 when executing the computer program.

10. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1 to 7.