CN112131475B - Interpretable and interactive user portrayal method and device - Google Patents

Interpretable and interactive user portrayal method and device Download PDF

Info

Publication number
CN112131475B
CN112131475B CN202011024688.9A CN202011024688A CN112131475B CN 112131475 B CN112131475 B CN 112131475B CN 202011024688 A CN202011024688 A CN 202011024688A CN 112131475 B CN112131475 B CN 112131475B
Authority
CN
China
Prior art keywords
user
feedback
label
labels
portrait
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011024688.9A
Other languages
Chinese (zh)
Other versions
CN112131475A (en
Inventor
郑驰
蔡苗
夏燕
张金凤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing University of Post and Telecommunications
Original Assignee
Chongqing University of Post and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing University of Post and Telecommunications filed Critical Chongqing University of Post and Telecommunications
Priority to CN202011024688.9A priority Critical patent/CN112131475B/en
Publication of CN112131475A publication Critical patent/CN112131475A/en
Application granted granted Critical
Publication of CN112131475B publication Critical patent/CN112131475B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The application relates to an interpretable and interactive user portrait method and device, and belongs to the technical field of computers. Firstly, constructing a user portrait tag according to an interpretable method, then using Hive to store relevant data of the user portrait tag, using ECharts to enable the user portrait to be visualized and feedbacked for users, feeding back user opinions to the system to optimize according to the adjustment of the user portrait, and finally adopting anti-discrimination and accuracy test to test the performance of the user portrait. The interpretable and interactive user portrait method constructs the user portrait label according to the interpretable mode, improves the understandability of the user portrait, supports the user to adjust the user portrait, protects the awareness, the autonomy and the privacy of the user, and can also prevent the problems of big data maturity, regional discrimination and the like.

Description

Interpretable and interactive user portrayal method and device
Technical Field
The application belongs to the technical field of computers, and relates to an interpretable and interactive user portrait method and device.
Background
For user attribute judgment, behavior prediction and risk assessment, data needs to be collected and associated analysis, so that new knowledge, an optimized flow and decision making capability are acquired, user portrayal technology can be used for describing user characteristics, potential value information can be mined, decision making capability is improved, and accurate service and modern treatment are realized. The user portrait is generally required to construct a user portrait tag system, and the prior patent specification discloses a privacy-protecting user portrait generating method, which processes the count value of each rectangular unit of tag data in a tag data set when the tag data set of the user is clustered, and protects the count value, thereby protecting the privacy of the user.
However, even if privacy rights are protected by this method, there is a problem that user information autonomy, awareness, and equity rights are impaired in user portrait application. The user portrait is generally not disclosed to the user, the user directly receives personalized pushing and risk assessment results based on the user portrait, active participation, interaction and supervision in the user portrait construction process are lacked, the user is not aware of the portrait, the pushing results and the assessment results can only be passively accepted, the autonomy of user selection information is infringed, the accuracy of the user portrait is difficult to directly verify, the inaccurate user portrait can cause poor user experience sense and reduced user viscosity, and unfair portrait rules can even cause social problems such as region discrimination and race discrimination.
Disclosure of Invention
Accordingly, the present application is directed to a method and apparatus for user portraying that is interpretable and interactive.
In order to achieve the above purpose, the present application provides the following technical solutions:
an interpretable, interactive user portrayal method, the method comprising:
s10: constructing a user portrait tag according to an interpretable method;
s20: storing user portrait tag related data by using Hive;
s30: the ECharts is utilized to enable the user portrait to be visual and feedback for the user;
s40: according to the adjustment of the user image, feeding back user opinion to the system for optimization;
s50: and checking the performance of the user portrait by adopting anti-discrimination and accuracy tests.
Optionally, the step S10 specifically includes:
determining the types of the used labels, including statistics type labels, rule type labels and mining type labels;
when constructing the user portrait, performing natural language interpretation on the labels of the user portrait, including interpretation on label categories, label data sources and label reasoning rules;
the proportion of the labels is determined to be 50% of the statistics type labels, 30% of the rule type labels and 20% of the mining type labels according to the interpretation difficulty.
Optionally, in the label category used in the determining, the mining class label uses the hidden factor model LFM and TF-IDF to mine data, and submits Spark task for calculation.
Optionally, the step S20 specifically includes:
establishing a Hive user tag table, and determining the name, content and explanation column of the tag;
the calculated user tag vector value is inserted into the content of the Hive user tag table, and the natural language solution of the tag is released in the interpretation column of the user tag table.
Optionally, the step S30 specifically includes:
introducing an ECharts file, and designating to use a radar chart;
the indicator of the radar chart is a user label stored in the Hive data warehouse, and the data of the radar chart is the score obtained by each user on the corresponding user label;
setting an expansion field on the axis of the radar chart, wherein the keyword of the expansion field is 'interpretation', the content is corresponding interpretation, and the expansion field is derived from an interpretation column in a Hive user tag table;
setting a click event on the radar chart indicator, wherein a message list appears after clicking, and a user can input objection and other feedback of the user portrait;
the feedback interface is laid out using LinearLayout and the feedback box is specified to fit the screen.
Optionally, the step S40 specifically includes:
using a python tool to perform text word segmentation processing on feedback opinions of the user;
extracting subject terms from the feedback opinion based on a TF-IDF algorithm to obtain a feedback data file of the user;
based on the feedback data, adjusting a label vector value of the user, and updating the user portrait;
and recommending based on the updated user portrait.
Optionally, the S50 specifically is:
testing the anti-discrimination, accuracy and feedback mechanism of the user portrait, wherein the samples of the normal population and the weak population with the same quantity are extracted, the average index difference of the sensitive labels of the normal population and the weak population in violence index labels, credit indexes and crime possibility indexes is compared, and the anti-discrimination of the user portrait to the weak population is tested;
using cross validation to test the accuracy of statistics class labels and rule class labels, and using sampling validation to test the accuracy of development mining class labels;
classifying feedback contents of users, extracting samples of certain feedback users from each category, classifying each category into A, B groups, wherein A groups are liveness before user feedback, B groups are liveness after user feedback, comparing A, B groups, and testing operation effect of a feedback mechanism.
An interpretable, interactive user representation apparatus, the apparatus comprising:
the user portrait interpretation module is used for performing natural language interpretation on the labels of the user portraits when the user portraits are constructed, and comprises an interpretation content unit and a proportion unit; the interpretation content unit comprises natural language interpretation of labels of user portraits, determining three items of interpretation label categories, label data sources and label reasoning rules, and determining the used label categories, including statistics label, rule label and mining label; the proportion unit comprises a statistics type label 50%, a rule type label 30% and an excavation type label 20% according to the interpretation difficulty;
the user portrait storage module is used for establishing a Hive user tag table and storing the names, contents and interpretations of the tags;
the user-oriented visualization and feedback module is used for establishing an ECharts radar chart, enabling the user portrait and interpretation thereof to be visible to a user, and supporting objection and other feedback of the user portrait input to the user;
the user portrait optimization module is used for adjusting the label vector value of the user by using a python tool and a TF-IDF algorithm according to the feedback of the user to the user portrait and updating the user portrait;
the verification module is used for testing the anti-discrimination, the accuracy and the feedback mechanism of the user portrait and comprises an anti-discrimination verification unit, an accuracy verification unit and a feedback mechanism verification unit; the anti-discrimination verification unit is used for testing the anti-discrimination of the average index difference of the normal population and the weak population on the sensitive label; the accuracy verification unit is used for testing the accuracy of the statistics type labels and the rule type labels by using cross verification, developing the accuracy of the mining type labels by using sampling verification test, and the feedback mechanism verification unit is used for testing the feedback mechanism effect by using A, B grouping comparison.
An electronic device having stored therein computer program instructions which, when read and executed by a processor, perform the steps of the method of any of claims 1 to 7.
The application has the beneficial effects that:
the interpretable and interactive user portrait method provided by the application has the advantages that the user portrait label system with strong interpretability is constructed, the user portrait and the interpretation thereof are visible to the user, the user's right of knowledge is protected, and the user can understand the reason for decision making conveniently. Meanwhile, through supporting user feedback and adopting optimization based on feedback, the user can adjust the portrait result, on one hand, the information autonomy of the user is protected, the user is not trapped in an information cocoon room, and meanwhile, the user can monitor the possible problems of discrimination and the like in the portrait and avoid risks in advance; on the other hand, the personalized pushing and accurate service effects can be improved through interaction, and the cost is reduced.
Additional advantages, objects, and features of the application will be set forth in part in the description which follows and in part will become apparent to those having ordinary skill in the art upon examination of the following or may be learned from practice of the application. The objects and other advantages of the application may be realized and obtained by means of the instrumentalities and combinations particularly pointed out in the specification.
Drawings
For the purpose of making the objects, technical solutions and advantages of the present application more apparent, the present application will be described in the following preferred detail with reference to the accompanying drawings, in which:
FIG. 1 is a flow chart of an interpretable, interactive user representation method provided by a first embodiment of the present application;
FIG. 2 is a specific flow chart for constructing a user portrait tag according to an interpretable method according to a first embodiment of the present application;
FIG. 3 is a flowchart showing a user-oriented visualization and feedback of user portraits using ECharts according to a first embodiment of the present application;
FIG. 4 is a flowchart showing an embodiment of a reverse discrimination test for checking user image performance using reverse discrimination and accuracy tests according to the first embodiment of the present application;
FIG. 5 is a block diagram of an interpretable, interactive user representation device according to a second embodiment of the present application;
fig. 6 is a schematic block diagram of an electronic device according to a third embodiment of the present application.
Reference numerals: 100-interpretable and interactive user portrayal device; 110-a user portrayal interpretation module; 120-user portrayal storage module; 130-user-oriented visualization and feedback module; 140-user portrayal optimization module; 150-a verification module; 200-an electronic device; 201-CPU; 202-memory; 203-an input device; 204-output means.
Detailed Description
Other advantages and effects of the present application will become apparent to those skilled in the art from the following disclosure, which describes the embodiments of the present application with reference to specific examples. The application may be practiced or carried out in other embodiments that depart from the specific details, and the details of the present description may be modified or varied from the spirit and scope of the present application. It should be noted that the illustrations provided in the following embodiments merely illustrate the basic idea of the present application by way of illustration, and the following embodiments and features in the embodiments may be combined with each other without conflict.
Wherein the drawings are for illustrative purposes only and are shown in schematic, non-physical, and not intended to limit the application; for the purpose of better illustrating embodiments of the application, certain elements of the drawings may be omitted, enlarged or reduced and do not represent the size of the actual product; it will be appreciated by those skilled in the art that certain well-known structures in the drawings and descriptions thereof may be omitted.
The same or similar reference numbers in the drawings of embodiments of the application correspond to the same or similar components; in the description of the present application, it should be understood that, if there are terms such as "upper", "lower", "left", "right", "front", "rear", etc., that indicate an azimuth or a positional relationship based on the azimuth or the positional relationship shown in the drawings, it is only for convenience of describing the present application and simplifying the description, but not for indicating or suggesting that the referred device or element must have a specific azimuth, be constructed and operated in a specific azimuth, so that the terms describing the positional relationship in the drawings are merely for exemplary illustration and should not be construed as limiting the present application, and that the specific meaning of the above terms may be understood by those of ordinary skill in the art according to the specific circumstances.
First embodiment
The present inventors have found that user portraits are used in numerous scenarios, such as personalized recommendations, credit scores, sentency evaluations. Because the user portraits can analyze a large amount of user data to obtain predictive and judging knowledge, the knowledge is embodied as a series of labels forming the user portraits, and based on the user portraits covering the knowledge content, the institutions can make more scientific and fine decisions, so the user portraits are widely applied. For example, if a user frequently browses automobile videos, the video website judges that the user prefers to be an automobile, marks the user with a preference label of 'loving to see the automobile', and recommends more automobile-related videos to the user based on the user preference portrait. However, the existing user portraits are mainly used for observing and using the inside of an organization, are not disclosed to users, lack of active participation, interaction and supervision of the users themselves in the construction of the user portraits, so that the accuracy of the user portraits is difficult to verify directly, the user experience is possibly degraded due to inaccurate user portraits, the user viscosity is reduced, and even the unfair portraits can possibly infringe the equal rights of the users. In order to solve the above problems, a first embodiment of the present application provides an interpretable and interactive user portrait method and apparatus.
Referring to fig. 1, fig. 1 is a flowchart of an interpretable and interactive user portrait method and apparatus according to a first embodiment of the present application. The method and the device for the interpretable and interactive user portrait have the following specific steps:
step S10: the user portrayal tag is constructed in an interpretable way.
Step S20: hive is used to store relevant data such as user portrait tags.
Step S30: user portraits are visualized and feedback-enabled to the user using ECharts.
Step S40: and feeding back user opinion to the system for optimization according to the user selection and adjustment of the user portrait.
Step S50: and checking the performance of the user portrait by adopting anti-discrimination and accuracy tests.
For step S10, a user portrait tag is constructed in an interpretable way. The user portrait label is constructed according to an interpretable method, natural language interpretation is carried out on the label of the user portrait when the user portrait is constructed, and the proportion of the label is determined according to the difficulty of interpretation. As an alternative implementation manner, please refer to fig. 2, fig. 2 is a specific flowchart of a step of constructing a user portrait tag according to an interpretable method according to a first embodiment of the present application. Specifically, constructing user portrait tag S10 in an interpretable manner may include the sub-steps of:
s11, the method determines the label types used, including statistics type labels, rules type labels and mining type labels. The statistical type label is obtained after the data are quantitatively calculated, and objectively describes the label of the user; rule type labels refer to labels designed based on platform business needs, with manually set rules, possibly with discrimination; the mining class labels refer to labels obtained by data mining and automatic learning of machine learning technology, and the problem of operation of black boxes exists.
S12, explaining the user portrait labels, namely explaining label categories, label data sources and label reasoning rules. For example, the shopping times label of 3 times of the last week of the user is interpreted as a statistics type label. The label data is derived from shopping behavior data of the user, data which accords with the date of payment more than or equal to 7 days before and less than or equal to yesterday is extracted from the user log, and then quantitative statistics is carried out on the data. The inference rule of the label is that the statistics type label refers to a label for objectively describing the user, the shopping times label of 3 times in the last week refers to the addition statistics of the passing times, and the user performs 3 times of shopping in the last week. The interpretation of the rule class labels is the interpretation of label categories, label data sources and label reasoning rules. For example, a user whose accumulated amount of consumption has reached 6100 yuan for the last week is marked with a "high-rate consumption user" label, which is interpreted as a rule-type label. The rule type label refers to a label which is designed based on the platform business requirement and has a manual rule setting. The "high-volume consuming user" label refers to a user's high consuming capacity. The tag data is derived from shopping behavior data of the user. The reasoning rule of the label is that the data of accumulated expense amount is extracted from the user expense order list, wherein the data is more than or equal to 7 days before the date of payment is less than or equal to yesterday, and the label of the high-amount expense user is added for the user, the accumulated expense amount of which is more than 5000 yuan in the last week, according to the data. The method comprises the steps of determining a rule that the accumulated consumption amount of the last week reaches more than 5000 yuan and is used as a high-rate consumption user, and taking 5000 yuan as a high-rate consumption standard, wherein the service orders average consumption amounts of all users every week according to the trend of user consumption, and selects one fifth of the consumption amounts of the users in the first week to calculate the average value, so as to obtain a consumption amount standard of 5000 yuan in one week, and further determining a rule that the accumulated consumption amount of the last week reaches more than 5000 yuan and is used as the high-rate consumption user.
S13, in order to ensure the interpretation of the results, the method determines the proportion of the three types of labels, namely 50% of statistical type labels, 30% of rule type labels and 20% of mining type labels. The interpretive of the statistics type label, the rule type label and the mining type label are gradually decreased, so that the proportion of the three labels is gradually decreased in order to ensure the interpretive of the result. The statistical type labels are obtained through statistics of objective data, the rule type labels relate to subjective selection of designers, and the mining type labels relate to machine learning. For rule-like tags, there is a developer design required by the business, with some interpretability, but there may be a risk of potential discrimination by the developer. For example, the selection of ethnicities by developers as a reference factor for evaluating violence index tags may not give appropriate reasons, which may result in the violence index of certain ethnicities being scored too high and being unevennesses. Because the mining type labels relate to machine learning, the risk of ' black box ' possibly exists, a developer and a user cannot understand the operation process of the system and the generated result, even if deviation exists, the correction is difficult to find, the user's knowledge right, objection right, correction right and deletion right are difficult to exercise, and therefore the concept that the proportion of the three types of labels is sequentially decreased is adopted to determine the proportion of the three types of labels, namely 50% higher, 30% lower and 20% lower, so that the misinterpretation and difficult interpretation of the portrait result are avoided.
The data mining technique involved in mining class labels in the method of S14 is the crypto-pattern (LFM) mining. And (5) mining the implicit interests of the user by using the LFM. The formula used is:
p uf calculate the use ofRelationship between user u interest or risk and the like label and f hidden factor, q if And calculating the relation between the event i and the f hidden factor, obtaining the two parameters through learning a training set, including positive and negative samples in the training set, and finally obtaining the portrait of the user on the i.
S14, explaining the mining class labels, namely explaining label categories, label data sources and label reasoning rules. For example, a user viewing preference "comedy" tab, interpreted as a mining class tab, refers to a user having a preference for comedy. The tag data is derived from historical data such as user viewing. The inference rule of the label is that the mining class label refers to a label obtained by automatic learning of data mining and machine learning technology. The implicit interests of the user, i.e. hidden factors, are mined by means of LFMs. The formula used is:
p uf calculate the relation between the interest of user u and the f hidden factor, q if And calculating the relation between the film i and the f hidden factors, obtaining the two parameters through learning a training set, including positive and negative samples in the training set, and finally obtaining the preference degree of the user to the film. The value range of the hidden factor is set to be 0-10, the preference degree of the user to comedy is 9, and other types such as horror film and action film scores are 4 and 3. And obtaining a comedy tab of the user viewing preference.
For step S20: hive is used to store relevant data such as user portrait tags. Hive is a data warehouse tool based on Hadoop, and is used for extracting, converting and loading data, and can store, inquire and analyze large-scale data stored in Hadoop. The Hive data warehouse tool can map a structured data file into a database table, provide SQL query functions, and convert SQL sentences into MapReduce tasks for execution. As an alternative embodiment, the present application uses Hive as a data repository for storing relevant data such as user portrait tags. Firstly, establishing a Hive user tag table, determining the name, content meaning and interpretation column of the tag, extracting data from a user log table related to the tag, submitting Spark tasks for calculation, inserting calculated vector values into the Hive data table, and releasing natural language solutions of the tag in the interpretation column of the user tag table.
For step S30: user portraits are visualized and feedback-enabled to the user using ECharts. The visualization can ensure the knowledge rights of the users, the objection rights of the users can be fed back, through the interactivity of the system, stakeholders can know and feed back, potential harm to rights and interests can be found out, the problem is solved before the damage occurs, and the cost of each party is reduced. In one embodiment, the user representation is presented to the user in the form of a radar map, a corresponding user representation tag is displayed at the position of each vertex of the radar map, the tag can be presented with content, and an interpretation page, i.e., a natural language interpretation of the user representation tag, appears. The label can be clicked, after clicking the label, the user can input objections and other feedback to the user portrait, and after clicking, the user can send the user portrait to the system. As an alternative implementation, please refer to fig. 3, fig. 3 is a specific flowchart of using echarties to make a user portrait to be visible and feedbacked for a user according to a first embodiment of the present application. Specifically, the user-facing visualization of the user representation using ECharts, the feedback step S30 may comprise the sub-steps of:
s31, realizing the user portrait in the radar chart form by using ECharts. First, an ECharts file is introduced, a div container with a wide height is prepared for the ECharts file, configuration items and data of a chart are specified, the configuration items and the data comprise titles of icons, position distances, widths, heights and directions of legend components and coordinate system components of a radar chart.
The indicator of the S32 radar chart is a user label stored in a database, and the data of the radar chart is the score obtained by each user on the corresponding user label.
S33, setting a 'fold' unfolding field on the axis of the radar chart, wherein the keyword of the unfolding field is 'explanation', and the content is corresponding explanation.
S34, setting a click event on the radar chart indicator, wherein a message list appears after clicking, and a user can input objection and other feedback of the user portrait.
The S35 feedback interface is laid out by using the LinearLayout, and the feedback frame is adapted to the screen by designating the feedback frame, so that the feedback frame is not covered by the soft keyboard all the time. A user feedback table is created in the data warehouse, wherein the table comprises five field definitions of 'user id', 'user nickname', 'feedback content', 'feedback date', 'feedback time', and corresponding fields are respectively 'user_id', 'user_name', 'user_content', 'modification_date', 'modification_time', and user feedback is stored in the table.
For step S40: and feeding back user opinion to the system for optimization according to the user selection and adjustment of the user portrait. The feedback opinion of the user needs to be converted into a data base for adjusting the user portrait, text word segmentation processing can be adopted, and word segmentation is a process of recombining continuous word sequences into word sequences according to a certain specification. By utilizing the statistical probability established by the corpus, for a new sentence, the word segmentation method corresponding to the maximum probability can be found out by calculating the joint distribution probability corresponding to various word segmentation methods, namely the optimal word segmentation. As one implementation mode, text word segmentation processing is carried out on the feedback opinions of the user, a python tool is used for word segmentation, and subject word extraction is carried out on the feedback opinions based on a TF-IDF algorithm, so that a feedback data file of the user is obtained. python is a cross-platform computer programming language, a high-level scripting language that combines interpretive, compiled, interactive, and object-oriented. The TF-IDF algorithm measures the importance of a word with word frequency, and is expressed in statistical language, i.e. on the basis of word frequency, an importance weight is allocated to each word. After the feedback data file of the user is obtained, the feedback text is represented by word vectors according to word frequency, so that the tag vector value of the user is adjusted, and the portrait of the user is updated.
For step S50: and checking the performance of the user portrait by adopting anti-discrimination and accuracy tests. The performance of the user portrait is tested by adopting the anti-discrimination and accuracy test, namely, the anti-discrimination, the accuracy and the feedback mechanism of the user portrait are tested. For the accuracy detection of the user portrait, a cross-validation method is adopted to test the accuracy of the statistics type labels and the rule type labels. The cross-validation should be preceded by a basic magnitude and ratio acceptance. In addition, in order to further detect the accuracy of the user portrait, the accuracy is verified by adopting a reverse deduction mode. Reverse derivation is based on one or more tag search categories and accurate searches to find a user group with the same or multiple tags, and then analyze its subsequent dynamic purchasing behavior and conduct reverse derivation. If the result is consistent with the idea of creating the label, the idea of creating the label is correct, and the user portrait model is accurate. For developing the mining type label, after the user portrait system is built, a sampling verification method is used, a part of experimental users participate in specific type activities through appointing, the consistency of behaviors and the label is observed, and experimental results are analyzed, so that the system building process is further optimized. And (3) testing a feedback mechanism, namely extracting samples of certain feedback users from each category by classifying feedback contents of the users, dividing each category into two groups A/B, wherein the group A is the activity before user feedback, the group B is the activity after user feedback, comparing the two groups A, B, and if the activity of the group A is changed compared with the activity of the group B, proving that the user portrait feedback mechanism is effective.
As an alternative implementation manner, please refer to fig. 4, fig. 4 is a specific flowchart of the anti-discrimination test for the user portrait according to the first embodiment of the present application. In particular, the step of testing the anti-discrimination of the user representation may comprise the sub-steps of:
s51, sampling the normal group and the weak group with the same quantity, and inquiring the user portrait label;
s52, calculating average indexes h1, h2, h3 and the like of sensitivity indexes such as violence indexes, credit indexes, crime possibility and the like of normal groups;
s53, calculating average indexes H1, H2, H3 and the like of sensitivity indexes such as violence indexes, credit indexes, crime possibility and the like of the vulnerable group;
s54 compares H with H, and if the gap is too large, it proves that the user portrait system may be discriminated.
Second embodiment
In order to cooperate with the interpretable and interactive user portrait method provided by the first embodiment of the present application, a second embodiment of the present application also provides an interpretable and interactive user portrait device.
Referring to fig. 5, fig. 5 is a block diagram of an interpretable and interactive user portrait device according to a second embodiment of the present application.
The interpretable, interactive user portrayal device 100 includes a user portrayal interpretation module 110, a user portrayal storage module 120, a user oriented visualization and feedback module 130, a user portrayal optimization module 140, a verification module 150.
The user portrait interpretation module 110 is used for performing natural language interpretation on the labels of the user portrait when the user portrait is constructed, and comprises three items of interpretation label category, label data source and label reasoning rule; determining the types of the used labels, including statistics type labels, rule type labels and mining type labels; the proportion of the labels is determined to be 50% of the statistics type labels, 30% of the rule type labels and 20% of the mining type labels according to the interpretation difficulty.
The user portrait storage module 120 is used for creating a Hive user tag table and storing the names, contents and interpretations of the tags.
The user-oriented visualization and feedback module 130 is configured to determine that data of the radar chart is a user tag score by using the echartis radar chart; the expansion field of the radar chart axis is a corresponding interpretation, and the interpretation is derived from an interpretation column in the Hive user tag table; a click event is set on the radar chart indicator, a message list appears after clicking, and a user can input objection and other feedback on the user portrait.
And the user portrait optimization module 140 is used for obtaining a feedback data file of the user by using a python tool and a TF-IDF algorithm according to the feedback of the user to the user portrait, and adjusting the label vector value of the user based on the feedback data to update the user portrait.
Further, considering that the established interpretation system and the optimized user portrayal are not necessarily capable of accurately reflecting the characteristics of the user, an improper design may exist in the label reasoning rule, so that the interpretable and interactable user portrayal device should further comprise a verification module for verifying whether the user portrayal is accurate or whether a discrimination phenomenon exists.
The verification module comprises a reverse discrimination verification unit, an accuracy verification unit and a feedback mechanism verification unit. The anti-discrimination verification unit is used for testing the anti-discrimination performance of the average index difference of the normal population and the weak population on the sensitive label. The accuracy verification unit is used for testing the accuracy of the statistics type labels and the rule type labels by using cross verification, developing the accuracy of the mining type labels by using sampling verification test, and the feedback mechanism verification unit is used for testing the feedback mechanism effect by using A/B grouping comparison.
It will be clear to those skilled in the art that, for convenience and brevity of description, reference may be made to the corresponding procedure in the foregoing method for the specific working procedure of the apparatus described above, and this will not be repeated here.
Third embodiment
In order to enable the step counting method described above, a third embodiment of the present application provides an electronic device 200. Referring to fig. 6, fig. 6 is a schematic diagram of an electronic device according to a third embodiment of the present application.
Fig. 6 is a schematic structural diagram of an electronic device according to a third embodiment of the present application, where, as shown in fig. 6, the electronic device includes a CPU, a memory, an input device, and an output device.
The memory stores a program for controlling the operation of the computer, wherein the memory may be, but is not limited to, random Access Memory (RAM), read Only Memory (ROM). The memory stores program instructions or modules corresponding to the interpretable and interactive user portraits in the embodiment of the application, for example, a user portrait interpretation module in an interpretable and interactive user portrait device; a user portrait storage module; a user-oriented visualization and feedback module; a user portrait optimization module; and a verification module.
The CPU has a control unit, an Arithmetic Logic Unit (ALU), registers (small memory areas) implemented by using D flip-flops, and a program counter. The execution operation is that the control unit fetches the program instructions or modules from the memory and uses the program counter to determine the location of the instructions or modules. The instruction or module is decoded into a language understood by an Arithmetic Logic Unit (ALU), operands required to execute the instruction or module are fetched from memory, placed into registers, the instruction or module is executed by the Arithmetic Logic Unit (ALU), and the result is placed into memory.
The input device is used for inputting data into the system, and interaction between a user and the system is realized. The signal conversion between the input device and the bus is realized through an external interface. The input device may be, but is not limited to, a mouse, keyboard, touch screen, etc.
The output device is used for outputting the system data to the user, so that the interaction between the user and the system is realized. The signal conversion between the input device and the bus is realized through an external interface. The input device may be, but is not limited to, a display, speakers, printer, etc. In this embodiment, the input/output device may adopt an interrupt driving input/output control manner, where the interrupt driving input/output control manner communicates through an intermediate interrupt controller, when a process is to start a certain input/output device to work, the CPU issues an input/output command to the controller, and then immediately returns to continue to execute the original task, and the controller controls the specified input/output device according to the requirement of the command.
It is to be understood that the configuration shown in fig. 6 is illustrative only, and that the electronic device 200 may also include more or fewer components than those shown in fig. 6, or have a different configuration than that shown in fig. 6. The components shown in fig. 6 may be implemented in hardware, software, or a combination thereof.
It will be clear to those skilled in the art that, for convenience and brevity of description, reference may be made to the corresponding procedure in the foregoing method for the specific working procedure of the apparatus described above, and this will not be repeated here.
In summary, the embodiment of the application provides an interpretable and interactive user portrait method and device, wherein the interpretable and interactive user portrait method is used for constructing a user portrait tag system with strong interpretability, so that the user portrait and the interpretation thereof are visible to a user, the user's knowledge is protected, and the user can understand the reason for decision making conveniently. Meanwhile, through supporting user feedback and adopting optimization based on feedback, the user can adjust the portrait result, on one hand, the information autonomy of the user is protected, the user is not trapped in an information cocoon room, and meanwhile, the user can monitor the possible problems of discrimination and the like in the portrait and avoid risks in advance; on the other hand, the personalized pushing and accurate service effects can be improved through interaction, and the cost is reduced.
In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other manners. The apparatus embodiments described above are merely illustrative, for example, of the flowcharts and block diagrams in the figures that illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
In addition, functional modules in the embodiments of the present application may be integrated together to form a single part, or each module may exist alone, or two or more modules may be integrated to form a single part.
The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present application.
The above description is only of the preferred embodiments of the present application and is not intended to limit the present application, but various modifications and variations can be made to the present application by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the protection scope of the present application. It should be noted that: like reference numerals and letters denote like items in the following figures, and thus once an item is defined in one figure, no further definition or explanation thereof is necessary in the following figures.
The foregoing is merely illustrative of the present application, and the present application is not limited thereto, and any person skilled in the art will readily recognize that variations or substitutions are within the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.
It is noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
Finally, it is noted that the above embodiments are only for illustrating the technical solution of the present application and not for limiting the same, and although the present application has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications and equivalents may be made thereto without departing from the spirit and scope of the present application, which is intended to be covered by the claims of the present application.

Claims (6)

1. An interpretable, interactive user portrayal method, characterized by: the method comprises the following steps:
s10: constructing a user portrait tag according to an interpretable method;
s20: storing user portrait tag related data by using Hive;
s30: the ECharts is utilized to enable the user portrait to be visual and feedback for the user;
s40: according to the adjustment of the user image, feeding back user opinion to the system for optimization;
s50: checking the performance of the user portrait by adopting anti-discrimination and accuracy tests;
the step S20 specifically includes:
establishing a Hive user tag table, and determining the name, content and explanation column of the tag;
inserting the calculated user tag vector value into the content of the Hive user tag table, and releasing the natural language solution of the tag into an explanation column in the user tag table;
the step S30 is specifically as follows:
introducing an ECharts file, and designating to use a radar chart;
the indicator of the radar chart is a user label stored in the Hive data warehouse, and the data of the radar chart is the score obtained by each user on the corresponding user label;
setting an expansion field on the axis of the radar chart, wherein the keyword of the expansion field is 'interpretation', the content is corresponding interpretation, and the expansion field is derived from an interpretation column in a Hive user tag table;
setting a click event on the radar chart indicator, wherein a message list appears after clicking, and a user can input objection and other feedback of the user portrait;
the feedback interface is laid out by using a LinearLayout, and a feedback frame is specified to adapt to a screen;
the step S50 is specifically as follows:
testing the anti-discrimination, accuracy and feedback mechanism of the user portrait, wherein the samples of the normal population and the weak population with the same quantity are extracted, the average index difference of the sensitive labels of the normal population and the weak population in violence index labels, credit indexes and crime possibility indexes is compared, and the anti-discrimination of the user portrait to the weak population is tested;
using cross validation to test the accuracy of statistics class labels and rule class labels, and using sampling validation to test the accuracy of development mining class labels;
classifying feedback contents of users, extracting samples of certain feedback users from each category, classifying each category into A, B groups, wherein A groups are liveness before user feedback, B groups are liveness after user feedback, comparing A, B groups, and testing operation effect of a feedback mechanism.
2. An interpretable and interactive user representation method according to claim 1, wherein: the step S10 is specifically as follows:
determining the types of the used labels, including statistics type labels, rule type labels and mining type labels;
when constructing the user portrait, performing natural language interpretation on the labels of the user portrait, including interpretation on label categories, label data sources and label reasoning rules;
the proportion of the labels is determined to be 50% of the statistics type labels, 30% of the rule type labels and 20% of the mining type labels according to the interpretation difficulty.
3. An interpretable and interactive user representation method according to claim 2, wherein: and in the label type used for determination, mining class labels utilize a hidden factor model LFM and TF-IDF to mine data, and submitting Spark tasks for calculation.
4. An interpretable and interactive user representation method according to claim 1, wherein: the step S40 is specifically as follows:
using a python tool to perform text word segmentation processing on feedback opinions of the user;
extracting subject terms from the feedback opinion based on a TF-IDF algorithm to obtain a feedback data file of the user;
based on the feedback data, adjusting a label vector value of the user, and updating the user portrait;
and recommending based on the updated user portrait.
5. An interpretable, interactive user portrayal device, characterized by: the device comprises:
the user portrait interpretation module is used for performing natural language interpretation on the labels of the user portraits when the user portraits are constructed, and comprises an interpretation content unit and a proportion unit; the interpretation content unit comprises natural language interpretation of labels of user portraits, determining three items of interpretation label categories, label data sources and label reasoning rules, and determining the used label categories, including statistics label, rule label and mining label; the proportion unit comprises a statistics type label 50%, a rule type label 30% and an excavation type label 20% according to the interpretation difficulty;
the user portrait storage module is used for establishing a Hive user tag table and storing the names, contents and interpretations of the tags; inserting the calculated user tag vector value into the content of the Hive user tag table, and releasing the natural language solution of the tag into an explanation column in the user tag table;
the user-oriented visualization and feedback module is used for establishing an ECharts radar chart, enabling the user portrait and interpretation thereof to be visible to a user, and supporting objection and other feedback of the user portrait input to the user; the indicator of the radar chart is a user label stored in the Hive data warehouse, and the data of the radar chart is the score obtained by each user on the corresponding user label; setting an expansion field on the axis of the radar chart, wherein the keyword of the expansion field is 'interpretation', the content is corresponding interpretation, and the expansion field is derived from an interpretation column in a Hive user tag table; setting a click event on the radar chart indicator, wherein a message list appears after clicking, and a user can input objection and other feedback of the user portrait; the feedback interface is laid out by using a LinearLayout, and a feedback frame is specified to adapt to a screen;
the user portrait optimization module is used for adjusting the label vector value of the user by using a python tool and a TF-IDF algorithm according to the feedback of the user to the user portrait and updating the user portrait;
the verification module is used for testing the anti-discrimination, the accuracy and the feedback mechanism of the user portrait and comprises an anti-discrimination verification unit, an accuracy verification unit and a feedback mechanism verification unit; the anti-discrimination verification unit is used for testing the anti-discrimination of the average index difference of the normal population and the weak population on the sensitive label; the accuracy verification unit is used for testing the accuracy of the statistics type labels and the rule type labels by using cross verification, developing the accuracy of the mining type labels by using sampling verification test, and testing the feedback mechanism effect by using A, B grouping comparison;
the anti-discrimination verification unit is used for extracting samples of normal groups and weak groups with the same quantity, comparing average index differences of sensitive labels of the normal groups and the weak groups in violence index labels, credit indexes and crime possibility indexes, and testing anti-discrimination of user figures on the weak groups;
the accuracy verification unit uses cross verification to test the accuracy of the statistics class labels and the rule class labels, and uses sampling verification to test the accuracy of the development mining class labels;
the feedback mechanism verification unit classifies feedback content of users, samples of certain feedback users are extracted for each category, each category is divided into A, B groups, A groups are liveness before user feedback, B groups are liveness after user feedback, A, B groups are compared, and operation effects of the feedback mechanism are tested.
6. An electronic device having stored therein computer program instructions which, when read and executed by a processor, perform the steps of the method of any of claims 1-4.
CN202011024688.9A 2020-09-25 2020-09-25 Interpretable and interactive user portrayal method and device Active CN112131475B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011024688.9A CN112131475B (en) 2020-09-25 2020-09-25 Interpretable and interactive user portrayal method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011024688.9A CN112131475B (en) 2020-09-25 2020-09-25 Interpretable and interactive user portrayal method and device

Publications (2)

Publication Number Publication Date
CN112131475A CN112131475A (en) 2020-12-25
CN112131475B true CN112131475B (en) 2023-10-10

Family

ID=73839435

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011024688.9A Active CN112131475B (en) 2020-09-25 2020-09-25 Interpretable and interactive user portrayal method and device

Country Status (1)

Country Link
CN (1) CN112131475B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113391867B (en) * 2021-06-16 2022-07-01 刘叶 Big data service processing method and service server based on digitization and visualization
CN117807190A (en) * 2024-02-28 2024-04-02 青岛他坦科技服务有限公司 Intelligent identification method for sensitive data of energy big data

Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016054908A1 (en) * 2014-10-10 2016-04-14 中兴通讯股份有限公司 Internet of things big data platform-based intelligent user profiling method and apparatus
WO2017041372A1 (en) * 2015-09-07 2017-03-16 百度在线网络技术(北京)有限公司 Man-machine interaction method and system based on artificial intelligence
CN106651424A (en) * 2016-09-28 2017-05-10 国网山东省电力公司电力科学研究院 Electric power user figure establishment and analysis method based on big data technology
WO2017080176A1 (en) * 2015-11-12 2017-05-18 乐视控股(北京)有限公司 Individual user profiling method and system
CN109992982A (en) * 2019-04-11 2019-07-09 北京信息科技大学 Big data access authorization methods, device and big data platform
WO2019232891A1 (en) * 2018-06-06 2019-12-12 平安科技(深圳)有限公司 Method and device for acquiring user portrait, computer apparatus and storage medium
CN110796470A (en) * 2019-08-13 2020-02-14 广州中国科学院软件应用技术研究所 Market subject supervision and service oriented data analysis system
CN111159276A (en) * 2018-11-08 2020-05-15 北京航天长峰科技工业集团有限公司 Holographic image system construction method based on hybrid storage mode
CN111191125A (en) * 2019-12-24 2020-05-22 长威信息科技发展股份有限公司 Data analysis method based on tagging
CN111210326A (en) * 2019-12-27 2020-05-29 大象慧云信息技术有限公司 Method and system for constructing user portrait
CN111339409A (en) * 2020-02-20 2020-06-26 深圳壹账通智能科技有限公司 Map display method and system
CN111368548A (en) * 2018-12-07 2020-07-03 北京京东尚科信息技术有限公司 Semantic recognition method and device, electronic equipment and computer-readable storage medium
CN111444368A (en) * 2020-03-25 2020-07-24 平安科技(深圳)有限公司 Method and device for constructing user portrait, computer equipment and storage medium
CN111444236A (en) * 2020-03-23 2020-07-24 华南理工大学 Mobile terminal user portrait construction method and system based on big data

Patent Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016054908A1 (en) * 2014-10-10 2016-04-14 中兴通讯股份有限公司 Internet of things big data platform-based intelligent user profiling method and apparatus
WO2017041372A1 (en) * 2015-09-07 2017-03-16 百度在线网络技术(北京)有限公司 Man-machine interaction method and system based on artificial intelligence
WO2017080176A1 (en) * 2015-11-12 2017-05-18 乐视控股(北京)有限公司 Individual user profiling method and system
CN106651424A (en) * 2016-09-28 2017-05-10 国网山东省电力公司电力科学研究院 Electric power user figure establishment and analysis method based on big data technology
WO2019232891A1 (en) * 2018-06-06 2019-12-12 平安科技(深圳)有限公司 Method and device for acquiring user portrait, computer apparatus and storage medium
CN111159276A (en) * 2018-11-08 2020-05-15 北京航天长峰科技工业集团有限公司 Holographic image system construction method based on hybrid storage mode
CN111368548A (en) * 2018-12-07 2020-07-03 北京京东尚科信息技术有限公司 Semantic recognition method and device, electronic equipment and computer-readable storage medium
CN109992982A (en) * 2019-04-11 2019-07-09 北京信息科技大学 Big data access authorization methods, device and big data platform
CN110796470A (en) * 2019-08-13 2020-02-14 广州中国科学院软件应用技术研究所 Market subject supervision and service oriented data analysis system
CN111191125A (en) * 2019-12-24 2020-05-22 长威信息科技发展股份有限公司 Data analysis method based on tagging
CN111210326A (en) * 2019-12-27 2020-05-29 大象慧云信息技术有限公司 Method and system for constructing user portrait
CN111339409A (en) * 2020-02-20 2020-06-26 深圳壹账通智能科技有限公司 Map display method and system
CN111444236A (en) * 2020-03-23 2020-07-24 华南理工大学 Mobile terminal user portrait construction method and system based on big data
CN111444368A (en) * 2020-03-25 2020-07-24 平安科技(深圳)有限公司 Method and device for constructing user portrait, computer equipment and storage medium

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
Fred Hohman等.summit:scaling deep learning interpretability by visualizing activation and attribution summarizations.IEEE Transactions onvisualization and computer graphics.2019,第26卷(第1期),1096-1106. *
张宇 ; 阮雪灵 ; .大数据环境下移动用户画像的构建方法研究.中国信息化.2020,(第04期),65-68. *
张宇航等.个性化推荐系统综述.价值工程.2020,第39卷(第2期),287-292. *
覃召敬 ; .一种基于大数据的证券业客户标签计算系统设计.中国科技信息.2020,(第12期),84-85. *
郑驰.用户画像风险与法律规制研究.中国优秀硕士学位论文全文数据库社会科学I辑.2022,(第7期),G116-32. *

Also Published As

Publication number Publication date
CN112131475A (en) 2020-12-25

Similar Documents

Publication Publication Date Title
US11182564B2 (en) Text recommendation method and apparatus, and electronic device
US11625407B2 (en) Website scoring system
US20220188708A1 (en) Systems and methods for predictive coding
US20220284327A1 (en) Resource pushing method and apparatus, device, and storage medium
US11238225B2 (en) Reading difficulty level based resource recommendation
US20100257182A1 (en) Automated dynamic style guard for electronic documents
CN112131475B (en) Interpretable and interactive user portrayal method and device
US11520990B2 (en) Systems and methods for dynamically displaying a user interface of an evaluation system processing textual data
EP3803628A1 (en) Language agnostic data insight handling for user application data
US10664927B2 (en) Automation of crowd-sourced polling
Jha et al. Reputation systems: Evaluating reputation among all good sellers
US20190019094A1 (en) Determining suitability for presentation as a testimonial about an entity
Solainayagi et al. Trust discovery and information retrieval using artificial intelligence tools from multiple conflicting sources of web cloud computing and e-commerce users
Liu et al. Supporting features updating of apps by analyzing similar products in App stores
Peng et al. An approach of extracting feature requests from app reviews
JP2023533723A (en) Evaluate interpretation of search queries
Wang et al. UISMiner: Mining UI suggestions from user reviews
Wang et al. Missing standard features compared with similar apps? A feature recommendation method based on the knowledge from user interface
EP4116898A1 (en) Document evaluation program, document evaluation method, and document evaluation device
Shi et al. Prediction model of consumer price preference based on machine learning
Bauer¹ et al. Check for updates A Taxonomy of User Behavior Model (UBM) Tools for UI Design and User Research
Yu et al. Design of a tool for checking academic integrity and content consistency of paper abstracts
CN114021000A (en) Commodity recommendation method and device, storage medium and electronic equipment
Da Silva et al. Algorithms for the Development of Adaptive Web Interfaces: A Systematic Literature Review
CN113901996A (en) Equipment screen perspective detection model training method and equipment screen perspective detection method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant