CN114090854B

CN114090854B - Intelligent label weight updating method and system based on information entropy and computer equipment

Info

Publication number: CN114090854B
Application number: CN202210076732.3A
Authority: CN
Inventors: 姜磊; 朱振航; 杨钊; 严海龙
Original assignee: Brilliant Data Analytics Inc
Current assignee: Brilliant Data Analytics Inc
Priority date: 2022-01-24
Filing date: 2022-01-24
Publication date: 2022-04-19
Anticipated expiration: 2042-01-24
Also published as: CN114090854A

Abstract

The invention belongs to a big data label technology, and relates to an intelligent label weight updating method, a system and computer equipment based on information entropy, wherein the method comprises the following steps: acquiring source data comprising a label set, a label coverage rate, a label use behavior frequency set, a label behavior weight set and a service scene coefficient; considering the overall distribution condition of the label coverage rate, introducing a label coverage rate reference value as the base number of the logarithm in the information quantity calculation formula, improving the information quantity calculation formula and generating the label information quantity weight; automatically updating the tag use weight coefficient based on the tag use behavior times and the tag behavior weight; calculating the attenuation coefficient of the label weight; and generating the label weight and dynamically updating according to the attenuation mode of the label weight and by integrating the label information weight, the label use weight coefficient and the service scene coefficient. The invention enables the coefficient related to the updating of the label weight to be dynamically adjusted, and solves the problem that the prior art is difficult to ensure the accuracy and effectiveness of the label weight.

Description

Intelligent label weight updating method and system based on information entropy and computer equipment

Technical Field

The invention belongs to the technical field of big data labels, and particularly relates to an intelligent label weight updating method and system based on information entropy and computer equipment.

Background

The big data label is a characteristic mark obtained by highly extracting, summarizing, analyzing and mining data, expresses the conclusion and judgment of an object, is a bridge between data and business, can support and apply accurate strategy making and make a decision quickly based on the characteristic, and plays an increasingly important role in the digital era.

The traditional method for constructing the client label is generally based on various types of data inside and outside an enterprise, a corresponding label is marked on the client according to a specific rule, meanwhile, the label weight is calculated according to a fixed label weight calculation formula when the label is updated, and the corresponding label weight calculation formula lacks a dynamic updating mode. With the development of services and the rapid change of data, after a period of operation, the label weight calculation mode which cannot be dynamically updated gradually fails to meet the requirement of accurate services. Meanwhile, after the label weight is generated, the label weight can only be adjusted along with the update of the label, so that the latest characteristics of the service object cannot be fed back in time in the use process of the label weight.

Meanwhile, due to changes of business requirements, use environments and feedback data, the value of the label is greatly influenced by time changes, and the image shows that the value of the label is gradually reduced along with the time. To solve this problem, the label update frequency is increased or attenuation coefficient is increased in the label weight calculation. The method for increasing the update frequency of the tag is only suitable for the situation that data change can be rapidly acquired and the computing capability is strong. After some labels are created, because corresponding source data changes cannot be acquired, especially manually marked labels, attenuation coefficients are needed to ensure the accuracy of label weights; conventional implementations, which are generally based on newton's law of cooling, utilize an exponential decay function as a time decay factor and can only specify a fixed date as the decay start time, are not easily understood by business personnel and cannot satisfy the decay characteristics of all tags.

Disclosure of Invention

The invention provides an intelligent updating method, a system and computer equipment for label weight based on information entropy, which improve the calculation mode of label information weight and label weight attenuation by setting label information weight, label use weight coefficient, service scene coefficient and the like, and construct an intelligent label weight updating mode, so that parameters or coefficients related to label weight updating can be flexibly and dynamically adjusted, and the technical problems that the prior art lacks intelligent dynamic adjusting means and is difficult to ensure that the label weight is continuously, accurately and effectively solved.

On one hand, the intelligent updating method of the label weight based on the information entropy comprises the following steps:

s1, acquiring source data used for label weight calculation, and preprocessing the source data; the source data includes: label set A1, total number of service objects T corresponding to labels, and label coverage rate P (T)_i) Label usage behavior times set a2 (t)_i) Label behavior weight set A3 and service scene coefficient CS_i(t_i)；

S2, improving an information quantity calculation formula of the label, and introducing a label coverage rate reference value as the base number of the logarithm in the information quantity calculation formula by considering the overall distribution condition of the label coverage rate, wherein the label coverage rate is a true number; taking the improved information quantity calculation formula as a label information quantity weight generation formula to generate a label information quantity weight;

s3, automatically updating the label use weight coefficient based on the label use behavior times and the label behavior weight;

s4, calculating an attenuation coefficient of the label weight according to the label use scene and the artificial adjustment coefficient;

and S5, generating label weight and dynamically updating according to the attenuation mode of the label weight, the label information weight, the label use weight coefficient and the service scene coefficient on the basis of the steps S1-S4.

On the other hand, the intelligent updating system for label weight based on information entropy comprises:

the source data acquisition module is used for acquiring source data used for label weight calculation and preprocessing the source data; the source data includes: label set A1, total number of service objects T corresponding to labels, and label coverage rate P (T)_i) Label usage behavior times set a2 (t)_i) Label behavior weight set A3 and service scene coefficient CS_i(t_i)；

The label information quantity weight generation module is used for improving an information quantity calculation formula of a label, taking the overall distribution condition of the label coverage rate into consideration, introducing a label coverage rate reference value as the base number of the logarithm in the information quantity calculation formula, and taking the label coverage rate as a true number; taking the improved information quantity calculation formula as a label information quantity weight generation formula to generate a label information quantity weight;

the tag use weight coefficient updating module is used for automatically updating the tag use weight coefficient based on the tag use behavior times and the tag behavior weight;

the tag weight attenuation coefficient calculation module is used for calculating the attenuation coefficient of the tag weight according to the tag use scene and the manual adjustment coefficient;

and the label weight dynamic updating module is used for generating and dynamically updating the label weight according to the attenuation mode of the label weight and by integrating the label information weight, the label use weight coefficient and the service scene coefficient.

In still another aspect, a computer device according to the present invention includes a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein the processor executes the computer program to perform the steps of the intelligent tag weight updating method according to the present invention.

Compared with the prior art, the invention has the following beneficial effects:

1. based on an improved information amount calculation formula, the label information amount relative to the reference value is obtained by dynamically and automatically calculating the label coverage rate reference value, and then the label information amount weight is obtained, so that the overall distribution condition of the label weight is optimized, and the reasonability of the label weight is improved.

2. The using weight coefficient of the label is increased so as to reflect the using value of the label in the label weight and further improve the comprehensiveness of the label weight; the label use weight is obtained through automatic calculation based on the behavior times and the behavior weight, and can be automatically weighted and updated according to the label use data, so that the accuracy of the weight coefficient is guaranteed.

3. The label basic weight is increased to solve the problem of cold start of label weight calculation under the condition of lacking data, ensure that the label weight can still be used when lacking initial data, and simultaneously enable a user to manually adjust the corresponding label weight so as to meet various weight application requirements of the user.

4. The label weight attenuation calculation mode is improved, and a user can dynamically adjust the corresponding attenuation coefficient based on business understanding by supporting various attenuation modes, attenuation dates and increasing artificial adjustment factors so as to meet the attenuation characteristics of different label types. Meanwhile, after the training sample is provided, the corresponding attenuation weight can be automatically generated, and the reasonability of the label weight with the attenuation characteristic is improved.

5. Proposing a service scene coefficient CS_i(t_i) The importance of the label under different scenes is evaluated, and the flexibility and the adaptive range of the label weight are improved. After the service scene coefficients are set, the same tag has different service scene coefficients in different application scenes, that is, the same tag has a plurality of different weight coefficients. That is, the invention can flexibly and dynamically adjust the business scene coefficient according to the change of the scene by setting the business scene coefficient, thereby dynamically adjusting and updating the label weight according to the scene.

Drawings

FIG. 1 is a flowchart of an intelligent tag weight updating method based on information entropy according to an embodiment of the present invention;

fig. 2 is a schematic structural diagram of an intelligent tag weight updating system based on information entropy according to an embodiment of the present invention.

Detailed Description

The present invention will be described in further detail with reference to examples and drawings, but the present invention is not limited thereto.

Example 1

The embodiment provides an intelligent updating method of label weight based on information entropy, which comprises the following steps:

s1, acquiring source data used for label weight calculation, and preprocessing the source data; the source data includes: label set A1, total number of service objects T corresponding to labels, and label coverage rate P (T)_i) Label basis weight C (t)_i) Label usage behavior times set a2 (t)_i) Label behavior weight set A3 and service scene coefficient CS_i(t_i) And so on.

Label set a1= { t = { (t) }₀,t₂,t₃,....,t_nRefer to a collection of labels, packets, of some kindIncluding but not limited to: 1. all label sets under a certain business object, such as all labels under a customer business body; 2. all label sets under a certain service scene; 3. all label sets under a specific classification, such as behavior class labels; 4. a set of full-size tags. Wherein t is₁,t₂,t₃,....,t_nRefers to a specific label.

The total number T of the business objects corresponding to the label refers to the total number of the business objects corresponding to the label for which the weight is currently calculated, for example, the total number of the business objects corresponding to the label of the "high-risk customer" refers to the total number of the business objects that can be labeled as the label of the "high-risk customer", that is, the total number of customers having corresponding data in the system.

Label coverage ratio P (t)_i) Refers to possession of the tag t_iThe ratio of the number of the service objects to the total number of the service objects corresponding to the tag, P (t)_i)=T(t_i) [ solution ] T, wherein T (T)_i) Means possession of the tag t_iThe number of business objects.

Label basis weight C (t)_i) The basic weight input by a user is usually greater than 0, the initial value is 0, and the method can be used for manually adjusting the label weight and solving the cold start problem, so that on one hand, the label weight can still be used when the initial data is lacked, and on the other hand, the entry of manually adjusting the weight is increased.

Label usage behavior times set A2 (t)_i)={a₁,a₂,..,a_nMeans the label t_iCorresponding operation behavior record number, wherein a₁,a₂,..,a_nThe number of corresponding behaviors is referred to, and the data of the operation behaviors comprises but is not limited to the number of times of querying the tag information, the number of times of calling the tag result, the number of times of approving the tag, the number of tag evaluation records and the like. In this embodiment, the statistical period of the tag usage behavior times may only count the number of behavior records in a specified time period, such as the number of behavior records in three months, or may perform weighted calculation on the behavior times according to the behavior occurrence time, so as to reflect that the importance of the historical behavior records to the current is continuously weakened, for example: behavioral records three months ago to one year were according to 0.5And (4) carrying out frequency statistics on the weight, wherein the frequency statistics is carried out on the behavior records one year ago according to the weight of 0.1.

Label behavior weight set a3= { b = { b =₁,b₂,..,b_nRecording corresponding weight coefficient sets for corresponding behaviors to reflect importance of different behaviors to label weights, wherein the weight coefficient sets (namely label behavior weight sets) are mainly obtained from historical experience values, and b is₁Corresponding behavior a₁Expressed behavior weights. In this embodiment, in different scenarios, the tag behavior weight set may be manually adjusted according to the characteristics of the scenario, that is, different tag behavior weight sets may exist in different scenarios.

Service scene coefficient CS_i(t_i) The weighting weight coefficient of the label in a certain scene is shown, the coefficient of the service scene is greater than 0, and the initial value is 1. The setting mode of the service scene coefficient comprises three modes: the first is manual input by a user, namely the user can input a scene weight coefficient of a certain label in a certain scene in a system interface; the second is system automatic calculation, namely the system automatically performs weighting calculation according to the number of times of using the tag in a certain scene, for example, the tag A is frequently used in a first scene, and the system automatically increases the scene weight coefficient of the tag A in the first scene; the third method is to specify key labels, which is similar to the setting mode of manual input by users, except that the users do not need to input one by one, and only need to specify the key labels in a certain scene, and the system automatically increases the scene weight coefficients of the specified key labels and the similar labels in the scene.

In this embodiment, taking the calculation of the label weight of the "high-value customer" label as an example, step S1 is described in detail as follows:

s101, obtain client tag set a1= { 'high value client', 'medium value client', 'low value client', 'high risk client',_nthe label set a1 contains the full number of customer labels, n total.

S102, obtaining the total number T of the clients, namely the total number T of the business objects corresponding to the labels.

S103, acquiring owned 'high-value customer'Number of customers T (T) of tag_{High value customer}) And obtain the coverage rate P (t) of the label of the' high-value client_{High value customer})=T(t_{High value customer})/T。

S104, acquiring label basic weight C (t) input by a user_i) If there is no input, the default tag basis weight is 0.

S105, acquiring a 'high-value customer' label use behavior time set A2 (t)_{High value customer})={a_Query,a_Invoking,a_{Like points},a_{Evaluation of},a_{Treading on}In which a is_QueryIs the number of times the tag information is queried, a_InvokingIs the number of times the tag result is called, a_{Like points}Is the number of times the tag points, a_{Evaluation of}Is the number of label evaluation records, a_{Treading on}Is the number of tag steps.

S106, obtaining a label behavior weight set A3= { b =_Query, b_Invoking,b_{Like points},b_{Evaluation of},b_{Treading on}In which b is_QueryIs the weight of the behavior of the label information to be inquired, b_InvokingIs the weight of the behavior of the tag result being called, b_{Like points}Is the weight of behavior of the tag like, b_{Evaluation of}Is the label evaluation behavior weight, b_{Treading on}Is the tag tread behavior weight. Except for b_{Treading on}In addition to negative numbers, the others are positive numbers.

S106, obtaining a business scene coefficient CS (t)_{High value customer})。

S2, improving an information quantity calculation formula of the label, and introducing a label coverage rate reference value as the base number of the logarithm in the information quantity calculation formula by considering the overall distribution condition of the label coverage rate, wherein the label coverage rate is a true number; and taking the improved information quantity calculation formula as a label information quantity weight generation formula to generate a label information quantity weight.

The information entropy is a measure designed to quantify the uncertainty of information, and when an event occurs with a small probability, the information amount of the event is large, and when an event occurs with a large probability, the information amount of the event is small. By taking the idea of information entropy as a reference, the label can also be regarded as one piece of information of the business object, and if a large number of business objects possess the label, the amount of information contained in the label can be regarded as low; if only a small number of customers have the tag, the tag may be considered to contain a relatively high amount of information.

Because the label only has two states of "marked" and "unmarked", conventionally, only the information entropy of the label is calculated by using the information amount calculation formula i (x) = -log (p (x)), where the label coverage rate p (x) can be used as the probability of the label occurrence, but the calculation formula does not consider the overall distribution of the label coverage rate, and lacks a corresponding reference, such as: the weight should be 1 when the tag coverage approaches the tag coverage reference value, and the change of the weight should be more obvious when the tag coverage deviates more from the tag coverage reference value.

Based on this, the embodiment improves the above-mentioned conventional information amount calculation formula, and uses the label coverage reference value as the base number of log logarithm, and uses the label coverage as the true number, so as to ensure the reasonability of label weight. The specific calculation formula of the improved label information weight is as follows:

I(t_i) = log_{p (Standard)}(P(t_i))=

Wherein, I (t)_i) Is label information weight, P (reference) is label coverage rate reference value, n is label quantity in current label set, and can be obtained by calculating coverage rate average value of current label set, that is

. In addition, numerical values such as a median value, a quartile value and the like of the label coverage can be selected as a reference value of the label coverage according to a label distribution rule; the user may designate a certain label as a reference label and use the latest coverage of the designated reference label as a label coverage reference value. And the system can switch the calculation mode of the label coverage rate reference value among the three modes according to the condition that the user manually adjusts the basic weight of the label.

Label overlayRate reference value P (reference) and label basis weight C (t)_i) Both of these parameters may affect the final label weight generation or update results. If the label coverage rate reference value is set reasonably, the label weight generated based on the label coverage rate reference value can be considered to be appropriate, so that the user does not need to manually adjust the label basis weight to influence the final label weight result. If the user often needs to manually adjust the basic weight of the label to influence the final label weight result, the setting is not reasonable probably because of the calculation mode of the label coverage rate reference value; under the condition, the system can automatically adjust the calculation mode of the corresponding label coverage rate reference value, if the median value of the label coverage rate is used as the label coverage rate reference value originally, the user can not frequently need to manually adjust the label basic weight to influence the final label weight result by changing the quartile value of the label coverage rate, if the manual adjustment times of the user are less, the system proves effective, and if the manual adjustment times of the user are more, the system proves worse, the system continues to adjust.

In the practical application process, whether the calculation mode of the label coverage rate reference value is suitable for the current situation of the user can be indirectly evaluated according to the frequency of manually adjusting the label basic weight by the user, and if the fact that the user frequently and manually adjusts the label basic weight to correct the final weight result of each label is found, the system automatically adjusts the corresponding calculation mode of the label coverage rate reference value to automatically adjust the label coverage rate reference value to the best effect. For example: and in the initial stage, the average value of the label coverage rate is used as a label coverage rate reference value, the final weight of the label is corrected by finding that the user frequently and manually adjusts the basic weight of the label, the system can automatically adjust the median value of the label coverage rate to be used as the label coverage rate reference value, automatically compares the median value with the frequency of the basic weight of the manually adjusted label before adjustment after adjustment, and judges whether to perform switching again, reduction or prompt to enable the user to manually set the label coverage rate reference value.

In this embodiment, the process of obtaining the weight of the tag information amount by using the improved information amount calculation formula is as follows:

s201, calculating a labelCoverage reference value

The present embodiment is obtained by calculating the average label coverage of all the labels in the customer label set a 1.

S202, calculating label information weight I (t)_{High value customer}) = log_{p (Standard)}(P(t_i))=

。

And S3, automatically updating the label use weight coefficient based on the label use behavior times and the label behavior weight.

The importance of the tag needs to be combined with whether the tag is valuable to the user or not besides the information content of the tag, and whether the value is valuable to the user can be judged through the use behavior of the tag. The present embodiment automatically calculates the tag use weight coefficient based on the tag use behavior frequency and the tag behavior weight, so as to represent the use value of the tag in the tag weight.

The calculation formula of the label use weight coefficient is as follows:

UW(t_i)=

wherein, UW (t)_i) A weighting factor is used for the tag,

is a label t_iNumber of certain actions, b_iSetting the upper limit value of each behavior weight for the corresponding behavior weight at the same time to prevent the occurrence of extreme value, if setting the upper limit value of the weight of the behavior i to 1, even if

Is greater than 1, the corresponding behavior weight is also only 1.

In this embodiment, since the tag behavior weight may be adjusted according to the scene characteristics, the calculated tag usage weight coefficient may be dynamically adjusted sufficiently according to the change of the service scene, and the influence of the service scene on the generation and update of the tag is not reflected on the classification of the tag, but the same tag may be dynamically updated following the switching or change of the service scene, so that the accuracy of the tag weight coefficient is improved and enhanced well. The concrete embodiment is as follows: the label weight coefficient is determined by performing coarse adjustment through the service scene coefficient, and further performing fine adjustment according to the weights of different label behaviors in the service scene.

The update period of the tag usage weight coefficient can be real-time update, but is usually set to update according to a specified period in consideration of consuming system computing resources, such as: the updating is carried out according to hours and days.

In the present embodiment, the automatic update process of the tag usage weight coefficient of "high-value customer" is as follows

S301, calculating a tag use weight coefficient:

UW(t_{high value customer})=

Wherein

_iIs a label t_iNumber of certain actions, b_iIs the corresponding action weight, namely UW (t)_{High value customer})=

_Queryb_Query+

_Callb_Invoking+

_Praiseb_{Like points}+

_Evaluationb_{Evaluation of}+

_{Stepping on}b_{Treading on}In this embodiment, the upper weight limit of each behavior is assumed to be k, and

_queryb_Query、

_Callb_Invoking、

_Praiseb_{Like points}、

_Evaluationb_{Evaluation of}、

_{Stepping on}b_{Treading on}The absolute values of all results are less than k.

And S302, if the label use weight coefficient is updated every hour, calculating according to the latest behavior number every hour.

And S4, calculating the attenuation coefficient of the label weight according to the label use scene and the artificial adjustment coefficient. The method specifically comprises the following steps:

s401, when the system is started initially, a user needs to judge the label with weight attenuation according to business experience, and configures attenuation start date, attenuation period, attenuation coefficient and attenuation mode.

The decay start date includes, but is not limited to, a tag update date, a fixed date, and a dynamic date, where the dynamic date refers to a date that changes with respect to a specific business object state, such as: decay begins according to the client's birthday (this time is not the same for different clients). Decay periods include time periods such as weekly, daily, monthly, etc. The attenuation coefficient is a decimal less than 1 and greater than 0. The attenuation mode can be a plurality of modes such as fixed multiple, index, fixed value and the like so as to meet different label value attenuation characteristics.

S402, calculating the attenuation coefficient of the label weight according to the attenuation mode.

The different attenuation modes are not the same for the calculation of the attenuation coefficients:

when the attenuation mode is a fixed multiple or an exponential, the obtained attenuation coefficient is as follows: AW (t)_i) = ((current date-decay start date)/number of decay cycles) × decay coefficient, or AW (t)_i)=

. The above attenuation coefficient calculation formula is only a simple implementation example, and in the practical application process, a person skilled in the art can train the corresponding attenuation coefficient calculation formula according to the value attenuation degree of the label.

When the attenuation mode is a fixed value, the attenuation coefficient is as follows: a (t)_i) = ((current date-decay start date)/decay period) × decay coefficient.

Wherein, AW (t)_i) The attenuation coefficient when the attenuation mode is a fixed multiple or an exponential, A (t)_i) The attenuation coefficient is a constant value of the attenuation mode.

In this embodiment, assuming that the attenuation start date is D, the attenuation period is weekly attenuation, the attenuation coefficient is ks, and the attenuation mode may be a fixed multiple, the attenuation coefficient is calculated as follows: a (t)_{High value customer}) = ((current date-D)/7) × ks.

Preferably, after a certain amount of training sample data is obtained, the similarity between the current label and the label with the configured attenuation coefficient can be calculated by converting the attribute information of the label into a vector, and if the label with higher similarity exists, the attenuation coefficient of the corresponding similar label is automatically referred.

And S5, generating and dynamically updating the label weight according to the attenuation mode of the label weight and by integrating the label information weight, the label use weight coefficient, the label basic weight and the service scene coefficient on the basis of the steps S1-S4.

This step dynamically generates and updates tag weights H (t)_i) There are three cases:

1. when the label weight is not attenuated, it is calculated as follows: h (t)_i)=( I(t_i)*UW(t_i)+ C(t_i))* CS_i(t_i)；

2. When the attenuation mode of the label weight is a fixed multiple or an exponential, the calculation is as follows: h (t)_i)=( I(t_i)*UW(t_i)+ C(t_i))* CS_i(t_i)* AW(t_i) ；

3. When the attenuation mode of the tag weight is a fixed value, the following calculation is performed: h (t)_i)=( I(t_i)*UW(t_i)+ C(t_i))* CS_i(t_i)+A(t_i)；

Wherein, I (t)_i) Weight of label information generated in step S2, UW (t)_i) Use of a weighting factor, C (t), for the automatically updated label of step S3_i) Being label basis weight, CS_i(t_i) For the traffic scenario coefficients, AW (t)_i)、A(t_i) The attenuation coefficient for the label weight calculated in step S4.

According to the attenuation mode of the label weight, selecting a corresponding label weight calculation formula to dynamically calculate and update the label weight H (t)_i) The value is obtained. The tag weights may be updated in real time, but are typically set to be updated on a specified period, such as 15 minutes, hours, or days, to account for the consumption of system computing resources.

In this embodiment, the tag weight of the "high-value customer" is generated and automatically updated as follows:

s501, generating label weight H (t)_{High value customer})=(I(t_{High value customer})*UW(t_{High value customer})+ C(t_{High value customer}))* CS(t_{High value customer})* AW(t_{High value customer}) 。

S502, in this embodiment, if the label weight is updated every hour, the corresponding label weight is automatically updated every hour.

Preferably, steps S1 to S4 may be performed without performing pre-calculation, and after the calculation formula of step S5 is directly formed, the calculation formula is substituted into the corresponding data for one time to perform calculation, and update is completed by using the latest data when the label weight is updated.

Example 2

Based on the same inventive concept as embodiment 1, this embodiment provides an intelligent tag weight updating system based on information entropy, which includes the following modules:

Further, the source data acquired by the source data acquisition module further includes a label basis weight C (t)_i). The label weight dynamic updating module integrates the label information weight, the label use weight coefficient, the label basic weight and the service scene system according to the attenuation mode of the label weightAnd counting, generating label weight and dynamically updating.

Further, in the tag information amount weight generating module, the introduced tag coverage rate reference value includes the following calculation modes: obtaining by calculating the average coverage rate of the current label set, or selecting a median value or a quartile value of the label coverage rate as a label coverage rate reference value according to a label distribution rule, or selecting the latest coverage rate of a reference label designated by a user as a label coverage rate reference value; the label information weight generating module also switches the calculation mode of the label coverage rate reference value among the calculation modes according to the manual adjustment condition of the label basic weight.

Example 3

Based on the same inventive concept as that of embodiment 1, this embodiment provides a computer device, which includes a memory, a processor, and a computer program stored on the memory and running on the processor, and when the processor executes the computer program, the steps of the intelligent update method for tag weights in embodiment 1 are implemented.

The above embodiments are preferred embodiments of the present invention, but the present invention is not limited to the above embodiments, and any other changes, modifications, substitutions, combinations, and simplifications which do not depart from the spirit and principle of the present invention should be construed as equivalents thereof, and all such changes, modifications, substitutions, combinations, and simplifications are intended to be included in the scope of the present invention.

Claims

1. The label weight intelligent updating method based on the information entropy is characterized by comprising the following steps of:

2. The intelligent updating method for label weight according to claim 1, wherein the source data obtained in step S1 further includes a label basis weight C (t)_i) (ii) a In step S5, tag information amount weight, tag use weight coefficient, tag basis weight, and service scene coefficient are integrated according to the attenuation mode of the tag weight, and tag weight is generated and dynamically updated.

3. The intelligent updating method for label weight according to claim 2, wherein the label coverage reference value of step S2 includes the following calculation methods: obtaining by calculating the average coverage rate of the current label set, or selecting a median value or a quartile value of the label coverage rate as a label coverage rate reference value according to a label distribution rule, or selecting the latest coverage rate of a reference label designated by a user as a label coverage rate reference value; and switching the calculation mode of the label coverage rate reference value among the calculation modes according to the manual adjustment condition of the label basic weight.

4. The intelligent updating method for label weight according to claim 3, wherein when the label coverage reference value is obtained by calculating the average value of the coverage of the current label set in step S2, the improved information amount is calculated by the following formula:

I(t_i) = log_P(standard)(P(t_i))=

Wherein, I (t)_i) Is a label information amount weight, P (reference) is a label coverage reference value,

is a label t_iN is the number of tags in the current set of tags.

5. The intelligent updating method for label weight according to claim 1, wherein the formula for calculating the label use weight coefficient in step S3 is as follows:

UW(t_i)=

wherein, UW (t)_i) A weighting factor is used for the tag,

is a label t_iNumber of certain actions, b_iAnd simultaneously setting the upper limit value of each behavior weight for the corresponding behavior weight.

6. The intelligent tag weight updating method according to claim 1, wherein the step S4 comprises the following steps:

s401, judging the label with weight attenuation according to business experience when the label is initially started, and configuring an attenuation starting date, an attenuation period, an attenuation coefficient and an attenuation mode;

s402, calculating an attenuation coefficient of the label weight according to an attenuation mode;

；

When the attenuation mode is a fixed value, the attenuation coefficient is as follows: a (t)_i) = ((current date-decay start date)/decay period) × decay coefficient;

7. The intelligent updating method for label weight according to claim 6, wherein the source data obtained in step S1 further includes a label basis weight C (t)_i) (ii) a In step S5, tag weight H (t) is generated_i) And dynamically updated scenarios include:

when the label weight is not attenuated, it is calculated as follows: h (t)_i)=( I(t_i)*UW(t_i)+ C(t_i))* CS_i(t_i)；

When the attenuation mode of the label weight is a fixed multiple or an exponential, the calculation is as follows: h (t)_i)=( I(t_i)*UW(t_i)+ C(t_i))* CS_i(t_i)* AW(t_i) ；

When the attenuation mode of the tag weight is a fixed value, the following calculation is performed: h (t)_i)=( I(t_i)*UW(t_i)+ C(t_i))* CS_i(t_i)+A(t_i)；

Wherein, I (t)_i) Weight of label information generated in step S2, UW (t)_i) The weight coefficient, AW (t), is used for the automatically updated tag of step S3_i)、A(t_i) The attenuation coefficient for the label weight calculated in step S4.

8. Intelligent label weight updating system based on information entropy is characterized by comprising:

the source data acquisition module is used for acquiring source data used for label weight calculation and preprocessing the source data; the source data includes: label set A1Total number of service objects T corresponding to the label, and label coverage rate P (T)_i) Label usage behavior times set a2 (t)_i) Label behavior weight set A3 and service scene coefficient CS_i(t_i)；

9. The intelligent tag weight update system of claim 8, wherein the obtained source data further comprises a tag basis weight C (t)_i)；

The label weight dynamic updating module generates and dynamically updates label weights according to the attenuation mode of the label weights and by integrating the label information weight weights, the label use weight coefficients, the label basic weights and the service scene coefficients;

in the tag information weight generating module, the introduced tag coverage rate reference value comprises the following calculation modes: obtaining by calculating the average coverage rate of the current label set, or selecting a median value or a quartile value of the label coverage rate as a label coverage rate reference value according to a label distribution rule, or selecting the latest coverage rate of a reference label designated by a user as a label coverage rate reference value; the label information weight generating module also switches the calculation mode of the label coverage rate reference value among the calculation modes according to the manual adjustment condition of the label basic weight.

10. Computer arrangement comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor, when executing the computer program, carries out the steps of the intelligent update method of tag weights as claimed in any of the claims 1-7.