CN111178534A - Method and device for determining value distribution function, electronic equipment and readable medium - Google Patents

Method and device for determining value distribution function, electronic equipment and readable medium Download PDF

Info

Publication number
CN111178534A
CN111178534A CN201811347255.XA CN201811347255A CN111178534A CN 111178534 A CN111178534 A CN 111178534A CN 201811347255 A CN201811347255 A CN 201811347255A CN 111178534 A CN111178534 A CN 111178534A
Authority
CN
China
Prior art keywords
distribution function
value
determining
feature
value distribution
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811347255.XA
Other languages
Chinese (zh)
Inventor
冉世伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing ByteDance Network Technology Co Ltd
Original Assignee
Beijing ByteDance Network Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing ByteDance Network Technology Co Ltd filed Critical Beijing ByteDance Network Technology Co Ltd
Priority to CN201811347255.XA priority Critical patent/CN111178534A/en
Publication of CN111178534A publication Critical patent/CN111178534A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the disclosure discloses a method and a device for determining a value distribution function, electronic equipment and a readable medium. The method comprises the following steps: determining the value distribution function divergence of at least one characteristic according to the characteristic selectable value of the at least one characteristic of the sample data; constructing a decision tree according to the value distribution function divergence of the at least one feature; and determining a value distribution function of the target data according to the decision tree. By adopting the technical scheme provided by the disclosure, the effect of determining the value distribution function of information display through the existing data can be realized.

Description

Method and device for determining value distribution function, electronic equipment and readable medium
Technical Field
The embodiment of the disclosure relates to the technical field of information display, and in particular relates to a method and a device for determining a value distribution function, an electronic device and a readable medium.
Background
In the era of rapid development of economy and explosive growth of information, the understanding of the value of information display also becomes an important technology in the industry.
The value distribution function is a function in which the value is used as an abscissa and the number of distributions on each value is used as an ordinate, and the value distribution function can determine the number of distributions on each value and determine the probability that the current value can be displayed by information, so that the value distribution function is particularly important for information display. However, there is no method for determining the value distribution function of the information display flow currently in the industry.
Disclosure of Invention
The embodiment of the disclosure provides a method and a device for determining a value distribution function, an electronic device and a readable medium, which can achieve the effect of determining the value distribution function displayed by information through existing data.
In a first aspect, an embodiment of the present disclosure provides a method for determining a cost distribution function, where the method includes:
determining the value distribution function divergence of at least one characteristic according to the characteristic selectable value of the at least one characteristic of the sample data;
constructing a decision tree according to the value distribution function divergence of the at least one feature;
and determining a value distribution function of the target data according to the decision tree.
Further, determining a value distribution function divergence of at least one feature according to a feature selectable value of the at least one feature of the sample data, comprising:
determining a value distribution function of each feature selectable value of each feature of the feature aiming at each feature in sample data;
candidate grouping is carried out on the optional values of the features to form a plurality of candidate set pairs, and divergence values of a value distribution function between the candidate set pairs are determined;
and determining the divergence of the value distribution function of the characteristic according to the divergence value of the value distribution function between each candidate set pair.
Further, constructing a decision tree according to the divergence of the value distribution function of the at least one feature, comprising:
aiming at the current node, taking the characteristic of the divergence of the maximum value distribution function as a splitting basis, and splitting the current node into two sub-nodes;
and traversing all the nodes to construct a binary decision tree.
Further, before splitting the current node into two child nodes by taking the feature of the maximum value distribution function divergence as a splitting basis for the current node, the method further includes:
and if the number of the sample data in the current node is smaller than a preset number threshold, or if the height of the current node reaches a preset height threshold, stopping splitting the current node.
Further, determining a value distribution function of the target data according to the decision tree includes:
and determining a value distribution function of the target data according to the feature selectable value of each feature of the sample data in the nodes in the decision tree.
Further, determining a value distribution function of the target data according to the feature selectable value of each feature of the sample data in the node in the decision tree, including:
acquiring a target selectable value of a target feature of target data;
determining a matching node matched with the target data according to the similarity between the feature selectable value of each feature of the sample data in the node in the decision tree and the target selectable value;
and determining a value distribution function of the target data according to the value distribution function of the sample data in the matching node.
Further, after determining a cost distribution function of the target data according to the decision tree, the method further comprises:
and determining the expected value of the target data according to the value distribution function of the target data.
In a second aspect, an embodiment of the present disclosure further provides an apparatus for determining a cost distribution function, where the apparatus includes:
the value distribution function divergence determining module is used for determining the value distribution function divergence of at least one characteristic according to the characteristic selectable value of the at least one characteristic of the sample data;
the decision tree construction module is used for constructing a decision tree according to the value distribution function divergence of the at least one characteristic;
and the value distribution function determining module is used for determining a value distribution function of the target data according to the decision tree.
Further, the value distribution function divergence determination module includes:
the characteristic selectable value distribution determining unit is used for determining a value distribution function of each characteristic selectable value of the characteristic aiming at each characteristic in the sample data;
a candidate set pair divergence value determining unit, configured to perform candidate grouping on each feature selectable value of the feature to form a plurality of candidate set pairs, and determine divergence values of a value distribution function between the candidate set pairs;
and the value distribution function divergence determining unit is used for determining the value distribution function divergence of the characteristic according to the divergence value of the value distribution function between each candidate set pair.
Further, the decision tree building module comprises:
the node splitting unit is used for splitting the current node into two sub-nodes by taking the characteristic of the divergence of the maximum value distribution function as a splitting basis for the current node;
and the binary decision tree construction unit is used for traversing all the nodes to construct a binary decision tree.
Further, the decision tree building module further includes:
and the splitting termination determining unit is used for stopping splitting the current node if the number of the sample data in the current node is less than a preset number threshold or if the height of the current node reaches a preset height threshold.
Further, the value distribution function determining module includes:
and the value distribution function determining unit is used for determining the value distribution function of the target data according to the feature selectable value of each feature of the sample data in the node in the decision tree.
Further, the cost distribution function determination unit includes:
a target selectable value determination subunit, configured to obtain a target selectable value of a target feature of the target data;
the target optional value matching subunit is used for determining a matching node matched with the target data according to the similarity between the feature optional value of each feature of the sample data in the node in the decision tree and the target optional value;
and the value distribution function determining subunit is used for determining a value distribution function of the target data according to the value distribution function of the sample data in the matching node.
Further, the apparatus further comprises:
and the expected value determining module is used for determining the expected value of the target data according to the value distribution function of the target data.
In a third aspect, an embodiment of the present disclosure provides an electronic device, which includes a memory, a processor, and a computer program stored on the memory and executable on the processor, where the processor implements the method for determining a cost distribution function according to an embodiment of the present disclosure when executing the computer program.
In a fourth aspect, the disclosed embodiments provide a computer-readable medium, on which a computer program is stored, which when executed by a processor, implements a method for determining a cost distribution function according to the disclosed embodiments.
According to the technical scheme provided by the embodiment of the disclosure, the value distribution function divergence of at least one characteristic is determined according to the characteristic selectable value of the at least one characteristic of the sample data; constructing a decision tree according to the value distribution function divergence of the at least one feature; and determining a value distribution function of the target data according to the decision tree. By adopting the technical scheme provided by the disclosure, the effect of determining the value distribution function of information display through the existing data can be realized.
Drawings
FIG. 1 is a flow chart of a method for determining a cost distribution function according to an embodiment of the present disclosure;
FIG. 2 is a flow chart of a method for determining a cost distribution function according to an embodiment of the disclosure;
FIG. 3 is a flowchart of a method for determining a cost distribution function according to an embodiment of the disclosure;
FIG. 4 is a schematic structural diagram of an apparatus for determining a cost distribution function according to a second embodiment of the disclosure;
fig. 5 is a schematic structural diagram of an electronic device according to a third embodiment of the present disclosure.
Detailed Description
The present disclosure is described in further detail below with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the disclosure and are not limiting of the disclosure. It should be further noted that, for the convenience of description, only some of the structures relevant to the present disclosure are shown in the drawings, not all of them.
Before discussing exemplary embodiments in more detail, it should be noted that some exemplary embodiments are described as processes or methods depicted as flowcharts. Although a flowchart may describe the steps as a sequential process, many of the steps can be performed in parallel, concurrently or simultaneously. In addition, the order of the steps may be rearranged. The process may be terminated when its operations are completed, but may have additional steps not included in the figure. The processes may correspond to methods, functions, procedures, subroutines, and the like.
In the following embodiments, optional features and examples are provided in each embodiment, and various features described in the embodiments may be combined to form a plurality of alternatives, and each numbered embodiment should not be regarded as only one technical solution.
Example one
Fig. 1 is a flowchart of a method for determining a value distribution function according to an embodiment of the present disclosure, where the method is applicable to information presentation, and the method may be executed by a device for determining a value distribution function according to an embodiment of the present disclosure, where the device may be implemented by software and/or hardware, and may be integrated in electronic devices such as a client, a terminal, and a server.
As shown in fig. 1, the method for determining the cost distribution function includes:
s110, determining the value distribution function divergence of at least one characteristic according to the characteristic selectable value of the at least one characteristic of the sample data.
Wherein the sample data may be the result of a random extraction of data of the historical information presentation, as the information presentation may involve one or more features. The characteristics can be the characteristics of the user watching the display information, such as age characteristics, which can be 16-18 years old, 19-22 years old, 23-26 years old, province characteristics, which can be Beijing, Shanghai, Guangdong province, and the like, and gender characteristics, which can be male and female, and other characteristics divided in any way. Besides, the characteristics can also be characteristics of the display information, such as the type of the display information, e.g. e-commerce, diet and life, games, and the like. In this embodiment, the features may be divided by labels of each dimension included in the history data of the information presentation.
As can be appreciated from the above, each feature may include two or more feature options, and different feature options may have an effect on the value distribution function. If the information being presented is a game type, as for men and women in gender characteristics, the value distribution function for men may be different than for women, because men may be more concerned with game type presentation information, have higher download conversion rates, and therefore have a higher value determined by the game type for men's presentation information.
Determining a value distribution function divergence of at least one feature of the sample data according to a feature selectable value of the at least one feature. Wherein the divergence of the cost distribution function may be a relative entropy of the cost distribution function, i.e., a KL divergence (Kullback-Leible divergence). Relative entropy measures the distance between two random distributions, where the relative entropy is zero when the two random distributions are the same, and increases when the difference between the two random distributions increases. The relative entropy (KL divergence) can be used to compare the similarity between two data. The divergence of the value distribution function of the features is determined by means of statistical divergence, feature selectable values which are relatively close to the value distribution function and are completely different from the value distribution function can be determined, and meanwhile, the difference between the value distribution functions of the selectable values of each feature in one feature can be determined.
And S120, constructing a decision tree according to the value distribution function divergence of the at least one characteristic.
The decision tree may include a root node and at least two leaf nodes, where, if the root node 1 is split into a node 2 and a node 3, and the node 3 is split into a node 4 and a node 5, the node 3 may be referred to as an intermediate node, and the node 2, the node 4, and the node 5 may be referred to as leaf nodes. It follows that the decision tree may not include intermediate nodes if splitting is performed only once, or intermediate nodes if splitting is performed multiple times. The height of the node may be the number of times of splitting from the root node to the current node, for example, the heights of the nodes 2 and 3 are level 1, the heights of the nodes 4 and 5 are level 2, or the height of the root node 1 may be level 1, the heights of the nodes 2 and 3 are level 2, and so on for other nodes.
In the technical scheme, the value distribution function of the selectable value of the characteristic in one characteristic and the value distribution function of the selectable value of other characteristics can be selected with the maximum divergence value to serve as the splitting basis of the root node or the middle node of the decision tree. When a plurality of features exist, the calculation can be performed on the plurality of features, the maximum divergence value is taken as the splitting basis of the current node, and the child nodes of other features after splitting are further determined. For example, in the root node, the feature selectable values of the feature a are a1 and a2, the feature selectable values of the feature B are B1, B2 and B3, and the feature selectable values of the feature C are C1 and C2, if the divergence of the value distribution functions of B1 and B2+ B3 is the largest, the root node is split in the manner of B1 and B2+ B3, and then the first child node of the two split child nodes contains a1 and a2, B1, C1 and C2, and the second child node contains a1 and a2, B2 and B3, C1 and C2, so that when the two child nodes continue to be split, the two child nodes can still be split in the manner described above.
In this technical solution, optionally, the constructing a decision tree according to the divergence of the value distribution function of the at least one feature includes: aiming at the current node, taking the characteristic of the divergence of the maximum value distribution function as a splitting basis, and splitting the current node into two sub-nodes; and traversing all the nodes to construct a binary decision tree. For the nodes that need to be split currently, the splitting criterion may be based on the feature of the maximum value distribution function divergence, and in the above example, if the distribution function divergence of the feature B is the maximum in the features A, B and C, the splitting criterion is based on the feature B, specifically, the splitting criterion is based on the feature selectable value division manner that obtains the maximum divergence value, such as B1 and B2+ B3 in the above example. In the technical scheme, the constructed decision tree can be a binary decision tree, and the binary decision tree is characterized in that two child nodes are definitely split if the nodes are split.
In this technical solution, optionally, before splitting the current node into two child nodes by using the feature of the maximum value distribution function divergence as a splitting basis for the current node, the method further includes: and if the number of the sample data in the current node is smaller than a preset number threshold, or if the height of the current node reaches a preset height threshold, stopping splitting the current node. The technical scheme has the advantages that the representativeness of the sample data in one node can be improved, if blind splitting is performed to cause the number of the samples in the node to be too small or the height of the current node to be too high, the representativeness of the sample data in the current node can be influenced, the value distribution function of the sample data with common characteristics cannot be represented, and therefore the accuracy of determining the value distribution function of the target data through the decision tree is influenced.
And S130, determining a value distribution function of the target data according to the decision tree.
In the process of constructing the decision tree, the value distribution function based on which the divergence is the largest is constructed, so that the obtained value distribution functions of the sample data in each intermediate node or leaf node of the decision tree are as close as possible. Therefore, the value distribution condition of the target data can be determined according to the value distribution function condition of each node.
In this technical solution, optionally, determining a value distribution function of the target data according to the decision tree includes: and determining a value distribution function of the target data according to the feature selectable value of each feature of the sample data in the nodes in the decision tree. The method comprises the steps that sample data in nodes can be marked according to features and feature selectable values, after target data are taken, the features and the feature selectable values of the target data can be determined to be closest to the features and the feature selectable values of one or more nodes in a decision tree, and after the target data are determined, a value distribution function of the target data is determined according to a value distribution function of one or more nodes.
According to the technical scheme provided by the embodiment of the disclosure, the value distribution function divergence of at least one characteristic is determined according to the characteristic selectable value of the at least one characteristic of the sample data; constructing a decision tree according to the value distribution function divergence of the at least one feature; and determining a value distribution function of the target data according to the decision tree. By adopting the technical scheme provided by the disclosure, the effect of determining the value distribution function of information display through the existing data can be realized.
On the basis of the above technical solutions, optionally, after determining the value distribution function of the target data according to the decision tree, the method further includes: and determining the expected value of the target data according to the value distribution function of the target data. After the value distribution function is determined, the expected value of the current target data can be determined, which is beneficial to the platform to evaluate the value of the target data in advance, and the information presenter can determine the probability that the value of the current provided target data can obtain the information presentation opportunity according to the value distribution function of the target data, so that the information presenter is beneficial to controlling the value provided by the target data according to the actual situation while the platform is beneficial to controlling the information presentation.
Fig. 2 is a flowchart of a method for determining a cost distribution function according to an embodiment of the present disclosure. The present embodiment is embodied on the basis of various alternatives in the above-described embodiments. The concrete optimization is as follows: determining a value distribution function divergence of at least one feature of the sample data according to a feature selectable value of the at least one feature, comprising: determining a value distribution function of each feature selectable value of each feature of the feature aiming at each feature in sample data; candidate grouping is carried out on the optional values of the features to form a plurality of candidate set pairs, and divergence values of a value distribution function between the candidate set pairs are determined; and determining the divergence of the value distribution function of the characteristic according to the divergence value of the value distribution function between each candidate set pair.
As shown in fig. 2, the method for determining the cost distribution function includes:
s210, determining a value distribution function of each feature selectable value of each feature of the feature aiming at each feature in the sample data.
When the included features are multiple, a value distribution function of each feature selectable value can be determined for each feature. Wherein, since the value data is greater than 0, the abscissa of the value distribution function may be from 0, and the ordinate may be the number of sample data falling on the current value, and if 100 sample data is available, for a value of 5 yuan, 10 sample data may be included, and the ordinate of the value distribution function at 5 yuan may be 0.1. Generally, the cost distribution function as a whole generally exhibits an incomplete normal distribution curve. When a price value of the abscissa, such as 5-tuple, is taken, it can be determined according to the value distribution function how large the probability of information presentation can be obtained if the information presenter competes with 5-tuple, that is, the sum of the probabilities of all the values smaller than 5-tuple in the value distribution function.
S220, candidate grouping is carried out on the optional values of the features to form a plurality of candidate set pairs, and the divergence value of the value distribution function between the candidate set pairs is determined.
If there are three features, that is, feature selectable values a1, a2, and a3 of feature a, the candidate set pair may be a pair of candidate sets of a feature selectable value a1 set and a2+ a3 set, a feature selectable value a2 set and a1+ a3 set, a feature selectable value a3 set and a1+ a2 set, and in the case of these three features, three candidate set pairs may be constructed. It is understood that if only two features exist, only one candidate set pair can be formed, and the "candidate" has no practical meaning in practice, and only the divergence value of the set pair needs to be calculated in actual calculation.
After obtaining a plurality of candidate set pairs, determining a divergence value of the value distribution function between each candidate set pair, specifically, calculating a divergence value of the value distribution function between two sets of each candidate set pair, in this embodiment, the divergence value may be directly calculated, or may be determined according to a ratio of the value distribution function of the feature optional value itself to the number of the feature optional value in the set. The advantage of this arrangement is that the speed of computation of the divergence values of the cost distribution function between the two sets of candidate set pairs can be increased.
And S230, determining the divergence of the value distribution function of the characteristic according to the divergence value of the value distribution function between each candidate set pair.
In this embodiment, the maximum value of the divergence values of the cost distribution function between each candidate set pair may be used as the cost distribution function divergence of the feature.
S240, constructing a decision tree according to the value distribution function divergence of the at least one characteristic.
In the technical scheme, after the value distribution function divergence of one feature is calculated, all the features can be traversed, so that the sequence of the value distribution function divergence of all the features can be obtained, and the feature corresponding to the maximum divergence can be used as the splitting basis of the current node so as to form the decision tree.
And S250, determining a value distribution function of the target data according to the decision tree.
On the basis of the technical schemes, the technical scheme provides a method for dividing each feature in sample data into candidate set pairs and determining the value distribution function divergence of the feature. The method has the advantages that in the process of constructing the decision tree, each node can be split according to the condition with the largest difference, so that the value distribution functions of the sample data in the obtained nodes are close to each other as much as possible, and the method is favorable for determining the value distribution function of the target data through the constructed decision tree.
Fig. 3 is a flowchart of a method for determining a cost distribution function according to an embodiment of the disclosure. The present embodiment is embodied on the basis of various alternatives in the above-described embodiments. The concrete optimization is as follows: determining a value distribution function of target data according to the feature selectable value of each feature of the sample data in the node in the decision tree, wherein the method comprises the following steps: acquiring a target selectable value of a target feature of target data; determining a matching node matched with the target data according to the similarity between the feature selectable value of each feature of the sample data in the node in the decision tree and the target selectable value; and determining a value distribution function of the target data according to the value distribution function of the sample data in the matching node.
As shown in fig. 3, the method for determining the cost distribution function includes:
s310, determining the value distribution function divergence of at least one characteristic according to the characteristic selectable value of the at least one characteristic of the sample data.
S320, constructing a decision tree according to the value distribution function divergence of the at least one characteristic.
S330, acquiring a target selectable value of the target characteristic of the target data.
The target features may be features consistent with features involved in the current decision tree construction process, and it can be understood that, for the features D in the target data, if the features D are not involved in the decision tree construction process, prediction of the features D through the decision tree cannot be accurately embodied. The target selectable value for a target feature may be a specific value in each feature, such as gender feature, or male or female, for the target data.
S340, determining a matched node matched with the target data according to the similarity between the feature selectable value of each feature of the sample data in the node in the decision tree and the target selectable value.
In the technical scheme, the characteristics of the sample data in the nodes in the decision tree may not be unique, but if the same or similar characteristics exist, the characteristics can be used for matching the target data. The similarity between the target selectable value and the data in the nodes may be obtained by comparing the nodes, and using the node with the highest similarity as a matching node. In the technical scheme, the number of the matching nodes is not limited to one, and when a plurality of matching nodes exist, all the nodes can be weighted and averaged to obtain the value distribution function of the matching nodes. The weighted average may be based on the number of sample data in each matching node, or may be based on the similarity with the target data, where the higher the similarity is, the higher the weight is.
And S350, determining a value distribution function of the target data according to the value distribution function of the sample data in the matching node.
After the matching node is determined, a value distribution function of the target data can be estimated according to the value distribution function of the node.
The technical scheme provides a specific method for the value distribution function of the target data on the basis of the technical schemes. By determining the matching nodes and determining the value distribution function of the target data according to the matching nodes, the value distribution function of the target data can be determined through historical data, the technical problem that the value distribution function cannot be determined in the prior art is solved, the effect of managing and controlling the value of the target data by a platform is achieved, and meanwhile, an information presenter can be helped to determine the information presentation value of the target data according to the own requirements.
Example two
Fig. 4 is a schematic structural diagram of a device for determining a cost distribution function according to a second embodiment of the present disclosure. As shown in fig. 4, the apparatus for determining the cost distribution function includes:
a value distribution function divergence determining module 410, configured to determine a value distribution function divergence of at least one feature according to a feature selectable value of the at least one feature of the sample data;
a decision tree construction module 420, configured to construct a decision tree according to the divergence of the value distribution function of the at least one feature;
and a value distribution function determining module 430 for determining a value distribution function of the target data according to the decision tree.
According to the technical scheme provided by the embodiment of the disclosure, the value distribution function divergence of at least one characteristic is determined according to the characteristic selectable value of the at least one characteristic of the sample data; constructing a decision tree according to the value distribution function divergence of the at least one feature; and determining a value distribution function of the target data according to the decision tree. By adopting the technical scheme provided by the disclosure, the effect of determining the value distribution function of information display through the existing data can be realized.
On the basis of the above technical solutions, optionally, the value distribution function divergence determining module 410 includes:
the characteristic selectable value distribution determining unit is used for determining a value distribution function of each characteristic selectable value of the characteristic aiming at each characteristic in the sample data;
a candidate set pair divergence value determining unit, configured to perform candidate grouping on each feature selectable value of the feature to form a plurality of candidate set pairs, and determine divergence values of a value distribution function between the candidate set pairs;
and the value distribution function divergence determining unit is used for determining the value distribution function divergence of the characteristic according to the divergence value of the value distribution function between each candidate set pair.
On the basis of the above technical solutions, optionally, the decision tree building module 420 includes:
the node splitting unit is used for splitting the current node into two sub-nodes by taking the characteristic of the divergence of the maximum value distribution function as a splitting basis for the current node;
and the binary decision tree construction unit is used for traversing all the nodes to construct a binary decision tree.
On the basis of the above technical solutions, optionally, the decision tree building module 420 further includes:
and the splitting termination determining unit is used for stopping splitting the current node if the number of the sample data in the current node is less than a preset number threshold or if the height of the current node reaches a preset height threshold.
On the basis of the above technical solutions, optionally, the value distribution function determining module 430 includes:
and the value distribution function determining unit is used for determining the value distribution function of the target data according to the feature selectable value of each feature of the sample data in the node in the decision tree.
On the basis of the above technical solutions, optionally, the value distribution function determining unit includes:
a target selectable value determination subunit, configured to obtain a target selectable value of a target feature of the target data;
the target optional value matching subunit is used for determining a matching node matched with the target data according to the similarity between the feature optional value of each feature of the sample data in the node in the decision tree and the target optional value;
and the value distribution function determining subunit is used for determining a value distribution function of the target data according to the value distribution function of the sample data in the matching node.
On the basis of the above technical solutions, optionally, the apparatus further includes:
and the expected value determining module is used for determining the expected value of the target data according to the value distribution function of the target data.
The product can execute the method provided by any embodiment of the disclosure, and has corresponding functional modules and beneficial effects of the execution method.
EXAMPLE III
Fig. 5 is a schematic structural diagram of an electronic device according to a third embodiment of the present disclosure. Referring now to FIG. 5, a block diagram of an electronic device 500 suitable for use in implementing embodiments of the present disclosure is shown. The electronic devices in the embodiments of the present disclosure may include, but are not limited to, electronic devices such as mobile phones, notebook computers, digital broadcast receivers, PDAs (personal digital assistants), PADs (tablet computers), PMPs (portable multimedia players), in-vehicle electronic devices (e.g., in-vehicle navigation electronic devices), and the like, and stationary electronic devices such as digital TVs, desktop computers, and the like. The electronic device shown in fig. 5 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present disclosure.
As shown in fig. 5, electronic device 500 may include a processing means (e.g., central processing unit, graphics processor, etc.) 501 that may perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)502 or a program loaded from a storage means 508 into a Random Access Memory (RAM) 503. In the RAM 503, various programs and data necessary for the operation of the electronic apparatus 500 are also stored. The processing device 501, the ROM502, and the RAM 503 are connected to each other through a bus 504. An input/output (I/O) interface 505 is also connected to bus 504.
Generally, the following devices may be connected to the I/O interface 505: input devices 506 including, for example, a touch screen, touch pad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; output devices 507 including, for example, a Liquid Crystal Display (LCD), speakers, vibrators, and the like; storage devices 508 including, for example, magnetic tape, hard disk, etc.; and a communication device 509. The communication means 509 may allow the electronic device 500 to communicate with other devices wirelessly or by wire to exchange data. While fig. 5 illustrates an electronic device 500 having various means, it is to be understood that not all illustrated means are required to be implemented or provided. More or fewer devices may alternatively be implemented or provided.
In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network via the communication means 509, or installed from the storage means 508, or installed from the ROM 502. The computer program performs the above-described functions defined in the methods of the embodiments of the present disclosure when executed by the processing device 501.
It should be noted that the computer readable medium in the present disclosure can be a computer readable signal medium or a computer readable medium or any combination of the two. A computer readable medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present disclosure, a computer readable medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In contrast, in the present disclosure, a computer readable signal medium may comprise a propagated data signal with computer readable program code embodied therein, either in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may be any computer readable medium that is not a computer readable medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, optical cables, RF (radio frequency), etc., or any suitable combination of the foregoing.
The computer readable medium may be embodied in the electronic device; or may exist separately without being assembled into the electronic device.
The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: determining the value distribution function divergence of at least one characteristic according to the characteristic selectable value of the at least one characteristic of the sample data; constructing a decision tree according to the value distribution function divergence of the at least one feature; and determining a value distribution function of the target data according to the decision tree.
Alternatively, the computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: determining the value distribution function divergence of at least one characteristic according to the characteristic selectable value of the at least one characteristic of the sample data; constructing a decision tree according to the value distribution function divergence of the at least one feature; and determining a value distribution function of the target data according to the decision tree.
Computer program code for carrying out operations for aspects of the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + +, and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units described in the embodiments of the present disclosure may be implemented by software or hardware. The names of the modules and units do not limit the modules and units in some cases.
The foregoing description is only exemplary of the preferred embodiments of the disclosure and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the disclosure herein is not limited to the particular combination of features described above, but also encompasses other embodiments in which any combination of the features described above or their equivalents does not depart from the spirit of the disclosure. For example, the above features and (but not limited to) the features disclosed in this disclosure having similar functions are replaced with each other to form the technical solution.

Claims (10)

1. A method for determining a cost distribution function, comprising:
determining the value distribution function divergence of at least one characteristic according to the characteristic selectable value of the at least one characteristic of the sample data;
constructing a decision tree according to the value distribution function divergence of the at least one feature;
and determining a value distribution function of the target data according to the decision tree.
2. The method of claim 1, wherein determining a spread of the cost distribution function for at least one feature of the sample data based on the selectable value of the feature for the at least one feature comprises:
determining a value distribution function of each feature selectable value of each feature of the feature aiming at each feature in sample data;
candidate grouping is carried out on the optional values of the features to form a plurality of candidate set pairs, and divergence values of a value distribution function between the candidate set pairs are determined;
and determining the divergence of the value distribution function of the characteristic according to the divergence value of the value distribution function between each candidate set pair.
3. The method of claim 1, wherein constructing a decision tree based on a divergence of a value distribution function of the at least one feature comprises:
aiming at the current node, taking the characteristic of the divergence of the maximum value distribution function as a splitting basis, and splitting the current node into two sub-nodes;
and traversing all the nodes to construct a binary decision tree.
4. The method of claim 3, wherein before splitting the current node into two child nodes based on the feature of the maximum cost distribution function divergence for the current node, the method further comprises:
and if the number of the sample data in the current node is smaller than a preset number threshold, or if the height of the current node reaches a preset height threshold, stopping splitting the current node.
5. The method of claim 1, wherein determining a cost distribution function for target data from the decision tree comprises:
and determining a value distribution function of the target data according to the feature selectable value of each feature of the sample data in the nodes in the decision tree.
6. The method of claim 5, wherein determining the cost distribution function of the target data according to the feature selectable value of each feature of the sample data in the node in the decision tree comprises:
acquiring a target selectable value of a target feature of target data;
determining a matching node matched with the target data according to the similarity between the feature selectable value of each feature of the sample data in the node in the decision tree and the target selectable value;
and determining a value distribution function of the target data according to the value distribution function of the sample data in the matching node.
7. The method of claim 1, wherein after determining a cost distribution function for target data from the decision tree, the method further comprises:
and determining the expected value of the target data according to the value distribution function of the target data.
8. An apparatus for determining a cost distribution function, comprising:
the value distribution function divergence determining module is used for determining the value distribution function divergence of at least one characteristic according to the characteristic selectable value of the at least one characteristic of the sample data;
the decision tree construction module is used for constructing a decision tree according to the value distribution function divergence of the at least one characteristic;
and the value distribution function determining module is used for determining a value distribution function of the target data according to the decision tree.
9. A computer-readable medium, on which a computer program is stored which, when being executed by a processor, carries out the method of determining a cost distribution function according to any one of claims 1 to 7.
10. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the method of determining a cost distribution function according to any of claims 1-7 when executing the computer program.
CN201811347255.XA 2018-11-13 2018-11-13 Method and device for determining value distribution function, electronic equipment and readable medium Pending CN111178534A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811347255.XA CN111178534A (en) 2018-11-13 2018-11-13 Method and device for determining value distribution function, electronic equipment and readable medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811347255.XA CN111178534A (en) 2018-11-13 2018-11-13 Method and device for determining value distribution function, electronic equipment and readable medium

Publications (1)

Publication Number Publication Date
CN111178534A true CN111178534A (en) 2020-05-19

Family

ID=70647348

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811347255.XA Pending CN111178534A (en) 2018-11-13 2018-11-13 Method and device for determining value distribution function, electronic equipment and readable medium

Country Status (1)

Country Link
CN (1) CN111178534A (en)

Similar Documents

Publication Publication Date Title
CN110321958B (en) Training method of neural network model and video similarity determination method
CN110222775B (en) Image processing method, image processing device, electronic equipment and computer readable storage medium
CN110765354B (en) Information pushing method and device, electronic equipment and storage medium
CN110213614B (en) Method and device for extracting key frame from video file
CN112650790B (en) Target point cloud plane determining method and device, electronic equipment and storage medium
CN112948614B (en) Image processing method, device, electronic equipment and storage medium
CN112650841A (en) Information processing method and device and electronic equipment
CN111309962A (en) Method and device for extracting audio clip and electronic equipment
CN114780338A (en) Host information processing method and device, electronic equipment and computer readable medium
CN112836128A (en) Information recommendation method, device, equipment and storage medium
CN114500339B (en) Node bandwidth monitoring method and device, electronic equipment and storage medium
CN113392018A (en) Traffic distribution method, traffic distribution device, storage medium, and electronic device
CN117241092A (en) Video processing method and device, storage medium and electronic equipment
CN111582456B (en) Method, apparatus, device and medium for generating network model information
US20240105162A1 (en) Method for training model, speech recognition method, apparatus, medium, and device
CN111680754B (en) Image classification method, device, electronic equipment and computer readable storage medium
CN111898061B (en) Method, apparatus, electronic device and computer readable medium for searching network
CN111628913B (en) Online time length determining method and device, readable medium and electronic equipment
CN114926234A (en) Article information pushing method and device, electronic equipment and computer readable medium
CN111178534A (en) Method and device for determining value distribution function, electronic equipment and readable medium
CN110704679B (en) Video classification method and device and electronic equipment
CN114138358A (en) Application program starting optimization method, device, equipment and storage medium
CN110991312A (en) Method, apparatus, electronic device, and medium for generating detection information
CN114625876B (en) Method for generating author characteristic model, method and device for processing author information
CN115840016B (en) Backup reduction method, device, equipment and computer medium of chromatographic analysis system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination