WO2018040561A1 - Procédé, dispositif et système de traitement de données - Google Patents

Procédé, dispositif et système de traitement de données Download PDF

Info

Publication number
WO2018040561A1
WO2018040561A1 PCT/CN2017/079791 CN2017079791W WO2018040561A1 WO 2018040561 A1 WO2018040561 A1 WO 2018040561A1 CN 2017079791 W CN2017079791 W CN 2017079791W WO 2018040561 A1 WO2018040561 A1 WO 2018040561A1
Authority
WO
WIPO (PCT)
Prior art keywords
feature
target
algorithm
data
parameters
Prior art date
Application number
PCT/CN2017/079791
Other languages
English (en)
Chinese (zh)
Inventor
刘冬
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2018040561A1 publication Critical patent/WO2018040561A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/02Knowledge representation; Symbolic representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/02Knowledge representation; Symbolic representation
    • G06N5/022Knowledge engineering; Knowledge acquisition

Definitions

  • the present application relates to the field of computer technologies, and in particular, to a data processing method, apparatus, and system.
  • an operator can manually process user data generated on the network side, but because of the large amount of data to be processed, the efficiency of manual processing is low. Therefore, in the related art, according to a feature selection algorithm and A machine learning algorithm processes a plurality of user data to determine whether each of the plurality of user data has a preset feature, and further determines whether the user corresponding to each user data has a preset attribute. For example, when multiple users of a communication carrier (such as China Mobile) use the network provided by the communication carrier to communicate, the network side generates more user data, such as: the user's fee (can reflect the user's Consumption level), user's bill (can reflect the user's use of China Mobile's business).
  • a communication carrier such as China Mobile
  • the communication operator may substitute multiple user data generated by the network side into a feature selection algorithm (such as a feature space algorithm), determine a feature set, and then substitute the feature set into a machine learning algorithm to determine the multiple users.
  • a feature selection algorithm such as a feature space algorithm
  • the first user data of the data having the preset feature (the service with the highest user frequency being the preset service) and the second user data without the preset feature are sent to the user corresponding to the first user data and related to the preset service. Offer information.
  • the present application provides a data processing method, device and system.
  • the technical solution is as follows:
  • a data processing method comprising:
  • the set of data parameters of the data to be processed is a target parameter group; after obtaining the data to be processed, the target parameter group may be substituted into a preset algorithm model to determine a target algorithm corresponding to the target parameter group, which needs to be explained
  • the target algorithm is an algorithm for evaluating at least one algorithm corresponding to the target parameter group according to a preset evaluation algorithm, and determining an optimal evaluation value; after determining a target algorithm corresponding to the target parameter group, the target algorithm may be Target parameter group
  • the corresponding target algorithm processes the processed data to determine the attributes of the data to be processed.
  • the data parameter is used to describe a feature of the data
  • the target parameter group is used to describe a set of features of the to-be-processed data.
  • the target algorithm corresponding to the target parameter group may be directly determined according to the preset algorithm model, and the target algorithm corresponding to the target parameter group indicated by the preset algorithm model is evaluated according to the preset.
  • the algorithm evaluates at least one algorithm corresponding to the target parameter group, and the algorithm corresponding to the determined optimal evaluation value, that is, the data to be processed according to the target algorithm corresponding to the target parameter group, and the attribute of the determined data to be processed is the most Accurate, improving the accuracy of the attributes of the determined data to be processed.
  • the target algorithm may include: a target feature selection algorithm and a target machine learning algorithm, and before the target parameter group is substituted into the preset algorithm model, n sample sets may also be acquired, and each of the n sample sets
  • the sample set may have a set of data parameters, the n sample sets have n sets of data parameters, and the n sets of data parameters of the n sample sets may include the target parameter set, and the n may be an integer greater than or equal to 1; Determining a target feature selection algorithm and a target machine learning algorithm corresponding to each set of data parameters of the n sets of data parameters. For example, after each sample set is acquired, a target corresponding to a set of data parameters of the sample set may be determined.
  • a feature selection algorithm and a target machine learning algorithm after determining a target feature selection algorithm and a target machine learning algorithm corresponding to each set of data parameters of the n sets of data parameters, according to each set of data parameters of the n sets of data parameters
  • the target feature selection algorithm and the target machine learning algorithm determine the preset algorithm model.
  • the algorithm model can determine the target algorithm corresponding to at least one set of data parameters, and can quickly determine the target algorithm corresponding to the data to be processed according to the preset algorithm model when processing the data to be processed, thereby improving the speed and efficiency of data processing.
  • the first sample set is any one of the n sample sets
  • the at least one feature selection algorithm corresponding to the first sample set and the at least one machine learning algorithm may be used to determine the first This episode is processed.
  • Determining the target feature selection algorithm and the target machine learning algorithm corresponding to each of the n sets of data parameters may include: substituting the first sample set into at least one feature selection algorithm (ie, the first Obtaining at least one feature set in at least one feature selection algorithm corresponding to a sample set, and determining the obtained at least one feature set as at least one feature set corresponding to a set of data parameters of the first sample set; Then, at least one feature set corresponding to a set of data parameters of the first sample set may be substituted into at least one machine learning algorithm to obtain at least one processing model, and the at least one processing model is determined to be the At least one processing model corresponding to a set of data parameters of the first sample set; finally, determining, according to a preset evaluation algorithm, an evaluation value corresponding to each processing model in the at least one
  • the first sample set is any sample set in n sample sets, that is, the process of determining the target feature selection algorithm and the target machine learning algorithm corresponding to each sample set in the n sample sets may be Referring to the above, the process of determining the target feature selection algorithm and the target machine learning algorithm corresponding to the first sample set.
  • the target feature selection algorithm and the target machine learning algorithm corresponding to the target parameter group of the data to be processed may be directly determined according to the preset algorithm model, and the whole process is performed. It takes less time and therefore improves the speed and efficiency of data processing.
  • the target algorithm may include: a target feature selection algorithm and a target machine learning algorithm, where determining, according to the target algorithm corresponding to the target parameter group, the attribute of the data to be processed, including: first, the deal with Substituting data into the target feature selection algorithm corresponding to the target parameter group, obtaining a feature set, and determining the obtained feature set as a target feature set, where the target feature set includes p features, each of the p features
  • the feature has a set of feature parameters
  • p features may have p set of feature parameters
  • the p is an integer greater than or equal to 1
  • each feature in the feature set has a weight; then, p of the p features may be
  • the group feature parameters are respectively substituted into the preset weight change model, and the weight change values corresponding to each set of the feature parameters of the p group feature parameters are determined.
  • the q group feature parameters can be determined.
  • the weight change value is updated to update the weight corresponding to each feature in the target feature set, that is, the weight of each feature is corresponding to a set of feature parameters of the feature.
  • Weight change values and weights as the feature updated corresponding weight;
  • machine learning algorithms to determine attributes of the data to be processed in accordance with the target object feature set of the weight update feature weights and the target parameter set corresponding to.
  • the preset weight change model may be pre-established according to the experience value of the staff. Since the preset weight change model is determined in advance, after the target feature set is obtained by using the automatic feature selection algorithm, the staff member may also refer to The empirical value is used to update the weight of the target feature set feature, so that the processed model obtained by substituting the updated target feature set into the machine learning algorithm has better processing effect.
  • the method may further include: acquiring m sample sets, where the m data parameters of the m sample sets include For the target parameter group, the m is an integer greater than or equal to 1. For example, m may be equal to n, and m may not be equal to n; after obtaining m sample sets, m sample sets may be determined.
  • the initial feature set may include: substituting each sample set in the m sample sets into a sample set a set of data parameters corresponding to the feature set selection algorithm obtained by the target feature selection algorithm, that is, each sample set is substituted into the target feature selection algorithm corresponding to the sample set to obtain a set of features of the sample set, the m
  • the sample set can obtain a total of m sets of features, and all the different features of the m set of features are composed of the initial feature set; further, a reference feature set is further determined, and the reference feature set includes: Each sample set in the m sample sets is substituted into a feature set obtained by the reference feature selection algorithm; finally, the reference feature set may be compared with an initial feature set, that is, the initial set is determined according to the reference feature set
  • the weight change value corresponding to a set of feature parameters of each feature is set in the feature set; and the preset weight change model is determined according to
  • the preset weight change model is configured, so that the weight change value corresponding to the at least one set of feature parameters can be determined according to the preset weight change model, and when the data to be processed is processed, the preset weight change model can be quickly determined according to the preset weight change model.
  • the feature change value corresponding to each feature in the feature set of the processed data is processed, and the data to be processed is processed according to the feature set after updating the weight, thereby improving the speed and efficiency of data processing.
  • determining, according to the reference feature set, a weight change value corresponding to a set of feature parameters of each feature in the initial feature set including: substituting the initial feature set into a preset machine learning algorithm, determining a processing model; and substituting the reference feature set into a preset machine learning algorithm to determine a second processing model; and evaluating the first processing model according to the preset evaluation algorithm to determine a first evaluation value;
  • the preset evaluation algorithm evaluates the second processing model to determine a second evaluation value; after obtaining the first evaluation value and the second evaluation value, the Whether the second evaluation value is greater than the first evaluation value; if the second evaluation value is greater than the first evaluation value, and the reference feature set includes the first feature in the initial feature set, it may be determined
  • the reference feature selection algorithm has a better processing effect than the target feature selection algorithm corresponding to the set of data parameters of the first sample set, and the weight of the first feature in the reference feature set is compared with the first feature The difference between the weights in the initial feature set is the weight change value corresponding to
  • the preset weight change value is used as the first feature.
  • Corresponding weight change value that is, when the target feature selection algorithm corresponding to the set of feature parameters of the first sample set is better than the first feature, and the reference feature set does not include the first feature.
  • An empirical value is set as the weight change value corresponding to the first feature; if the second evaluation value is not greater than the first evaluation value, the target feature selection algorithm corresponding to a set of data parameters of the first sample set may be determined The processing effect of the reference feature selection algorithm is better. At this time, it may be determined that the weight change value corresponding to the first feature is zero.
  • the processing model obtained by the target feature selection algorithm and the processing model obtained by the reference feature selection algorithm are respectively evaluated. If the first evaluation value is greater than or equal to the second evaluation value, it may be determined that the target feature selection algorithm is used to perform the target sample. The processing effect of the processing is better than that of the target feature processing by using the reference feature selection algorithm, or the same as the processing of the target sample by the reference feature selection algorithm. At this time, it is not necessary to refer to the experience value of the staff. If the first evaluation value is smaller than the second evaluation value, it may be determined that the processing effect of processing the target sample by using the reference feature selection algorithm is better than that of processing the target sample by using the target feature selection algorithm. The experience value is updated to the weight of the initial feature set feature, so that the processed model obtained by substituting the updated initial feature set into the machine learning algorithm has better processing effect on the processed data.
  • the target algorithm includes: a target feature selection algorithm and a target machine learning algorithm, and according to the preset algorithm model, the target feature corresponding to each set of data parameters in the first machine learning algorithm and the at least one set of data parameters can be determined.
  • the target feature selection corresponding to each set of data parameters in the at least one set of data parameters may be selected.
  • the algorithm and the target machine learning algorithm determine a preset machine learning algorithm and a target feature selection algorithm corresponding to each set of data parameters in at least one set of data parameters, thereby obtaining a preset algorithm model, and according to a preset machine learning algorithm, a target parameter group, and
  • the preset algorithm model determines a target feature set corresponding to the target parameter set and the preset machine learning algorithm.
  • the target algorithm includes: a target feature selection algorithm and a target machine learning algorithm, and according to the preset algorithm model, a target feature selection algorithm and target machine learning corresponding to each set of data parameters in at least one set of data parameters can be determined.
  • the target feature selection algorithm corresponding to each set of data parameters in the at least one set of data parameters After determining the target feature selection algorithm and the target machine learning algorithm corresponding to each of the at least one set of data parameters, the target feature selection algorithm corresponding to each set of data parameters in the at least one set of data parameters And a target machine learning algorithm, determining a target feature selection algorithm and a target machine learning algorithm corresponding to each set of data parameters in at least one set of data parameters, thereby obtaining a preset algorithm model, and obtaining according to the target parameter group and the preset algorithm model The target feature selection algorithm and the target machine learning algorithm corresponding to the target parameter group.
  • the target feature selection algorithm corresponding to the target parameter group may include: a feature selection algorithm based on information entropy, or a feature selection algorithm based on inter-feature correlation; the target machine learning algorithm corresponding to the target parameter group includes : Random forest RF machine learning algorithm, logistic regression LR machine learning algorithm, or support vector machine SVM machine learning algorithm.
  • a set of data parameters of the data is composed of a set of metadata of the data
  • a set of feature parameters of each feature is composed of a set of metadata of the feature
  • the target algorithm includes at least one of a target feature selection algorithm or a target machine learning algorithm. That is, the target algorithm corresponding to the determined target parameter group may be: a target feature selection algorithm corresponding to the target parameter group; or a target machine learning algorithm corresponding to the target parameter group; or a target feature selection algorithm corresponding to the target parameter group and Target machine learning algorithm.
  • a data processing apparatus includes: a first obtaining module, a first determining module, and a second determining module, wherein the first acquiring module is configured to acquire data to be processed, A set of data parameters of the data to be processed is a target parameter group; the first determining module may be configured to substitute the target parameter group into a preset algorithm model, and determine a target algorithm corresponding to the target parameter group, where the target algorithm is based on The evaluation algorithm is configured to evaluate at least one algorithm corresponding to the target parameter group, and the determined optimal evaluation value corresponds to an algorithm; and the second determining module may be configured to determine, according to the target algorithm corresponding to the target parameter group, the to-be-processed The properties of the data.
  • the target algorithm includes: a target feature selection algorithm and a target machine learning algorithm
  • the data processing device further includes: a second obtaining module, a third determining module, and a fourth determining module
  • the second acquiring module may For obtaining n sample sets, the n sets of data parameters of the n sample sets include the target parameter set, the n is an integer greater than or equal to 1
  • the third determining module may be configured to determine the n sets of data a target feature selection algorithm and a target machine learning algorithm corresponding to each set of data parameters in the parameter
  • the fourth determining module may be configured to select a target feature selection algorithm and a target machine learning algorithm according to each set of the data parameters of the n sets of data parameters Determining the preset algorithm model.
  • the first sample set is any one of the n sample sets
  • the third determining module is further configured to: substitute the first sample set into at least one feature selection algorithm to determine At least one feature set corresponding to a set of data parameters of the first sample set; at least one feature set corresponding to a set of data parameters of the first sample set is respectively substituted into at least one machine learning algorithm, and determined At least one processing model corresponding to a set of data parameters of the first sample set; determining, according to a preset evaluation algorithm, an evaluation value corresponding to each processing model in the at least one processing model, and processing the evaluation value optimally Corresponding feature selection algorithm and machine learning algorithm are used as a target feature selection algorithm and a target machine learning algorithm corresponding to a set of data parameters of the first sample set.
  • the target algorithm includes: a target feature selection algorithm and a target machine learning algorithm
  • the second determining module includes: a first determining unit, a second determining unit, an updating unit, and a third determining unit, where the first The determining unit may be configured to substitute the to-be-processed data into a target feature selection algorithm corresponding to the target parameter group, and determine a target feature set, where the target feature set includes p features, each of the p features has a set of characteristic parameters, the p is an integer greater than or equal to 1, and the feature in the feature set has a weight; the second determining unit may be configured to substitute the p-group feature parameters of the p features into the preset weight change model, respectively.
  • the update unit may be configured to update each feature pair in the target feature set according to the determined weight change value Weight; third determination And a unit, configured to determine an attribute of the to-be-processed data according to the updated target feature set and the target machine learning algorithm corresponding to the target parameter set.
  • the data processing apparatus further includes: a third obtaining module, a fifth determining module, a sixth determining module, a seventh determining module, an eighth determining module, and a nin determining module, wherein the third acquiring module is Obtaining m sample sets, the m sets of data parameters of the m sample sets include the target parameter set, the m is an integer greater than or equal to 1; the fifth determining module may be configured to determine the m sets of data parameters a target feature selection algorithm corresponding to each set of data parameters; the sixth determining module may be configured to determine an initial feature set, the initial feature set comprising: substituting each sample set in the m sample sets into one of the sample sets a feature set obtained by the target feature selection algorithm corresponding to the group data parameter; the seventh determining module may be configured to determine a reference feature set, the reference feature set comprising: substituting each sample set in the m sample sets into a reference feature Selecting features of the feature set obtained by the algorithm; the eighth determining module may be configured to determine a third
  • the eighth determining module is further configured to: substitute the initial feature set into a preset machine learning algorithm, determine a first processing model; substitute the reference feature set into a preset machine learning algorithm, and determine a second process
  • the first processing model is evaluated according to the preset evaluation algorithm, and the first evaluation value is determined
  • the second processing model is evaluated according to the preset evaluation algorithm, and the second evaluation value is determined; Whether the second evaluation value is greater than the first evaluation value; if the second evaluation value is greater than the first evaluation value, and the reference feature set includes the first feature in the initial feature set,
  • the difference between the weight of the first feature set in the reference feature set and the weight of the first feature in the initial feature set is a weight change value corresponding to a set of feature parameters of the first feature.
  • the target algorithm includes: a target feature selection algorithm or a target machine learning algorithm.
  • a data processing system comprising the data processing apparatus of the second aspect.
  • a data processing apparatus comprising: at least one processor, at least one network interface, a memory, and at least one bus, wherein the memory and the network interface are respectively connected to the processor through a bus; the processor is The instructions are configured to execute the instructions stored in the memory; the processor implements the data processing method provided by any of the possible implementations of the first aspect or the first aspect by executing the instructions.
  • a data processing system comprising the data processing apparatus of the fourth aspect.
  • the present application provides a data processing method, apparatus, and system.
  • the target parameter group (the data to be processed is directly determined according to the preset algorithm model).
  • a target algorithm corresponding to a set of data parameters and the target algorithm corresponding to the target parameter group determined according to the preset algorithm model is to evaluate at least one algorithm corresponding to the target parameter group according to a preset evaluation algorithm, and determine the optimal algorithm.
  • the algorithm corresponding to the evaluation value that is, the attribute of the data to be processed determined is the most accurate according to the target algorithm corresponding to the target parameter group, so that the attribute of the data to be processed determined according to the target algorithm corresponding to the target parameter group has higher accuracy.
  • FIG. 1 is a schematic diagram of an application scenario of a data processing method according to an embodiment of the present invention
  • FIG. 2 is a flowchart of a method for processing a data according to an embodiment of the present invention
  • 3-1 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present invention.
  • 3-2 is a schematic structural diagram of another data processing apparatus according to an embodiment of the present invention.
  • 3-3 is a schematic structural diagram of a second determining module according to an embodiment of the present invention.
  • 3-4 is a schematic structural diagram of still another data processing apparatus according to an embodiment of the present invention.
  • FIG. 4 is a schematic structural diagram of still another data processing apparatus according to an embodiment of the present invention.
  • FIG. 1 is a schematic diagram of an application scenario of a data processing method according to an embodiment of the present invention.
  • terminals used by user A, user B, user C, and user D all access the network, so the four users are all It is a network user, where user A and user B are users of the first communication carrier (such as China Mobile), that is, both user A and user B access the network provided by the first communication carrier, and user A uses the most
  • the service is the first service provided by the first communication carrier, the service that the user B uses the most is the second service provided by the first communication carrier, and the user C and the user D are the users of the second communication carrier (such as China Telecom).
  • both user C and user D access the network provided by the second communication carrier, and the user C uses the third service provided by the second communication carrier, and the user D uses the most service as the second communication carrier.
  • the fourth business provided.
  • the network side When the user A communicates using the network provided by the first communication carrier, the network side generates the user data 1; during the process of the user B communicating using the network provided by the first communication carrier, the network side generates the user data. 2; in the process of user C communicating using the network provided by the second communication carrier, the network side generates user data 3; when the user D communicates using the network provided by the second communication carrier, the network side generates User data 4.
  • two user data (user data 1 and user data 2) can be acquired, and the two user data are substituted into one type.
  • a feature selection algorithm determines a feature set corresponding to the two user data. Specifically, when determining the feature set corresponding to the two user data, the sample data may be collected in the two user data, and the sample data is substituted into the feature space algorithm to obtain a feature set (the obtained feature set) Usually a subset of the feature set of the sample data, so the resulting feature set may also be referred to as a feature subset). The feature set is substituted into a machine learning algorithm to obtain a processing model.
  • the sample data can be divided into multiple copies, and the attributes of each sample data are respectively determined according to the processing model, and the attributes of each sample data are substituted into a preset evaluation algorithm (such as an evaluation method based on multiple cross-validation mechanisms).
  • obtaining an evaluation value corresponding to the attribute of the plurality of sample data that is, the evaluation value corresponding to the processing model
  • the evaluation value is greater than the evaluation threshold
  • the feature set if the evaluation value is less than or equal to the evaluation threshold, the feature space selection algorithm needs to be re-acquired to obtain another feature set until the obtained evaluation value is greater than the evaluation value threshold.
  • the determined feature set is substituted into a machine learning algorithm to determine a processing model.
  • the processing model it is determined that the user data 1 in the two user data has a preset feature (ie, user data 1 is used to indicate use
  • the user A uses the highest frequency service as the first service.
  • the user data 2 does not have the preset feature (that is, the user data 2 is used to indicate that the service with the highest usage frequency of the user B is not the first service), and then the first communication operation.
  • the provider can send the preferential information related to the first service to the terminal used by the user A.
  • User data (user data 1 and user data 2) generated using the network provided by the first communication carrier is different from user data (user data 3 and user data 4) generated using the network provided by the second communication carrier.
  • the data generated in the scenario, and the same machine learning algorithm cannot be applied to user data generated in different scenarios. If the operator of the second communication carrier performs user data (user data 3 and user data 4) generated on the network side.
  • the same feature selection algorithm and machine learning algorithm as the first communication carrier are still used, which may cause the attribute of the user data 3 determined by the second communication carrier to deviate from the attribute of the user data 4, and is processed.
  • User data attributes are less accurate.
  • an embodiment of the present invention provides another data processing method, where the data processing method may include:
  • Step 201 Acquire multiple sample sets.
  • each sample set in the plurality of sample sets may be data generated in a scenario, and the plurality of sample sets may include a target sample set, and a set of data parameters of the target sample set may be a target parameter set.
  • the data parameter of the data is used to reflect the characteristics of the data, and each of the data parameters of a sample set can reflect a feature of the sample set, and a set of data parameters of a sample set can reflect the sample set.
  • a set of data parameters of a sample set may be composed of a set of metadata (including at least one metadata) of the sample set. If the two sample sets are different, the two sets of metadata of the two sample sets are different.
  • a set of data parameters of a sample set may include a mean of the sample set, a variance of the sample set, a maximum value of the sample set, a minimum value of the sample set, and the like, which are not limited by the embodiment of the present invention.
  • Sample set Metadata 1 1st metadata, 2nd metadata, ..., Xth metadata 2 X+1 metadata, X+2 metadata, ..., y metadata 3 Y+1 metadata, Y+2 metadata, ..., Z-dimensional data 4
  • Step 202 Determine a target feature selection algorithm and a target machine learning algorithm corresponding to a set of data parameters of each sample set in the plurality of sample sets.
  • a set of data parameters may correspond to multiple feature selection algorithms and multiple machine learning algorithms (that is, when processing a set of data parameters, any one of a plurality of feature selection algorithms may be used. ,and also Any of a variety of machine learning algorithms can be employed). Selecting a feature selection algorithm from a plurality of feature selection algorithms corresponding to a set of data parameters, and selecting a machine learning algorithm from a plurality of machine learning algorithms corresponding to the set of data parameters, may form an algorithm corresponding to the set of data parameters Therefore, the set of data parameters can correspond to a variety of algorithms.
  • the algorithm corresponding to the optimal evaluation value of the plurality of evaluation values is a target algorithm corresponding to the set of data parameters, and is composed of
  • the feature selection algorithm and the machine learning algorithm of the target algorithm are target feature selection algorithms and target machine learning algorithms corresponding to the set of data parameters.
  • a feature selection algorithm and a machine learning algorithm are used to process a certain sample set, and it can be determined whether the sample set has a preset feature, thereby determining an attribute of the sample set, that is, determining an attribute of the sample set. Yes: has preset features, or does not have preset features. If it is determined that the user corresponding to the sample set is female, or is not female.
  • the preset evaluation algorithm can evaluate parameters such as the accuracy or error rate of the process of "determining the attributes of the sample set by using a certain feature selection algorithm and a certain machine learning algorithm", and the numerical value is expressed in the form of a numerical value. It can be called the evaluation value of the preset evaluation algorithm.
  • the preset evaluation algorithm may be an evaluation method based on the multiple cross-validation mechanism, and the preset evaluation algorithm may also be other evaluation algorithms, which is not limited by the embodiment of the present invention.
  • the embodiment of the present invention only determines the target feature selection algorithm and the target machine learning algorithm corresponding to the target parameter group.
  • the specific steps of determining the target feature selection algorithm and the target machine learning algorithm corresponding to the other group data parameters may refer to: determining specific steps of the target feature selection algorithm and the target machine learning algorithm corresponding to the target parameter group, and the embodiment of the present invention does not Make a statement.
  • determining a target feature selection algorithm and a target machine learning algorithm corresponding to the target parameter group may include:
  • the target sample set is substituted into at least one feature selection algorithm to determine at least one feature set corresponding to the target parameter set.
  • the at least one feature selection algorithm may include a feature selection algorithm based on information entropy or a feature selection algorithm based on inter-feature correlation. It should be noted that the at least one feature selection algorithm may further include other feature selection. The algorithm is not mentioned in this example.
  • at least one feature set corresponding to the target parameter group may be substituted into at least one machine learning algorithm to determine at least one processing model corresponding to the target parameter group. For example, if the target parameter group corresponds to the A feature sets, the A feature sets are respectively substituted into the B machine learning algorithms, and the A ⁇ B processing models are determined.
  • the evaluation value corresponding to each processing model in the at least one processing model may be determined according to a preset evaluation algorithm, and the feature selection algorithm and the machine learning algorithm corresponding to the processing model with the optimal evaluation value are used as the target features corresponding to the target parameter group.
  • the selection algorithm and the target machine learning algorithm For example, if A ⁇ B is equal to 6, and the evaluation values corresponding to the six processing models are 10, 20, 30, 40, 50, and 60, respectively, the corresponding feature selection corresponding to the processing model with an evaluation value of 60 may be selected.
  • the algorithm and the machine learning algorithm are used as the target feature selection algorithm and the target machine learning algorithm corresponding to the target parameter group.
  • the target feature selection algorithm corresponding to the target parameter group may include: a feature selection algorithm based on information entropy, or a feature selection algorithm based on correlation between features;
  • the target machine learning algorithm corresponding to the target parameter group may include: a random forest (English: Random Forest; abbreviation: RF) machine learning algorithm, logic Regression (English: Logistic Regression; referred to as: LR) machine learning algorithm, or support vector machine (English: Support Vector Machine) machine learning algorithm.
  • a list of target feature selection algorithms and target machine learning algorithms corresponding to each set of data parameters may be created.
  • the list may be as shown in Table 2, data parameters: first metadata, second metadata, ... , Xth metadata (a set of data parameters of sample set 1), corresponding target feature selection algorithm 2 and target machine learning algorithm 3, data parameters: X+1 metadata, X+2 metadata, ..., Y Metadata (a set of data parameters of sample set 2), corresponding to target feature selection algorithm 2 and target machine learning algorithm 2, data parameters: Y+1 metadata, Y+2 metadata, ..., Z-dimensional data ( a set of data parameters of sample set 3), corresponding to target feature selection algorithm 1 and target machine learning algorithm 2, data parameters: Z+1 metadata, Z+2 metadata, ..., W-th data (sample set 4) A set of data parameters) corresponding to the target feature selection algorithm 1 and the target machine learning algorithm 3. It should be noted that only the identifier of the target feature selection algorithm and the identifier of the target machine learning algorithm may be recorded in the list.
  • Target machine learning algorithm 1st metadata, 2nd metadata, ..., Xth metadata 2 3 X+1 metadata, X+2 metadata, ..., y metadata 2 2 Y+1 metadata, Y+2 metadata, ..., Z-dimensional data 1 2
  • Step 203 Determine a preset algorithm model according to a target feature selection algorithm and a target machine learning algorithm corresponding to each set of data parameters.
  • the sample set can be continuously acquired, and after each sample set is acquired in step 201, the target feature selection algorithm corresponding to a set of data parameters of the sample set and the target machine learning are performed in step 202.
  • the algorithm until the number of sample sets acquired in step 201 is n, the steps in step 203 can be performed, n can be an integer greater than or equal to 1, and n sample sets have n sets of data parameters.
  • the preset algorithm model may be determined according to the target feature selection algorithm and the target machine learning algorithm corresponding to each set of data parameters.
  • a preset algorithm model capable of determining a target feature selection algorithm and a target machine learning algorithm corresponding to each set of data parameters of at least one set of data parameters may be derived.
  • the preset algorithm model may be a correspondence relationship record table, wherein the correspondence relationship record table records at least one set of data parameters, and a target feature selection algorithm and a target machine learning algorithm corresponding to each set of data parameters in the at least one set of data parameters, That is, according to the correspondence relationship record table (preset algorithm model), the target feature selection algorithm and the target machine learning algorithm corresponding to each set of data parameters can be determined.
  • the preset algorithm model may not be a correspondence record table.
  • the preset algorithm model may also be a three-dimensional coordinate curve, and the x variable in the three-dimensional coordinate is a data parameter group, and the y variable is a target.
  • the feature selection algorithm, the z variable is a target machine learning algorithm, and the three-dimensional coordinate curve can correspond to at least one set of data parameters. It should be noted that the preset algorithm model may also be expressed in other forms, which is not limited by the embodiment of the present invention.
  • the target algorithm corresponding to each set of data parameters in the n sets of data parameters can be determined according to the preset algorithm model determined in step 203; on the other hand, if n sets of data parameters If there are at least two sets of identical data parameters, the target algorithm corresponding to each set of data parameters in the L sets of data parameters can be determined according to the preset algorithm model determined in step 203, and L is an integer less than n.
  • the first machine learning algorithm is used to process the data in the process of processing the data, after determining the target feature selection algorithm and the target machine learning algorithm corresponding to each set of data parameters in the n sets of data parameters, Determining a preset algorithm model according to the target feature selection algorithm and the target machine learning algorithm corresponding to each set of data parameters, and the first machine learning algorithm, and determining, according to the preset algorithm model, the first machine learning algorithm and the at least one set of data The target feature selection algorithm corresponding to each set of data parameters in the parameter.
  • Step 204 Determine a preset weight change model according to a target feature selection algorithm corresponding to each set of data parameters.
  • the sample set may be continuously acquired, and after each sample set is acquired in step 201, the target feature selection algorithm corresponding to a set of data parameters of the sample set and the target machine learning are performed in step 202.
  • the algorithm may perform the step in step 204 when the number of sample sets obtained in step 201 is m, m may be an integer greater than or equal to 1, and m sample sets have m sets of data parameters, in step 204
  • the m may be the same as the n in the step 203, or the m in the step 204 may be different from the n in the step 203, which is not limited by the embodiment of the present invention.
  • the preset weight change model may be determined according to the target feature selection algorithm corresponding to each set of data parameters.
  • the m sample sets may be respectively substituted into a target feature selection algorithm corresponding to a set of data parameters of the sample set, and the m sets of feature sets are obtained, and the initial feature set is determined according to the obtained m set of feature sets, and the initial feature set may include All features (q features) in the m group feature set. For example, if the m group feature set is: (feature 1, feature 2, feature 3), (feature 1, feature 3, feature 4) and (feature 1, feature 2, feature 5), then the initial feature set can be determined It can be: (Feature 1, Feature 2, Feature 3, Feature 4, Feature 5).
  • the features in the initial feature set may be sorted according to a preset sorting algorithm, and each feature in the initial feature set is given a weight.
  • the weight of the feature 1 may be 5.
  • the weight of feature 2 may be 3, the weight of feature 3 may be 2.5, the weight of feature 4 may be 1, and the weight of feature 5 may be 0.5.
  • the m sample sets may be substituted into the reference feature selection algorithm to obtain the m sets of feature sets, and the reference feature set may be determined according to the obtained m set of feature sets, and the reference feature set may include all the features in the m sets of feature sets. For example, if the m group feature set is: (feature 1, feature 2, feature 3), (feature 1, feature 3, feature 6) and (feature 1, feature 2, feature 5), then the initial feature set can be determined It can be: (Feature 1, Feature 2, Feature 3, Feature 5, Feature 6). It should be noted that, after determining the reference feature set, the features in the reference feature set may be sorted according to a preset sorting algorithm, and each feature in the reference feature set is given a weight.
  • the weight of the feature 1 may be 5.
  • the weight of feature 2 may be 2.5
  • the weight of feature 3 may be 1
  • the weight of feature 5 may be 0.9
  • the weight of feature 6 may be 0.6.
  • the reference feature selection algorithm may be an artificial feature selection algorithm, that is, according to the experience value of the staff, each sample is analyzed and judged, and then the reference feature set is determined, and the reference feature set may be continued according to the experience value of the staff. All features are sorted to give each feature a weight in the reference feature set.
  • the weight change value corresponding to each feature in the initial feature set may be determined, and the weight change value of each feature is determined as the weight change value corresponding to the set of feature parameters of the feature.
  • the initial feature set may be substituted into a preset machine learning algorithm, the first processing model is determined, and the reference feature set is substituted into a preset machine learning algorithm to determine the second processing model. And evaluating the first processing model according to the preset evaluation algorithm, determining the first evaluation value, and evaluating the second processing model according to the preset evaluation algorithm to determine the second evaluation value.
  • the second evaluation value is greater than the first evaluation value, that is, the processing effect of processing the target sample by using the reference feature selection algorithm is good, or the processing effect of processing the target sample by using the target feature selection algorithm corresponding to the target parameter group. it is good. If the second evaluation value is greater than the first evaluation value, and the reference feature set includes the first feature in the initial feature set, the first special The weight of the eigenvalue in the reference feature set, and the difference between the weight of the first feature in the initial feature set, and the weight change value corresponding to the set of feature parameters of the first feature.
  • the preset weight change value is used as the weight change value corresponding to the set of feature parameters of the first feature; If the evaluation value is not greater than the first evaluation value, it is determined that the weight change value corresponding to the set of characteristic parameters of the first feature is zero.
  • the second evaluation value is less than or equal to the first evaluation value, it may be determined that the weight change values corresponding to the features 1, 2, 3, 4, and 5 are all 0. If the second evaluation value is greater than the first evaluation value, the feature set includes the feature 1 for the feature 1 in the initial feature set, so the weight 5 of the feature 1 in the reference feature set and the weight 5 of the feature 1 in the initial feature set can be The difference 0 is a weight change value for a set of characteristic parameters (first metadata, second metadata, ... C-ary data) of feature 1. For feature 2 in the initial feature set, feature reference set contains feature 2, so the difference between the weight 2.5 of the reference feature set feature 2 and the weight 3 of the initial feature set feature 2 can be used as a set of feature parameters of feature 2.
  • the reference feature set includes the feature 3, so the difference between the weight 0.9 of the reference feature set feature 3 and the weight 2.5 of the initial feature set feature 3 can be used as a set of feature parameters of the feature 3. (D+1 metadata, D+2 metadata, ... E-element data) corresponding weight change values.
  • the feature set does not include the feature 4, so the preset feature value (such as -0.2) can be used as a set of feature parameters of the feature 4 (E+1 metadata, E+ 2 yuan data, ... F-metadata) corresponding weight change value.
  • the reference feature set includes the feature 5, so the difference between the weight 1 of the reference feature set feature 5 and the weight of the initial feature set feature 5 of 0.5 can be used as a set of feature parameters of the feature 5 (The weight change value corresponding to the F+1 metadata, the F+2 metadata, the ... G metadata.
  • a simple descent algorithm may be used to divide the weight sum "1" into each feature, that is, assign one to each of the multiple features.
  • the weight change value is such that the sum of the weight change values of the plurality of features is 1.
  • a list may be used to record the weight change values corresponding to a set of feature parameters of each feature in the initial feature set.
  • Table 3 records the weight change values for a set of feature parameters for each feature in the initial feature set. It should be noted that the embodiment of the present invention only exemplifies the number of features in the initial feature set is 5. In practical applications, the number of features in the initial feature set may not be 5.
  • the preset weight change model may be determined according to the weight change value corresponding to each set of feature parameters, that is, the preset weight may be derived according to Table 3. Change model.
  • Step 205 Acquire data to be processed, and a set of data parameters of the data to be processed is a target parameter group.
  • the data parameter may be processed according to data of any set of data parameters that can be determined according to the preset algorithm model.
  • the data to be processed obtained in step 205 may include: in the process of user A in FIG.
  • the network side generates User data 1 and the user data 2 generated by the network B in the process of the user B using the network provided by the first communication carrier;
  • the data to be processed obtained in step 205 may include: user C is In the process of communicating using the network provided by the second communication carrier, the user data 3 generated by the network side, and the user data 4 generated by the network side during the communication of the user D using the network provided by the second communication carrier.
  • a set of data parameters of the data to be processed may be a target parameter group.
  • a process of processing data parameters as a target parameter group of the data to be processed is taken as an example for detailed explanation.
  • the process of the data parameter being the data to be processed of the other group data parameters that can be determined according to the preset algorithm model may refer to the process of processing the data to be processed as the target parameter group, which is not described herein.
  • Step 206 Substituting the target parameter group into a preset algorithm model, and determining a target algorithm corresponding to the target parameter group.
  • the target algorithm determined in step 206 may include: at least one of a target feature selection algorithm and a target machine learning algorithm, that is, the target algorithm corresponding to the determined target parameter group may be: a target parameter group corresponding to a target feature selection algorithm; or a target machine learning algorithm corresponding to the target parameter group; or a target feature selection algorithm and a target machine learning algorithm corresponding to the target parameter group.
  • the target algorithm includes: a target feature selection algorithm and a target machine learning algorithm as an example.
  • step 206 when step 206 is performed, if it is specified that the first machine learning algorithm must be used in the process of processing the data to be processed, the first machine learning algorithm and the target parameter group may be substituted into the preset algorithm model to obtain the The first machine learning algorithm and the target feature selection algorithm corresponding to the target parameter set, and the obtained target feature selection algorithm and the first machine learning algorithm are used as the target feature selection algorithm and the target machine learning algorithm corresponding to the target parameter group.
  • step 206 when step 206 is executed, if it is not explicitly specified that a certain machine learning algorithm must be used in the process of processing the data to be processed, the target parameter group can be directly substituted into the preset algorithm model to obtain the target.
  • the target feature selection algorithm and the target machine learning algorithm corresponding to the parameter group when step 206 is executed, if it is not explicitly specified that a certain machine learning algorithm must be used in the process of processing the data to be processed, the target parameter group can be directly substituted into the preset algorithm model to obtain the target. The target feature selection algorithm and the target
  • a machine learning algorithm may be determined according to the related art as the target machine learning algorithm corresponding to the target parameter group. If only the target machine learning algorithm corresponding to the target parameter group is determined in step 206, a feature selection algorithm may be determined according to the related technology as the target feature selection algorithm corresponding to the target parameter group.
  • Step 207 Determine an attribute of the data to be processed according to the target algorithm corresponding to the target parameter group and the preset weight change model.
  • the data to be processed may be substituted into a target feature selection algorithm corresponding to the target parameter group to determine a target feature set.
  • the initial feature set in step 204 may include a target feature set, that is, each feature in the target feature set belongs to the initial feature set.
  • each feature in the target feature set may also be sorted by using a preset sorting algorithm to determine the weight of each feature in the target feature set.
  • the features in the target feature set are Feature 1, Feature 2, Feature 3, Feature 4, Feature 5, and the weight of Feature 1 may be 5, the weight of Feature 2 may be 3, and the weight of Feature 3 may be 2.5.
  • the weight of the feature 4 may be 1, and the weight of the feature 5 may be 0.5, and the features in the target feature set are sorted according to the weights: feature 1, feature 2, feature 3, feature 4, and feature 5.
  • the weight change value corresponding to a set of feature parameters of each feature in the target feature set may be determined according to the preset weight change model determined in step 204.
  • the feature 1, the feature 2, and the feature may be 3,
  • the five sets of feature parameters in feature 4 and feature 5 are substituted into the preset weight change model, and the corresponding weight change values corresponding to each set of feature parameters are determined.
  • the weight corresponding to each feature in the target feature set may be updated according to the weight change value corresponding to each set of feature parameters.
  • the weight corresponding to each feature may be The sum of the weight change values corresponding to a set of feature parameters of the feature as the updated weight of the feature.
  • the weight of the target feature set feature 1 is 5, the weight change value corresponding to the set of feature parameters of the feature 1 is 0, the weight of the updated feature 1 is 5; if the weight of the target feature set 2 is 3, the weight change value corresponding to a set of feature parameters of the feature 2 is -0.5, and the weight of the updated feature 2 is 2.5; if the weight of the feature set 3 in the target feature set is 2.5, a set of features of the feature 3 If the weight change value of the parameter is -1.6, the weight of the updated feature 3 is 0.9; if the weight of the feature feature 4 of the target feature set is 1, the weight change value of the set of feature parameters of the feature 4 is -0.2.
  • the weight of the updated feature 4 is 0.8; if the weight of the target feature set feature 5 is 0.5, and the weight change value corresponding to the set of feature parameters of the feature 5 is 0.5, the weight of the feature 5 that can be updated is 1 Therefore, the features of the updated target feature set are sorted according to the weights: Feature 1, Feature 2, Feature 5, Feature 3, and Feature 4.
  • the attribute of the data to be processed may be determined according to the target machine learning algorithm corresponding to the updated target feature set and the target parameter set. Specifically, the updated target feature set may be substituted into the target.
  • the target machine learning algorithm corresponding to the parameter group a processing model is obtained, and the data to be processed is substituted into the processing model to determine the attributes of the model to be processed.
  • the two user data may be substituted into a feature selection algorithm to obtain an initial feature set, and then, according to the initial feature set.
  • Boosting algorithm an algorithm for improving the accuracy of the weak classification algorithm
  • the weak classifier is replaced with another feature to select the weak classifier, and the size of the parameter in the other feature selection classifier is adjusted. If the current feature selects the attribute of the two user data obtained by the weak classifier to be accurate, the current feature selection weak classifier is used as the feature selection strong classifier, and the feature selection strong classifier and the one machine learning algorithm determine the two The attributes of the user data. However, the process of repeatedly iterating the plurality of feature selection weak classifiers based on the Boosting algorithm takes a long time, so the data processing speed is slow and the data processing efficiency is low.
  • the target feature selection algorithm and the target machine learning algorithm corresponding to the data to be processed may be directly determined according to the preset algorithm model, and the whole process is performed. It takes less time, so it increases the speed and efficiency of data processing.
  • the data to be processed may be substituted into an automatic feature selection algorithm (such as an information gain based or correlation-based feature selection algorithm) to determine a target feature set.
  • the automatic feature selection algorithm is essentially an algorithm based on mathematical statistics theory, that is, the automatic feature selection algorithm can determine the discrimination of a certain tag in the feature of the data to be processed according to the value in the data to be processed.
  • the best feature, but in actual sense is not necessarily the best distinguishing feature, such as identity (English: identification; referred to as: ID) class features, in this case, the selected feature set is substituted into a machine learning algorithm
  • ID International Mobile Identification
  • the feature selected by the staff based on the empirical value of the feature value of the data to be processed may be different from the feature determined by the automatic feature selection algorithm, but the feature selected by the worker is substituted into a processing model obtained by a certain machine learning algorithm.
  • the processing of the processed data is better.
  • a preset weight change model is established in advance, so that automatic feature selection is used. After the algorithm obtains the feature set, the weight of the feature set can be updated by referring to the experience value of the staff, so that the processed model obtained by substituting the updated feature set into the machine learning algorithm has better processing effect on the processed data.
  • the target parameter group (a set of data parameters of the data to be processed) can be determined according to the preset algorithm model.
  • the algorithm, and the target algorithm corresponding to the target parameter group determined according to the preset algorithm model is an algorithm for evaluating at least one algorithm corresponding to the target parameter group according to the preset evaluation algorithm, and determining the optimal evaluation value corresponding to the algorithm, that is,
  • the attribute of the data to be processed is determined to be the most accurate according to the target algorithm corresponding to the target parameter group, so that the attribute of the data to be processed determined according to the target algorithm corresponding to the target parameter group has higher accuracy.
  • the embodiment of the present invention provides a data processing device 30, which may include:
  • a first acquiring module 301 configured to acquire data to be processed, where a set of data parameters of the data to be processed is a target parameter group;
  • the first determining module 302 is configured to substitute the target parameter group into the preset algorithm model, and determine a target algorithm corresponding to the target parameter group.
  • the target algorithm is: evaluating, according to the preset evaluation algorithm, at least one algorithm corresponding to the target parameter group, determining The algorithm corresponding to the optimal evaluation value;
  • the second determining module 303 is configured to determine an attribute of the data to be processed according to the target algorithm corresponding to the target parameter group.
  • the first determining module can directly determine the target parameter group according to the preset algorithm model (the data to be processed) a target algorithm corresponding to a set of data parameters, and the target algorithm corresponding to the target parameter group determined according to the preset algorithm model is to evaluate at least one algorithm corresponding to the target parameter group according to a preset evaluation algorithm, and determine the optimal algorithm.
  • the algorithm corresponding to the evaluation value that is, the second determining module determines the attribute of the data to be processed to be the most accurate according to the target algorithm corresponding to the target parameter group, so that the attribute of the data to be processed determined according to the target algorithm corresponding to the target parameter group is accurate. Higher degrees.
  • the target algorithm includes: a target feature selection algorithm and a target machine learning algorithm.
  • the embodiment of the present invention provides another data processing device 30. Based on the data of FIG. 3-1, Processing device 30 also includes:
  • a second obtaining module 304 configured to acquire n sample sets, where n sets of data parameters of the n sample sets include a target parameter set, where n is an integer greater than or equal to 1;
  • a third determining module 305 configured to determine a target feature selection algorithm and a target machine learning algorithm corresponding to each set of data parameters of the n sets of data parameters;
  • the fourth determining module 306 is configured to determine a preset algorithm model according to a target feature selection algorithm and a target machine learning algorithm corresponding to each set of data parameters of the n sets of data parameters;
  • the first sample set is any sample set in n sample sets, and the third determining module 305 can also be used to:
  • the feature selection algorithm and the machine learning algorithm corresponding to the optimal processing model are used as the target feature selection algorithm and the target machine learning algorithm corresponding to a set of data parameters of the first sample set.
  • the target algorithm includes: a target feature selection algorithm and a target machine learning algorithm.
  • the second determining module 303 may include:
  • the first determining unit 3031 is configured to substitute the data to be processed into a target feature selection algorithm corresponding to the target parameter group, and determine a target feature set, where the target feature set includes p features, and each of the p features has a set of feature parameters.
  • p is an integer greater than or equal to 1, and the feature in the feature set has a weight;
  • the second determining unit 3032 is configured to substitute the p group feature parameters of the p features into the preset weight change model, and determine a weight change value corresponding to each set of the feature parameters of the p group feature parameters;
  • the updating unit 3033 is configured to update, according to the determined weight change value, a weight corresponding to each feature in the target feature set;
  • the third determining unit 3034 is configured to determine an attribute of the data to be processed according to the updated target feature set and the target machine learning algorithm corresponding to the target parameter set.
  • the data processing apparatus 30 may further include:
  • a third obtaining module 307 configured to acquire m sample sets, where the m data parameters of the m sample sets include a target parameter group, where m is an integer greater than or equal to 1;
  • a fifth determining module 308, configured to determine a target feature selection algorithm corresponding to each group of data parameters in the m group data parameters
  • the sixth determining module 309 is configured to determine an initial feature set, where the initial feature set includes: a feature set obtained by substituting each sample set in the m sample sets into a feature set obtained by the target feature selection algorithm corresponding to a set of data parameters of the sample set;
  • the seventh determining module 310 is configured to determine a reference feature set, where the reference feature set includes: substituting each sample set in the m sample sets into a feature set obtained by the reference feature selection algorithm;
  • the eighth determining module 311 is configured to determine, according to the reference feature set, a weight change value corresponding to a set of feature parameters of each feature in the initial feature set;
  • the ninth determining module 312 is configured to determine a preset weight change model according to the weight change value corresponding to the set of feature parameters of each feature.
  • the eighth determining module 311 is further configured to:
  • the first processing model is evaluated according to a preset evaluation algorithm to determine a first evaluation value
  • the second processing model is evaluated according to a preset evaluation algorithm to determine a second evaluation value
  • the difference between the weight of the first feature in the reference feature set and the weight of the first feature in the initial feature set is used as A weight change value corresponding to a set of characteristic parameters of the first feature.
  • the target algorithm comprises: a target feature selection algorithm or a target machine learning algorithm.
  • the first determining module can directly determine the target parameter group according to the preset algorithm model (the data to be processed) a target algorithm corresponding to a set of data parameters, and a target algorithm corresponding to the target parameter set determined according to the preset algorithm model
  • the at least one algorithm corresponding to the target parameter group is evaluated according to the preset evaluation algorithm, and the algorithm corresponding to the determined optimal evaluation value, that is, the second determining module determines the data to be processed according to the target algorithm corresponding to the target parameter group.
  • the attribute is the most accurate, so that the accuracy of the attribute of the data to be processed determined according to the target algorithm corresponding to the target parameter group is high.
  • an embodiment of the present invention provides another network adjustment apparatus, which may include at least one processor 401 (such as a CPU), at least one network interface 402 or other communication interface, a memory 403, and at least one.
  • Communication bus 404 is used to implement connection communication between these devices.
  • the processor 401 is configured to execute an executable module stored in the memory 403, such as a computer program, and the memory 403 may include a high-speed random access memory (English: Random Access Memory; RAM), and may also include a non-unstable memory ( English: non-volatile memory), such as at least one disk storage.
  • the communication connection between the network adjustment device and the at least one other network element is implemented by at least one network interface 402 (which may be wired or wireless), and may use an Internet, a wide area network, a local network, a metropolitan area network, or the like.
  • the memory 403 stores a program 4031
  • the program 4031 can be executed by the processor 401
  • the data processing method shown in FIG. 2 can be implemented by the processor 401 executing the program 4031.
  • the processor after acquiring the data to be processed, the processor directly determines, according to the preset algorithm model, the target parameter group (a set of data parameters of the data to be processed).
  • the target algorithm, and the target algorithm corresponding to the target parameter group determined according to the preset algorithm model is an algorithm for evaluating at least one algorithm corresponding to the target parameter group according to the preset evaluation algorithm, and determining the optimal evaluation value, That is, according to the target algorithm corresponding to the target parameter group, the attribute of the to-be-processed data is determined to be the most accurate, so that the attribute of the data to be processed determined according to the target algorithm corresponding to the target parameter group has higher accuracy.
  • the embodiment of the invention provides a data processing system, which may include the data processing device shown in FIG. 3-1, FIG. 3-2, FIG. 3-4 or FIG.
  • the first determining module can directly determine the target parameter group according to the preset algorithm model.
  • a target algorithm corresponding to a set of data parameters of the data to be processed, and the target algorithm corresponding to the target parameter group determined according to the preset algorithm model is to evaluate at least one algorithm corresponding to the target parameter group according to the preset evaluation algorithm
  • the algorithm corresponding to the determined optimal evaluation value that is, the second determining module determines the attribute of the data to be processed to be the most accurate according to the target algorithm corresponding to the target parameter group, so that the target algorithm determined according to the target parameter group is to be processed.
  • the attributes of the data are more accurate.

Abstract

L'invention concerne un procédé, un dispositif et un système de traitement de données appartenant au domaine technique des ordinateurs. Le procédé consiste à : obtenir des données à traiter, un groupe de paramètres de données des données à traiter étant un groupe de paramètres cibles (205); substituer un groupe de paramètres cibles en un modèle d'algorithme prédéfini pour déterminer un algorithme cible correspondant au groupe de paramètres cibles (206), l'algorithme cible étant : l'évaluation d'au moins un algorithme correspondant au groupe de paramètres cibles selon un algorithme d'évaluation prédéfini pour déterminer un algorithme correspondant à une valeur d'évaluation optimale; et déterminer, selon l'algorithme cible correspondant au groupe de paramètres cible, un attribut des données à traiter. Le procédé est utilisé pour le traitement de données, et résout le problème de mauvais effet de traitement de données, améliorant ainsi l'effet de traitement de données.
PCT/CN2017/079791 2016-08-31 2017-04-07 Procédé, dispositif et système de traitement de données WO2018040561A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201610797641.3 2016-08-31
CN201610797641.3A CN107784363B (zh) 2016-08-31 2016-08-31 数据处理方法、装置及系统

Publications (1)

Publication Number Publication Date
WO2018040561A1 true WO2018040561A1 (fr) 2018-03-08

Family

ID=61299990

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/079791 WO2018040561A1 (fr) 2016-08-31 2017-04-07 Procédé, dispositif et système de traitement de données

Country Status (2)

Country Link
CN (1) CN107784363B (fr)
WO (1) WO2018040561A1 (fr)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109615144B (zh) * 2018-12-20 2022-11-01 中华全国供销合作总社郑州棉麻工程技术设计研究所 棉花回潮率目标值的设定方法、装置、设备及存储介质
CN112036569B (zh) * 2020-07-30 2021-07-23 第四范式(北京)技术有限公司 知识内容的标注方法、装置、计算机装置和可读存储介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101782976A (zh) * 2010-01-15 2010-07-21 南京邮电大学 一种云计算环境下机器学习自动选择方法
CN103761426A (zh) * 2014-01-02 2014-04-30 中国科学院数学与系统科学研究院 一种在高维数据中快速识别特征组合的方法及系统
US20140310208A1 (en) * 2013-04-10 2014-10-16 Machine Perception Technologies Inc. Facilitating Operation of a Machine Learning Environment
CN104200087A (zh) * 2014-06-05 2014-12-10 清华大学 用于机器学习的参数寻优及特征调优的方法及系统
CN105389639A (zh) * 2015-12-15 2016-03-09 上海汽车集团股份有限公司 基于机器学习的物流运输路径规划方法、装置及系统

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101872347B (zh) * 2009-04-22 2012-09-26 富士通株式会社 判断网页类型的方法和装置
CN103123649B (zh) * 2013-01-29 2016-04-20 广州一找网络科技有限公司 一种基于微博平台的消息搜索方法及系统
CN104239351B (zh) * 2013-06-20 2017-12-19 阿里巴巴集团控股有限公司 一种用户行为的机器学习模型的训练方法及装置
CN103778913A (zh) * 2014-01-22 2014-05-07 苏州大学 一种病理嗓音的识别方法
CN104462487A (zh) * 2014-12-19 2015-03-25 南开大学 一种融合多信息源的个性化在线新闻评论情绪预测方法
CN104573741A (zh) * 2014-12-24 2015-04-29 杭州华为数字技术有限公司 一种特征选择方法及装置

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101782976A (zh) * 2010-01-15 2010-07-21 南京邮电大学 一种云计算环境下机器学习自动选择方法
US20140310208A1 (en) * 2013-04-10 2014-10-16 Machine Perception Technologies Inc. Facilitating Operation of a Machine Learning Environment
CN103761426A (zh) * 2014-01-02 2014-04-30 中国科学院数学与系统科学研究院 一种在高维数据中快速识别特征组合的方法及系统
CN104200087A (zh) * 2014-06-05 2014-12-10 清华大学 用于机器学习的参数寻优及特征调优的方法及系统
CN105389639A (zh) * 2015-12-15 2016-03-09 上海汽车集团股份有限公司 基于机器学习的物流运输路径规划方法、装置及系统

Also Published As

Publication number Publication date
CN107784363A (zh) 2018-03-09
CN107784363B (zh) 2021-02-09

Similar Documents

Publication Publication Date Title
CN109871886B (zh) 基于谱聚类的异常点比例优化方法、装置及计算机设备
CN105574538B (zh) 分类模型训练方法及装置
TWI677852B (zh) 一種圖像特徵獲取方法及裝置、電子設備、電腦可讀存儲介質
WO2020073534A1 (fr) Procédé et appareil de poussée basés sur le re-clustering, dispositif informatique et support d'enregistrement
JP5755822B1 (ja) 類似度算出システム、類似度算出方法およびプログラム
US7788292B2 (en) Raising the baseline for high-precision text classifiers
RU2617921C2 (ru) Способ и система распознания пути категории
CN109685092B (zh) 基于大数据的聚类方法、设备、存储介质及装置
WO2016045567A1 (fr) Procédé et dispositif d'analyse de données de page internet
CN112132208B (zh) 图像转换模型的生成方法、装置、电子设备及存储介质
US20220180209A1 (en) Automatic machine learning system, method, and device
WO2018006631A1 (fr) Procédé et système de segmentation automatique au niveau des utilisateurs
WO2018001123A1 (fr) Estimateur de taille d'échantillon
WO2022001918A1 (fr) Procédé et appareil de construction de modèle prédictif, dispositif informatique et support de stockage
CN110909868A (zh) 基于图神经网络模型的节点表示方法和装置
WO2020155754A1 (fr) Procédé et appareil d'optimisation de proportions aberrantes, et dispositif informatique et support d'informations
CN104484600B (zh) 一种基于改进密度聚类的入侵检测方法及装置
CN109993026B (zh) 亲属识别网络模型的训练方法及装置
WO2018040561A1 (fr) Procédé, dispositif et système de traitement de données
CN110728322A (zh) 一种数据分类方法及相关设备
Van Rosmalen et al. Optimization strategies for two-mode partitioning
CN110929218A (zh) 一种差异最小化随机分组方法及系统
CN113222073B (zh) 训练广告推荐模型的方法及装置
JP6570978B2 (ja) クラスタ選択装置
US11556595B2 (en) Attribute diversity for frequent pattern analysis

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17844869

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17844869

Country of ref document: EP

Kind code of ref document: A1