CN111178488A - Data processing method and device - Google Patents

Data processing method and device Download PDF

Info

Publication number
CN111178488A
CN111178488A CN201911336654.0A CN201911336654A CN111178488A CN 111178488 A CN111178488 A CN 111178488A CN 201911336654 A CN201911336654 A CN 201911336654A CN 111178488 A CN111178488 A CN 111178488A
Authority
CN
China
Prior art keywords
individuals
individual
lambda
optimal
parameter
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN201911336654.0A
Other languages
Chinese (zh)
Inventor
范慧婷
卢亿雷
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Enyike Beijing Data Technology Co ltd
Original Assignee
Enyike Beijing Data Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Enyike Beijing Data Technology Co ltd filed Critical Enyike Beijing Data Technology Co ltd
Priority to CN201911336654.0A priority Critical patent/CN111178488A/en
Publication of CN111178488A publication Critical patent/CN111178488A/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/004Artificial life, i.e. computing arrangements simulating life
    • G06N3/006Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]

Abstract

The embodiment of the application discloses a data processing method and device. The method comprises the following steps: acquiring an initialization population corresponding to m parameters to be adjusted, wherein the initialization population comprises lambda individuals, each individual comprises initial values corresponding to the m parameters, and m is a positive integer; carrying out iterative processing on the initialization population to obtain lambda optimal individuals; calculating the variation probability of the optimal individual; performing mutation operation on at least one parameter in at least one optimal individual according to the mutation probability to obtain a new population generated by the mutation operation; and selecting an individual according with a preset optimal selection strategy from a new population generated by the mutation operation, and determining parameter values corresponding to the m parameters.

Description

Data processing method and device
Technical Field
The present disclosure relates to the field of information processing, and more particularly, to a data processing method and apparatus.
Background
Machine learning algorithms often model collected sample data to find out the rules in the data in order to solve a certain problem. The problem to be solved is often not an accurate solution, and generally needs to be converted into an optimization problem and continuously approaches to an optimal solution. The performance of the model is often closely related to the parameters of the model, i.e., whether the proposed problem can be solved better or not requires efficient and accurate adjustment of the parameters of the model. The parameters are algorithm parameters in machine learning, generally divided into model parameters and model hyper-parameters, and are key points of the algorithm. The model parameters are learned from data and do not need to be set manually, such as support vectors in a support vector machine model or coefficients in a logical regression. The model hyper-parameters are manually configured by the model user, such as the proportion of features that each node in the decision tree needs to use.
For the hyper-parameters of the model, in the related art, the parameter adjustment method mainly comprises manual adjustment and automatic adjustment. The manual parameter adjustment is mainly to determine how the parameters change according to the use experience of the model so that the model evaluation index is higher or lower, and a model user needs to have better related professional knowledge and more practical experience, otherwise, the efficiency is low, and the time cost is increased along with the increase of the number of the parameters exceeding the model. In order to solve the above disadvantages of manual parameter adjustment, an automatic parameter adjustment algorithm is proposed, mainly including a search algorithm and a bayesian optimization algorithm. In order to obtain the global optimal value of the model objective function, the search algorithm automatically traverses the value space of each parameter to obtain the value taking point of the optimized objective function, but the search cost is very high, and the optimal solution is difficult to approach efficiently and accurately. The Bayesian optimization algorithm learns the prior of the target function, fuses the prior with the information of the sample, and utilizes a Bayesian formula to obtain the posterior information of the target function, so that the position of the function in the parameter space can be deduced to obtain the optimal solution. The Bayesian optimization algorithm needs to be subject to Gaussian distribution when the optimization function of the model is assumed, which greatly limits the universality of the method.
Therefore, how to efficiently and accurately complete the adjustment of the parameters is an urgent problem to be solved.
Disclosure of Invention
In order to solve any technical problem, embodiments of the present application provide a data processing method and apparatus.
To achieve the purpose of the embodiment of the present application, an embodiment of the present application provides a data processing method, including:
acquiring an initialization population corresponding to m parameters to be adjusted, wherein the initialization population comprises lambda individuals, each individual comprises initial values corresponding to the m parameters, and m is a positive integer;
carrying out iterative processing on the initialization population to obtain lambda optimal individuals;
calculating the variation probability of the optimal individual;
performing mutation operation on at least one parameter in at least one optimal individual according to the mutation probability to obtain a new population generated by the mutation operation;
and selecting an individual according with a preset optimal selection strategy from a new population generated by the mutation operation, and determining parameter values corresponding to the m parameters.
In an exemplary embodiment, obtaining initial values corresponding to m parameters in an individual by the following method includes:
obtaining a parameter piValue space [ a ]i,bi]Wherein i is an integer of 1 or more and m or less, ai,biIs a real number;
subtending a value space [ a ]i,bi]Dividing the same width to obtain
Figure BDA0002331135960000021
An initial value interval, wherein
Figure BDA0002331135960000022
Is composed of
Figure BDA0002331135960000023
From the parameter piIs/are as follows
Figure BDA0002331135960000024
Selecting a value interval from the value intervals, and selecting a numerical value from the value interval as the parameter piThe corresponding initial value.
In an exemplary embodiment, the iteratively processing the initialization population to obtain λ optimal individuals includes:
repeatedly executing the following steps until lambda optimal individuals are obtained or the iteration number reaches a preset maximum iteration number T, wherein the steps comprise:
selecting lambda pairs of parent individuals from lambda individuals in the initialization population;
determining lambda offspring individuals corresponding to the parent individuals by the lambda;
from the λ pairs of individuals among the parent individuals and the λ offspring individuals, λ optimal individuals were selected.
In one exemplary embodiment, selecting λ pairs of parent individuals from λ individuals in the initialization population comprises:
calculating the fitness information of each individual according to a preset fitness calculation strategy;
selecting n individuals from the lambda individuals, selecting 2 individuals of which the fitness information accords with a preset optimal selection strategy from the n individuals as 1 pair of parent individuals, and so on until the lambda pair of parent individuals is selected, wherein n is an integer larger than 2.
In an exemplary embodiment, a progeny individual is obtained by:
Figure BDA0002331135960000031
wherein the content of the first and second substances,
Figure BDA0002331135960000032
respectively representing parent individuals in the t-th generation population, i is more than or equal to 1, and j is more than or equal to lambda; omegai、ωjRespectively are the values after the fitness normalization of parent individuals in the population of the T generation, wherein T is a positive integer and is less than or equal to the maximum iteration time T.
In an exemplary embodiment, the selecting λ optimal individuals from λ pairs of individuals among parent individuals and λ offspring individuals comprises:
determining the fitness information of lambda corresponding to the corresponding individual in the parent individual and lambda offspring individuals, wherein the fitness information is determined according to a preset fitness calculation strategy;
and selecting lambda optimal individuals of which the fitness information accords with a preset optimal selection strategy from the corresponding individuals and lambda offspring individuals in the lambda pair parent individuals.
In an exemplary embodiment, the mutation probability of an optimal individual is calculated by:
Figure BDA0002331135960000033
wherein i represents the ith individual, t represents the tth population,
Figure BDA0002331135960000034
the maximum fitness and the average fitness, sigma, of the t-th generation species respectively(t)Is the variance of the population fitness of the t-th generation,
Figure BDA0002331135960000035
is the fitness of the ith individual in the t generation, k(t)Is a variation factor of the t generation and is a constant.
In an exemplary embodiment, the mutation operation is performed on an optimal individual by the following methods, including:
judging whether the variation probability of the optimal individual accords with a preset variation judgment strategy or not to obtain a judgment result;
if the judgment result is that the mutation judgment strategy is met, selecting at least one parameter from the optimal individual as a mutation parameter;
calculating the noise corresponding to the variation parameter according to the value range corresponding to the variation parameter and a preset noise calculation strategy;
and if the generated noise range is in the value range corresponding to the variation parameter, adding the noise to the component to obtain a new individual after the variation processing.
In an exemplary embodiment, if the variation parameter has a value range of
Figure BDA0002331135960000041
The laplacian noise range increased by the variation parameter is:
Figure BDA0002331135960000042
wherein the content of the first and second substances,
Figure BDA0002331135960000043
are real numbers.
A data processing apparatus comprising a processor and a memory, the memory storing a computer program, the processor calling the computer program in the memory to implement the method of any one of the above.
According to the scheme provided by the embodiment of the application, an initialization population corresponding to m parameters to be adjusted is obtained, wherein the initialization population comprises lambda individuals, each individual comprises initial values corresponding to the m parameters, and m is a positive integer; carrying out iterative processing on the initialization population to obtain lambda optimal individuals; calculating the variation probability of the optimal individual; performing mutation operation on at least one parameter in at least one optimal individual according to the mutation probability to obtain a new population generated by the mutation operation; selecting an individual which accords with a preset optimal selection strategy from a new population generated by the mutation operation, determining parameter values corresponding to m parameters according to the new population generated by the mutation operation, constructing a next group of solutions based on a mode of carrying out mutation processing on the parameters in the individual in the population, obtaining a next generation population, determining the value of the parameters, and achieving the purpose of efficiently and accurately finishing the adjustment of the parameters.
Additional features and advantages of the embodiments of the application will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by the practice of the embodiments of the application. The objectives and other advantages of the embodiments of the application may be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.
Drawings
The accompanying drawings are included to provide a further understanding of the embodiments of the present application and are incorporated in and constitute a part of this specification, illustrate embodiments of the present application and together with the examples of the embodiments of the present application do not constitute a limitation of the embodiments of the present application.
Fig. 1 is a flowchart of a data processing method provided in an embodiment of the present application;
fig. 2 is a flowchart of an automatic parameter adjustment method according to an embodiment of the present disclosure;
fig. 3 is a flowchart of a method for parameter mutation processing according to an embodiment of the present disclosure.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present application more apparent, the embodiments of the present application will be described in detail below with reference to the accompanying drawings. It should be noted that, in the embodiments of the present application, features in the embodiments and the examples may be arbitrarily combined with each other without conflict.
In order to be capable of automatically adjusting model parameters efficiently, accurately and universally, the application provides a general algorithm for realizing automatic parameter adjustment on a classification algorithm model. The algorithm can automatically adjust parameters of an algorithm model for different data, even training data containing noise. In addition, if the quality of the model cannot be judged only by one objective optimization function, the algorithm can also be applied when multi-objective optimization is required.
The parameters referred to in the present application refer to common parameters and hyper-parameters of a classification algorithm, the common parameters include input feature variables and value selection thresholds used by each internal node of a decision tree, weights of each edge of a neural network, support vectors in a support vector machine, and the like, and the hyper-parameters include minimum sample number of each leaf node of the decision tree, the number of hidden layers of the neural network, kernel functions in the support vector machine, and the like.
Fig. 1 is a flowchart of a data processing method according to an embodiment of the present application. The method shown in fig. 1 comprises:
101, acquiring an initialization population corresponding to m parameters to be adjusted, wherein the initialization population comprises lambda individuals, each individual comprises initial values corresponding to the m parameters, and m is a positive integer;
in an exemplary embodiment, obtaining initial values corresponding to m parameters in an individual by the following method includes:
obtaining a parameter piValue space [ a ]i,bi]Wherein i is an integer of 1 or more and m or less, ai,biIs a real number;
subtending a value space [ a ]i,bi]Dividing the same width to obtain
Figure BDA0002331135960000061
An initial value interval, wherein
Figure BDA0002331135960000062
Is composed of
Figure BDA0002331135960000063
From the parameter piIs/are as follows
Figure BDA0002331135960000064
Selecting a value interval from the value intervals, and selecting a numerical value from the value interval as the parameter piThe corresponding initial value.
At least one of the selection operation of the value interval and the selection operation of the value in the value interval may be randomly selected, or selected according to a preset selection rule.
102, carrying out iterative processing on the initialization population to obtain lambda optimal individuals;
in one exemplary embodiment, λ optimal individuals may be determined from the individuals resulting from the iterative operations by an iterative process that initializes the population.
In an exemplary embodiment, the selection of the optimal individual may be achieved by using a genetic algorithm, which specifically includes:
repeatedly executing the following steps until lambda optimal individuals are obtained or the iteration number reaches a preset maximum iteration number T, wherein the steps comprise:
selecting lambda pairs of parent individuals from lambda individuals in the initialization population;
determining lambda offspring individuals corresponding to the parent individuals by the lambda;
from the λ pairs of individuals among the parent individuals and the λ offspring individuals, λ optimal individuals were selected.
In one exemplary embodiment, selecting λ pairs of parent individuals from λ individuals in the initialization population comprises:
calculating the fitness information of each individual according to a preset fitness calculation strategy;
selecting n individuals from the lambda individuals, selecting 2 individuals of which the fitness information accords with a preset optimal selection strategy from the n individuals as 1 pair of parent individuals, and so on until the lambda pair of parent individuals is selected, wherein n is an integer larger than 2.
In an exemplary embodiment, the selection may also be performed in a manner that the parent individuals are put back, for example, when the X-th pair of parent individuals is selected, the parent individuals are the 1 st individual and the 3 rd individual, when the X + 1-th pair of parent individuals is selected, at least one of the 1 st individual and the 3 rd individual may also be put back into the population as an alternative individual before the selection of the individuals, so as to improve the diversity of the parent individuals.
In an exemplary embodiment, a progeny individual is obtained by:
Figure BDA0002331135960000071
wherein the content of the first and second substances,
Figure BDA0002331135960000072
respectively representing parent individuals in the t-th generation population, i is more than or equal to 1, and j is more than or equal to lambda; omegai、ωjRespectively are the values after the fitness normalization of parent individuals in the t generation population, wherein t is positiveInteger, less than or equal to the maximum number of iterations T.
In an exemplary embodiment, the selecting λ optimal individuals from λ pairs of individuals among parent individuals and λ offspring individuals comprises:
determining the fitness information of lambda corresponding to the corresponding individual in the parent individual and lambda offspring individuals, wherein the fitness information is determined according to a preset fitness calculation strategy;
and selecting lambda optimal individuals of which the fitness information accords with a preset optimal selection strategy from the corresponding individuals and lambda offspring individuals in the lambda pair parent individuals.
103, calculating the variation probability of the optimal individual;
in an exemplary embodiment, the mutation probability of an optimal individual is calculated by:
Figure BDA0002331135960000073
wherein i represents the ith individual, t represents the tth population,
Figure BDA0002331135960000074
the maximum fitness and the average fitness, sigma, of the t-th generation species respectively(t)Is the variance of the population fitness of the t-th generation,
Figure BDA0002331135960000075
is the fitness of the ith individual in the t generation, k(t)Is a variation factor of the t generation and is a constant.
104, performing mutation operation on at least one parameter in at least one optimal individual according to the mutation probability to obtain a new population generated by the mutation operation;
in an exemplary embodiment, the mutation operation is performed on an optimal individual by the following methods, including:
calculating the variation probability of the optimal individual;
judging whether the variation probability of the optimal individual accords with a preset variation judgment strategy or not to obtain a judgment result;
if the judgment result is that the mutation judgment strategy is met, selecting at least one parameter from the optimal individual as a mutation parameter;
calculating the noise corresponding to the variation parameter according to the value range corresponding to the variation parameter and a preset noise calculation strategy;
and if the generated noise range is in the value range corresponding to the variation parameter, adding the noise to the component to obtain a new individual after the variation processing.
If the variation parameter has a value range of
Figure BDA0002331135960000081
The laplacian noise range increased by the variation parameter is:
Figure BDA0002331135960000082
wherein the content of the first and second substances,
Figure BDA0002331135960000083
are real numbers.
And 104, selecting an individual according with a preset optimal selection strategy from a new population generated by the mutation operation, and determining parameter values corresponding to the m parameters.
Wherein the optimization judgment strategy can be determined according to the numerical value of the fitness.
According to the method provided by the embodiment of the application, an initialization population corresponding to m parameters to be adjusted is obtained, wherein the initialization population comprises lambda individuals, each individual comprises initial values corresponding to the m parameters, and m is a positive integer; carrying out iterative processing on the initialization population to obtain lambda optimal individuals; performing a mutation operation on at least one parameter in at least one optimal individual to obtain a new population generated by the mutation operation; and selecting an individual according with a preset optimized selection strategy according to a new population generated by the variation operation, determining parameter values corresponding to the m parameters, constructing a next group of solutions based on a mode of performing variation processing on the parameters in the individual in the population, obtaining a next generation population, determining the value of the parameters, and efficiently and accurately finishing the adjustment of the parameters.
In addition, using the basic principle of genetic algorithms, a solution is initially obtained by randomly selecting a combination in the parameter space, where the solution is a vector and each dimension of the vector corresponds to a parameter. The random process is repeated for many times to obtain a group of solutions, namely, a group of solutions of an optimization objective function (fitness function) is randomly selected, which can also be called as a father population, parents and individuals are selected for the group of solutions, a next group of solutions is constructed by crossing and mutation of the parents and the individuals to obtain a next generation population, and finally the above process is repeated until an optimal solution is obtained or the number of iterations reaches a threshold value.
The method provided by the embodiment of the application can greatly improve the efficiency of parameter adjustment of the model user, and save time and cost, so that more time is provided for better solving the practical problem in thinking at the model level, and parameters in the model are just like a black box without intervention of the user. In addition, the referenced genetic algorithm can automatically eliminate parameters which can not optimize the target function, and the parameters which can continuously optimize the target function are selected, so that the efficient and accurate parameter adjusting effect is achieved. Meanwhile, the diversity of the initial population can be ensured by initializing two random choices of the population, and if the data has noise points, the influence of the noise on the parameters can be reduced in the crossing and variation stages. The crossover and mutation algorithm proposed herein enables the process of searching for the optimal parameter solution to converge more efficiently towards the global optimal solution while jumping out of the local optimal solution efficiently.
Fig. 2 is a flowchart of a method for automatically adjusting parameters according to an embodiment of the present application. As shown in fig. 2, the method includes:
step 201, performing adaptive discretization on parameters;
in one exemplary embodiment, there are a total of m parameters that need to be adjusted: p is a radical of1,p2,...,pmEach individual in the population is a vector of m-dimensional real-valued variables, and the population always sharesAnd (3) maintaining the number of the lambda individuals, namely the number of the individuals of the initial population or the new offspring population generated in each generation is maintained at lambda, setting the population to pass through T generations in total, and ending after the algorithm iterates for T times. The parameters are real numbers and have certain constraint conditions, namely, value ranges, so that real number coding can be adopted.
According to the parameter piValue space [ a ]i,bi]Dividing the equal width to obtain initial (generation 1) intervals and the number of the intervals
Figure BDA0002331135960000091
Is composed of
Figure BDA0002331135960000092
The difference value of the maximum value and the minimum value of the value space is expressed to carry out squaring calculation, the square calculation result is rounded downwards, and the sum of the square calculation result and 1 is calculated;
step 202, randomly selecting a value taking interval for each parameter, and selecting a random value from the value taking intervals to obtain an initialized individual;
in an exemplary embodiment, the initial λ individuals are randomly selected one interval and one value from the corresponding discretized value space.
Step 203, repeatedly executing the content of the step 202 for lambda times to obtain an initial population;
in one exemplary embodiment, any one of the λ individuals of the initial population may be represented as
Figure BDA0002331135960000101
Where k is 1,2, lambda,
Figure BDA0002331135960000102
is the first parameter p1An initial value of (1);
step 204, calculating the fitness, namely calculating the value of the loss function;
in one exemplary embodiment, the fitness function is the objective cost function C (p) of the algorithm1,p2,...,pm) The individuals are selected according to a target cost function. After the initialization parameters are obtained, the model is trained, and the fitness is calculated. The fitness of all individuals is normalized, so that the fitness of each individual is within 0,1]The normalized fitness is set as (omega)12,...,ωλ)。
Step 205, selecting an algorithm to obtain a parent individual;
in an exemplary embodiment, 4 individuals are randomly selected first, then according to their fitness ranking, the optimal 2 individuals are selected as a pair of next generation parent individuals, i.e. 2 solutions of the optimization loss function are selected, and then the fitness of the two individuals is exponentially decayed respectively. For example, an original fitness value of 0.8, x according to an exponential decay function f (x)0e-θxWhen θ is 0.24, the first attenuation, that is, when x is 1, the fitness attenuation is 0.629. The more times of selection, the lower the fitness is, so that the diversity of selection can be ensured, individuals with low fitness can also be selected with probability, and meanwhile, the parameters of the optimized model can be kept to the next generation. The selection process is repeated until the parent individuals of the lambda pair of the next generation are selected.
In one exemplary embodiment, the input is λ individuals of generation t — t(t)(ii) a The output is lambda pair parent individual psi of t +1 generation(t+1)(ii) a Wherein initializing Ψ(t+1)Is an empty set;
normalizing the fitness of all lambda individuals in the t generation to ensure that each individual has a new fitness with a value range between [0 and 1 ];
the following steps were cycled for all individuals:
4 individuals were randomly selected (4 sets of parameters), and 2 individuals were selected that minimized the cost function
Figure BDA0002331135960000103
If Ψ(t+1)Will be smaller than λ
Figure BDA0002331135960000104
Adding Ψ as a parent Pair(t+1)I.e. by
Figure BDA0002331135960000105
Figure BDA0002331135960000106
And recalculating the selected ones by an exponential decay function
Figure BDA0002331135960000107
The fitness of (2); if Ψ(t+1)If is equal to λ, the loop exits and the selection ends.
Step 206, obtaining lambda offspring through crossing according to the formula (1);
in an exemplary embodiment, each parent is assumed to produce only one offspring, and the parents are the individuals
Figure BDA0002331135960000111
I is not less than 1, j is not less than lambda, then the offspring of the two individuals are
Figure BDA0002331135960000112
Wherein, ω isi、ωjFitness values after normalization for the parent individuals, respectively. Such crossover operations on all parent individuals can yield offspring comprising lambda individuals.
Step 207, selecting the optimal lambda individuals from the lambda offspring and the individuals of the lambda pair parent individuals;
in an exemplary embodiment, the population has λ offspring and λ pairs of parent individuals, and a total of μ individuals, and the optimal λ individuals are selected as the new population of the next generation according to the fitness.
208, carrying out variation on part of individuals according to a variation algorithm;
fig. 3 is a flowchart of a method for parameter mutation processing according to an embodiment of the present disclosure. As shown in fig. 3, the method includes:
the variation is consistent with the biological world rule, the variation probability of a new population obtained after crossing in the scheme is very low, the variation probability of each individual is changed in a self-adaptive manner, and the calculation formula of the self-adaptive variation probability of any individual is as follows:
Figure BDA0002331135960000113
wherein i represents the ith individual, t represents the tth population,
Figure BDA0002331135960000114
the maximum fitness and the average fitness, sigma, of the t-th generation species respectively(t)Is the variance of the population fitness of the t-th generation,
Figure BDA0002331135960000115
is the fitness of the ith individual in the t generation, k(t)Is a variation factor of the t generation and is a constant.
When the fitness of each individual of the population is close, the population may be in local optimum, the variation probability formula tends to increase the variation probability of the individual with the fitness higher than the average fitness, so that the search space of the genetic algorithm is not limited to a certain space, and when the fitness of each individual of the population fluctuates greatly, the variation probability of all the individuals becomes low, so that the convergence towards the direction of the optimal solution can be accelerated more effectively, and meanwhile, the diversity of the individuals in the population is also ensured.
In the scenario of the present solution, the mutation is a mutation of one component of a solution of one m-dimension, that is, a certain parameter is mutated. According to the concept of statistically small probability, which is generally regarded as a small probability of being equal to or less than 0.05 or 0.01, it is set that if the mutation probability of an individual is greater than or equal to 0.03, one of the components (genes) of the individual is randomly selected for mutation. The variation is achieved by adding a limited range of laplacian noise, the limited range being dependent on the range of values of the component (specific parameter) to be varied. The range of values of the parameter to be mutated is assumed to be
Figure BDA0002331135960000121
The added laplacian noise range is:
Figure BDA0002331135960000122
the Laplace noise is a number randomly generated according to a program, if the generated noise is within a limited range, the generated noise is taken, otherwise, the random number is continuously generated until the limited range is met.
Step 209, obtaining the next generation population generated after the variation, namely the solution of the next iteration;
step 210, judging whether the optimal solution is reached or the iteration number reaches a threshold value;
if yes, the process is ended; otherwise, step 204 is performed.
In the embodiment, by referring to the part of the genetic algorithm for parameter adjustment, an evolutionary algorithm or an algorithm which can be derived from the genetic algorithm such as neural network evolution can be used, so that the high efficiency and accuracy of automatic parameter adjustment are realized, and the algorithm has better robustness even if data has noise.
A data processing apparatus comprising a processor and a memory, the memory storing a computer program, the processor calling the computer program in the memory to implement the method of any one of the above.
It will be understood by those of ordinary skill in the art that all or some of the steps of the methods, systems, functional modules/units in the devices disclosed above may be implemented as software, firmware, hardware, and suitable combinations thereof. In a hardware implementation, the division between functional modules/units mentioned in the above description does not necessarily correspond to the division of physical components; for example, one physical component may have multiple functions, or one function or step may be performed by several physical components in cooperation. Some or all of the components may be implemented as software executed by a processor, such as a digital signal processor or microprocessor, or as hardware, or as an integrated circuit, such as an application specific integrated circuit. Such software may be distributed on computer readable media, which may include computer storage media (or non-transitory media) and communication media (or transitory media). The term computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data, as is well known to those of ordinary skill in the art. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, Digital Versatile Disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can accessed by a computer. In addition, communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media as known to those skilled in the art.

Claims (10)

1. A data processing method, comprising:
acquiring an initialization population corresponding to m parameters to be adjusted, wherein the initialization population comprises lambda individuals, each individual comprises initial values corresponding to the m parameters, and m is a positive integer;
carrying out iterative processing on the initialization population to obtain lambda optimal individuals;
calculating the variation probability of the optimal individual;
performing mutation operation on at least one parameter in at least one optimal individual according to the mutation probability to obtain a new population generated by the mutation operation;
and selecting an individual according with a preset optimal selection strategy from a new population generated by the mutation operation, and determining parameter values corresponding to the m parameters.
2. The method of claim 1, wherein obtaining initial values corresponding to m parameters in an individual by:
obtaining a parameter piValue space [ a ]i,bi]Wherein i is an integer of 1 or more and m or less, ai,biIs a real number;
subtending a value space [ a ]i,bi]Dividing the same width to obtain
Figure FDA0002331135950000011
An initial value interval, wherein
Figure FDA0002331135950000012
Is composed of
Figure FDA0002331135950000013
From the parameter piIs/are as follows
Figure FDA0002331135950000014
Selecting a value interval from the value intervals, and selecting a numerical value from the value interval as the parameter piThe corresponding initial value.
3. The method of claim 1, wherein iteratively processing the initialization population to obtain λ optimal individuals comprises:
repeatedly executing the following steps until lambda optimal individuals are obtained or the iteration number reaches a preset maximum iteration number T, wherein the steps comprise:
selecting lambda pairs of parent individuals from lambda individuals in the initialization population;
determining lambda offspring individuals corresponding to the parent individuals by the lambda;
from the λ pairs of individuals among the parent individuals and the λ offspring individuals, λ optimal individuals were selected.
4. The method of claim 3, wherein selecting λ pairs of parent individuals from λ individuals in the initialization population comprises:
calculating the fitness information of each individual according to a preset fitness calculation strategy;
selecting n individuals from the lambda individuals, selecting 2 individuals of which the fitness information accords with a preset optimal selection strategy from the n individuals as 1 pair of parent individuals, and so on until the lambda pair of parent individuals is selected, wherein n is an integer larger than 2.
5. The method of claim 3, wherein an individual is obtained by a method comprising:
Figure FDA0002331135950000021
wherein the content of the first and second substances,
Figure FDA0002331135950000022
respectively representing parent individuals in the t-th generation population, i is more than or equal to 1, and j is more than or equal to lambda; omegai、ωjRespectively are the values after the fitness normalization of parent individuals in the population of the T generation, wherein T is a positive integer and is less than or equal to the maximum iteration time T.
6. The method of claim 1, wherein selecting λ optimal individuals from λ pairs of individuals from among parent individuals and λ offspring individuals comprises:
determining the fitness information of lambda corresponding to the corresponding individual in the parent individual and lambda offspring individuals, wherein the fitness information is determined according to a preset fitness calculation strategy;
and selecting lambda optimal individuals of which the fitness information accords with a preset optimal selection strategy from the corresponding individuals and lambda offspring individuals in the lambda pair parent individuals.
7. The method of claim 1, wherein the mutation probability of an optimal individual is calculated by:
Figure FDA0002331135950000023
wherein i represents the ith individual, t represents the tth population,
Figure FDA0002331135950000024
the maximum fitness and the average fitness, sigma, of the t-th generation species respectively(t)Is the variance of population fitness of the t-th generation, fi (t)Is the fitness of the ith individual in the t generation, k(t)Is a variation factor of the t generation and is a constant.
8. The method of claim 1, wherein performing the mutation on the optimal individual comprises:
judging whether the variation probability of the optimal individual accords with a preset variation judgment strategy or not to obtain a judgment result;
if the judgment result is that the mutation judgment strategy is met, selecting at least one parameter from the optimal individual as a mutation parameter;
calculating the noise corresponding to the variation parameter according to the value range corresponding to the variation parameter and a preset noise calculation strategy;
and if the generated noise range is in the value range corresponding to the variation parameter, adding the noise to the component to obtain a new individual after the variation processing.
9. The method of claim 8, wherein:
if the variation parameter has a value range of
Figure FDA0002331135950000033
The laplacian noise range increased by the variation parameter is:
Figure FDA0002331135950000031
wherein the content of the first and second substances,
Figure FDA0002331135950000032
are real numbers.
10. A data processing apparatus comprising a processor and a memory, the memory storing a computer program, the processor calling the computer program in the memory to implement the method of any one of claims 1 to 9.
CN201911336654.0A 2019-12-23 2019-12-23 Data processing method and device Withdrawn CN111178488A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911336654.0A CN111178488A (en) 2019-12-23 2019-12-23 Data processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911336654.0A CN111178488A (en) 2019-12-23 2019-12-23 Data processing method and device

Publications (1)

Publication Number Publication Date
CN111178488A true CN111178488A (en) 2020-05-19

Family

ID=70655599

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911336654.0A Withdrawn CN111178488A (en) 2019-12-23 2019-12-23 Data processing method and device

Country Status (1)

Country Link
CN (1) CN111178488A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114385256A (en) * 2020-10-22 2022-04-22 华为云计算技术有限公司 Method and device for configuring system parameters

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114385256A (en) * 2020-10-22 2022-04-22 华为云计算技术有限公司 Method and device for configuring system parameters

Similar Documents

Publication Publication Date Title
CN111177792B (en) Method and device for determining target business model based on privacy protection
CN109961098B (en) Training data selection method for machine learning
US9058564B2 (en) Controlling quarantining and biasing in cataclysms for optimization simulations
Kanan et al. Feature selection using ant colony optimization (ACO): a new method and comparative study in the application of face recognition system
CN112508243A (en) Training method and device for multi-fault prediction network model of power information system
CN111178416A (en) Parameter adjusting method and device
CN114328048A (en) Disk fault prediction method and device
CN112598062A (en) Image identification method and device
CN111178488A (en) Data processing method and device
CN112801231B (en) Decision model training method and device for business object classification
Farooq Genetic algorithm technique in hybrid intelligent systems for pattern recognition
Bhadouria et al. A study on genetic expression programming-based approach for impulse noise reduction in images
KR100869554B1 (en) Domain density description based incremental pattern classification method
CN116993548A (en) Incremental learning-based education training institution credit assessment method and system for LightGBM-SVM
CN116956160A (en) Data classification prediction method based on self-adaptive tree species algorithm
Azghani et al. Intelligent modified mean shift tracking using genetic algorithm
CN113141272B (en) Network security situation analysis method based on iteration optimization RBF neural network
JP7468088B2 (en) Image processing system and image processing program
CN112036566A (en) Method and apparatus for feature selection using genetic algorithm
Windarto An implementation of continuous genetic algorithm in parameter estimation of predator-prey model
CN108664992B (en) Classification method and device based on genetic optimization and kernel extreme learning machine
Kordos et al. Increasing speed of genetic algorithm-based instance selection
JP5652250B2 (en) Image processing program and image processing apparatus
CN116703529B (en) Contrast learning recommendation method based on feature space semantic enhancement
US20220172105A1 (en) Efficient and scalable computation of global feature importance explanations

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication
WW01 Invention patent application withdrawn after publication

Application publication date: 20200519