CN111310901A

CN111310901A - Method and device for obtaining a sample

Info

Publication number: CN111310901A
Application number: CN202010112532.XA
Authority: CN
Inventors: 希滕; 张刚; 温圣召
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2020-02-24
Filing date: 2020-02-24
Publication date: 2020-06-19
Anticipated expiration: 2040-02-24
Also published as: CN111310901B

Abstract

Embodiments of the present disclosure disclose methods and apparatus for obtaining a sample. One embodiment of the method comprises: updating a preset sample screening controller through the current feedback reward value to obtain an updated sample screening controller, and generating a candidate data sample set from a sample space through the updated sample screening controller; detecting the candidate data sample set based on a preset reference model, and determining a sample loss function of the candidate data sample; updating the feedback reward value based on the sample loss function; and determining the current candidate data sample as an effective data sample in response to the feedback reward value reaching a preset convergence condition or the accumulated times of the iterative operation reaching a preset iteration time threshold value. The implementation mode improves the efficiency of obtaining effective data samples, reduces the data processing amount and saves the memory space of hardware.

Description

Method and device for obtaining a sample

Technical Field

The embodiment of the disclosure relates to the technical field of computers, in particular to a method and a device for obtaining a sample.

Background

With the development of science and technology, many problems can be solved through corresponding data models. Data samples are needed for data model training, and technicians can obtain appropriate data samples to train to obtain corresponding data models. In practice, the number of samples is usually different for different problems to be solved or different domains, and the different number of samples directly affects the efficiency of model training.

Disclosure of Invention

Embodiments of the present disclosure provide methods and apparatuses for obtaining a sample.

In a first aspect, embodiments of the present disclosure provide a method for obtaining a sample, the method comprising: updating a preset sample screening controller through a current feedback reward value to obtain an updated sample screening controller, and generating a candidate data sample set from a sample space through the updated sample screening controller, wherein the candidate data sample set meets a set data balance constraint condition, and the feedback reward value is used for representing the degree that the candidate data sample meets the data balance constraint condition; detecting the candidate data sample set based on a preset reference model, and determining a sample loss function of the candidate data sample, wherein the reference model is used for detecting the validity of the data sample; updating the feedback reward value based on the sample loss function; and determining the current candidate data sample as an effective data sample in response to the feedback reward value reaching a preset convergence condition or the accumulated times of the iterative operation reaching a preset iteration time threshold value.

In some embodiments, the sample screening controller comprises a recurrent neural network; and, the updating of the preset sample screening controller by the current feedback reward value includes: and updating the parameters of the recurrent neural network through the current feedback reward value so that the updated sample screening controller generates a candidate data sample which enables the feedback reward value to be increased.

In some embodiments, the updating the preset sample screening controller by the current feedback reward value includes: generating a plurality of sample populations based on a preset sample space, and taking the current feedback reward value as the fitness of a candidate data sample generated in the current iteration operation in the sample populations; and updating the sample screening controller based on the fitness of the candidate data sample in the sample population so that the sample screening controller generates a data sample with increased fitness in the next iteration operation as the candidate data sample of the next iteration operation.

In some embodiments, the configuring, by the updated sample screening controller, the data samples in the sample space with sample codes, and the generating, by the updated sample screening controller, the candidate data sample set from the sample space includes: determining a sample type parameter from a sample space through the updated sample screening controller, wherein the sample type parameter is used for representing a proportional relation of a data sample corresponding to the sample type in the sample space; and generating a sample coding sequence through the sample type parameters, wherein the sample coding sequence is used for characterizing the candidate data sample set.

In some embodiments, the determining the sample type parameter from the sample space by the updated sample screening controller includes: dividing the data samples in the sample space into at least one data sample set according to at least one sample type; counting the number of the data samples in each data sample set in the at least one data sample set; and determining the sample type parameter of the data sample in the sample space according to the sample number corresponding to the at least one data sample set.

In a second aspect, embodiments of the present disclosure provide an apparatus for obtaining a sample, the apparatus comprising: the candidate data sample set generating unit is configured to update a preset sample screening controller through a current feedback reward value to obtain an updated sample screening controller, and generate a candidate data sample set from a sample space through the updated sample screening controller, wherein the candidate data sample set meets a set data balance constraint condition, and the feedback reward value is used for representing the degree that the candidate data sample meets the data balance constraint condition; a sample loss function determining unit configured to detect the candidate data sample set based on a preset reference model, and determine a sample loss function of the candidate data sample, wherein the reference model is used for detecting validity of the data sample; a feedback award value updating unit configured to update the feedback award value based on the sample loss function; and the sample acquisition unit is used for responding to the fact that the feedback reward value reaches a preset convergence condition or the accumulated times of the iterative operation reaches a preset iteration time threshold value, and is configured to determine the current candidate data sample as a valid data sample.

In some embodiments, the sample screening controller comprises a recurrent neural network; and the candidate data sample set generating unit includes: and the first candidate data sample generating subunit is configured to update the parameter of the recurrent neural network through the current feedback reward value, so that the updated sample screening controller generates a candidate data sample with an increased feedback reward value.

In some embodiments, the candidate data sample set generating unit includes: the fitness acquiring subunit is configured to generate a plurality of sample populations based on a preset sample space, and use the current feedback reward value as the fitness of the candidate data sample generated in the current iteration operation in the sample populations; and a second candidate data sample generation subunit configured to update the sample screening controller based on the fitness of the candidate data sample in the sample population, so that the sample screening controller generates a data sample with increased fitness in a next iteration operation as a candidate data sample for the next iteration operation.

In some embodiments, the data samples in the sample space are configured with sample codes, and the candidate data sample set generating unit includes: the sample type parameter determining subunit is configured to determine a sample type parameter from a sample space through the updated sample screening controller, wherein the sample type parameter is used for representing a proportional relationship of a data sample corresponding to a sample type in the sample space; and the candidate data sample set generating subunit is configured to generate a sample coding sequence through the sample type parameter, wherein the sample coding sequence is used for characterizing the candidate data sample set.

In some embodiments, the sample type parameter determining subunit includes: a data sample set dividing module configured to divide the data samples in the sample space into at least one data sample set according to at least one sample type; a sample counting module configured to count the number of samples of the data samples in each of the at least one data sample set; and the sample type parameter determining module is configured to determine the sample type parameter of the data sample in the sample space according to the number of samples corresponding to the at least one data sample set.

In a third aspect, an embodiment of the present disclosure provides an electronic device, including: one or more processors; memory having one or more programs stored thereon which, when executed by the one or more processors, cause the one or more processors to perform the method for obtaining a sample of the first aspect.

In a fourth aspect, an embodiment of the present disclosure provides a computer-readable medium, on which a computer program is stored, wherein the program, when executed by a processor, implements the method for obtaining a sample of the first aspect described above.

According to the method and the device for obtaining the sample, the sample screening controller is updated through the feedback reward value, and the candidate data sample set is generated through the updated sample screening controller; then, detecting a candidate data sample set through a reference model to obtain a sample loss function, and updating the feedback reward value through the sample loss function; and finally, after the feedback reward value meets the convergence condition or the iteration times, taking the candidate data sample as an effective data sample. Therefore, the samples meeting the constraint conditions and capable of representing the spatial characteristics of the samples can be selected from the sample space, the efficiency of obtaining effective data samples is improved, the data processing amount is reduced, and the memory space of hardware is saved.

Drawings

Other features, objects and advantages of the disclosure will become more apparent upon reading of the following detailed description of non-limiting embodiments thereof, made with reference to the accompanying drawings in which:

FIG. 1 is an exemplary system architecture diagram in which one embodiment of the present disclosure may be applied;

FIG. 2 is a flow diagram of one embodiment of a method for obtaining a sample according to the present disclosure;

FIG. 3 is a schematic diagram of one application scenario of a method for obtaining a sample according to the present disclosure;

FIG. 4 is a flow chart of yet another embodiment of a method for obtaining a sample according to the present disclosure;

FIG. 5 is a schematic structural diagram of one embodiment of an apparatus for obtaining a sample according to the present disclosure;

FIG. 6 is a schematic diagram of an electronic device suitable for use in implementing embodiments of the present disclosure.

Detailed Description

The present disclosure is described in further detail below with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the relevant invention and not restrictive of the invention. It should be noted that, for convenience of description, only the portions related to the related invention are shown in the drawings.

It should be noted that, in the present disclosure, the embodiments and features of the embodiments may be combined with each other without conflict. The present disclosure will be described in detail below with reference to the accompanying drawings in conjunction with embodiments.

Fig. 1 illustrates an exemplary system architecture 100 of a method for acquiring a sample or an apparatus for acquiring a sample to which embodiments of the present disclosure may be applied.

As shown in fig. 1, the system architecture 100 may include

terminal devices

101, 102, 103, a network 104, a sample acquisition server 105, and a sample server 106. Network 104 is used to provide a medium for communication links between

terminal devices

101, 102, 103, sample acquisition server 105, and sample server 106. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.

The

terminal devices

101, 102, 103 interact with the sample acquisition server 105 via the network 104 to receive or send messages or the like. The

terminal devices

101, 102, 103 may have various communication client applications installed thereon, such as a web browser application, a shopping application, a search application, an instant messaging tool, a mailbox client, social platform software, and the like.

The

terminal apparatuses

101, 102, and 103 may be hardware or software. When the

terminal devices

101, 102, 103 are hardware, they may be various electronic devices having a display screen and supporting web browsing, including but not limited to smart phones, tablet computers, e-book readers, MP3 players (Moving Picture Experts Group Audio Layer III, mpeg compression standard Audio Layer 3), MP4 players (Moving Picture Experts Group Audio Layer IV, mpeg compression standard Audio Layer 4), laptop portable computers, desktop computers, and the like. When the

terminal apparatuses

101, 102, 103 are software, they can be installed in the electronic apparatuses listed above. It may be implemented as a plurality of software or software modules (for example, for providing distributed services), or as a single software or software module, which is not specifically limited herein.

The sample server 106 may be a server that stores a large number of samples. For example, the sample server 106 may obtain samples of a specified type via a network, database, or the like, and store the samples locally. The sample server 106 may also accept data access by the sample acquisition server 105 and provide samples to the sample acquisition server 105.

The sample server 106 may be hardware or software. When the sample server 106 is hardware, it may be implemented as a distributed server cluster composed of a plurality of servers, or may be implemented as a single server. When the sample server 106 is software, it may be implemented as a plurality of software or software modules (for example, for providing distributed services), or may be implemented as a single software or software module, and is not limited in particular.

The sample acquisition server 105 may be a server that provides various services, such as acquiring from the sample server 106 a server that characterizes the entirety of the samples on the sample server 106. The sample collection server 105 may analyze and process data such as collected samples, and feed back a processing result (e.g., a trained model) to the

terminal devices

101, 102, and 103, so that the model runs on the

terminal devices

101, 102, and 103.

It should be noted that the method for obtaining a sample provided by the embodiment of the present disclosure is generally performed by the sample collection server 105, and accordingly, the apparatus for obtaining a sample is generally disposed in the sample collection server 105.

The server may be hardware or software. When the server is hardware, it may be implemented as a distributed server cluster formed by multiple servers, or may be implemented as a single server. When the server is software, it may be implemented as a plurality of software or software modules (for example, to provide distributed services), or may be implemented as a single software or software module, and is not limited specifically herein.

It should be understood that the number of terminal devices, networks, sample collection servers, and sample servers in fig. 1 are merely illustrative. There may be any number of terminal devices, networks, sample collection servers, and sample servers, as desired for implementation.

With continued reference to fig. 2, a flow 200 of one embodiment of a method for obtaining a sample according to the present disclosure is shown. The method for obtaining a sample comprises the steps of:

step 201, updating a preset sample screening controller through a current feedback reward value to obtain an updated sample screening controller, and generating a candidate data sample set from a sample space through the updated sample screening controller.

In this embodiment, the executing entity of the method for obtaining a sample (e.g., sample acquisition server 105 shown in fig. 1) may be in data communication with sample server 106 via a wired connection or a wireless connection to select a sample from a sample space on sample server 106. The executive agent may then monitor the selected sample for a corresponding feedback reward value, and may then obtain a new candidate data sample. It should be noted that the wireless connection means may include, but is not limited to, a 3G/4G connection, a WiFi connection, a bluetooth connection, a WiMAX connection, a Zigbee connection, a uwb (ultra wide band) connection, and other wireless connection means now known or developed in the future.

In practice, the data size of some samples in the sample space is too large, and if all the samples are used for training the model, the blindness of selecting the samples is increased. The similarity of a plurality of samples is very high, and if all samples are used for training the model, the effectiveness of the samples is reduced, and simultaneously the difficulty of training the model and the time for adding the training model are increased.

Therefore, the application discloses a method for obtaining a sample, wherein an executive body can update the preset parameters of a sample screening controller according to the current feedback reward value to obtain the updated sample screening controller. And then, the execution subject selects a sample from the sample space through the updated sample screening controller to generate a candidate data sample set. The candidate data sample set satisfies the set data balance constraint condition, that is, the candidate data samples in the candidate data sample set can represent the spatial characteristics of the sample as much as possible, and the number of the candidate data samples in the candidate data sample set depends on the actual requirement. The sample screening controller may be various models, and is configured to select a set number of samples from the massive samples in the sample space on the sample server 106, so as to reduce the data amount of the corresponding model trained based on the sample space, and enable the selected samples to represent the overall characteristics of the samples in the sample space. The feedback reward value can be used for representing the degree of the candidate data sample meeting the data balance constraint condition and can also represent the accuracy and effectiveness of the sample selection from the sample space by the current sample screening controller. The data balance constraint condition can meet the condition of setting data requirement. For example, the data equalization constraint may be that the sample collected by the sample screening controller contains all of the sample types in the sample space. For example, the sample space is a picture space, and the pictures include pictures of different sample types such as animal pictures, plant pictures, landscape pictures, and architectural pictures. While the number of pictures of various sample types varies widely. In this case, the sample collected by the sample screening controller needs to include pictures of different sample types such as animal pictures, plant pictures, landscape pictures, and architectural pictures. Further, the data equalization constraint may be that the sample collected by the sample screening controller contains all sample types in the sample space, and. The proportion between the individual sample types contained in the acquired samples is similar to the proportion between all the sample types in the sample space, etc. The data equalization constraint can also be of other types, depending on the actual needs.

The feedback reward value may be used to guide the sample screening controller's update trend. The initial value of the feedback reward value is a preset value. For example, the value of the feedback award value may be 0 when the sample screening controller is first run, indicating that no update of the sample screening controller is affected.

In some optional implementations of this embodiment, the sample screening controller may include a recurrent neural network; and, the updating of the preset sample screening controller by the current feedback reward value includes: and updating the parameters of the recurrent neural network through the current feedback reward value so that the updated sample screening controller generates a candidate data sample which enables the feedback reward value to be increased.

The feedback reward value may be calculated based on the sample loss function in the last iteration. The feedback reward value can be propagated reversely, the parameters of the recurrent neural network are adjusted by adopting a gradient descent method, the value of the sample loss function of the new candidate data sample generated by the recurrent neural network after the parameters are adjusted is reduced, and the corresponding reward feedback value is increased.

In some optional implementations of this embodiment, the updating the preset sample screening controller by the current feedback reward value may include the following steps:

the method comprises the steps of firstly, generating a plurality of sample populations based on a preset sample space, and taking the current feedback reward value as the fitness of a candidate data sample generated in the current iteration operation in the sample populations.

The execution subject can divide the sample space into a plurality of sample populations by means of random selection and the like. And then, taking the current feedback reward value as the fitness of the candidate data sample generated in the current iteration operation in the sample population.

And secondly, updating the sample screening controller based on the fitness of the candidate data sample in the sample population so that the sample screening controller generates a data sample with increased fitness in the next iteration operation as a candidate data sample for the next iteration operation.

After the fitness is obtained, the execution main body can update the sample screening controller according to the fitness, so that the sample screening controller generates a data sample with the increased fitness in the next iteration operation as a candidate data sample of the next iteration operation. That is, fitness is used to guide the updated trend of the sample screening controller.

In some optional implementations of this embodiment, the configuring, with sample codes, data samples in the sample space, and the generating, by the updated sample screening controller, a candidate data sample set from the sample space may include the following steps:

in the first step, the sample type parameter is determined from the sample space by the updated sample screening controller.

After obtaining the updated sample screening controller, the execution subject may determine the sample type parameter from the sample space through the updated sample screening controller. The sample type parameter may be used to characterize a proportional relationship of the data sample corresponding to the sample type in the sample space. Therefore, the acquired candidate data samples can meet the data balance constraint condition, and the distribution of the data samples in the sample space and the overall characteristics of the samples can be truly reflected under the condition that the number of the candidate data samples is less than that of the data samples contained in the sample space.

And secondly, generating a sample coding sequence through the sample type parameters.

Typically, the data samples in the sample space of the sample screening controller have corresponding encodings for storage and retrieval. After the execution subject obtains the sample type parameters, the sample screening controller can generate a sample coding sequence through the sample type parameters. The sample encoding sequence may be used to characterize the candidate data sample set.

In some optional implementations of this embodiment, the determining, by the updated sample screening controller, the sample type parameter from the sample space may include the following steps:

the method comprises the first step of dividing the data samples in the sample space into at least one data sample set according to at least one sample type.

The execution subject may obtain a sample type within a sample space and then divide the sample space into a plurality of sets of data samples according to the sample type. The way in which the sample space is divided may be various. For example, the sample types within the sample space include different ones of 10. The execution subject may divide the sample space into 10 sets of data samples according to the different sample types in these 10. The execution subject may also divide the sample space into one data sample set according to one of the different sample types in 10, and divide the sample space into another data sample set according to the 9 sample types. The specific way of dividing the data sample set can be determined according to actual needs.

And secondly, counting the number of the data samples in each data sample set in the at least one data sample set.

After the data sample sets are determined, the executing subject may count the number of samples of the data samples in each data sample set in the at least one data sample set.

And thirdly, determining the sample type parameter of the data sample in the sample space according to the number of the samples corresponding to the at least one data sample set.

The execution subject may determine the sample type parameter from each data sample set in a set proportion (which may be 2%, for example). Different sample type parameters may also be set for each data sample set for a specific data volume. For example, the first set of data samples contains 10 ten thousand data samples and the second set of data samples contains 100 data samples. The executing entity may set the middle sample type parameter of the first data sample set to 0.1% and set the middle sample type parameter of the second data sample set to 2% to avoid losing the feature information of the data samples in the second data sample set in the subsequent iteration process or in the process of training other models based on the candidate data samples. In this way, the magnitude of the candidate data samples is reduced (the data samples in the original sample space are much more than the candidate data samples) while the candidate data samples are guaranteed to be valid.

Step 202, detecting the candidate data sample set based on a preset reference model, and determining a sample loss function of the candidate data sample.

The reference model is used for detecting the validity of the data sample. The reference model is generally trained and may be used to detect feature information of candidate data samples in the set of candidate data samples. When the feature information of the candidate data sample is not obvious or less than a certain amount, the validity of the candidate data sample can be considered to be not high, and then the candidate data sample is divided into different valid types to obtain a sample loss function. Wherein, the sample loss function can be a segment function or the like.

In step 203, the feedback prize value is updated based on the sample loss function.

After obtaining the sample loss function, the execution subject may directly set the sample loss function as a feedback reward value to adjust according to the corresponding parameter of the sample screening controller. The feedback reward value can also be set according to the sample loss function (for example, different value intervals of the sample loss function correspond to different feedback reward values, etc.), and then the adjustment is performed according to the corresponding parameters of the sample screening controller.

Step 204, in response to that the feedback reward value reaches a preset convergence condition or the cumulative number of the iterative operations reaches a preset iteration number threshold, determining the current candidate data sample as an effective data sample.

When the feedback reward value is positive, the candidate data sample generated by the sample screening controller is more and more consistent with the data balance constraint condition. When the value change of the feedback reward value after multiple iterations is small (namely, the convergence condition is met), the candidate data sample corresponding to the feedback reward value can be considered to meet the data balance constraint condition, and the characteristics of the data sample in the sample space can be represented. At this point, the current candidate data sample may be determined to be a valid data sample. In addition, when the accumulated number of iterative operations reaches a preset threshold of iterative times, the current candidate data sample can be considered to meet the data balance constraint condition, and the characteristics of the data sample in the sample space can be represented. The execution principal may determine that the current candidate data sample is a valid data sample. Because the candidate data samples are a small number of data samples which are selected from massive data samples in the sample space and can represent the characteristics of the data samples in the sample space, when the effective data samples determined by the candidate data samples are used for training other models, not only can a more accurate model be obtained, but also the data processing capacity of the training model is reduced, the training speed of the model is accelerated, the memory space of hardware can be saved, and the data processing efficiency of the hardware is improved.

With continued reference to fig. 3, fig. 3 is a schematic diagram of an application scenario of the method for acquiring a sample according to the present embodiment. In the application scenario of fig. 3, sample acquisition server 105 obtains data samples in a sample space on sample server 106 via network 104. And updating the sample screening controller preset by the feedback reward value degree to obtain the updated sample screening controller. A set of candidate data samples is then generated from the sample space on the sample server 106 by the updated sample screening controller. And then, detecting the candidate data sample set through the reference model, determining a sample loss function, and updating the feedback reward value according to the sample loss function. And finally, when the feedback reward value reaches a preset convergence condition or the accumulated times of the iterative operation reaches a preset iteration time threshold value, determining the current candidate data sample as an effective data sample. Sample acquisition server 105 may send the valid data samples to terminal device 103 to cause terminal device 103 to train the model with the valid data samples. The sample collection server 105 may also directly train the model with the valid data samples and send the trained model to the terminal device 103.

The method provided by the embodiment of the disclosure includes the steps that firstly, a sample screening controller is updated through feedback of an award value, and a candidate data sample set is generated through the updated sample screening controller; then, detecting a candidate data sample set through a reference model to obtain a sample loss function, and updating the feedback reward value through the sample loss function; and finally, after the feedback reward value meets the convergence condition or the iteration times, taking the candidate data sample as an effective data sample. Therefore, the samples meeting the constraint conditions and capable of representing the spatial characteristics of the samples can be selected from the sample space, the efficiency of obtaining effective data samples is improved, the data processing amount is reduced, and the memory space of hardware is saved.

With further reference to FIG. 4, a flow 400 of yet another embodiment of a method for training a model is illustrated. The process 400 of the method for training a model includes the steps of:

step 401, updating a preset sample screening controller through a current feedback reward value to obtain an updated sample screening controller, and generating a candidate data sample set from a sample space through the updated sample screening controller.

The content of step 401 is the same as that of step 201, and is not described in detail here.

The content of step 402 is the same as that of step 202, and is not described in detail here.

In step 403, the feedback prize value is updated based on the sample loss function.

The content of step 403 is the same as that of step 203, and is not described in detail here.

In step 404, in response to that the feedback reward value reaches a preset convergence condition or the cumulative number of the iterative operations reaches a preset iteration number threshold, determining the current candidate data sample as an effective data sample.

The content of step 404 is the same as that of step 204, and is not described in detail here.

Step 405, training the target model according to the effective data sample.

As can be seen from the above description, the valid data samples can characterize the characteristics of the data samples in the sample space, and the data amount is much smaller than the data samples in the sample space. Therefore, when other models are trained through the effective data samples, the accurate models can be obtained, the data processing amount of the trained models is reduced, and the training speed of the models is accelerated. The execution subject can train the target model through the effective data sample according to the actual requirement.

With further reference to fig. 5, as an implementation of the methods shown in the above figures, the present disclosure provides an embodiment of an apparatus for obtaining a sample, which corresponds to the method embodiment shown in fig. 2, and which is particularly applicable in various electronic devices.

As shown in fig. 5, the apparatus 500 for obtaining a sample of the present embodiment may include: a candidate data sample set generating unit 501, a sample loss function determining unit 502, a feedback prize value updating unit 503, and a sample acquiring unit 504. The candidate data sample set generating unit 501 is configured to update a preset sample screening controller through a current feedback reward value to obtain an updated sample screening controller, and generate a candidate data sample set from a sample space through the updated sample screening controller, where the candidate data sample set satisfies a set data balance constraint condition, and the feedback reward value is used to represent a degree that the candidate data sample satisfies the data balance constraint condition; the sample loss function determining unit 502 is configured to perform detection on the candidate data sample set based on a preset reference model, and determine a sample loss function of the candidate data sample, where the reference model is used for detecting validity of the data sample; the feedback prize value updating unit 503 is configured to update the feedback prize value based on the sample loss function; the sample obtaining unit 504, in response to the feedback reward value reaching a preset convergence condition or the cumulative number of iterations reaching a preset iteration threshold, is configured to determine the current candidate data sample as a valid data sample.

In some optional implementations of this embodiment, the sample screening controller includes a recurrent neural network; and, the candidate data sample set generating unit 501 may include: a first candidate data sample generating subunit (not shown) configured to update the parameter of the recurrent neural network with the current feedback reward value, so that the updated sample screening controller generates a candidate data sample with an increased feedback reward value.

In some optional implementations of the present embodiment, the candidate data sample set generating unit 501 may include: a fitness obtaining subunit (not shown) and a second candidate data sample generating subunit (not shown). The fitness acquiring subunit is configured to generate a plurality of sample populations based on a preset sample space, and use the current feedback reward value as the fitness of the candidate data sample generated in the current iteration operation in the sample populations; the second candidate data sample generating subunit is configured to update the sample screening controller based on the fitness of the candidate data sample in the sample population, so that the sample screening controller generates a data sample with increased fitness in a next iteration operation as a candidate data sample for the next iteration operation.

In some optional implementations of this embodiment, the data samples in the sample space are configured with sample codes, and the candidate data sample set generating unit 501 may include: the sample type parameter determining subunit (not shown) and the candidate data sample set generating subunit (not shown). The sample type parameter determining subunit is configured to determine a sample type parameter from a sample space through the updated sample screening controller, where the sample type parameter is used to characterize a proportional relationship of a data sample corresponding to a sample type in the sample space; the candidate data sample set generating subunit is configured to generate a sample encoding sequence by using the sample type parameter, wherein the sample encoding sequence is used for characterizing the candidate data sample set.

In some optional implementations of this embodiment, the sample type parameter determining subunit may include: the device comprises a data sample set dividing module, a sample counting module and a sample type parameter determining module. The data sample set dividing module is configured to divide the data samples in the sample space into at least one data sample set according to at least one sample type; the sample counting module is configured to count the number of samples of the data samples in each data sample set in the at least one data sample set; the sample type parameter determining module is configured to determine a sample type parameter of the data sample in the sample space according to the number of samples corresponding to the at least one data sample set.

The present embodiment also provides an electronic device, including: one or more processors; a memory having one or more programs stored thereon that, when executed by the one or more processors, cause the one or more processors to perform the method for obtaining a sample described above.

The present embodiment also provides a computer-readable medium, on which a computer program is stored which, when being executed by a processor, carries out the above-mentioned method for obtaining a sample.

Referring now to FIG. 6, a block diagram of a computer system 600 of an electronic device (e.g., sample acquisition server 105 of FIG. 1) suitable for use in implementing embodiments of the present disclosure is shown. The electronic device shown in fig. 6 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present disclosure.

As shown in fig. 6, electronic device 600 may include a processing means (e.g., central processing unit, graphics processor, etc.) 601 that may perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)602 or a program loaded from a storage means 608 into a Random Access Memory (RAM) 603. In the RAM603, various programs and data necessary for the operation of the electronic apparatus 600 are also stored. The processing device 601, the ROM 602, and the RAM603 are connected to each other via a bus 604. An input/output (I/O) interface 605 is also connected to bus 604.

Generally, the following devices may be connected to the I/O interface 605: input devices 606 including, for example, a touch screen, touch pad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; output devices 607 including, for example, a Liquid Crystal Display (LCD), a speaker, a vibrator, and the like; storage 608 including, for example, tape, hard disk, etc.; and a communication device 609. The communication means 609 may allow the electronic device 600 to communicate with other devices wirelessly or by wire to exchange data. While fig. 6 illustrates an electronic device 600 having various means, it is to be understood that not all illustrated means are required to be implemented or provided. More or fewer devices may alternatively be implemented or provided. Each block shown in fig. 6 may represent one device or may represent multiple devices as desired.

In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network via the communication means 609, or may be installed from the storage means 608, or may be installed from the ROM 602. The computer program, when executed by the processing device 601, performs the above-described functions defined in the methods of embodiments of the present disclosure.

It should be noted that the computer readable medium mentioned above in the embodiments of the present disclosure may be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In embodiments of the disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In embodiments of the present disclosure, however, a computer readable signal medium may comprise a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, optical cables, RF (radio frequency), etc., or any suitable combination of the foregoing.

The computer readable medium may be embodied in the electronic device; or may exist separately without being assembled into the electronic device. The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: updating a preset sample screening controller through a current feedback reward value to obtain an updated sample screening controller, and generating a candidate data sample set from a sample space through the updated sample screening controller, wherein the candidate data sample set meets a set data balance constraint condition, and the feedback reward value is used for representing the degree that the candidate data sample meets the data balance constraint condition; detecting the candidate data sample set based on a preset reference model, and determining a sample loss function of the candidate data sample, wherein the reference model is used for detecting the validity of the data sample; updating the feedback reward value based on the sample loss function; and determining the current candidate data sample as an effective data sample in response to the feedback reward value reaching a preset convergence condition or the accumulated times of the iterative operation reaching a preset iteration time threshold value.

Computer program code for carrying out operations for embodiments of the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + +, and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The units described in the embodiments of the present disclosure may be implemented by software or hardware. The described units may also be provided in a processor, and may be described as: a processor includes a candidate data sample set generation unit, a sample loss function determination unit, a feedback reward value update unit, and a sample acquisition unit. The names of the units do not in some cases constitute a limitation on the units themselves, and for example, the sample acquisition unit may also be described as a "unit that determines a candidate data sample as a valid data sample when a set condition is satisfied".

The foregoing description is only exemplary of the preferred embodiments of the disclosure and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the invention in the present disclosure is not limited to the specific combination of the above-mentioned features, but also encompasses other embodiments in which any combination of the above-mentioned features or their equivalents is possible without departing from the inventive concept as defined above. For example, the above features and (but not limited to) the features disclosed in this disclosure having similar functions are replaced with each other to form the technical solution.

Claims

1. A method for obtaining a sample, comprising:

updating a preset sample screening controller through a current feedback reward value to obtain an updated sample screening controller, and generating a candidate data sample set from a sample space through the updated sample screening controller, wherein the candidate data sample set meets a set data balance constraint condition, and the feedback reward value is used for representing the degree that the candidate data sample meets the data balance constraint condition;

detecting the candidate data sample set based on a preset reference model, and determining a sample loss function of the candidate data sample, wherein the reference model is used for detecting the validity of the data sample;

updating the feedback reward value based on the sample loss function;

and determining the current candidate data sample as an effective data sample in response to the feedback reward value reaching a preset convergence condition or the accumulated times of the iterative operation reaching a preset iteration time threshold value.

2. The method of claim 1, wherein the sample screening controller comprises a recurrent neural network; and

the updating of the preset sample screening controller through the current feedback reward value comprises the following steps:

and updating the parameters of the recurrent neural network through the current feedback reward value so that the updated sample screening controller generates candidate data samples which enable the feedback reward value to be increased.

3. The method according to claim 1 or 2, wherein the updating the preset sample screening controller by the current feedback reward value comprises:

generating a plurality of sample populations based on a preset sample space, and taking the current feedback reward value as the fitness of the candidate data sample generated in the current iteration operation in the sample populations;

and updating the sample screening controller based on the fitness of the candidate data sample in the sample population so that the sample screening controller generates a data sample with increased fitness in the next iteration operation as the candidate data sample of the next iteration operation.

4. The method of claim 1, the data samples within the sample space are configured with sample coding, an

The generating, by the updated sample screening controller, a set of candidate data samples from a sample space comprises:

determining a sample type parameter from a sample space through the updated sample screening controller, wherein the sample type parameter is used for representing a proportional relation of a data sample corresponding to a sample type in the sample space;

and generating a sample coding sequence through the sample type parameters, wherein the sample coding sequence is used for characterizing a candidate data sample set.

5. The method of claim 4, the determining, by the updated sample screening controller, a sample type parameter from a sample space, comprising:

dividing the data samples in the sample space into at least one data sample set according to at least one sample type;

counting the number of data samples in each data sample set in the at least one data sample set;

and determining the sample type parameter of the data sample in the sample space according to the sample number corresponding to the at least one data sample set.

6. An apparatus for obtaining a sample, comprising:

the candidate data sample set generating unit is configured to update a preset sample screening controller through a current feedback reward value to obtain an updated sample screening controller, and generate a candidate data sample set from a sample space through the updated sample screening controller, wherein the candidate data sample set meets a set data balance constraint condition, and the feedback reward value is used for representing the degree that the candidate data sample meets the data balance constraint condition;

a sample loss function determining unit configured to detect the candidate data sample set based on a preset reference model, and determine a sample loss function of the candidate data sample, wherein the reference model is used for detecting validity of the data sample;

a feedback reward value updating unit configured to update the feedback reward value based on the sample loss function;

and the sample acquisition unit is used for responding to the feedback reward value reaching a preset convergence condition or the accumulated times of the iterative operation reaching a preset iteration time threshold value and determining the current candidate data sample as a valid data sample.

7. The apparatus of claim 6, wherein the sample screening controller comprises a recurrent neural network; and

the candidate data sample set generating unit includes:

a first candidate data sample generating subunit configured to update the parameter of the recurrent neural network with the current feedback reward value, so that the updated sample screening controller generates a candidate data sample that increases the feedback reward value.

8. The apparatus according to claim 6 or 7, wherein the candidate data sample set generating unit comprises:

the fitness obtaining subunit is configured to generate a plurality of sample populations based on a preset sample space, and use the current feedback reward value as the fitness of the candidate data samples generated in the current iteration operation in the sample populations;

a second candidate data sample generating subunit configured to update the sample screening controller based on the fitness of the candidate data sample in the sample population, so that the sample screening controller generates a data sample with increased fitness in a next iteration operation as a candidate data sample for the next iteration operation.

9. The apparatus of claim 6, data samples within the sample space are configured with sample coding, an

The candidate data sample set generating unit includes:

the sample type parameter determining subunit is configured to determine a sample type parameter from a sample space through the updated sample screening controller, wherein the sample type parameter is used for characterizing a proportional relationship of a data sample corresponding to a sample type in the sample space;

a candidate data sample set generating subunit configured to generate a sample encoding sequence by the sample type parameter, the sample encoding sequence being used for characterizing the candidate data sample set.

10. The apparatus of claim 9, the sample type parameter determination subunit comprising:

a data sample set dividing module configured to divide the data samples in the sample space into at least one data sample set according to at least one sample type;

a sample counting module configured to count the number of samples of the data samples in each of the at least one data sample set;

a sample type parameter determining module configured to determine a sample type parameter of the data sample in the sample space according to the number of samples corresponding to the at least one data sample set.

11. An electronic device, comprising:

one or more processors;

a memory having one or more programs stored thereon,

the one or more programs, when executed by the one or more processors, cause the one or more processors to perform the method of any of claims 1-5.

12. A computer-readable medium, on which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1 to 5.