CN113821498A

CN113821498A - Data screening method, device, equipment and medium

Info

Publication number: CN113821498A
Application number: CN202110821398.5A
Authority: CN
Inventors: 李悦翔; 何楠君; 马锴; 郑冶枫
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2021-07-20
Filing date: 2021-07-20
Publication date: 2021-12-21

Abstract

The application discloses a data screening method, a data screening device, data screening equipment and a data screening medium, and relates to the technical field of machine learning. The method comprises the following steps: acquiring initial data and a first classification network; determining cross entropy loss corresponding to initial data based on a prediction result obtained by performing classification prediction on the initial data by a first classification network; performing characteristic sampling on the initial data, and determining reconstruction error information corresponding to the initial data, wherein the reconstruction error information is used for indicating the correlation degree between the initial data and the classification task; and screening the initial data based on the cross entropy loss and the reconstruction error information to obtain noise reduction data, wherein the noise reduction data is used for performing iterative training on the first classification network to obtain a second classification network for processing classification tasks. Through the combination of cross entropy loss and reconstruction error information, the initial data is screened from uncertainty dimensionality and representative dimensionality in a combined mode, and the network is helped to resist the interference of noise data in the training process.

Description

Data screening method, device, equipment and medium

Technical Field

The present application relates to the field of machine learning technologies, and in particular, to a method, an apparatus, a device, and a medium for screening data.

Background

In the training process of the deep learning network, the selection of the training data is particularly important, and the quality of the training data directly influences the training effect of the network. Generally, a labeling person labels data in a data set according to a training task of a network to be trained to obtain training data. However, manual labeling may cause false labeling due to carelessness or insufficient ability of the labeling personnel, for example, a category label which is labeled as a mistake for the data is embodied in the classification task. The mislabeled data, i.e., the noise data, can greatly affect the training of the deep learning network.

In the related art, a noise problem in training data is reduced by a Decoupling (Decoupling) method in a training process of a network. That is, two networks are trained simultaneously in the training process, wherein the two networks are different only in the initialization parameter, and then the update of back propagation is performed only when the two networks are divergent in the training process, so as to improve the robustness of the networks.

However, in the network training process implemented in the above manner, assuming that the batch data initially input to the network includes biased training samples, the error for a single network will be trained again in the second training process, resulting in the error training of the model parameters by the noise data, and the trained network has low precision.

Disclosure of Invention

The embodiment of the application provides a data screening method, a data screening device, data screening equipment and a data screening medium, which can reduce the negative influence of noise data in training data on network training. The technical scheme is as follows:

in one aspect, a method for screening data is provided, the method comprising:

acquiring initial data and a first classification network, wherein the first classification network is a network to be trained for processing classification tasks;

determining cross entropy loss corresponding to the initial data based on a prediction result obtained by performing classification prediction on the initial data by the first classification network;

performing feature sampling on the initial data, and determining reconstruction error information corresponding to the initial data, wherein the reconstruction error information is used for indicating the correlation degree between the initial data and the classification task;

and screening the initial data based on the cross entropy loss and the reconstruction error information to obtain noise reduction data, wherein the noise reduction data is used for performing iterative training on the first classification network to obtain a second classification network for processing the classification task.

In another aspect, an apparatus for screening data is provided, the apparatus comprising:

the system comprises an acquisition module, a classification module and a classification module, wherein the acquisition module is used for acquiring initial data and a first classification network, and the first classification network is a network to be trained for processing classification tasks;

the determining module is used for determining cross entropy loss corresponding to the initial data based on a prediction result obtained by performing classification prediction on the initial data by the first classification network;

the determining module is further configured to perform feature sampling on the initial data, and determine reconstruction error information corresponding to the initial data, where the reconstruction error information is used to indicate a degree of association between the initial data and the classification task;

and the screening module is used for screening the initial data based on the cross entropy loss and the reconstruction error information to obtain noise reduction data, and the noise reduction data is used for performing iterative training on the first classification network to obtain a second classification network for processing the classification task.

In another aspect, a computer device is provided, which includes a processor and a memory, where at least one instruction, at least one program, a set of codes, or a set of instructions is stored in the memory, and the at least one instruction, the at least one program, the set of codes, or the set of instructions is loaded and executed by the processor to implement the method for screening data according to any one of the embodiments of the present application.

In another aspect, a computer-readable storage medium is provided, in which at least one program code is stored, and the program code is loaded and executed by a processor to implement the method for screening data of a terminal device described in any of the embodiments of the present application.

In another aspect, a computer program product or computer program is provided, the computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer readable storage medium, and the processor executes the computer instructions to cause the computer device to execute the method for screening data as described in any of the above embodiments.

The technical scheme provided by the application at least comprises the following beneficial effects:

in the training process of the neural network aiming at the classification task, initial data are input into the network to be trained, cross entropy loss corresponding to the initial data is obtained according to an output prediction result, meanwhile, feature sampling is carried out on the initial data, reconstruction error information capable of indicating the association degree between the initial data and the classification task is determined, the initial data are screened based on the cross entropy loss and the reconstruction error information of the initial data to obtain noise reduction data, and the noise reduction data are used for carrying out iterative training on the first classification network to obtain a second classification network for realizing the classification task. Namely, through the combination of cross entropy loss and reconstruction error information, the initial data is screened jointly from uncertainty dimensionality and representative dimensionality to obtain noise reduction data for network training, and the network is helped to resist the interference of noise data in the training process.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

FIG. 1 is a schematic illustration of an implementation environment provided by an exemplary embodiment of the present application;

FIG. 2 is a flow chart of a method for screening data provided by an exemplary embodiment of the present application;

FIG. 3 is a schematic diagram of a self-encoder provided by an exemplary embodiment of the present application;

FIG. 4 is a schematic diagram of a training framework provided by an exemplary embodiment of the present application;

FIG. 5 is a flow chart of a method for screening data provided by another exemplary embodiment of the present application;

FIG. 6 is a flow chart of a method for screening data provided by another exemplary embodiment of the present application;

FIG. 7 is a schematic diagram of a network training framework provided by an exemplary embodiment of the present application;

FIG. 8 is a block diagram of an apparatus for screening data provided in an exemplary embodiment of the present application;

FIG. 9 is a block diagram of an apparatus for screening data provided in another exemplary embodiment of the present application;

fig. 10 is a schematic structural diagram of a server according to an exemplary embodiment of the present application.

Detailed Description

To make the objects, technical solutions and advantages of the present application more clear, embodiments of the present application will be described in further detail below with reference to the accompanying drawings.

First, terms referred to in the embodiments of the present application are briefly described:

artificial Intelligence (AI) is a theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and expand human Intelligence, perceive the environment, acquire knowledge and use the knowledge to obtain the best results. In other words, artificial intelligence is a comprehensive technique of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence. Artificial intelligence is the research of the design principle and the realization method of various intelligent machines, so that the machines have the functions of perception, reasoning and decision making.

The artificial intelligence technology is a comprehensive subject and relates to the field of extensive technology, namely the technology of a hardware level and the technology of a software level. The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning, automatic driving, intelligent traffic and the like.

Machine Learning (ML) is a multi-domain cross discipline, and relates to a plurality of disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory and the like. The special research on how a computer simulates or realizes the learning behavior of human beings so as to acquire new knowledge or skills and reorganize the existing knowledge structure to continuously improve the performance of the computer. Machine learning is the core of artificial intelligence, is the fundamental approach for computers to have intelligence, and is applied to all fields of artificial intelligence. Machine learning and deep learning generally include techniques such as artificial neural networks, belief networks, reinforcement learning, transfer learning, inductive learning, and formal education learning.

Computer Vision technology (CV) Computer Vision is a science for researching how to make a machine "see", and further refers to that a camera and a Computer are used to replace human eyes to perform machine Vision such as identification, tracking and measurement on a target, and further image processing is performed, so that the Computer processing becomes an image more suitable for human eyes to observe or transmitted to an instrument to detect. As a scientific discipline, computer vision research-related theories and techniques attempt to build artificial intelligence systems that can capture information from images or multidimensional data. The computer vision technology generally includes image processing, image recognition, image semantic understanding, image retrieval, OCR, video processing, video semantic understanding, video content/behavior recognition, three-dimensional object reconstruction, 3D technology, virtual reality, augmented reality, synchronous positioning and map construction, automatic driving, intelligent transportation and other technologies, and also includes common biometric identification technologies such as face recognition and fingerprint recognition.

Natural Language Processing (NLP) is an important direction in the fields of computer science and artificial intelligence. It studies various theories and methods that enable efficient communication between humans and computers using natural language. Natural language processing is a science integrating linguistics, computer science and mathematics. Therefore, the research in this field will involve natural language, i.e. the language that people use everyday, so it is closely related to the research of linguistics. Natural language processing techniques typically include text processing, semantic understanding, machine translation, robotic question and answer, knowledge mapping, and the like.

The data screening method provided by the embodiment of the application relates to a machine learning technology in artificial intelligence, and can be applied to computer vision scenes and natural language processing scenes. The above data screening method is mainly used in a network training process for classification tasks, for example, an image classification task in a computer vision scene or a text classification task in a natural language processing scene. Schematically, the method for screening data provided by the embodiment of the present application is applied to the following scenes for explanation:

firstly, the screening method of the data is applied to image recognition, taking an application scene of medical image processing as an example, in order to achieve automatic recognition of lesion information in a medical image through artificial intelligence, a recognition model for recognizing the medical image needs to be trained. The identification model corresponds to an identification task, taking melanoma identification as an example, in the training process of the identification model, a sample image for training is obtained, and the sample image is labeled according to 'health/benign/malignant' to obtain a training image. In the process of labeling the sample image, a training image which is wrongly labeled exists due to the problem of manual experience. In the training process, a self-encoder is obtained through training of the training images, the training images are input into the self-encoder, reconstruction errors of the training images are obtained, a network to be trained is generated according to recognition tasks of the melanoma, the training images are input into the network to be trained, cross entropy loss of the training images relative to the network to be trained is obtained, network parameters of the network to be trained are trained through the cross entropy loss, noise reduction images used for iterative training are determined from the training images according to the reconstruction errors and the cross entropy loss, images with error labels in the noise reduction image set are smaller than images with error labels in the total training image set, and finally a network model capable of achieving melanoma recognition is obtained. The medical image to be identified is input into the network model, the skin condition indicated by the medical image can be judged, and the skin condition result indicated by the image to be identified is output to be one of a healthy state, a benign melanoma or a malignant melanoma.

Secondly, the screening method of the data is applied to text recognition, and by taking an application scene of automatic spam detection as an example, the function of automatically recognizing the received text information and filtering spam information is completed through artificial intelligence. The method includes the steps of establishing a model to be trained for classifying text information according to input text information, obtaining a certain number of training samples, wherein the training samples are samples marked by marking personnel for judging whether the training samples are junk information, for example, marking texts carrying words such as 'preferential', 'consumption', 'lottery' and the like as the junk information, inputting the training samples into the model to be trained and a self-encoder to perform data screening and model training, and finally obtaining a junk information recognition model with high recognition accuracy. The spam information identification model can be applied to scenes such as filtering of spam short messages and identification of spam information in social media.

Thirdly, the screening method of the data is applied to speech recognition, taking an application scene of speech-to-text as an example, the function of converting speech information into text information is completed through artificial intelligence, in the process of converting speech into text, speech features recognized in the speech data need to be classified according to pronunciation, namely a speech classification task is corresponded, and in the training process of the model, the speech recognition model with higher recognition precision is finally obtained by inputting training samples marked manually into the model to be trained and the self-encoder to perform data screening and model training. The speech recognition model can be applied to scenes of converting speech information into text in social software or inputting speech in an input method function and the like.

The data screening method may also be applied to other application scenarios, and only the above three application scenarios are described here, and the specific application scenarios are not limited.

The implementation environment of the embodiments of the present application is described with reference to the above noun explanations and application scenarios. Referring to fig. 1, the implementation environment includes a terminal 101, a server 102, and a communication network 103.

The terminal 101 may be an electronic device such as a mobile phone, a tablet computer, an e-book reader, a multimedia playing device, a wearable device, a laptop portable computer, a desktop computer, or an image/text/voice recognition all-in-one machine. Illustratively, a labeling person labels a sample for model training through the terminal 101 to obtain initial data, uploads the initial data to the server 102, and the server 102 trains a network to be trained by using the initial data. The terminal 101 is further configured to input test data, upload the test data to the server 102, identify the test data by using a classification network, and the server 102 returns an identification result to the terminal 101, where the classification network is obtained by denoising and training the network to be trained through initial data.

The server 102 is configured to provide a model training function for the terminal 101, filter noise data in initial data in a process of performing iterative training on a training model, so that the noise data included in data used for iterative training each time is less, reduce an influence of the noise data on model training, improve model accuracy, and finally obtain a target model capable of processing a classification task, where the server 102 may transmit the target model to the terminal 101, and may also complete a recognition process of the target model on data to be recognized by receiving the data to be recognized of the terminal 101 in the server 102, and only return a recognition result to the terminal 101.

It should be noted that the server 102 may be an independent physical server, may also be a server cluster or a distributed system formed by a plurality of physical servers, and may also be a cloud server providing basic cloud computing services such as a cloud service, a cloud database, cloud computing, a cloud function, cloud storage, a Network service, cloud communication, a middleware service, a domain name service, a security service, a Content Delivery Network (CDN), a big data and artificial intelligence platform, and the like.

The Cloud Technology (Cloud Technology) is a hosting Technology for unifying series resources such as hardware, software, network and the like in a wide area network or a local area network to realize calculation, storage, processing and sharing of data. The cloud technology is based on the general names of network technology, information technology, integration technology, management platform technology, application technology and the like applied in the cloud computing business model, can form a resource pool, is used as required, and is flexible and convenient. Cloud computing technology will become an important support. Background services of the technical network system require a large amount of computing and storage resources, such as video websites, picture-like websites and more web portals. With the high development and application of the internet industry, each article may have its own identification mark and needs to be transmitted to a background system for logic processing, data in different levels are processed separately, and various industrial data need strong system background support and can only be realized through cloud computing.

In some embodiments, the server 102 may also be implemented as a node in a blockchain system. The Blockchain (Blockchain) is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism, an encryption algorithm and the like. The block chain, which is essentially a decentralized database, is a string of data blocks associated by using a cryptographic method, and each data block contains information of a batch of network transactions, which is used for verifying the validity (anti-counterfeiting) of the information and generating a next block. The blockchain may include a blockchain underlying platform, a platform product services layer, and an application services layer.

The block chain underlying platform can comprise processing modules such as user management, basic service, intelligent contract and operation monitoring. The user management module is responsible for identity information management of all blockchain participants, and comprises public and private key generation maintenance (account management), key management, user real identity and blockchain address corresponding relation maintenance (authority management) and the like, and under the authorization condition, the user management module supervises and audits the transaction condition of certain real identities and provides rule configuration (wind control audit) of risk control; the basic service module is deployed on all block chain node equipment and used for verifying the validity of the service request, recording the service request to storage after consensus on the valid request is completed, for a new service request, the basic service firstly performs interface adaptation analysis and authentication processing (interface adaptation), then encrypts service information (consensus management) through a consensus algorithm, transmits the service information to a shared account (network communication) completely and consistently after encryption, and performs recording and storage; the intelligent contract module is responsible for registering and issuing contracts, triggering the contracts and executing the contracts, developers can define contract logics through a certain programming language, issue the contract logics to a block chain (contract registration), call keys or other event triggering and executing according to the logics of contract clauses, complete the contract logics and simultaneously provide the function of upgrading and canceling the contracts; the operation monitoring module is mainly responsible for deployment, configuration modification, contract setting, cloud adaptation in the product release process and visual output of real-time states in product operation.

Illustratively, the terminal 101 and the server 102 are connected via a communication network 103.

Referring to fig. 2, a method for screening data shown in an embodiment of the present application is shown, in the embodiment of the present application, the method is applied to a server shown in fig. 1, and illustratively, the method may also be implemented in a terminal as a functional module, and the embodiment of the present application is described by taking the server as an example, and does not limit a specific implementation environment thereof, where the method includes:

step 201, initial data and a first classification network are obtained.

The first classification network is a network to be trained for processing classification tasks, and the initial data is data for training network parameters of the first classification network.

The initial data is training data labeled according to the classification task. Illustratively, the training data may be from a database in the server, that is, the database stores the labeled training data; or training data uploaded by the terminal, namely, the training data is labeled by the labeling personnel through the terminal, and the labeled training data is uploaded to the server to serve as initial data of the first classification network.

The initial data may be at least one of text data, image data, voice data, video data, and the like, which is not limited in the embodiment of the present application.

Illustratively, the network structure of the first classification network may be a network structure pre-stored in a database, or a network structure obtained from a terminal. In some embodiments, the first classification Network may be at least one of a Convolutional Neural Network (CNN), a vgg (visual Geometry group) Network, a Deep residual Network (reserve) Network, and the like, which are capable of handling classification tasks.

The classification task corresponding to the first classification network may be a task stored in a database in correspondence with the network configuration of the first classification network, or may be a task instructed by the terminal. The classification task may be at least one of an image classification task, a text classification task, a voice classification task, a video classification task, and the like, where a classification target indicated by the classification task has a corresponding relationship with the initial data, for example, if the classification target indicated by the classification task is to classify an image, the initial data is image data.

The initial network parameter of the first classification network may be a randomly initialized parameter or a preset parameter, which is not limited herein. The number of the first classification networks may be one or more.

Step 202, based on a prediction result obtained by performing classification prediction on the initial data by the first classification network, determining a cross entropy loss corresponding to the initial data.

The cross-entropy loss of the initial data is used to measure the uncertainty of the initial data. After the initial data is input into the first classification network, classification prediction is carried out through the first classification network to obtain a prediction result, and at the moment, the parameter of the first classification network is the initial network parameter. The cross entropy loss corresponding to the initial data can be obtained through the prediction result.

In some embodiments, the cross entropy loss L_CEThe method is obtained by formula one, wherein p represents a true value, p is a vector in the form of one-hot in formula one, and q represents a predicted value.

The formula I is as follows:

illustratively, after cross entropy loss is determined according to a prediction result obtained by performing classification prediction on initial data based on a first classification network, parameter updating is performed on the first classification network according to the cross entropy loss, so as to realize first training on the first classification network.

And 203, performing characteristic sampling on the initial data, and determining reconstruction error information corresponding to the initial data.

The reconstruction error information is used to indicate the degree of correlation between the original data and the classification task.

In some embodiments, the characteristic sampling of the initial data is performed by an auto-encoder, and the reconstruction error information corresponding to the initial data is determined by the output of the auto-encoder. Illustratively, the self-encoder may be at least one of a linear self-encoder, a sparse self-encoder, a stacked self-encoder, a denoised self-encoder, and the like. The self-encoder enables compression and reconstruction of the original data.

The self-encoder is trained from an initial data set that includes all of the initial data used for training the first classification network. Illustratively, the training process of the self-encoder includes: acquiring an initial self-encoder; inputting initial data into an initial self-encoder to obtain an encoding result; and carrying out supervision training on the initial self-encoder according to the reconstruction error between the initial data and the encoding result to obtain the self-encoder. In one example, the initial data is I, which is sent to the beginningAnd starting from the self-encoder M, predicting to obtain I' as a coding result, carrying out supervision training on the initial self-encoder M through a reconstruction error, and finally obtaining the self-encoder. Wherein the reconstruction error L_RECAnd calculating by a formula II.

The formula II is as follows: l is_REC＝|I-I′|

And determining reconstruction error information corresponding to the initial data based on a sampling result obtained by performing feature sampling on the initial data by the self-encoder obtained by training. As shown in fig. 3, initial data I is input into the self-encoder 300, and the output results in a sampling result I', and the self-encoder 300 compresses and reconstructs the initial data I. Taking the input initial data as an image as an example, the initial input image is output to obtain a predicted image through a self-encoder executing a preset number of down-sampling processes and up-sampling processes, and the reconstruction error information of the initial input image is determined through the difference between the predicted image and the initial input image.

And 204, screening the initial data based on the cross entropy loss and the reconstruction error information to obtain noise reduction data, wherein the noise reduction data is used for performing iterative training on the first classification network to obtain a second classification network for processing classification tasks.

The method of screening the original data by cross-entropy loss and reconstruction error information to obtain the noise reduction data includes but is not limited to one of the following methods:

and (I) sequencing the initial data in the initial data set according to the cross entropy loss and the reconstruction error information respectively so as to screen the initial data. Namely, the initial data in the initial data set is sorted through the cross entropy loss to obtain a cross entropy queue, and the initial data in the initial data set is sorted through the reconstruction error information to obtain a reconstruction error queue, wherein the cross entropy queue and the reconstruction error queue are both arranged in a reverse order (namely, the lower the numerical value, the higher the ranking is). Then, a preset amount of initial data is obtained from the cross entropy queue and the reconstruction error queue respectively, for example, initial data located 25% of the front of the queue is obtained from the cross entropy queue, initial data located 25% of the front of the queue is obtained from the reconstruction error queue, and the initial data are integrated to obtain noise reduction data for iterative training.

And secondly, calculating value information of the initial data through cross entropy loss and reconstruction error information, sequencing the initial data in the initial data set according to the value information to screen the initial data, wherein the value information is used for measuring the training value of the initial data relative to the first classification network. Illustratively, the cross entropy loss and the reconstruction error information corresponding to the initial data are added according to a preset weight relationship to obtain value information, the initial data in the initial data set are sorted according to the value information, and the noise reduction data are obtained by screening from the sorted initial data according to a preset proportion. Taking the example that the preset weight relationship is the same as the weight of the two, the value information L is obtained by the calculation of the formula III_JointWherein L is_RECRepresenting reconstruction error information, L_CERepresenting the cross entropy loss.

The formula III is as follows: l is_Joint＝L_REC+L_CE

In the embodiment of the present application, taking the number of the first classification networks as an example, the corresponding training frame is as shown in fig. 4, the initial data 401 is respectively input into the self-encoder 410 and the first classification network 420, the reconstruction error information corresponding to the initial data 401 is determined by the self-encoder 410, the cross entropy loss corresponding to the initial data 401 is determined by the first classification network 420, the noise reduction data 1 is determined by the reconstruction error information and the cross entropy loss corresponding to the initial data 401, the noise reduction data 1 is respectively input into the first classification network 420 and the self-encoder 410, the network parameters of the first classification network 420 at this time are updated by the cross entropy loss corresponding to the initial data 410, then the noise reduction data 1 is subjected to data screening to obtain the noise reduction data 2 in the same manner as above, the noise reduction data 2 is input into the first classification network 420, and the iterative training process of the first classification network and the screening process of the data input into the network are simultaneously performed, and finally training to obtain a second classification network capable of processing classification tasks.

In summary, in the method for screening data provided in the embodiment of the present application, in the training process of a neural network for a classification task, initial data is input into a network to be trained, cross entropy loss corresponding to the initial data is obtained according to an output prediction result, feature sampling is performed on the initial data at the same time, reconstruction error information capable of indicating a degree of association between the initial data and the classification task is determined, the initial data is screened based on the cross entropy loss and the reconstruction error information of the initial data to obtain noise reduction data, and the noise reduction data is used for performing iterative training on a first classification network to obtain a second classification network for implementing the classification task. Namely, through the combination of cross entropy loss and reconstruction error information, the initial data is screened jointly from uncertainty dimensionality and representative dimensionality to obtain noise reduction data for network training, and the network is helped to resist the interference of noise data in the training process.

Referring to fig. 5, a method for screening data according to an embodiment of the present application is shown, in which an explanation is made on a screening process of initial data, the method includes:

step 501, initial data and a first classification network are obtained.

The first classification network is a network to be trained for processing classification tasks, and the initial data is data for training network parameters of the first classification network. The initial data is training data labeled according to the classification task. The initial data may be at least one of text data, image data, voice data, video data, and the like, which is not limited in the embodiment of the present application.

The initial network parameter of the first classification network may be a randomly initialized parameter or a preset parameter, which is not limited herein. The number of the first classification networks may be one or more. In some embodiments, the first classification network may be at least one of a convolutional neural network, a VGG network, a deep residual network, etc., capable of handling classification tasks.

The classification task may be at least one of an image classification task, a text classification task, a voice classification task, a video classification task, and the like, where a classification target indicated by the classification task has a corresponding relationship with the initial data, for example, if the classification target indicated by the classification task is to classify an image, the initial data is image data.

Step 502, based on a prediction result obtained by performing classification prediction on the initial data by the first classification network, determining a cross entropy loss corresponding to the initial data.

The cross-entropy loss of the initial data is used to measure the uncertainty of the initial data. Schematically, the first classification network performs classification prediction on initial data, outputs a prediction result corresponding to a current network parameter, and determines cross entropy loss of the initial data according to the prediction result, wherein the cross entropy loss is calculated by a formula I.

And when the cross entropy loss is determined, the first classification network updates parameters according to the cross entropy loss to obtain a training sub-network.

Step 503, determining reconstruction error information corresponding to the initial data based on a sampling result obtained by performing feature sampling on the initial data by the self-encoder.

The reconstruction error information is used to indicate the degree of correlation between the original data and the classification task. The self-encoder may be at least one of linear self-encoders, sparse self-encoders, stacked self-encoders, denoised self-encoders, and the like. The self-encoder enables compression and reconstruction of the original data. The self-encoder is trained from an initial data set that includes all of the initial data used for training the first classification network.

Inputting the initial data into a self-encoder for characteristic sampling, outputting a sampling result from the self-encoder, and determining reconstruction error information according to a difference value between the sampling result and the initial data. In one example, the reconstruction error information is a reconstruction error found by equation two.

And step 504, determining value information corresponding to the initial data based on the cross entropy loss and the reconstruction error information.

The value information is used for measuring the training value of the initial data relative to the first classification network. Illustratively, the cross entropy loss and the reconstruction error information corresponding to the initial data are added according to a preset weight relationship to obtain value information. The preset weight relationship may be stored in the database corresponding to the first classification network, or may be input by the terminal. In some embodiments, the classification task includes information indicating the preset weight relationship, and the preset weight relationship may also be manually set according to an actual requirement for emphasis.

In one example, the weights of the two are the same, and the value information is obtained through calculation of a formula III, wherein the larger the value corresponding to the value information is, the lower the training value of the initial data relative to the first classification network is.

And 505, sorting the initial data according to the value information.

And performing reverse order arrangement on the initial data according to the value information to obtain a corresponding data queue.

And 506, screening the sorted initial data according to a preset proportion to obtain noise reduction data.

In some embodiments, the classification task is indicated with the preset ratio, for example, the preset ratio is 25% of the total data, and then the first 25% of the initial data is obtained from the data queue as the noise reduction data. Illustratively, the preset proportion may also be determined according to the proportion of the noise data in the initial data set, wherein the proportion of the noise data in the initial data set is in a negative correlation with the preset proportion, that is, the more noise data in the initial data, the lower the preset proportion is set, and the more noise data is screened each time.

And the noise reduction data obtained through screening is used for carrying out iterative training on the first classification network. Illustratively, the denoising data is input into the first classification network, and the steps 502 to 506 are repeated, that is, the first classification network is trained through the denoising data, and meanwhile, the denoising data is filtered again according to the reconstruction error information and the cross entropy loss corresponding to the denoising data to obtain the denoising data with a lower noise ratio, and the above steps are repeated continuously to indicate that the first classification network is trained to be convergent, so as to obtain the second classification network capable of processing the classification task.

Referring to fig. 6, a method for screening data according to an embodiment of the present application is shown, in the embodiment of the present application, when the number of first classification networks is 2, the method for screening data to improve the generalization of the networks is described as an example, the method includes:

step 601, initial data, a first classification sub-network and a second classification sub-network are obtained.

In an embodiment of the application, the first classification sub-network comprises a first classification sub-network and a second classification sub-network, the first classification sub-network and the second classification sub-network have different initialization parameters, and the classification tasks for the first classification sub-network and the second classification sub-network are the same.

Step 602, determining a first cross entropy loss corresponding to the initial data based on a first prediction result obtained by performing classification prediction on the initial data by the first classification sub-network.

After the initial data are input into the first classification sub-network, classification prediction is carried out through the first classification sub-network to obtain a first prediction result, and a first cross entropy loss is obtained according to the cross entropy between the first prediction result and the input initial data.

Step 603, determining a second cross entropy loss corresponding to the initial data based on a second prediction result obtained by performing classification prediction on the initial data by the second classification sub-network.

After the initial data are input into the second classification sub-network, classification prediction is carried out through the second classification sub-network to obtain a second prediction result, and a second cross entropy loss is obtained according to the cross entropy between the second prediction result and the input initial data.

In some embodiments, the cross-entropy loss is determined based on the first cross-entropy loss and the second cross-entropy loss. That is, the cross entropy loss corresponding to the initial data is determined jointly by the first cross entropy loss and the second cross entropy loss, for example, the first cross entropy loss and the second cross entropy loss of the initial data are weighted and summed according to a preset weight, so as to obtain the cross entropy loss.

Step 604, performing feature sampling on the initial data, and determining reconstruction error information corresponding to the initial data.

The reconstruction error information is used to indicate the degree of correlation between the original data and the classification task. In the embodiment of the application, the characteristic sampling of the initial data is realized through the self-encoder, and the reconstruction error information corresponding to the initial data is determined through the output of the self-encoder. The self-encoder is obtained by training all initial data, and reconstruction error information corresponding to the initial data is determined based on a sampling result obtained by performing feature sampling on the initial data by the self-encoder obtained by training.

And 605, combining the first cross entropy loss and the reconstruction error information to obtain first value information of the initial data.

The first value information is used to measure the training value of the initial data relative to the first classification sub-network. Illustratively, the first cross entropy loss and the reconstruction error information are subjected to weighted addition according to a preset weight relationship to obtain first value information.

And 606, combining the second cross entropy loss and the reconstruction error information to obtain second value information of the initial data.

The second value information is used to measure the training value of the initial data relative to the second classification sub-network. Illustratively, the second cross entropy loss and the reconstruction error information are subjected to weighted addition according to a preset weight relationship to obtain second value information.

Illustratively, the initial data is sorted based on the first value information and the second value information. In some embodiments, the initial data may be ranked according to the first value information and the second value information together, or the initial data may be ranked according to the first value information and the second value information respectively.

Step 607, sorting the initial data based on the first value information to obtain a first data queue.

Illustratively, the first value information is sorted in a reverse order, that is, the smaller the value corresponding to the first value information is, the higher the training value of the initial data relative to the first classification subnetwork is, the higher the corresponding ranking is.

Step 608, sorting the initial data based on the second value information to obtain a second data queue.

Illustratively, the second value information is sorted in a reverse order, that is, the smaller the value corresponding to the second value information is, the higher the training value of the initial data relative to the second classification subnetwork is, the higher the corresponding ranking is.

Step 609, obtaining first noise reduction data from the first data queue according to the first preset proportion.

The proportion of the noise data in the initial data set is in a negative correlation with the preset proportion, namely, the more the noise data in the initial data is, the lower the preset proportion is set, and the more the noise data is screened each time.

And step 610, acquiring second noise reduction data from the second data queue according to a second preset proportion.

In some embodiments, the first predetermined ratio for filtering the first data queue and the second predetermined ratio for filtering the second data queue may be the same or different. Illustratively, when the first preset proportion and the second preset proportion are different, the first preset proportion and the second preset proportion are determined according to the specific gravity of the first classification sub-network and the second classification sub-network in the classification task, for example, when the classification task indicates that the weight of the output result of the first network when the first network performs the classification task is higher than the weight of the output result of the second network when the second network performs the classification task, the first preset proportion is higher than the second preset proportion, the first network is a network trained by the first classification sub-network, and the second network is a network trained by the second classification sub-network.

Illustratively, the noise reduction data is generated from the first noise reduction data and the second noise reduction data. The first noise reduction data and the second noise reduction data can jointly carry out iterative training on the first classification sub-network and the second classification sub-network, namely the first noise reduction data and the second noise reduction data are input into the first classification sub-network for iterative training, and meanwhile the first noise reduction data and the second noise reduction data are input into the second classification sub-network for iterative training; the iterative training can also be performed by cross-inputting the first classification sub-network and the second classification sub-network, that is, the first noise reduction data is input into the second classification sub-network for iterative training, and the second noise reduction data is input into the first classification sub-network for iterative training.

Step 611, determining a third cross entropy loss corresponding to the first denoising data based on a prediction result obtained by performing classification prediction on the first denoising data by the second updating network.

And the second updating network is obtained by updating parameters of the second classification sub-network according to the first cross entropy loss. In an embodiment of the present application, the first noise reduction data is input into the second update network to determine a third cross entropy loss corresponding to the first noise reduction data.

Step 612, determining a fourth cross entropy loss corresponding to the second denoising data based on a prediction result obtained by performing classification prediction on the second denoising data by the first updating network.

And the first updating network is obtained by updating parameters of the first classification sub-network according to the second cross entropy loss. In the embodiment of the present application, the second noise reduction data is input into the first updating network to determine a fourth cross entropy loss corresponding to the first noise reduction data.

In the embodiment of the present application, the first noise reduction data and the second noise reduction data are cross-input into the first classification sub-network and the second classification sub-network for subsequent network parameter training, that is, the first noise reduction data is used for training the network parameters of the second update network, and the second noise reduction data is used for training the network parameters of the first update network.

Step 613, performing feature sampling on the first noise reduction data, and determining first reconstruction error information corresponding to the first noise reduction data.

And inputting the first noise reduction data into the self-encoder, outputting a sampling result corresponding to the first noise reduction data, and determining first reconstruction error information corresponding to the first noise reduction data according to the relation between the sampling result and the input first noise reduction data, wherein the first reconstruction error information is obtained by calculating according to a formula II.

And 614, performing characteristic sampling on the second noise reduction data, and determining second reconstruction error information corresponding to the second noise reduction data.

And inputting the second noise reduction data into the self-encoder, outputting a sampling result corresponding to the second noise reduction data, and determining second reconstruction error information corresponding to the second noise reduction data according to the relation between the sampling result and the input second noise reduction data, wherein the second reconstruction error information is obtained by calculating according to a formula II.

And step 615, screening the first noise reduction data based on the third cross entropy loss and the first reconstruction error information to obtain third noise reduction data.

The third noise reduction data is used for iterative training of the first updated network. In this embodiment, the method for obtaining the third noise reduction data by screening from the first noise reduction data is the same as the method for obtaining the first noise reduction data by screening from the initial data, and is not described herein again.

And 616, screening the second noise reduction data based on the fourth cross entropy loss and the second reconstruction error information to obtain fourth noise reduction data.

The fourth noise reduction data is used for iterative training of the second updated network. In this embodiment of the present application, a method for obtaining the fourth noise reduction data by screening from the second noise reduction data is the same as a method for obtaining the second noise reduction data by screening from the initial data, and details are not repeated herein.

Schematically, as shown in fig. 7, which illustrates a network training framework provided in an embodiment of the present application, initial data 701 is input into an auto-encoder 710, a first classification sub-network 720, and a second classification sub-network 730, first noise reduction data is determined by reconstruction error information determined by the auto-encoder 710 and a first cross entropy loss determined by the first classification sub-network 720, and the first noise reduction data is input into a second update network 731, where the second update network 731 is a network obtained by parameter updating of the second classification sub-network 730 according to the second cross entropy loss; and determining second noise reduction data through the reconstruction error information determined by the self-encoder 710 and the second cross entropy loss determined by the second classification sub-network 730, and inputting the second noise reduction data into the first updating network 721, wherein the first updating network 721 is a network obtained by updating parameters of the first classification sub-network 720 according to the first cross entropy loss, and then repeating the screening of the data and the network training process until the two networks converge to obtain a first network 722 and a second network 732 for processing classification tasks.

In the embodiment of the application, the first network and the second network can be finally obtained by training according to the provided data screening method, wherein the first network and the second network process the same classification task, and the network parameters of the first network and the second network may be the same or different. When the parameters of the two networks are different, a target network for specific application can be determined according to the classification condition of the network parameters corresponding to the emphasis, for example, taking an image classification task as an example, the first network obtained by training is more accurate in classifying the cat, the second network is more accurate in classifying the dog, and when the target network application is cat recognition, the first network is selected as the target network.

Optionally, the first network and the second network may be used together as a target network, that is, the identification result of the first network and the identification result of the second network are combined to obtain a target identification result together. For example, the data to be identified is input to a first network to obtain a first result, the data to be identified is input to a second network to obtain a second result, and the first result and the second result are combined by different weights to obtain a target result. The specific application of the first network and the second network is not limited herein.

In summary, in the method for screening data provided in this embodiment of the present application, in the process of training a neural network for a classification task, initial data is input into a first classification sub-network and a second classification sub-network, a first cross entropy loss and a second cross entropy loss corresponding to the initial data are obtained according to an output first prediction result and an output second prediction result, feature sampling is performed on the initial data, reconstruction error information capable of indicating a degree of association between the initial data and the classification task is determined, the initial data is screened based on the first cross entropy loss and the reconstruction error information of the initial data to obtain first noise reduction data, the initial data is screened based on the second cross entropy loss and the reconstruction error information of the initial data to obtain second noise reduction data, and the first noise reduction data is used for training network parameters of the second classification sub-network, and the second noise reduction data is used for training the network parameters of the first classification sub-network, simultaneously screening the first noise reduction data to obtain third noise reduction data, screening the second noise reduction data to obtain fourth noise reduction data, performing cross training on two networks to be trained through the third noise reduction data and the fourth noise reduction data, repeating the data screening and cross training process of the network parameters until the two networks are converged, and obtaining two networks capable of processing classification tasks. Namely, through the combination of cross entropy loss and reconstruction error information, the initial data is screened jointly from uncertainty dimensionality and representative dimensionality to obtain noise reduction data for network training, the network is helped to resist the interference of noise data in the training process, and meanwhile, the generalization of the network is improved through a cross training mode.

The following are embodiments of the apparatus of the present application that may be used to perform embodiments of the method of the present application. For details which are not disclosed in the embodiments of the apparatus of the present application, reference is made to the embodiments of the method of the present application.

Fig. 8 is a block diagram illustrating a data filtering apparatus according to an embodiment of the present application. The device has the functions of realizing the method examples, and the functions can be realized by hardware or by hardware executing corresponding software. The apparatus may include:

an obtaining module 810, configured to obtain initial data and a first classification network, where the first classification network is a network to be trained for processing a classification task;

a determining module 820, configured to determine, based on a prediction result obtained by performing classification prediction on the initial data by using the first classification network, a cross entropy loss corresponding to the initial data;

the determining module 820 is further configured to perform feature sampling on the initial data, and determine reconstruction error information corresponding to the initial data, where the reconstruction error information is used to indicate a degree of association between the initial data and the classification task;

a screening module 830, configured to screen the initial data based on the cross entropy loss and the reconstruction error information to obtain noise reduction data, where the noise reduction data is used to perform iterative training on the first classification network to obtain a second classification network for processing the classification task.

In some optional embodiments, as shown in fig. 9, the screening module 830 further includes:

a determining unit 831, configured to determine, based on the cross entropy loss and the reconstruction error information, value information corresponding to the initial data, where the value information is used to measure a training value of the initial data relative to the first classification network;

a sorting unit 832 for sorting the initial data according to the value information;

a screening unit 833, configured to screen the sorted initial data according to a preset ratio to obtain the noise reduction data.

In some optional embodiments, the first classification sub-network comprises a first classification sub-network and a second classification sub-network, the first classification sub-network and the second classification sub-network having different initialization parameters;

the determining module 820 is further configured to determine a first cross entropy loss corresponding to the initial data based on a first prediction result obtained by performing classification prediction on the initial data by the first classification sub-network;

the determining module 820 is further configured to determine a second cross entropy loss corresponding to the initial data based on a second prediction result obtained by performing classification prediction on the initial data by the second classification sub-network;

the determining module 820 is further configured to determine the cross entropy loss based on the first cross entropy loss and the second cross entropy loss.

In some optional embodiments, the determining unit 831 is further configured to combine the first cross entropy loss and the reconstruction error information to obtain first value information of the initial data, where the first value information is used to measure a training value of the initial data relative to the first classification subnetwork;

the determining unit 831 is further configured to combine the second cross entropy loss and the reconstruction error information to obtain second value information of the initial data, where the second value information is used to measure a training value of the initial data relative to the second classification sub-network;

the sorting unit 832 is further configured to sort the initial data based on the first value information and the second value information.

In some optional embodiments, the determining unit 831 is further configured to sort the initial data based on the first value information, so as to obtain a first data queue;

the determining unit 831 is further configured to sort the initial data based on the second value information, so as to obtain a second data queue;

the screening unit 833 is further configured to obtain first noise reduction data from the first data queue according to a first preset proportion;

the screening unit 833 is further configured to obtain second noise reduction data from the second data queue according to a second preset proportion;

the filtering unit 833 is further configured to generate the noise reduction data according to the first noise reduction data and the second noise reduction data.

In some optional embodiments, the determining module 820 is further configured to determine a third cross entropy loss corresponding to the first noise reduction data based on a prediction result obtained by performing classification prediction on the first noise reduction data by using a second updating network, where the second updating network is a network obtained by performing parameter updating on the second classification sub-network according to the first cross entropy loss;

the determining module 820 is further configured to determine a fourth cross entropy loss corresponding to the second noise reduction data based on a prediction result obtained by performing classification prediction on the second noise reduction data by using a first updating network, where the first updating network is a network obtained by performing parameter updating on the first classification sub-network according to the second cross entropy loss;

the determining module 820 is further configured to perform feature sampling on the first noise reduction data, and determine first reconstruction error information corresponding to the first noise reduction data;

the determining module 820 is further configured to perform feature sampling on the second noise reduction data, and determine second reconstruction error information corresponding to the second noise reduction data;

the screening module 830 is further configured to screen the first denoising data based on the third cross entropy loss and the first reconstruction error information to obtain third denoising data, where the third denoising data is used to perform iterative training on the first update network;

the screening module 830 is further configured to screen the second denoising data based on the fourth cross entropy loss and the second reconstruction error information to obtain fourth denoising data, where the fourth denoising data is used to perform iterative training on the second update network.

In some optional embodiments, the obtaining module 810 is further configured to obtain an initial self-encoder;

the device further comprises:

a training module 840, configured to input the initial data into the initial self-encoder to obtain an encoding result;

the training module 840 is further configured to perform supervised training on the initial self-encoder according to a reconstruction error between the initial data and the encoding result, so as to obtain a self-encoder;

the determining module 820 is further configured to determine the reconstruction error information corresponding to the initial data based on a sampling result obtained by performing feature sampling on the initial data by the self-encoder.

To sum up, in the training process of the neural network for the classification task, the initial data is input into the network to be trained, cross entropy loss corresponding to the initial data is obtained according to the output prediction result, meanwhile, feature sampling is performed on the initial data, reconstruction error information capable of indicating the degree of association between the initial data and the classification task is determined, the initial data is screened based on the cross entropy loss and the reconstruction error information of the initial data to obtain noise reduction data, and the noise reduction data is used for performing iterative training on the first classification network to obtain a second classification network for realizing the classification task. Namely, through the combination of cross entropy loss and reconstruction error information, the initial data is screened jointly from uncertainty dimensionality and representative dimensionality to obtain noise reduction data for network training, and the network is helped to resist the interference of noise data in the training process.

It should be noted that: the data screening apparatus provided in the above embodiment is only illustrated by dividing the functional modules, and in practical applications, the functions may be distributed by different functional modules according to needs, that is, the internal structure of the device is divided into different functional modules, so as to complete all or part of the functions described above. In addition, the data screening apparatus and the data screening method provided in the above embodiments belong to the same concept, and specific implementation processes thereof are described in the method embodiments and are not described herein again.

Fig. 10 shows a schematic structural diagram of a server provided in an exemplary embodiment of the present application. Specifically, the structure includes the following.

The server 1000 includes a Central Processing Unit (CPU) 1001, a system Memory 1004 including a Random Access Memory (RAM) 1002 and a Read Only Memory (ROM) 1003, and a system bus 1005 connecting the system Memory 1004 and the Central Processing Unit 1001. The server 1000 also includes a mass storage device 1006 for storing an operating system 1013, application programs 1014, and other program modules 1015.

The mass storage device 1006 is connected to the central processing unit 1001 through a mass storage controller (not shown) connected to the system bus 1005. The mass storage device 1006 and its associated computer-readable media provide non-volatile storage for the server 1000. That is, the mass storage device 1006 may include a computer-readable medium (not shown) such as a hard disk or Compact disk Read Only Memory (CD-ROM) drive.

Without loss of generality, computer readable media may comprise computer storage media and communication media. Computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data. Computer storage media includes RAM, ROM, Erasable Programmable Read-Only Memory (EPROM), Electrically Erasable Programmable Read-Only Memory (EEPROM), flash Memory or other solid state Memory technology, CD-ROM, Digital Versatile Disks (DVD), or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage, or other magnetic storage devices. Of course, those skilled in the art will appreciate that computer storage media is not limited to the foregoing. The system memory 1104 and mass storage device 1006 described above may be collectively referred to as memory.

According to various embodiments of the present application, the server 1000 may also operate as a remote computer connected to a network through a network, such as the Internet. That is, the server 1000 may be connected to the network 1012 through a network interface unit 1011 connected to the system bus 1005, or the network interface unit 1011 may be used to connect to another type of network or a remote computer system (not shown).

The memory further includes one or more programs, and the one or more programs are stored in the memory and configured to be executed by the CPU.

Embodiments of the present application further provide a computer device, which includes a processor and a memory, where at least one instruction, at least one program, a set of codes, or a set of instructions is stored in the memory, and the at least one instruction, the at least one program, the set of codes, or the set of instructions is loaded and executed by the processor to implement the target object prediction method provided by the above method embodiments. Alternatively, the computer device may be a terminal or a server.

Embodiments of the present application further provide a computer-readable storage medium, on which at least one instruction, at least one program, a code set, or a set of instructions is stored, and the at least one instruction, the at least one program, the code set, or the set of instructions is loaded and executed by a processor to implement the target object prediction method provided by the above method embodiments.

Embodiments of the present application also provide a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer readable storage medium, and the processor executes the computer instructions to cause the computer device to perform the target object prediction method described in any of the above embodiments.

Optionally, the computer-readable storage medium may include: a Read Only Memory (ROM), a Random Access Memory (RAM), a Solid State Drive (SSD), or an optical disc. The Random Access Memory may include a resistive Random Access Memory (ReRAM) and a Dynamic Random Access Memory (DRAM). The above-mentioned serial numbers of the embodiments of the present application are merely for description and do not represent the merits of the embodiments.

It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, where the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.

The above description is only exemplary of the present application and should not be taken as limiting, as any modification, equivalent replacement, or improvement made within the spirit and principle of the present application should be included in the protection scope of the present application.

Claims

1. A method of screening data, the method comprising:

2. The method of claim 1, wherein the screening the initial data based on the cross entropy loss and the reconstruction error information to obtain noise reduction data comprises:

determining value information corresponding to the initial data based on the cross entropy loss and the reconstruction error information, wherein the value information is used for measuring the training value of the initial data relative to the first classification network;

sorting the initial data according to the value information;

and screening the sorted initial data according to a preset proportion to obtain the noise reduction data.

3. The method of claim 2, wherein the first classification network comprises a first classification sub-network and a second classification sub-network, the first classification sub-network and the second classification sub-network having different initialization parameters;

the determining, based on a prediction result obtained by performing classification prediction on the initial data by the first classification network, a cross entropy loss corresponding to the initial data includes:

determining a first cross entropy loss corresponding to the initial data based on a first prediction result obtained by performing classification prediction on the initial data by the first classification sub-network;

determining a second cross entropy loss corresponding to the initial data based on a second prediction result obtained by performing classification prediction on the initial data by the second classification sub-network;

determining the cross-entropy loss based on the first cross-entropy loss and the second cross-entropy loss.

4. The method of claim 3, wherein the determining value information corresponding to the initial data based on the cross-entropy loss and the reconstruction error information comprises:

combining the first cross entropy loss and the reconstruction error information to obtain first value information of the initial data, wherein the first value information is used for measuring the training value of the initial data relative to the first classification sub-network;

combining the second cross entropy loss and the reconstruction error information to obtain second value information of the initial data, wherein the second value information is used for measuring the training value of the initial data relative to the second classification sub-network;

the sorting the initial data according to the value information includes:

ranking the initial data based on the first value information and the second value information.

5. The method of claim 4, wherein the ranking the initial data based on the first value information and the second value information comprises:

sorting the initial data based on the first value information to obtain a first data queue;

sorting the initial data based on the second value information to obtain a second data queue;

the filtering the sorted initial data according to a preset proportion to obtain the noise reduction data includes:

acquiring first noise reduction data from the first data queue according to a first preset proportion;

acquiring second noise reduction data from the second data queue according to a second preset proportion;

and generating the noise reduction data according to the first noise reduction data and the second noise reduction data.

6. The method according to claim 5, wherein after the screening the initial data based on the cross entropy loss and the reconstruction error information to obtain noise reduction data, the method further comprises:

determining a third cross entropy loss corresponding to the first noise reduction data based on a prediction result obtained by classifying and predicting the first noise reduction data by a second updating network, wherein the second updating network is obtained by parameter updating of the second classifying sub-network according to the first cross entropy loss;

determining a fourth cross entropy loss corresponding to the second noise reduction data based on a prediction result obtained by classifying and predicting the second noise reduction data by a first updating network, wherein the first updating network is obtained by performing parameter updating on the first classifying sub-network according to the second cross entropy loss;

performing feature sampling on the first noise reduction data, and determining first reconstruction error information corresponding to the first noise reduction data;

performing feature sampling on the second noise reduction data, and determining second reconstruction error information corresponding to the second noise reduction data;

screening the first noise reduction data based on the third cross entropy loss and the first reconstruction error information to obtain third noise reduction data, wherein the third noise reduction data is used for performing iterative training on the first updating network;

and screening the second noise reduction data based on the fourth cross entropy loss and the second reconstruction error information to obtain fourth noise reduction data, wherein the fourth noise reduction data is used for performing iterative training on the second updating network.

7. The method according to any one of claims 1 to 6, wherein the performing feature sampling on the initial data to determine reconstruction error information corresponding to the initial data comprises:

acquiring an initial self-encoder;

inputting the initial data into the initial self-encoder to obtain an encoding result;

carrying out supervision training on the initial self-encoder according to a reconstruction error between the initial data and the encoding result to obtain a self-encoder;

and determining the reconstruction error information corresponding to the initial data based on a sampling result obtained by performing characteristic sampling on the initial data by the self-encoder.

8. An apparatus for screening data, the apparatus comprising:

9. A computer device comprising a processor and a memory, the memory having stored therein at least one instruction, at least one program, set of codes, or set of instructions, which is loaded and executed by the processor to implement a method of screening data as claimed in any one of claims 1 to 7.

10. A computer-readable storage medium, having at least one program code stored therein, the program code being loaded and executed by a processor to implement the method of screening data according to any one of claims 1 to 7.