CN114067183B

CN114067183B - Neural network model training method, image processing method, device and equipment

Info

Publication number: CN114067183B
Application number: CN202111406372.0A
Authority: CN
Inventors: 刘佳; 杨叶辉; 尚方信; 王兆玮; 王磊
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2021-11-24
Filing date: 2021-11-24
Publication date: 2022-10-28
Anticipated expiration: 2041-11-24
Also published as: CN114067183A

Abstract

A neural network model training method, an image processing device, an image processing apparatus, a neural network training medium and a computer program product relate to the technical field of artificial intelligence, in particular to the field of computer vision and machine learning. The neural network model training method comprises the following steps: acquiring a training data set in an image data set; inputting a training data set into a data processing model provided with a hyper-parameter set to obtain a first feature set; inputting the first feature set into a first prediction model to obtain a first performance index; feeding back the adjustment direction of the hyper-parameter to the data processing model based on the first performance index; adjusting the hyper-parameter set based on the adjustment direction; inputting the training data set into a data processing model provided with the adjusted hyper-parameter set to obtain a second feature set; inputting the second feature set into the first prediction model to obtain a second performance index; and responding to the second performance index meeting the preset condition, and inputting the second characteristic set into the second prediction model to train the second prediction model to obtain the target model.

Description

Neural network model training method, image processing method, device and equipment

Technical Field

The present disclosure relates to the field of artificial intelligence technologies, and in particular, to a computer vision and machine learning technology, and in particular, to a training method for a neural network model, an image processing method, an apparatus, a device, a computer-readable storage medium, and a computer program product.

Background

Artificial intelligence is the subject of research that makes computers simulate some human mental processes and intelligent behaviors (such as learning, reasoning, thinking, planning, etc.), both at the hardware level and at the software level. Artificial intelligence hardware technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing, and the like; the artificial intelligence software technology mainly comprises a computer vision technology, a voice recognition technology, a natural language processing technology, machine learning/deep learning, a big data processing technology, a knowledge map technology and the like.

In recent years, with the rapid development of technologies such as artificial intelligence and big data, valuable information can be obtained from a large amount of data, and in recent years, a great deal of research is carried out in the field of imaging omics by using a machine learning method to obtain a prediction model. Imaging omics is becoming an increasingly popular computer-aided diagnostic tool in the field of medical research. Imaging omics provide imaging biomarkers that can facilitate cancer detection, diagnosis, prognostic assessment, and prediction of treatment response.

The approaches described in this section are not necessarily approaches that have been previously conceived or pursued. Unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section. Similarly, the problems mentioned in this section should not be considered as having been acknowledged in any prior art, unless otherwise indicated.

Disclosure of Invention

The present disclosure provides a neural network model training method and image processing method, apparatus, device, medium and computer program product.

According to an aspect of the present disclosure, there is provided a neural network model training method, including: acquiring a training data set in an image data set; inputting a training data set into a data processing model provided with a hyper-parameter set to obtain a first feature set; inputting the first feature set into a first prediction model to obtain a first performance index of the first prediction model; feeding back the adjustment direction of the hyper-parameter to the data processing model based on the first performance index; adjusting the hyper-parameter set based on the adjustment direction; inputting the training data set into a data processing model provided with the adjusted hyper-parameter set to obtain a second feature set; inputting the second feature set into the first prediction model to obtain a second performance index of the first prediction model; and responding to the second performance index meeting the preset condition, and inputting the second feature set into a second prediction model for training the second prediction model to obtain a target model.

According to another aspect of the present disclosure, an image processing method is provided, which includes inputting an image to be detected into an image detection neural network model to obtain a detection result of the image, wherein the image detection neural network model is obtained by training through the training method.

According to another aspect of the present disclosure, there is provided a training apparatus of a neural network model, the apparatus including: a first unit configured for acquiring a training data set in an image data set; a second unit configured to input the training data set into a data processing model provided with a hyper-parameter set, resulting in a first feature set; a third unit, configured to input the first feature set into the first prediction model, to obtain a first performance index of the first prediction model; a fourth unit configured to feed back an adjustment direction of the hyper-parameter to the data processing model based on the first performance index; a fifth unit configured to adjust the hyper-parameter set based on the adjustment direction; a sixth unit configured to input the training data set into a data processing model provided with the adjusted hyper-parameter set, so as to obtain a second feature set; a seventh unit, configured to input the second feature set into the first prediction model, to obtain a second performance index of the first prediction model; and an eighth unit, configured to, in response to the second performance index meeting the preset condition, input the second feature set to the second prediction model for training the second prediction model to obtain the target model.

According to another aspect of the present disclosure, there is provided an image processing apparatus including: and a ninth unit, inputting the image to be detected into the image detection neural network model to obtain the detection result of the image, wherein the image detection neural network model is obtained by training through the training method.

According to another aspect of the present disclosure, there is provided an electronic device including: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the above neural network model training method or image processing method.

According to another aspect of the present disclosure, there is provided a non-transitory computer-readable storage medium storing computer instructions for causing a computer to perform the above neural network model training method or image processing method.

According to another aspect of the present disclosure, a computer program product is provided, comprising a computer program, wherein the computer program, when being executed by a processor, implements the above-mentioned training method or image processing method of the neural network model.

According to one or more embodiments of the present disclosure, when training a neural network model, after one training of the model is completed, a training result is evaluated, and an adjustment direction of a training parameter of the neural network model is proposed based on an analysis performed on the evaluation result at this time. By the method, the parameters of the neural network model are continuously and autonomously adjusted, so that the neural network model can be rapidly converged, and training is completed. In practical application, the image processing model trained by the mode of autonomously determining the training parameters has high model accuracy, and can accurately perform image processing on the image to be detected.

It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present disclosure, nor do they limit the scope of the present disclosure. Other features of the present disclosure will become apparent from the following description.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate exemplary embodiments of the embodiments and, together with the description, serve to explain the exemplary implementations of the embodiments. The illustrated embodiments are for purposes of illustration only and do not limit the scope of the claims. Throughout the drawings, identical reference numbers designate similar, but not necessarily identical, elements.

FIG. 1 illustrates a schematic diagram of an exemplary system in which various methods described herein may be implemented, according to an embodiment of the present disclosure;

FIG. 2 shows a flow diagram of a method of training a neural network model in accordance with an embodiment of the present disclosure;

FIG. 3 shows a flow diagram of another method of training a neural network model, in accordance with an embodiment of the present disclosure;

FIG. 4 shows a schematic diagram of a data processing model for training a neural network model, in accordance with an embodiment of the present disclosure;

FIG. 5 shows a block diagram of a training apparatus for a neural network model, according to an embodiment of the present disclosure;

fig. 6 shows a block diagram of the structure of an image processing apparatus according to an embodiment of the present disclosure; and

FIG. 7 illustrates a block diagram of an exemplary electronic device that can be used to implement embodiments of the present disclosure.

Detailed Description

Exemplary embodiments of the present disclosure are described below with reference to the accompanying drawings, in which various details of the embodiments of the disclosure are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, it will be recognized by those of ordinary skill in the art that various changes and modifications may be made to the embodiments described herein without departing from the scope of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

In the present disclosure, unless otherwise specified, the use of the terms "first", "second", and the like to describe various elements is not intended to limit the positional relationship, the temporal relationship, or the importance relationship of the elements, and such terms are used only to distinguish one element from another. In some examples, a first element and a second element may refer to the same instance of the element, and in some cases, based on the context, they may also refer to different instances.

The terminology used in the description of the various examples in this disclosure is for the purpose of describing particular examples only and is not intended to be limiting. Unless the context clearly indicates otherwise, if the number of elements is not specifically limited, the elements may be one or more. Furthermore, the term "and/or" as used in this disclosure is intended to encompass any and all possible combinations of the listed items.

The feature engineering is the most valuable research direction in the machine learning process, the current technical scheme is almost completely realized manually, and the optimal feature combination needs to be found manually by data science families, so that the feature engineering in the related technology has great limitation on the realization effect and efficiency.

In order to solve the above problems, the inventor proposes an automated general technical framework applied to image detection (including but not limited to the research of the imaging omics), so that the feature engineering can mine valuable information as much as possible, obtain the optimal feature combination and the optimal model, and reduce the human involvement as much as possible. On the premise of improving the effect and efficiency of characteristic engineering, the technical threshold of research in the field is reduced.

Embodiments of the present disclosure will be described in detail below with reference to the accompanying drawings.

Fig. 1 illustrates a schematic diagram of an example system 100 in which various methods and apparatus described herein may be implemented in accordance with embodiments of the present disclosure. Referring to fig. 1, the system 100 includes one or

more client devices

101, 102, 103, 104, 105, and 106, a server 120, and one or more communication networks 110 coupling the one or more client devices to the server 120.

Client devices

101, 102, 103, 104, 105, and 106 may be configured to execute one or more applications.

In embodiments of the present disclosure, the server 120 may run one or more services or software applications that enable the training of neural network models and, in practical applications, methods of processing pictures through the trained models.

In some embodiments, the server 120 may also provide other services or software applications that may include non-virtual environments and virtual environments. In some embodiments, these services may be provided as web-based services or cloud services, for example, provided to users of

client devices

101, 102, 103, 104, 105, and/or 106 under a software as a service (SaaS) model.

In the configuration shown in fig. 1, server 120 may include one or more components that implement the functions performed by server 120. These components may include software components, hardware components, or a combination thereof, which may be executed by one or more processors. A user operating a

client device

101, 102, 103, 104, 105, and/or 106 may, in turn, utilize one or more client applications to interact with the server 120 to take advantage of the services provided by these components. It should be understood that a variety of different system configurations are possible, which may differ from system 100. Accordingly, fig. 1 is one example of a system for implementing the various methods described herein and is not intended to be limiting.

The user may use

client devices

101, 102, 103, 104, 105, and/or 106 to input images to be processed. The client device may provide an interface that enables a user of the client device to interact with the client device. The client device may also output information to the user via the interface. Although fig. 1 depicts only six client devices, those skilled in the art will appreciate that any number of client devices may be supported by the present disclosure.

Client devices

101, 102, 103, 104, 105, and/or 106 may include various types of computer devices, such as portable handheld devices, general purpose computers (such as personal computers and laptop computers), workstation computers, wearable devices, smart screen devices, self-service terminal devices, service robots, gaming systems, thin clients, various messaging devices, sensors or other sensing devices, and so forth. These computer devices may run various types and versions of software applications and operating systems, such as MICROSOFT Windows, APPLE iOS, UNIX-like operating systems, linux, or Linux-like operating systems (e.g., GOOGLE Chrome OS); or include various Mobile operating systems, such as MICROSOFT Windows Mobile OS, iOS, windows Phone, android. Portable handheld devices may include cellular telephones, smart phones, tablets, personal Digital Assistants (PDAs), and the like. Wearable devices may include head-mounted displays (such as smart glasses) and other devices. The gaming system may include a variety of handheld gaming devices, internet-enabled gaming devices, and the like. The client device is capable of executing a variety of different applications, such as various Internet-related applications, communication applications (e.g., email applications), short Message Service (SMS) applications, and may use a variety of communication protocols.

Network 110 may be any type of network known to those skilled in the art that may support data communications using any of a variety of available protocols, including but not limited to TCP/IP, SNA, IPX, etc. By way of example only, one or more networks 110 may be a Local Area Network (LAN), an ethernet-based network, a token ring, a Wide Area Network (WAN), the internet, a virtual network, a Virtual Private Network (VPN), an intranet, an extranet, a Public Switched Telephone Network (PSTN), an infrared network, a wireless network (e.g., bluetooth, WIFI), and/or any combination of these and/or other networks.

The server 120 may include one or more general purpose computers, special purpose server computers (e.g., PC (personal computer) servers, UNIX servers, mid-end servers), blade servers, mainframe computers, server clusters, or any other suitable arrangement and/or combination. The server 120 may include one or more virtual machines running a virtual operating system, or other computing architecture involving virtualization (e.g., one or more flexible pools of logical storage that may be virtualized to maintain virtual storage for the server). In various embodiments, the server 120 may run one or more services or software applications that provide the functionality described below.

The computing units in server 120 may run one or more operating systems including any of the operating systems described above, as well as any commercially available server operating systems. The server 120 can also run any of a variety of additional server applications and/or mid-tier applications, including HTTP servers, FTP servers, CGI servers, JAVA servers, database servers, and the like.

In some implementations, the server 120 may include one or more applications to analyze and consolidate data feeds and/or event updates received from users of the

client devices

101, 102, 103, 104, 105, and/or 106. Server 120 may also include one or more applications to display data feeds and/or real-time events via one or more display devices of

client devices

101, 102, 103, 104, 105, and/or 106.

In some embodiments, the server 120 may be a server of a distributed system, or a server incorporating a blockchain. The server 120 may also be a cloud server, or a smart cloud computing server or a smart cloud host with artificial intelligence technology. The cloud Server is a host product in a cloud computing service system, and is used for solving the defects of high management difficulty and weak service expansibility in the traditional physical host and Virtual Private Server (VPS) service.

The system 100 may also include one or more databases 130. In some embodiments, these databases may be used to store data and other information. For example, one or more of the databases 130 may be used to store information such as audio files and video files. The database 130 may reside in various locations. For example, the database used by the server 120 may be local to the server 120, or may be remote from the server 120 and may communicate with the server 120 via a network-based or dedicated connection. The database 130 may be of different types. In certain embodiments, the database used by the server 120 may be, for example, a relational database. One or more of these databases may store, update, and retrieve data to and from the database in response to the command.

In some embodiments, one or more of the databases 130 may also be used by applications to store application data. The databases used by the application may be different types of databases, such as key-value stores, object stores, or regular stores supported by a file system.

The system 100 of fig. 1 may be configured and operated in various ways to enable application of the various methods and apparatus described in accordance with the present disclosure.

According to an aspect of the present disclosure, a method 200 of training a neural network model is provided. As shown in fig. 2, the training method 200 may include:

in step 201, a training data set in an image data set is acquired.

According to some embodiments, in preparing a dataset for model training, the image dataset is divided into a training dataset and a validation dataset. The image dataset is used to train the model such that the model converges after multiple training passes. The use of the validation data set will be described in detail later.

In one example, the image dataset contains 100 data, at which time 80 data may be set as the training dataset and the remaining 20 data set as the validation dataset.

In step 202, a training data set is input into a data processing model with a hyper-parameter set to obtain a first feature set.

According to some embodiments, the data processing model is used to extract features of images in the training dataset and pass these features to the neural network model to be trained.

In one example, different kinds of hyper-parameters may be set in the data processing model, and these hyper-parameters define the way in which the data processing model processes the training data set and obtains the corresponding feature set.

In step 203, the first feature set is input into the first prediction model to obtain a first performance index of the first prediction model.

According to some embodiments, the first prediction model adopts a conventional default parameter setting, and the model parameters set in the first prediction model do not change in the model training process.

In one example, after one training, the first prediction model may be subjected to five-fold cross validation, and an AUC (Area Under measure) value is obtained as a first performance index, and a performance of the first prediction model at this time is evaluated, where the performance may represent an accuracy of the prediction model for data prediction in the training data set at this time.

In step 204, the adjustment direction of the hyper-parameter is fed back to the data processing model based on the first performance indicator.

According to some embodiments, when the performance metric of the first predictive model is the first performance metric, the first predictive model may not yet have converged, because the parameters in the first predictive model are not involved in the optimization operation, so the hyper-parameters in the data processing model need to be adjusted.

In one example, a bayesian optimization algorithm can be used to search for hyper-parameters in the data processing model.

According to some embodiments, a relationship between at least one hyperparameter in the hyperparameter set and the first performance indicator is fitted based on at least one proxy function in a bayesian optimization algorithm to determine a hyperparameter to be adjusted in the at least one hyperparameter and an adjustment direction of the hyperparameter to be adjusted.

In one example, a Bayesian optimized hyper-parameter search algorithm employs a Gaussian process, and the prior parameter information is considered and continuously updated. That is to say, bayesian optimization can make the hyper-parameters and the performance indexes accord with convex optimization conditions, and an optimal solution can be derived by a derivation or convex optimization method, and the optimal solution is to determine which specific hyper-parameters need to be adjusted and the adjustment directions of the hyper-parameters which need to be adjusted.

In one example, the adjustment direction of the hyper-parameter may be an adjustment manner of increasing or decreasing the value.

In the prior art, the hyper-parameters of the data processing model are specifically set to which parameters, and the set values of the parameters are obtained through a plurality of experiments by experimenters, so that the optimal hyper-parameter set can be obtained only by a large number of training data sets, but the hyper-parameter set which can enable the data processing model to have the best performance can be quickly found through Bayesian optimization.

In step 205, the hyper-parameter set is adjusted based on the adjustment direction. The method for adjusting the hyper-parameter set based on the adjustment direction will be described in detail later with reference to fig. 4.

In step 206, the training data set is input into the data processing model with the adjusted hyper-parameter set to obtain a second feature set.

According to some embodiments, in the step of adjusting the hyper-parameter set based on the adjustment direction, the adjusted hyper-parameter set and the pre-adjusted hyper-parameter set have the same hyper-parameter type, and the values of at least one hyper-parameter are different.

In one example, the hyper-parameters set in the data processing model are adjusted in step 205, and the adjusted hyper-parameter set is the same kind of hyper-parameters as the pre-adjusted hyper-parameter set, and at least one hyper-parameter exists because the adjusted hyper-parameter is different from the pre-adjusted hyper-parameter set.

In step 207, the second feature set is input into the first prediction model to obtain a second performance index of the first prediction model.

In one example, the second feature set obtained in step 205 is input to the first prediction model again, so as to obtain a second performance index, i.e., an AUC value, of the first prediction model when the second feature set is input.

In step 208, in response to that the second performance index meets the preset condition, the second feature set is input into the second prediction model for training the second prediction model to obtain the target model.

In one example, the second prediction model continuously optimizes the parameters of the model itself when the second feature set is used as input, so as to realize model convergence.

In one example, the preset condition that the second performance index meets may be that the second performance index and the first performance index (even a plurality of performance indexes obtained before the first performance index is obtained) fluctuate steadily within a certain range.

In another example, the preset condition met by the second performance index may be that the second performance index meets a certain standard set by a user. For example, the accuracy of the first prediction model prediction reaches ninety percent.

Other suitable conditions may also be used as the preset conditions, and are not specifically limited herein.

According to some embodiments, during the training process, parameters of the second predictive model are adjusted.

According to some embodiments, the second predictive model is taken as the target model in response to convergence of the second predictive model.

According to another aspect of the present disclosure, another method 300 of training a neural network model is provided.

As shown in fig. 3, the training method 300 may include:

in step 301, a training data set in an image data set is acquired.

In step 302, a training data set is input into a data processing model with a hyper-parameter set to obtain a first feature set.

In step 303, the first feature set is input into the first prediction model to obtain a first performance index of the first prediction model.

In step 304, the adjustment direction of the hyper-parameter is fed back to the data processing model based on the first performance indicator.

In step 305, the hyper-parameter set is adjusted based on the adjustment direction. The method for adjusting the hyper-parameter set based on the adjustment direction will be described in detail later with reference to fig. 4.

In step 306, the training data set is input into the data processing model with the adjusted hyper-parameter set to obtain a second feature set.

In step 307, the second feature set is input into the first prediction model to obtain a second performance index of the first prediction model.

Steps 301 to 307 in fig. 3 are similar to steps 201 to 207 in fig. 2, and are not repeated herein for brevity.

In step 308, it is determined whether the second performance metric meets a predetermined condition. The preset condition in step 308 and the preset condition in step 208 may be the same, and for brevity, the detailed description is omitted here.

According to some embodiments, in response to the second performance indicator not meeting the preset condition, the adjustment direction of the hyper-parameter is fed back to the data processing model, and then the flow returns to step 305 of adjusting the hyper-parameter set based on the adjustment direction.

In one example, when the second performance indicator does not meet the preset condition, the adjustment direction of the hyper-parameter is fed back to the data processing model, and the hyper-parameter of the data processing model is adjusted again in step 305 based on the adjustment direction of the hyper-parameter fed back to the data processing model in step 304.

In one example, as shown in fig. 3, when the second performance index does not meet the preset condition, step 305 is executed again to adjust the hyper-parameter of the data processing model based on the hyper-parameter adjustment direction determined by the bayesian optimization algorithm.

According to some embodiments, in response to the second performance indicator meeting the preset condition, in step 309, the second feature set is input into the second prediction model for training the second prediction model to obtain the target model. Step 309 is similar to step 208, and for brevity, will not be described again.

According to some embodiments, the image dataset further comprises a validation dataset, the data in the validation dataset being annotated data. In one example, each data set in the image dataset that is set as the validation dataset is labeled and labeled prior to performing step 301.

In the above-described embodiments, the

method

200 or 300 may further include: the validation data set is input to the target model to validate the target model.

By the method, the performance of the trained neural network model is verified, so that the neural network model can process images more accurately when the images are processed.

Referring now to the schematic diagram of FIG. 4, FIG. 4 illustrates the composition of a data processing model.

According to some embodiments, the data processing model comprises an image acquisition sub-model 401 and an image pre-processing sub-model 402, the image acquisition sub-model 401 being configured to process the training data set to obtain the first data set, the image pre-processing sub-model 402 being configured to pre-process the first data set.

In one example, the image acquisition submodel 401 processes the training dataset to obtain a first dataset and passes the generated first dataset to the image pre-processing submodel 402 to perform the pre-processing operation.

According to some embodiments, a plurality of training data samples in the training data set are resampled such that the plurality of training data samples have pixels of the same size.

To overcome this problem, the image acquisition submodel 401 is required to first resample the training data samples, as the training data samples have significant differences in size, contrast, and brightness due to differences in acquisition equipment and parameters.

In one example, the training data samples are resampled to 1 × 1 × 1mm ³ 3-dimensional data of (2).

According to some embodiments, regions of interest of the plurality of resampled training data samples are extracted based on an attention mechanism in the image pre-processing sub-model, resulting in a first data set.

As an example, in an application scenario of the imaging group, different imaging group tasks are different for regions of interest in image data, so the image acquisition sub-model 401 also performs region of interest extraction on a plurality of training data samples after resampling.

In one example, taking the brain glioma study task as an example, the tumor usually appears in a certain region of the brain, so the image acquisition submodel 401 now needs to extract features from the contrast-enhanced region and also features from the region in the vicinity where edema and tumor infiltration occur.

In one example, the image acquisition submodel 401 results in a first data set after performing a region of interest extraction operation on the image data set.

According to some embodiments, a first hyper-parameter of the hyper-parameters to be adjusted is adjusted, the first hyper-parameter being provided in an image pre-processing sub-model for controlling intensity normalization of the first data set.

In one example, the image pre-processing submodel 402, upon receiving the input of the first dataset, needs to decide whether to intensity normalize the region impact of different modalities by a first hyper-parameter due to the existence of region-of-interest contrast inconsistencies.

Corresponding to step 205 of the method 200, when the Bayesian optimization algorithm sets an adjustment direction for the hyperparameter, the image pre-processing sub-model 402 adjusts the first hyperparameter for intensity normalization of the first data set based on the adjustment direction.

According to some embodiments, a second hyper-parameter of the hyper-parameters to be adjusted is adjusted, the second hyper-parameter being provided in the image pre-processing sub-model 402 for controlling the intensity-normalized first data set to be discretized.

In one example, when acquiring an image data set, there is an information bias due to camera movement, and the intensity-normalized first data set needs to be discretized by a second hyper-parameter.

Corresponding to step 205 of the method 200, when the bayesian optimization algorithm sets the adjustment direction of the hyper-parameters, the image pre-processing sub-model 402 adjusts the second hyper-parameters for discretizing the intensity-normalized first data set based on the adjustment direction.

According to some embodiments, the image pre-processing submodel 402 is used to pre-process the first data set to obtain a second data set. The data processing model further comprises a feature extraction submodel for extracting features from the second dataset to obtain a third dataset, and a feature screening submodel for selecting features from the third dataset to obtain the first feature set.

In one example, referring to FIG. 4, the image pre-processing sub-model processes the first data set to obtain a second data set and passes the generated second data set to the feature extraction sub-model 403. The feature extraction submodel 403 then processes the second data to generate a third data set, which is passed to the feature screening submodel. Finally, the third data set is processed by the feature screening submodel to obtain the first feature set mentioned in step 202 of the method 200.

According to some embodiments, at least one image category feature is extracted from the second data set.

In the second dataset, there are a number of features. In one example, the features may be texture, shape, pixel, etc. features of the image.

In one example, different kinds of features may be extracted from each image, such as types of texture, shape, pixel intensity, and so on. The texture features of an image generally include: gray level co-occurrence matrices (GLCM), local Binary Patterns (LBPs), histogram of Oriented Gradients (HOGs), and Haralick texture features. Shape characteristics generally include compactness index, surface area, and volume, among other characteristics. Pixel intensity feature extraction generally includes features such as mean, median, variance, entropy, and quartile values for each sequence.

According to some embodiments, the extracted at least one image class feature is normalized to obtain a third data set.

In one example, since all data have actual physical significance, the values of the data are affected by units, and in order to eliminate the influence of different feature value magnitudes, the feature extraction submodel needs to perform normalization processing on the second data set, so that indexes of different units or magnitudes can be compared and weighted conveniently.

According to some embodiments, a third hyper-parameter of the hyper-parameters to be adjusted is adjusted, the third hyper-parameter being provided in the feature screening sub-model for controlling feature selection of the third data set.

In one example, since the feature extraction operation may extract a large number of features, each type number varying from several to several hundred, directly using all the features for model training may result in the solution parameters exceeding the sample size. Therefore, the feature screening submodel 404 may select a Gradient Boosting Decision Tree (GBDT) algorithm for feature selection, sort the features selected by the GBDT, set a third hyper-parameter as a threshold, select the sorted features, and select the features with importance greater than or equal to the third hyper-parameter.

Corresponding to step 205 of method 200, when the Bayesian optimization algorithm determines an adjustment direction for the hyperparameters, the feature filtering sub-model 404 adjusts third hyperparameters for feature selection on the third data set based on the adjustment direction.

According to some embodiments, a fourth hyper-parameter of the hyper-parameters to be adjusted is adjusted, the fourth hyper-parameter being set in the feature screening submodel 404 for controlling feature dimensionality reduction of features selected from the third dataset.

In one example, to reduce the probability of the over-fitting phenomenon, the feature screening submodel 404 may employ a conventional Principal Component Analysis (PCA). The PCA method is capable of exploring the correlation between features by constructing mutually independent values as the fourth hyper-parameter principal components to retain as much information as possible contained in the original data.

Corresponding to step 205 of method 200, when the bayesian optimization algorithm sets a tuning direction for the hyperparameters, the feature filtering submodel 404 adjusts the fourth hyperparameters for feature selection on the feature-filtered third data set based on the tuning direction.

According to another aspect of the present disclosure, there is also provided an image processing method. The image processing method comprises the following steps: and inputting the image to be detected into an image detection neural network model to obtain a detection result of the image, wherein the image detection neural network model is obtained by training the

method

200 or 300.

According to another aspect of the present disclosure, there is also provided a training apparatus 500. As shown in fig. 5, the neural network model training apparatus 500 includes: a first unit 501 configured for acquiring a training data set in an image data set; a second unit 502 configured to input the training data set into a data processing model provided with a hyper-parameter set, resulting in a first feature set; a third unit 503, configured to input the first feature set into the first prediction model, to obtain a first performance index of the first prediction model; a fourth unit 504 configured to feed back the adjustment direction of the hyper-parameter to the data processing model based on the first performance indicator; a fifth unit 505 configured to adjust the hyper-parameter set based on the adjustment direction; a sixth unit 506, configured to input the training data set into the data processing model provided with the adjusted hyper-parameter set, to obtain a second feature set; a seventh unit 507, configured to input the second feature set into the first prediction model, so as to obtain a second performance index of the first prediction model; and an eighth unit 508, configured to, in response to the second performance indicator meeting the preset condition, input the second feature set to the second prediction model for training the second prediction model to obtain the target model.

According to another aspect of the present disclosure, an image processing apparatus 600 is also provided. As shown in fig. 6, the apparatus 600 includes: a ninth unit 601, configured to input the image to be detected into an image detection neural network model to obtain a detection result of the image, where the image detection neural network model is obtained by training according to the training method 200 or 3000 described above.

According to an embodiment of the present disclosure, there is also provided an electronic device, a readable storage medium, and a computer program product.

Referring to fig. 7, a block diagram of a structure of an electronic device 700, which may be a server or a client of the present disclosure, which is an example of a hardware device that may be applied to aspects of the present disclosure, will now be described. Electronic device is intended to represent various forms of digital electronic computer devices, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other suitable computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the disclosure described and/or claimed herein.

As shown in fig. 7, the device 700 comprises a computing unit 701 which may perform various suitable actions and processes according to a computer program stored in a Read Only Memory (ROM) 702 or a computer program loaded from a storage unit 708 into a Random Access Memory (RAM) 703. In the RAM 703, various programs and data required for the operation of the device 700 can also be stored. The calculation unit 701, the ROM 702, and the RAM 703 are connected to each other by a bus 704. An input/output (I/O) interface 705 is also connected to bus 704.

Various components in the device 700 are connected to the I/O interface 705, including: an input unit 706, an output unit 707, a storage unit 707, and a communication unit 709. The input unit 706 may be any type of device capable of inputting information to the device 700, and the input unit 706 may receive input numeric or character information and generate key signal inputs related to user settings and/or function controls of the electronic device, and may include, but is not limited to, a mouse, a keyboard, a touch screen, a track pad, a track ball, a joystick, a microphone, and/or a remote control. Output unit 707 may be any type of device capable of presenting information and may include, but is not limited to, a display, speakers, a video/audio output terminal, a vibrator, and/or a printer. Storage unit 708 may include, but is not limited to, magnetic or optical disks. The communication unit 709 allows the device 700 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunications networks, and may include, but is not limited to, modems, network cards, infrared communication devices, wireless communication transceivers, and/or chipsets, such as bluetooth (TM) devices, 802.11 devices, wiFi devices, wiMax devices, cellular communication devices, and/or the like.

Computing unit 701 may be a variety of general purpose and/or special purpose processing components with processing and computing capabilities. Some examples of the computing unit 701 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various dedicated Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, and so forth. The computing unit 701 performs the various methods and processes described above, such as the method 200. For example, in some embodiments, the method 200 may be implemented as a computer software program tangibly embodied in a machine-readable medium, such as the storage unit 708. In some embodiments, part or all of a computer program may be loaded onto and/or installed onto device 700 via ROM 702 and/or communications unit 709. When the computer program is loaded into RAM 703 and executed by the computing unit 701, one or more steps of the method 200 described above may be performed. Alternatively, in other embodiments, the computing unit 701 may be configured to perform the method 200 in any other suitable manner (e.g., by way of firmware).

Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuitry, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), system on a chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.

Program code for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program codes, when executed by the processor or controller, cause the functions/operations specified in the flowchart and/or block diagram to be performed. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package partly on the machine and partly on a remote machine or entirely on the remote machine or server.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user may provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), and the Internet.

The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server may be a cloud server, a server of a distributed system, or a server combining a blockchain.

It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present disclosure may be performed in parallel, sequentially or in different orders, and are not limited herein as long as the desired results of the technical aspects of the present disclosure can be achieved.

Although embodiments or examples of the present disclosure have been described with reference to the accompanying drawings, it is to be understood that the above-described methods, systems and apparatus are merely exemplary embodiments or examples and that the scope of the present invention is not limited by these embodiments or examples, but only by the claims as issued and their equivalents. Various elements in the embodiments or examples may be omitted or may be replaced with equivalents thereof. Further, the steps may be performed in an order different from that described in the present disclosure. Further, various elements in the embodiments or examples may be combined in various ways. It is important that as technology evolves, many of the elements described herein may be replaced with equivalent elements that appear after the present disclosure.

Claims

1. A method of training a neural network model, the method comprising the steps of:

acquiring a training data set in an image data set;

inputting the training data set into a data processing model provided with a hyper-parameter set to obtain a first feature set, wherein the data processing model comprises an image acquisition submodel and an image preprocessing submodel, the image acquisition submodel is used for processing the training data set to obtain a first data set, and the image preprocessing submodel is used for preprocessing the first data set;

inputting the first feature set into a first prediction model to obtain a first performance index of the first prediction model;

feeding back the adjustment direction of the hyper-parameter to the data processing model based on the first performance index;

adjusting the hyper-parameter set based on the adjustment direction, comprising:

adjusting a first hyper-parameter in the hyper-parameters to be adjusted, wherein the first hyper-parameter is arranged in the image preprocessing submodel and used for controlling intensity normalization of the first data set; and

adjusting a second hyper-parameter in the hyper-parameters to be adjusted, wherein the second hyper-parameter is arranged in the image preprocessing sub-model and is used for controlling the data discretization of the first data set after intensity normalization;

inputting the training data set into the data processing model provided with the adjusted hyper-parameter set to obtain a second feature set;

inputting the second feature set into the first prediction model to obtain a second performance index of the first prediction model; and

and responding to the second performance index meeting a preset condition, and inputting the second characteristic set into a second prediction model for training the second prediction model to obtain a target model.

2. The method of claim 1, further comprising:

feeding back the adjustment direction of the hyper-parameter to the data processing model in response to the second performance index not meeting the preset condition; and

returning to the step of adjusting the hyper-parameter set based on the adjustment direction.

3. The method of claim 1, wherein feeding back to the data processing model, based on the first performance metric, an adjustment direction of a hyper-parameter comprises:

and fitting the relation between at least one hyper-parameter in the hyper-parameter set and the first performance index based on at least one proxy function in a Bayesian optimization algorithm so as to determine a hyper-parameter to be adjusted in the at least one hyper-parameter and an adjustment direction of the hyper-parameter to be adjusted.

4. The method of claim 1, wherein the processing the training data set to obtain a first data set comprises:

resampling a plurality of training data samples in the training data set such that the plurality of training data samples have pixels of the same size; and

and extracting the regions of interest of the plurality of training data samples after resampling based on an attention mechanism in the image preprocessing submodel to obtain the first data set.

5. The method of claim 1, wherein the image pre-processing sub-model is to pre-process the first data set to obtain a second data set, wherein the data processing model further comprises a feature extraction sub-model to extract features from the second data set to obtain a third data set, and a feature screening sub-model to select features from the third data set to obtain the first feature set, and wherein adjusting the hyper-parameter set based on the adjustment direction further comprises:

adjusting a third hyper-parameter in the hyper-parameters to be adjusted, wherein the third hyper-parameter is arranged in the feature screening submodel and is used for controlling feature selection of the third data set; and

and adjusting a fourth hyper-parameter in the hyper-parameters to be adjusted, wherein the fourth hyper-parameter is arranged in the feature screening submodel and is used for controlling feature dimension reduction of the features selected from the third data set.

6. The method as recited in claim 5, wherein said extracting features from said second data set to obtain a third data set comprises:

extracting at least one image category feature from the second data set; and

and normalizing the extracted at least one image type characteristic to obtain the third data set.

7. The method of any of claims 1 to 6, wherein the training the second predictive model to arrive at an objective model comprises:

adjusting parameters of the second prediction model during a training process; and

in response to the second predictive model converging, treating the second predictive model as the target model.

8. The method of any of claims 1 to 6, wherein the image dataset further comprises a validation dataset, the data in the validation dataset being annotated data, and wherein the method further comprises:

inputting the validation data set to the target model to validate the target model.

9. The method according to any one of claims 1 to 6, wherein, in the step of adjusting the hyper-parameter set based on the adjustment direction, the adjusted hyper-parameter set and the pre-adjusted hyper-parameter set have the same hyper-parameter category, and at least one hyper-parameter has a different value.

10. An image processing method comprising:

inputting an image to be detected into an image detection neural network model to obtain a detection result of the image, wherein the image detection neural network model is obtained by training through the method of any one of claims 1 to 9.

11. An apparatus for training a neural network model, the apparatus comprising:

a first unit configured for acquiring a training data set in an image data set;

a second unit configured to input the training dataset into a data processing model provided with a hyper-parameter set, resulting in a first feature set, wherein the data processing model comprises an image acquisition sub-model and an image pre-processing sub-model, the image acquisition sub-model is configured to process the training dataset to result in a first dataset, and the image pre-processing sub-model is configured to pre-process the first dataset;

a third unit, configured to input the first feature set into a first prediction model, to obtain a first performance index of the first prediction model;

a fourth unit configured to feed back an adjustment direction of a hyper-parameter to the data processing model based on the first performance indicator;

a fifth unit configured to adjust the hyper-parameter set based on the adjustment direction, including:

adjusting a first hyper-parameter of hyper-parameters to be adjusted, wherein the first hyper-parameter is arranged in the image preprocessing sub-model and is used for controlling intensity normalization of the first data set; and

a sixth unit configured to input the training data set into the data processing model provided with the adjusted hyper-parameter set to obtain a second feature set;

a seventh unit, configured to input the second feature set into the first prediction model, to obtain a second performance index of the first prediction model; and

the eighth unit is configured to, in response to the second performance index meeting a preset condition, input the second feature set to a second prediction model for training the second prediction model to obtain a target model.

12. An image prediction apparatus comprising:

a ninth unit, configured to input the image to be detected into an image prediction neural network model, and obtain a prediction result, wherein the image prediction neural network model is obtained by training through the training method of any one of claims 1-9.

13. An electronic device, comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein

The memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-10.

14. A non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of any one of claims 1-10.