CN113627611A - Model training method and device, electronic equipment and storage medium - Google Patents

Model training method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN113627611A
CN113627611A CN202110903615.5A CN202110903615A CN113627611A CN 113627611 A CN113627611 A CN 113627611A CN 202110903615 A CN202110903615 A CN 202110903615A CN 113627611 A CN113627611 A CN 113627611A
Authority
CN
China
Prior art keywords
model
training
stage
parameter
sample data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110903615.5A
Other languages
Chinese (zh)
Inventor
徐华鹏
李坤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Keyun Laser Technology Co Ltd
Original Assignee
Suzhou Keyun Laser Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Keyun Laser Technology Co Ltd filed Critical Suzhou Keyun Laser Technology Co Ltd
Priority to CN202110903615.5A priority Critical patent/CN113627611A/en
Publication of CN113627611A publication Critical patent/CN113627611A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

The embodiment of the application discloses a model training method and device, electronic equipment and a storage medium. Wherein, the method comprises the following steps: according to a first sample data set, carrying out first-stage training on an initial model, determining a model depth parameter and a first-stage model layer parameter, and carrying out parameter updating on the initial model according to the model depth parameter and the first-stage model layer parameter to obtain an intermediate model; and according to a second sample data set, performing second-stage training on the intermediate model, determining model layer parameters of a second stage, and updating parameters of the intermediate model according to the model layer parameters of the second stage to obtain a target model. According to the technical scheme provided by the embodiment of the application, the initial model is trained in two stages, so that the accuracy of the target model is higher.

Description

Model training method and device, electronic equipment and storage medium
Technical Field
The embodiment of the application relates to the technical field of computers, in particular to a model training method and device, electronic equipment and a storage medium.
Background
With the development of internet technology, the application of deep learning technology is more and more extensive. The model training method in the prior art is simple, and the accuracy of the target model obtained through training is poor. Therefore, a model training method with high accuracy of the target model output result is needed.
Disclosure of Invention
The embodiment of the application provides a model training method and device, electronic equipment and a storage medium, so as to rapidly obtain a target model.
In a first aspect, an embodiment of the present application provides a model training method, where the method includes:
according to a first sample data set, carrying out first-stage training on an initial model, determining a model depth parameter and a first-stage model layer parameter, and carrying out parameter updating on the initial model according to the model depth parameter and the first-stage model layer parameter to obtain an intermediate model;
according to a second sample data set, performing second-stage training on the intermediate model, determining model layer parameters of a second stage, and updating parameters of the intermediate model according to the model layer parameters of the second stage to obtain a target model; wherein the number of first sample data sets is smaller than the number of second sample data sets.
In a second aspect, an embodiment of the present application provides a model training apparatus, including:
the first training module is used for carrying out first-stage training on an initial model according to a first sample data set, determining a model depth parameter and a first-stage model layer parameter, and carrying out parameter updating on the initial model according to the model depth parameter and the first-stage model layer parameter to obtain an intermediate model;
the second training module is used for performing second-stage training on the intermediate model according to a second sample data set, determining model layer parameters of a second stage, and updating parameters of the intermediate model according to the model layer parameters of the second stage to obtain a target model; wherein the number of first sample data sets is smaller than the number of second sample data sets.
In a third aspect, an embodiment of the present application provides an electronic device, including:
one or more processors;
storage means for storing one or more programs;
when executed by the one or more processors, cause the one or more processors to implement the model training method of any of the embodiments of the present application.
In a fourth aspect, the embodiments of the present application provide a computer-readable storage medium, on which a computer program is stored, where the program, when executed by a processor, implements the model training method according to any of the embodiments of the present application.
The embodiment of the application provides a model training method, a model training device, electronic equipment and a storage medium, wherein an initial model is trained in a first stage according to a first sample data set, a model depth parameter and a first-stage model layer parameter are determined, and the initial model is updated according to the model depth parameter and the first-stage model layer parameter to obtain an intermediate model; according to a second sample data set, performing second-stage training on the intermediate model, determining model layer parameters of a second stage, and updating parameters of the intermediate model according to the model layer parameters of the second stage to obtain a target model; wherein the number of first sample data sets is smaller than the number of second sample data sets. By executing the scheme, the accuracy of the target model can be higher by training the initial model in two stages.
It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present application, nor do they limit the scope of the present application. Other features of the present application will become apparent from the following description.
Drawings
The drawings are included to provide a better understanding of the present solution and are not intended to limit the present application. Wherein:
fig. 1 is a first flowchart of a model training method according to an embodiment of the present disclosure;
fig. 2A is a second flowchart of a model training method according to the second embodiment of the present application;
fig. 2B is a schematic diagram of a first image of a model training method according to a second embodiment of the present application;
fig. 2C is a schematic diagram of a second image of a model training method according to a second embodiment of the present application;
fig. 2D is a schematic diagram of a third image of a model training method according to the second embodiment of the present application;
fig. 3 is a schematic structural diagram of a model training apparatus according to a third embodiment of the present application;
FIG. 4 is a block diagram of an electronic device for implementing a model training method according to an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
Example one
Fig. 1 is a first flowchart of a model training method according to an embodiment of the present disclosure, which is applicable to a case where a model is trained using a sample data set to obtain a target model. The model training method provided by this embodiment may be executed by the model training apparatus provided by this embodiment, which may be implemented in software and/or hardware and integrated in an electronic device executing this method.
Referring to fig. 1, the method of the present embodiment includes, but is not limited to, the following steps:
s110, according to the first sample data set, performing first-stage training on the initial model, determining a model depth parameter and a first-stage model layer parameter, and updating the initial model according to the model depth parameter and the first-stage model layer parameter to obtain an intermediate model.
The first sample data set may be data required for training the initial model, for example, if the initial model is used for performing a task of calibrating a circuit structure in an image, the first sample data set of the present embodiment may be an image set of a liquid crystal panel microcircuit structure. Alternatively, the initial model may be any network model that needs to be trained, for example, a semantic segmentation Unet model. The model depth parameter refers to the number of layers of the model. The model layer parameters include the scale of the parameters inside the model, the rate at which the neural network elements are discarded (i.e., Dropout), the size of the Batch size (i.e., Batch _ size), etc.
In this embodiment of the application, according to the function to be implemented by the model to be trained, training data sets under different conditions required for training the model are collected, for example, if the function to be implemented by the model to be trained is to calibrate a circuit structure in an image, the collected training data may be images of different liquid crystal panel microcircuit structures, and the obtained training data sets (such as images of different liquid crystal panel microcircuit structures) are preprocessed to obtain a first sample data set. And training the initial model according to the first sample data set to obtain an intermediate model.
Optionally, in this step, according to the first sample data set, the initial model is subjected to first-stage training, a model depth parameter and a model layer parameter of the first stage are determined, and the initial model is subjected to parameter updating according to the model depth parameter and the model layer parameter of the first stage, so that the specific process of obtaining the intermediate model can be implemented through the following two substeps:
s1101, according to the first sample data set, carrying out model depth training on the initial model, determining a model depth parameter, and updating the model depth of the initial model according to the model depth parameter to obtain an initial intermediate model.
In an embodiment of the present application, the first sample dataset comprises a training dataset and a testing dataset. When the training data set is adopted to carry out model depth iterative training on the initial model, the initial model can be continuously deepened along with the depth of the model, the accuracy rate approaches to one hundred percent, the initial model becomes specialized, and the universality of model prediction is lost. Therefore, when the initial model is trained in the first stage, a mode of training while testing is adopted, namely after the initial model is subjected to model deep iterative training by adopting a training data set, the initial model after each iterative training is tested by adopting test data.
In the embodiment of the present application, the specific process of performing model depth training on the initial model according to the first sample data set to determine the model depth parameter is as follows: performing model depth iterative training on the initial model by adopting a first sample data set to obtain model depth corresponding to each iterative training; testing the initial model after each iterative training by adopting test data, and determining the error rate of the model after each iterative training; and taking the model depth corresponding to the iterative training with the lowest error rate as a model depth parameter. And taking the obtained model depth parameter as the model depth of the initial model to obtain an initial intermediate model. When the initial model after each iterative training is tested by adopting the test data, the model depth corresponding to the iterative training with the lowest error rate is selected to prevent the overfitting phenomenon of the model training.
And S1102, performing layer parameter training on the initial intermediate model according to the first sample data set, determining the model layer parameters of the first stage, and updating the model layer parameters of the initial intermediate model according to the model layer parameters of the first stage to obtain the final intermediate model.
In the embodiment of the present application, through the above step S1101, model depth training is performed on the initial model to determine the model depth of the initial model, so as to obtain an initial intermediate model. On the basis, layer parameter training is carried out on the initial intermediate model according to the first sample data set, and model layer parameters of the first stage are determined.
In the embodiment of the present application, the specific process of performing layer parameter training on the initial intermediate model according to the first sample data set to determine the model layer parameters in the first stage is as follows: determining a model layer parameter orthogonal table according to the type of the model layer parameter; and performing layer parameter training on the initial intermediate model according to the orthogonality of the first sample data set and the model layer parameters, recording the error rate of each training, and taking the layer parameter corresponding to the layer parameter training with the lowest error rate as the model layer parameter of the first stage.
For example, the model layer parameter orthogonal table may be determined by: three categories of model layer parameters are assumed, namely the scale of the model internal parameters, the ratio of Dropout used, and the size of batch size. Firstly, ten values are selected in the value range of each model layer parameter, and the values can be selected at equal intervals. Ten values selected for each model layer parameter are then arranged and combined. Finally, one thousand arrays, namely, the model layer parameter orthogonal table can be obtained.
In the embodiment of the application, according to the determined model layer parameter orthogonal table, the sample images in the first sample data set are sequentially input to the initial intermediate model, the layer parameters of the initial intermediate model are trained, the error rate of each training is recorded, and the three-dimensional images of the error rate, the iteration times and the model layer parameters are drawn. And finding the layer parameter corresponding to the lowest error rate in the three-dimensional image, and taking the layer parameter as the model layer parameter of the first stage.
And S120, performing second-stage training on the intermediate model according to the second sample data set, determining model layer parameters of the second stage, and updating parameters of the intermediate model according to the model layer parameters of the second stage to obtain the target model.
The second sample data set is data required for performing the second stage training on the intermediate model, for example, if the intermediate model is used for performing a task of calibrating a circuit structure in an image, the second sample data set is the same as the first sample data set. Optionally, if the function to be implemented by the model to be trained is to identify the circuit structure in the image of the liquid crystal panel microcircuit structure, the first sample data set and the second sample data set are both image sets of the liquid crystal panel microcircuit structure. Wherein the number of first sample data sets is smaller than the number of second sample data sets. Optionally, the first sample data set may be a part of the second sample data set, or may not be, which is not limited in this embodiment. Correspondingly, if the first sample data set and the second sample data set are image sets of the liquid crystal panel microcircuit structure, the finally obtained target model is used for identifying the circuit structure in the image of the liquid crystal panel microcircuit structure.
In the embodiment of the present application, through the step S110, the initial model is subjected to the first-stage training according to the first sample data set, and the model depth parameter and the model layer parameter of the first stage are determined, so as to perform parameter updating on the initial model. When the initial model is trained in the first stage, the number of the adopted first sample data sets is small, and the purpose is to rapidly obtain the depth value and the rough model layer parameter of the model, so that the depth value and the rough model layer parameter of the model in the initial model are updated, and an intermediate model is obtained. In the step, the second-stage training is carried out on the intermediate model by using the second sample data sets with larger quantity, so that a more accurate target model is obtained. Wherein, the specific process of the second stage training is as follows: performing iterative training on the intermediate model according to the second sample data set, judging whether the output result of the intermediate model meets the expected effect, and stopping the iterative training if the output result meets the expected effect to obtain a target model; and if the expected effect is not met, fine adjustment is carried out on the parameters of the model layer until the output result of the intermediate model meets the expected effect, and the target model is obtained. Optionally, the mode of fine-tuning the model layer parameters may be to make the model layer parameters orthogonal, which is the same as the determination process of the model layer parameters in the first stage.
According to the technical scheme provided by the embodiment, the initial model is trained in the first stage according to the first sample data set, the model depth parameter and the model layer parameter in the first stage are determined, and the initial model is subjected to parameter updating according to the model depth parameter and the model layer parameter in the first stage to obtain an intermediate model; and performing second-stage training on the intermediate model according to the second sample data set, determining model layer parameters of the second stage, and updating parameters of the intermediate model according to the model layer parameters of the second stage to obtain the target model. According to the method, the deep learning model meeting the expected requirements is obtained through model training in two stages, and the microcircuit structure of the liquid crystal panel can be accurately identified. By executing the scheme, the accuracy of the target model can be higher by training the initial model in two stages.
Example two
Fig. 2A is a second flowchart of a model training method according to the second embodiment of the present application; fig. 2B is a schematic diagram of a first image of a model training method according to a second embodiment of the present application; fig. 2C is a schematic diagram of a second image of a model training method according to a second embodiment of the present application; fig. 2D is a third image schematic diagram of a model training method according to the second embodiment of the present application. The embodiment of the application is optimized on the basis of the above embodiment, and provides a detailed explanation on the determination process of the sample data set and the process of obtaining the target image labeled with the circuit structure through the target model when the target model for performing the calibration task on the circuit structure in the image is trained and executed.
Referring to fig. 2A, the method of the present embodiment includes, but is not limited to, the following steps:
s210, extracting at least two image areas to be calibrated containing the liquid crystal panel microcircuit structure from the original circuit image.
In the embodiment of the application, the image area to be calibrated is obtained by randomly extracting the areas containing different characteristics of the microcircuit structure of the liquid crystal panel from the original circuit images with different colors and different illumination intensities. Thus, the final image area to be calibrated can contain different colors, different illumination intensities, different features, different orientations (positions and angles). The method has the advantages that the types of the sample images are more, and the target model trained through the sample images can have good identification accuracy rate for images with different illumination intensities and smaller color differences.
S220, preprocessing at least two image areas to be calibrated, calibrating a circuit structure of the preprocessed image to be calibrated to obtain a sample image, and adding the sample image to the first sample data set.
At least two image areas to be calibrated correspond to areas with different features and/or different directions of the original circuit image.
In the embodiment of the application, because the image to be calibrated includes images with different orientations and different sizes, the image area to be calibrated is preprocessed, wherein the preprocessing includes adjusting the size, the rotation angle and normalization of the image. And then, carrying out artificial circuit structure calibration on the preprocessed image to be calibrated, and adding the image to be calibrated into the first sample data set as a sample image. In fig. 2B, the two left diagrams S1 and S3 are image areas to be calibrated including the liquid crystal panel microcircuit structure extracted from the original circuit image, and the two right diagrams S2 and S4 are sample images obtained by preprocessing the two left diagrams S1 and S3 and calibrating the circuit structure, respectively.
S230, according to the first sample data set, performing first-stage training on the initial model, determining a model depth parameter and a first-stage model layer parameter, and updating the initial model according to the model depth parameter and the first-stage model layer parameter to obtain an intermediate model.
S240, performing second-stage training on the intermediate model according to the second sample data set, determining model layer parameters of the second stage, and updating parameters of the intermediate model according to the model layer parameters of the second stage to obtain a target model; wherein the number of first sample data sets is smaller than the number of second sample data sets. Optionally, the first sample data set and the second sample data set are image sets of the liquid crystal panel microcircuit structure.
And S250, inputting the image to be processed containing the liquid crystal panel microcircuit structure into a target model to obtain a target image marked with the circuit structure.
In the embodiment of the application, the initial model is trained in two stages through the steps to obtain the target model. Inputting an image to be processed containing the liquid crystal panel microcircuit structure into a target model, wherein the target model can identify the microcircuit structure in the image and output a target image marked with the microcircuit structure. In fig. 2C, 2D, S5 and S8 are output images of the target model, S6 and S9 are input images of the target model, and S7 and S10 are desired output images. The target model in fig. 2C is an untrained initial model, and it can be seen that the output image of the target model is very blurred and has very poor effect. The target model in fig. 2D is obtained through the above steps, and compared with fig. 2C, it can be seen that the output image of the target model is very good, and is close to the desired output image.
According to the technical scheme provided by the embodiment, the image area to be calibrated is extracted from the original circuit image, and the image area to be calibrated is preprocessed; performing circuit structure calibration on the preprocessed image to be calibrated to obtain a sample image, and adding the sample image to the first sample data set; performing first-stage training on the initial model according to the first sample data set, determining a model depth parameter and a first-stage model layer parameter, and performing parameter updating on the initial model to obtain an intermediate model; performing second-stage training on the intermediate model according to the second sample data set, determining model layer parameters of the second stage, and updating the parameters of the intermediate model to obtain a target model; and finally, inputting the image to be processed into a target model to obtain a target image marked with a circuit structure. According to the method and the device, the image area to be calibrated is randomly extracted from the original circuit image, so that the types of the sample image are more, and the target model trained through the sample image can have good identification accuracy rate for images with smaller color difference under different illumination intensities. According to the scheme, the target model is obtained by training the initial model in two stages, and the problem of the cost that a large amount of manual calibration and circuit structure division are needed in the prior art is solved. By executing the scheme, the microcircuit structure in the liquid crystal panel can be identified more accurately.
EXAMPLE III
Fig. 3 is a schematic structural diagram of a model training apparatus according to an embodiment of the present disclosure, and as shown in fig. 3, the apparatus 300 may include:
the first training module 310 is configured to perform a first-stage training on an initial model according to a first sample data set, determine a model depth parameter and a first-stage model layer parameter, and perform parameter updating on the initial model according to the model depth parameter and the first-stage model layer parameter to obtain an intermediate model.
The second training module 320 is configured to perform second-stage training on the intermediate model according to a second sample data set, determine a model layer parameter of a second stage, and perform parameter updating on the intermediate model according to the model layer parameter of the second stage to obtain a target model; wherein the number of first sample data sets is smaller than the number of second sample data sets.
Further, the first training module 310 may be specifically configured to: according to the first sample data set, carrying out model depth training on an initial model, determining a model depth parameter, and updating the model depth of the initial model according to the model depth parameter to obtain an initial intermediate model; and according to the first sample data set, carrying out layer parameter training on the initial intermediate model, determining the model layer parameters of the first stage, and updating the model layer parameters of the initial intermediate model according to the model layer parameters of the first stage to obtain the final intermediate model.
Further, the first training module 310 may be specifically configured to: performing model depth iterative training on the initial model by adopting a first sample data set to obtain model depth corresponding to each iterative training; testing the initial model after each iterative training by adopting test data, and determining the error rate of the model after each iterative training; and taking the model depth corresponding to the iterative training with the lowest error rate as the model depth parameter.
Further, the first training module 310 may be specifically configured to: determining a model layer parameter orthogonal table according to the type of the model layer parameter; and performing layer parameter training on the initial intermediate model according to the orthogonality of the first sample data set and the model layer parameters, and determining the model layer parameters of the first stage.
Optionally, at least two image areas to be calibrated containing the liquid crystal panel microcircuit structure are extracted from the original circuit image; preprocessing at least two image areas to be calibrated, calibrating a circuit structure of the preprocessed image to be calibrated to obtain a sample image, and adding the sample image to the first sample data set.
Optionally, the first sample data set and the second sample data set are image sets of a liquid crystal panel microcircuit structure; the target model is used to identify circuit structures in the image of the liquid crystal panel microcircuit structure.
Optionally, at least two image areas to be calibrated correspond to areas where different features and/or different orientations of the original circuit image are located.
Optionally, the image to be processed including the liquid crystal panel microcircuit structure is input into the target model, so as to obtain a target image marked with the circuit structure.
The model training device provided by the embodiment can be applied to the model training method provided by any embodiment, and has corresponding functions and beneficial effects.
Example four
Fig. 4 is a schematic structural diagram of an electronic device according to a fifth embodiment of the present invention, and fig. 4 shows a block diagram of an exemplary electronic device suitable for implementing the embodiment of the present invention. The electronic device shown in fig. 4 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present invention. The electronic device can be a smart phone, a tablet computer, a notebook computer, a vehicle-mounted terminal, a wearable device and the like.
As shown in fig. 4, electronic device 400 is embodied in the form of a general purpose computing device. The components of electronic device 400 may include, but are not limited to: one or more processors or processing units 416, a memory 428, and a bus 418 that couples the various system components including the memory 428 and the processing unit 416.
Bus 418 represents one or more of any of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, and a processor or local bus using any of a variety of bus architectures. By way of example, such architectures include, but are not limited to, Industry Standard Architecture (ISA) bus, micro-channel architecture (MAC) bus, enhanced ISA bus, Video Electronics Standards Association (VESA) local bus, and Peripheral Component Interconnect (PCI) bus.
Electronic device 400 typically includes a variety of computer system readable media. Such media may be any available media that is accessible by electronic device 400 and includes both volatile and nonvolatile media, removable and non-removable media.
Memory 428 can include computer system readable media in the form of volatile memory, such as Random Access Memory (RAM)430 and/or cache memory 432. The electronic device 400 may further include other removable/non-removable, volatile/nonvolatile computer system storage media. By way of example only, storage system 434 may be used to read from and write to non-removable, nonvolatile magnetic media (not shown in FIG. 4, commonly referred to as a "hard drive"). Although not shown in FIG. 4, a magnetic disk drive for reading from and writing to a removable, nonvolatile magnetic disk (e.g., a "floppy disk") and an optical disk drive for reading from or writing to a removable, nonvolatile optical disk (e.g., a CD-ROM, DVD-ROM, or other optical media) may be provided. In these cases, each drive may be connected to bus 418 by one or more data media interfaces. Memory 428 may include at least one program product having a set (e.g., at least one) of program modules that are configured to carry out the functions of embodiments of the invention.
A program/utility 440 having a set (at least one) of program modules 442 may be stored, for instance, in memory 428, such program modules 442 including, but not limited to, an operating system, one or more application programs, other program modules, and program data, each of which examples or some combination thereof may comprise an implementation of a network environment. Program modules 442 generally perform the functions and/or methodologies of embodiments described herein.
The electronic device 400 may also communicate with one or more external devices 414 (e.g., keyboard, pointing device, display 424, etc.), with one or more devices that enable a user to interact with the electronic device 400, and/or with any devices (e.g., network card, modem, etc.) that enable the electronic device 400 to communicate with one or more other computing devices. Such communication may occur via input/output (I/O) interfaces 422. Also, electronic device 400 may communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network such as the Internet) through network adapter 420. As shown in FIG. 4, network adapter 420 communicates with the other modules of electronic device 400 over bus 418. It should be appreciated that although not shown in FIG. 4, other hardware and/or software modules may be used in conjunction with electronic device 400, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, among others.
The processing unit 416 executes programs stored in the memory 428 to perform various functional applications and data processing, such as implementing the product supervision methods provided by any of the embodiments of the present invention.
EXAMPLE five
Fifth embodiment of the present invention further provides a computer-readable storage medium, on which a computer program (or referred to as computer-executable instructions) is stored, where the computer program, when executed by a processor, can be used to execute the product supervision method provided by any of the above embodiments of the present invention.
Computer storage media for embodiments of the invention may employ any combination of one or more computer-readable media. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations for embodiments of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present invention and the technical principles employed. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the embodiments of the present invention have been described in more detail through the above embodiments, the embodiments of the present invention are not limited to the above embodiments, and many other equivalent embodiments may be included without departing from the spirit of the present invention, and the scope of the present invention is determined by the scope of the appended claims.

Claims (11)

1. A method of model training, the method comprising:
according to a first sample data set, carrying out first-stage training on an initial model, determining a model depth parameter and a first-stage model layer parameter, and carrying out parameter updating on the initial model according to the model depth parameter and the first-stage model layer parameter to obtain an intermediate model;
according to a second sample data set, performing second-stage training on the intermediate model, determining model layer parameters of a second stage, and updating parameters of the intermediate model according to the model layer parameters of the second stage to obtain a target model; wherein the number of first sample data sets is smaller than the number of second sample data sets.
2. The method of claim 1, wherein the performing a first stage training on an initial model according to a first sample dataset to determine a model depth parameter and a first stage model layer parameter, and performing a parameter update on the initial model according to the model depth parameter and the first stage model layer parameter to obtain an intermediate model comprises:
according to the first sample data set, carrying out model depth training on an initial model, determining a model depth parameter, and updating the model depth of the initial model according to the model depth parameter to obtain an initial intermediate model;
and according to the first sample data set, carrying out layer parameter training on the initial intermediate model, determining the model layer parameters of the first stage, and updating the model layer parameters of the initial intermediate model according to the model layer parameters of the first stage to obtain the final intermediate model.
3. The method of claim 2, wherein the model depth training of the initial model from the first sample dataset to determine model depth parameters comprises:
performing model depth iterative training on the initial model by adopting a first sample data set to obtain model depth corresponding to each iterative training;
testing the initial model after each iterative training by adopting test data, and determining the error rate of the model after each iterative training;
and taking the model depth corresponding to the iterative training with the lowest error rate as the model depth parameter.
4. The method of claim 2, wherein the training of layer parameters of the initial intermediate model from the first sample dataset to determine the first stage model layer parameters comprises:
determining a model layer parameter orthogonal table according to the type of the model layer parameter;
and performing layer parameter training on the initial intermediate model according to the orthogonality of the first sample data set and the model layer parameters, and determining the model layer parameters of the first stage.
5. The method of claim 1, further comprising:
extracting at least two image areas to be calibrated containing a liquid crystal panel microcircuit structure from an original circuit image;
preprocessing at least two image areas to be calibrated, calibrating a circuit structure of the preprocessed image to be calibrated to obtain a sample image, and adding the sample image to the first sample data set.
6. The method according to any of claims 1-5, wherein the first sample data set and the second sample data set are image sets of liquid crystal panel microcircuit structures; the target model is used to identify circuit structures in the image of the liquid crystal panel microcircuit structure.
7. The method according to claim 5, characterized in that at least two image areas to be calibrated correspond to areas where different features and/or different orientations of the original circuit image are located.
8. The method of claim 5, further comprising:
and inputting the image to be processed containing the liquid crystal panel microcircuit structure into the target model to obtain a target image marked with the circuit structure.
9. A model training apparatus, the apparatus comprising:
the first training module is used for carrying out first-stage training on an initial model according to a first sample data set, determining a model depth parameter and a first-stage model layer parameter, and carrying out parameter updating on the initial model according to the model depth parameter and the first-stage model layer parameter to obtain an intermediate model;
the second training module is used for performing second-stage training on the intermediate model according to a second sample data set, determining model layer parameters of a second stage, and updating parameters of the intermediate model according to the model layer parameters of the second stage to obtain a target model; wherein the number of first sample data sets is smaller than the number of second sample data sets.
10. An electronic device, characterized in that the electronic device comprises:
one or more processors;
storage means for storing one or more programs;
when executed by the one or more processors, cause the one or more processors to implement the model training method of any one of claims 1-8.
11. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the model training method according to any one of claims 1 to 8.
CN202110903615.5A 2021-08-06 2021-08-06 Model training method and device, electronic equipment and storage medium Pending CN113627611A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110903615.5A CN113627611A (en) 2021-08-06 2021-08-06 Model training method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110903615.5A CN113627611A (en) 2021-08-06 2021-08-06 Model training method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN113627611A true CN113627611A (en) 2021-11-09

Family

ID=78383326

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110903615.5A Pending CN113627611A (en) 2021-08-06 2021-08-06 Model training method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113627611A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113836566A (en) * 2021-11-26 2021-12-24 腾讯科技(深圳)有限公司 Model processing method, device, equipment and medium based on block chain system

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108009525A (en) * 2017-12-25 2018-05-08 北京航空航天大学 A kind of specific objective recognition methods over the ground of the unmanned plane based on convolutional neural networks
US20200151514A1 (en) * 2018-11-09 2020-05-14 Canon Kabushiki Kaisha Training and application method of neural network model, apparatus, system and storage medium
US20200160212A1 (en) * 2018-11-21 2020-05-21 Korea Advanced Institute Of Science And Technology Method and system for transfer learning to random target dataset and model structure based on meta learning
CN112906865A (en) * 2021-02-19 2021-06-04 深圳大学 Neural network architecture searching method and device, electronic equipment and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108009525A (en) * 2017-12-25 2018-05-08 北京航空航天大学 A kind of specific objective recognition methods over the ground of the unmanned plane based on convolutional neural networks
US20200151514A1 (en) * 2018-11-09 2020-05-14 Canon Kabushiki Kaisha Training and application method of neural network model, apparatus, system and storage medium
US20200160212A1 (en) * 2018-11-21 2020-05-21 Korea Advanced Institute Of Science And Technology Method and system for transfer learning to random target dataset and model structure based on meta learning
CN112906865A (en) * 2021-02-19 2021-06-04 深圳大学 Neural network architecture searching method and device, electronic equipment and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113836566A (en) * 2021-11-26 2021-12-24 腾讯科技(深圳)有限公司 Model processing method, device, equipment and medium based on block chain system
CN113836566B (en) * 2021-11-26 2022-03-29 腾讯科技(深圳)有限公司 Model processing method, device, equipment and medium based on block chain system

Similar Documents

Publication Publication Date Title
CN108280477B (en) Method and apparatus for clustering images
CN110111334B (en) Crack segmentation method and device, electronic equipment and storage medium
WO2022105125A1 (en) Image segmentation method and apparatus, computer device, and storage medium
CN111028246A (en) Medical image segmentation method and device, storage medium and electronic equipment
CN107274442A (en) A kind of image-recognizing method and device
EP3620982B1 (en) Sample processing method and device
CN111311613A (en) Image segmentation model training method, image segmentation method and device
CN111444807A (en) Target detection method, device, electronic equipment and computer readable medium
CN112308069A (en) Click test method, device, equipment and storage medium for software interface
CN112183627A (en) Method for generating predicted density map network and vehicle annual inspection mark number detection method
CN115907970A (en) Credit risk identification method and device, electronic equipment and storage medium
CN111291902A (en) Detection method and device for rear door sample and electronic equipment
CN113627611A (en) Model training method and device, electronic equipment and storage medium
CN112508005B (en) Method, apparatus, device and storage medium for processing image
CN114741697B (en) Malicious code classification method and device, electronic equipment and medium
CN115049003A (en) Pre-training model fine-tuning method, device, equipment and storage medium
CN112287144B (en) Picture retrieval method, equipment and storage medium
CN111242322B (en) Detection method and device for rear door sample and electronic equipment
CN112906652A (en) Face image recognition method and device, electronic equipment and storage medium
CN113537026A (en) Primitive detection method, device, equipment and medium in building plan
CN112801201A (en) Deep learning visual inertial navigation combined navigation design method based on standardization
CN112035732A (en) Method, system, equipment and storage medium for expanding search results
CN114187549A (en) Training method and device of scene classification model, electronic equipment and storage medium
CN111414921A (en) Sample image processing method and device, electronic equipment and computer storage medium
CN116503849B (en) Abnormal address identification method, device, electronic equipment and computer readable medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination