WO2022042123A1 - Procédé et appareil générateurs de modèles de reconnaissance d'images, dispositif informatique et support de stockage - Google Patents

Procédé et appareil générateurs de modèles de reconnaissance d'images, dispositif informatique et support de stockage Download PDF

Info

Publication number
WO2022042123A1
WO2022042123A1 PCT/CN2021/106635 CN2021106635W WO2022042123A1 WO 2022042123 A1 WO2022042123 A1 WO 2022042123A1 CN 2021106635 W CN2021106635 W CN 2021106635W WO 2022042123 A1 WO2022042123 A1 WO 2022042123A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
sample
loss value
sample image
recognition model
Prior art date
Application number
PCT/CN2021/106635
Other languages
English (en)
Chinese (zh)
Inventor
崔洁全
刘枢
田倬韬
贾佳亚
Original Assignee
深圳思谋信息科技有限公司
上海思谋科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳思谋信息科技有限公司, 上海思谋科技有限公司 filed Critical 深圳思谋信息科技有限公司
Priority to JP2022564577A priority Critical patent/JP7376731B2/ja
Publication of WO2022042123A1 publication Critical patent/WO2022042123A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Definitions

  • the present application relates to an image recognition model generation method, apparatus, computer equipment and storage medium.
  • a first aspect of the present application provides a method for generating an image recognition model, the method comprising:
  • the sample image set includes a plurality of sample image subsets of which the number of images decreases sequentially, and the multiple sample image subsets all contain the same number of image categories;
  • the image recognition model to be trained is trained to obtain the loss value of the image recognition model to be trained;
  • the image recognition model to be trained includes a plurality of branch neural networks, each branch neural network uses For identifying the corresponding image;
  • the loss value includes a target classification loss value and a classification loss value corresponding to each of the branch neural networks, and the target classification loss value is the image recognition model to be trained for the sample.
  • a loss value for the set of images the classification loss value is the loss value of the corresponding branch neural network for the subset of sample images corresponding to the branch neural network;
  • the model parameters of the image recognition model to be trained are adjusted according to the loss value, until the loss value is lower than a preset threshold, the image recognition model to be trained is regarded as the image recognition model after training.
  • a second aspect of the present application provides an apparatus for generating an image recognition model, the apparatus comprising:
  • an acquisition module configured to acquire a sample image set;
  • the sample image set includes a plurality of sample image subsets whose number of images decreases sequentially, and the multiple sample image subsets all contain the same number of image categories;
  • the training module is used to train the image recognition model to be trained according to the sample image set, and obtain the loss value of the image recognition model to be trained;
  • the image recognition model to be trained includes a plurality of branch neural networks, each Each branch neural network is used to identify the corresponding image;
  • the loss value includes a target classification loss value and a classification loss value corresponding to each of the branch neural networks, and the target classification loss value is the image recognition to be trained.
  • a loss value of the model for the set of sample images, and the classification loss value is a loss value of the corresponding branch neural network for a subset of sample images corresponding to the branch neural network;
  • an adjustment module configured to adjust the model parameters of the image recognition model to be trained according to the loss value, until the loss value is lower than a preset threshold, use the image recognition model to be trained as the image recognition model after training Model.
  • a third aspect of the present application provides a computer device, comprising a memory and a processor, the memory stores a computer program, and the processor implements the following steps when executing the computer program:
  • the sample image set includes a plurality of sample image subsets of which the number of images decreases sequentially, and the multiple sample image subsets all contain the same number of image categories;
  • the image recognition model to be trained is trained to obtain the loss value of the image recognition model to be trained;
  • the image recognition model to be trained includes a plurality of branch neural networks, each branch neural network uses For identifying the corresponding image;
  • the loss value includes a target classification loss value and a classification loss value corresponding to each of the branch neural networks, and the target classification loss value is the image recognition model to be trained for the sample.
  • a loss value for the set of images the classification loss value is the loss value of the corresponding branch neural network for the subset of sample images corresponding to the branch neural network;
  • the model parameters of the image recognition model to be trained are adjusted according to the loss value, until the loss value is lower than a preset threshold, the image recognition model to be trained is regarded as the image recognition model after training.
  • a fourth aspect of the present application provides a computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, the following steps are implemented:
  • the sample image set includes a plurality of sample image subsets of which the number of images decreases sequentially, and the multiple sample image subsets all contain the same number of image categories;
  • the image recognition model to be trained is trained to obtain the loss value of the image recognition model to be trained;
  • the image recognition model to be trained includes a plurality of branch neural networks, each branch neural network uses For identifying the corresponding image;
  • the loss value includes a target classification loss value and a classification loss value corresponding to each of the branch neural networks, and the target classification loss value is the image recognition model to be trained for the sample.
  • a loss value for the set of images the classification loss value is the loss value of the corresponding branch neural network for the subset of sample images corresponding to the branch neural network;
  • the model parameters of the image recognition model to be trained are adjusted according to the loss value, until the loss value is lower than a preset threshold, the image recognition model to be trained is regarded as the image recognition model after training.
  • FIG. 1 is an application environment diagram of a method for generating an image recognition model in one embodiment.
  • FIG. 2 is a schematic flowchart of a method for generating an image recognition model in one embodiment.
  • FIG. 3 is a schematic structural diagram of a branched neural network in one embodiment.
  • FIG. 4 is a schematic flowchart of a step of training an image recognition model to be trained to obtain a loss value in one embodiment.
  • FIG. 5 is a schematic flowchart of a step of determining a loss value of an image recognition model to be trained in one embodiment.
  • FIG. 6 is a schematic flowchart of a method for obtaining a sample image subset and a sample image set in one embodiment.
  • FIG. 7 is a structural block diagram of an apparatus for generating an image recognition model in an embodiment.
  • FIG. 8 is a diagram of the internal structure of a computer device in one embodiment.
  • the result is usually that the neural network can identify a small number of categories with more image data, and identify most categories with less image data.
  • the accuracy of the image recognition model is poor; it can be seen that if the long-tailed distribution characteristics are ignored in the image recognition model generation, the performance of the image recognition model will be greatly reduced in actual use. Therefore, through the existing image recognition model generation method, the recognition effect of the obtained image recognition model is still poor.
  • the image recognition model generation method provided in this application can be applied to the application environment shown in FIG. 1 .
  • the terminal 11 communicates with the server 12 through the network.
  • the server 12 obtains the sample image set sent by the terminal 11 through the network; the sample image set includes a plurality of sample image subsets with the number of images decreasing in turn, and the multiple sample image subsets all contain the same number of image categories; the server 12 according to the sample image set,
  • the image recognition model to be trained is trained to obtain the loss value of the image recognition model to be trained;
  • the image recognition model to be trained includes a plurality of branch neural networks, and each branch neural network is used to identify the corresponding image;
  • the loss value includes The target classification loss value and the classification loss value corresponding to each branch neural network, the target classification loss value is the loss value of the image recognition model to be trained for the sample image set, and the classification loss value is the corresponding branch neural network corresponding to the branch neural network.
  • the terminal 11 can send the image to be recognized to the server 12 and obtain the recognition result returned by the server 12 .
  • the terminal 11 can be, but is not limited to, various personal computers, notebook computers, smart phones, tablet computers and portable wearable devices, and the server 12 can be implemented by an independent server or a server cluster composed of multiple servers.
  • a method for generating an image recognition model is provided, and the method is applied to the server 12 in FIG. 1 as an example for description, including the following steps:
  • Step 21 Obtain a sample image set; the sample image set includes a plurality of sample image subsets of which the number of images decreases sequentially, and the multiple sample image subsets all contain the same number of image categories.
  • the sample image set is a data set containing all sample images, consisting of multiple sample image subsets, each sample image subset contains sample images of one or more image categories, and each sample image subset contains images
  • the categories are different; in addition, the total number of images contained in the sample image subsets is not the same, and the number tends to decrease in turn.
  • the image category has 10 images.
  • a and B can form a sample image subset containing 180 sample images
  • image categories C and D can form a sample image subset containing 100 sample images
  • image categories E and F can form a sample image subset containing 30 images The sample image subset of the sample image. It can be seen that the number of images in the three sample image subsets decreases sequentially and contains the same number of image categories.
  • the server can directly obtain a sample image set including a plurality of sample image subsets with decreasing numbers of images from the terminal; it can also obtain a large number of sample images from the terminal, and classify the sample images according to the image types corresponding to the sample images. , to obtain a sample image set including a plurality of sample image subsets whose number of images decreases sequentially.
  • the sample image set can be composed of sample images that conform to the characteristics of long-tailed distribution (that is, the number of images in a small number of image categories is large, while the number of images in most image categories is small), or it can be composed of sample images that conform to the characteristics of normal distribution.
  • the class distribution characteristics of the sample images in the sample image set are not limited here.
  • the preprocessing of the sample images is realized by acquiring a sample image set including a plurality of sample image subsets of which the number of images decreases in turn, so that the sample images are sorted according to image categories and are in different sample image subsets, which is convenient for subsequent branching.
  • the neural network performs feature learning, so that image categories with a small number of images can be fully trained during the training process, avoiding the neglect of long-tail data in the traditional neural network training process, and improving the effect of image recognition model generation.
  • Step 22 according to the sample image set, train the image recognition model to be trained, and obtain the loss value of the image recognition model to be trained;
  • the image recognition model to be trained includes a plurality of branch neural networks, and each branch neural network is used for corresponding
  • the loss value includes the target classification loss value and the classification loss value corresponding to each branch neural network, the target classification loss value is the loss value of the image recognition model to be trained for the sample image set, and the classification loss value is the corresponding branch.
  • the branch neural network can be constructed through 1 ⁇ 1 convolution, so the process of constructing the branch neural network can be completed with only a few additional parameters. Since multiple branch neural networks are constructed in the image recognition model to be trained, the branch neural network can divide the parameters of the image recognition model to be trained into two parts, one part is the shared parameters used to extract the common features of the sample images, and the other part On the basis of the shared parameters, it is further used to extract the exclusive parameters of the sample images of the sample image subset corresponding to the branch neural network; wherein, the exclusive parameters are the corresponding parameters in the branch neural network.
  • the corresponding relationship between the branch neural network and the sample image subset is determined, and the corresponding relationship can be determined according to the number of the branch neural network and the sample image subset. More commonly used are three branch neural networks and three sample image subsets, and it is stipulated that one branch neural network corresponds to the three sample image subsets, and the second branch neural network corresponds to the last two of the three sample image subsets. The sample image subset corresponds, and the third branch neural network corresponds to the last sample image subset (the sample image subset with the smallest number of images) among the three sample image subsets.
  • a sample image set includes three sample image subsets, namely head classes (head data, abbreviated h), medium classes (middle data, abbreviated m) and tail classes (tail data, abbreviated m); It includes the top 1/3 image categories with the largest number of images, the medium classes includes the image categories with the middle 1/3 of images, and the tail classes includes the last 1/3 image categories with the smallest number of images.
  • head data abbreviated h
  • medium classes includes the image categories with the middle 1/3 of images
  • tail classes includes the last 1/3 image categories with the smallest number of images.
  • N h+m+t corresponds to all the sample image subsets, and we use It is used to classify image categories in all sample image subsets;
  • N m+t corresponds to two sample image subsets, and is used to classify image categories in a relatively small number of images medium classes and tail classes sample image subsets;
  • N t corresponds to a sample image subset and is used to classify image categories in the tail classes sample image subset with the smallest number of images;
  • the three branch neural networks N h+m+t , N m+t and N t can all be The learning of the image category in the corresponding sample image subset is dominated by its own exclusive parameters; and the tail classes with a small number of images have a corresponding relationship with the three branch neural networks, while the head classes with a large number are only related to one branch neural network. There is a corresponding relationship between them, which realizes the utilization of long-tail data to a certain extent, so that image categories with
  • the loss value of the image recognition model to be trained includes the classification loss value and the target classification loss value, wherein the classification loss value is the loss value of the branch neural network for the subset of sample images corresponding to the branch neural network; the target classification loss value is to be trained.
  • the loss value of the image recognition model for the sample image set can be adjusted. According to the multiple classification loss values and the target classification loss value, the loss value for training the image recognition model to be trained can be obtained, and the training degree of the entire image recognition model can be judged.
  • the classification loss value is the loss value of the sample image subset corresponding to the branch neural network respectively, that is, the branch neural network N h+m+t corresponds to the loss value of the sample image subset of head classes, medium classes and tail classes, or N t corresponds to the loss value for a subset of tail classes sample images.
  • the target classification loss value is the loss value obtained by the overall output image category of the image recognition model to be trained corresponding to the entire sample image set.
  • the difference between the classification loss value and the target classification loss value is that the objects considered when calculating the loss value are different.
  • the classification loss value is the actual image of the sample image that compares the image category output by each branch neural network with the corresponding sample image subset.
  • the loss value obtained after comparing the categories; and the target classification loss value is the image category output by the entire image recognition model to be trained (that is, the fusion result of the image categories output by multiple branch neural networks) and the sample image in the sample image set.
  • the loss value obtained after comparing the actual classes is the image category output by the entire image recognition model to be trained (that is, the fusion result of the image categories output by multiple branch neural networks) and the sample image in the sample image set.
  • the corresponding images are identified through the branch neural network, the target classification loss value and the classification loss value corresponding to each branch neural network are obtained, the image recognition model is trained, and the loss value of the image recognition model to be trained is obtained, so that the training In the process, image categories with a small number of images can also be fully trained, avoiding the neglect of long-tail data in the traditional neural network training process, and improving the effect of image recognition model generation.
  • Step 23 Adjust the model parameters of the image recognition model to be trained according to the loss value, and use the image recognition model to be trained as the trained image recognition model until the loss value is lower than the preset threshold.
  • the server reversely adjusts the parameters of the image recognition model to be trained, including but not limited to the convolution layer, pooling layer, normalization layer, etc., according to the calculated loss value, such as weights and biases; In this case, after multiple training iterations, each loss value will gradually decrease and approach a fixed value.
  • the preset threshold can be set near the fixed value, and when the loss value is lower than the preset threshold, it can be determined that the training of the image recognition model is completed.
  • the parameters in the image recognition model are continuously adjusted by the loss value, and the training degree of the image recognition model is judged according to the difference between the loss value and the preset threshold, until the loss value calculated by the image recognition model is lower than the preset threshold. , it can be judged that the image recognition model training is completed; the effect of image recognition model generation is improved.
  • the above method for generating an image recognition model includes: acquiring a sample image set; the sample image set includes a plurality of sample image subsets of which the number of images decreases sequentially, and the multiple sample image subsets all contain the same number of image categories; according to the sample image set, to treat
  • the trained image recognition model is trained to obtain the loss value of the image recognition model to be trained;
  • the image recognition model to be trained includes a plurality of branch neural networks, each branch neural network is used to identify the corresponding image;
  • the loss value includes the target The classification loss value and the classification loss value corresponding to each branch neural network, the target classification loss value is the loss value of the image recognition model to be trained for the sample image set, and the classification loss value is the corresponding branch neural network for the corresponding branch neural network.
  • the loss value of the sample image subset; and adjusting the model parameters of the image recognition model to be trained according to the loss value, until the loss value is lower than the preset threshold, the image recognition model to be trained is regarded as the trained image recognition model.
  • the image recognition model to be trained is regarded as the trained image recognition model.
  • the image recognition model to be trained is trained, and the loss value of the image recognition model to be trained is obtained, including:
  • Step 41 performing uniform sampling on a plurality of sample image subsets in the sample image set to obtain a sample image input sequence
  • Step 42 input the sample image into the image recognition model to be trained to obtain the image category of the sample image;
  • Step 43 Determine the loss value of the image recognition model to be trained according to the image category of the sample image and the corresponding actual image category.
  • the server can uniformly sample multiple sample image subsets in the sample image set to obtain mini-batch data; input the mini-batch data as the sample image input sequence into the image recognition model to be trained for training, and obtain the image recognition model
  • the image category of the output sample image obtain the actual image category of the sample image, input the image category of the sample image and the actual image category into the preset loss function, and calculate the loss value of the image recognition model.
  • This embodiment makes the sample images of each image category in the sample image input sequence more balanced by uniform sampling; further makes the determined loss value of the image recognition model to be trained more accurate, and improves the effect of image recognition model generation.
  • the image recognition model to be trained further includes a basic neural network, and the basic neural network is connected with the branch neural network;
  • the above step 42 input the sample image into the image recognition model to be trained to obtain the image category of the sample image, including: inputting the sample image into the image recognition model to be trained, so that the basic neural network obtains the first image of the sample image. an image feature, and the branch neural network obtains the second image feature of the sample image according to the first image feature, and determines the image category of the sample image in the sample image set according to the second image feature.
  • the basic neural network is used to extract the feature information of the sample images in the sample image set, that is, the common features of all image categories in the sample image set are extracted as the first image features; the branch neural network obtains the first image features extracted by the basic neural network, And perform extraction again to obtain and output the second image feature.
  • the second image feature output by the branch neural network is passed through the classifier and fused to obtain the image category of the sample image.
  • the parameters of the basic neural network are shared parameters, and each branch neural network can be used; the type and structure of the basic neural network are not limited here.
  • the loss value of the image recognition model to be trained is determined, including:
  • Step 51 according to the image category of the sample image in the sample image set and the corresponding actual image category, determine the loss value of the sample image in the sample image set;
  • Step 52 Obtain the loss value corresponding to the sample image set according to the loss value of the sample image set in the sample image set determined by the multiple branch neural networks, and use it as the target classification loss value;
  • Step 53 obtaining the loss values of all sample images in the sample image subsets corresponding to the multiple branch neural networks, and taking the sum of the loss values of all the sample images in the sample image subsets as the classification loss value corresponding to the multiple branch neural networks;
  • Step 54 Calculate the loss value of the image recognition model to be trained according to the target classification loss value and the classification loss values corresponding to the multiple branched neural networks.
  • the sample image set includes three sample image subsets head classes, medium classes, and tail classes as an example for illustration;
  • the target classification loss value is the image class output by the entire image recognition model to be trained (that is, multiple branch nerves).
  • the fusion result of the image category output by the network) is the loss value obtained by comparing the actual category of the sample image in the sample image set; therefore, the target classification loss value is calculated by the output of the three branch neural network corresponding to the sample image in the sample image set. Then input all image categories and actual image categories into the loss function, and the obtained loss value is the target classification loss value, as shown in the following formula:
  • L f is the target classification loss value
  • J is the cross entropy loss function
  • F net is the image recognition model to be trained
  • X is the sample image in the sample image input sequence
  • Y is the actual image category of the sample image
  • h, m , t are the first, second and third sample image subsets with decreasing number of images respectively
  • the corner marks are the sample image subsets corresponding to the branch neural networks.
  • the classification loss value is the loss value obtained by each branch neural network for its corresponding sample image subset, not for the entire sample image set.
  • the branch neural network corresponds to the first, second and third sample image subsets, then calculate When branching the neural network, it is equivalent to calculating its loss value with the entire sample image set. and There is only a corresponding relationship with the third sample image subset, so calculate When the loss value is calculated according to the actual image category of the corresponding sample image in the third sample image subset, the loss value can be calculated; after obtaining the classification loss value calculated by all branch neural networks, the sum operation is performed, and the result is the final image category.
  • the prediction result of is as follows:
  • Li is the sum of the classification loss values corresponding to multiple branch neural networks; Sm +t is a subset of X, including the sample images belonging to the second and third sample image subsets in the sample image input sequence; St is Another subset of X, containing sample images in the input sequence of sample images that belong to the third subset of sample images.
  • the loss value of the image recognition model to be trained is calculated by the classification loss value and the target classification loss value, and the specific formula is as follows:
  • L all (1- ⁇ )L f /n 1 + ⁇ L i /n 2 ;
  • L all is the loss value of the image recognition model to be trained; ⁇ is a hyperparameter; n 1 is the number of sample images in X; n 2 is the sum of the number of sample images in X, S m+t and S t .
  • the hyperparameter ⁇ in the L all function can be adjusted by the hyperparameter ⁇ in the L all function; in addition, when the dataset is in a normal distribution state (that is, the number of images in each image category is relatively average), the The hyperparameter ⁇ is set to 0 for normal operation.
  • the above embodiment calculates the target classification loss value and the classification loss value through the difference between the image category of the sample image and the corresponding actual image category, and further obtains the loss value of the image recognition model to be trained, so that the image recognition model to be trained can be recognized.
  • the parameters in the model are adjusted so that image categories with a small number of images can be fully trained during the training process, avoiding the neglect of long-tail data in the traditional neural network training process, and improving the effect of image recognition model generation.
  • the above step 21, before acquiring the sample image set further includes:
  • Step 61 obtaining a sample image, and determining the number of images of the image category according to the image category of the sample image;
  • Step 62 Obtain the arrangement order of the image categories according to the number of images of the image categories, and divide the image categories into multiple category combinations according to the arrangement order; the multiple category combinations include the same number of image categories;
  • Step 63 according to the combination of multiple categories and the sample images corresponding to the image categories in the combination of multiple categories, obtain a subset of sample images corresponding to the combination of multiple categories; the combination of multiple subsets of sample images is used as the sample image set .
  • the server obtains sample images from the terminal, identifies the image categories of the sample images, classifies the sample images according to the image categories, and counts the number of sample images corresponding to each image category. According to the number of sample images corresponding to the image categories, sort the image categories from high to low to obtain the sorting order.
  • Image categories are evenly distributed to multiple category combinations based on the number of branched neural networks and the number of image categories. For example, if there are 3 branch neural networks and 6 image categories, the two image categories can be grouped into one group to obtain three category combinations.
  • a sample image subset corresponding to the category combination is obtained according to the category combination and the sample images corresponding to the category combination, and multiple sample image subsets are combined into a sample image set.
  • each branch neural network corresponds to the subset of sample images according to the characteristics of long-tail data distribution; so that image categories with a small number of images can be fully trained during the training process, avoiding the traditional neural network training process for long-tail data.
  • the neglect of image recognition model improves the effect of image recognition model generation.
  • steps in the flowcharts of FIGS. 2 and 4-6 are sequentially displayed according to the arrows, these steps are not necessarily executed in the order indicated by the arrows. Unless explicitly stated herein, the execution of these steps is not strictly limited to the order, and these steps may be performed in other orders. Moreover, at least a part of the steps in FIGS. 2 and 4-6 may include multiple steps or multiple stages. These steps or stages are not necessarily executed at the same time, but may be executed at different times. These steps or stages The order of execution of the steps is not necessarily sequential, but may be performed alternately or alternately with other steps or at least a portion of the steps or stages in the other steps.
  • an apparatus for generating an image recognition model including:
  • an acquisition module 71 configured to acquire a sample image set;
  • the sample image set includes a plurality of sample image subsets whose number of images decreases sequentially, and the multiple sample image subsets all contain the same number of image categories;
  • the training module 72 is used to train the image recognition model to be trained according to the sample image set, and obtain the loss value of the image recognition model to be trained;
  • the image recognition model to be trained includes a plurality of branch neural networks, and each branch neural network uses For identifying the corresponding image;
  • the loss value includes the target classification loss value and the classification loss value corresponding to each branch neural network, the target classification loss value is the loss value of the image recognition model to be trained for the sample image set, and the classification loss value is the loss value of the corresponding branch neural network for the subset of sample images corresponding to the branch neural network;
  • the adjustment module 73 is configured to adjust the model parameters of the image recognition model to be trained according to the loss value, and use the image recognition model to be trained as the trained image recognition model until the loss value is lower than the preset threshold.
  • the training module 72 is further configured to uniformly sample multiple sample image subsets in the sample image set to obtain a sample image input sequence; input the sample image into the image recognition model to be trained according to the sample image input sequence, Obtain the image category of the sample image; determine the loss value of the image recognition model to be trained according to the image category of the sample image and the corresponding actual image category.
  • the training module 72 is further configured to input the sample image into the image recognition model to be trained, so that the basic neural network obtains the first image feature of the sample image, and the branch neural network obtains the first image feature of the sample image according to the first image feature. the second image feature, and the image category of the sample image in the sample image set is determined according to the second image feature.
  • the training module 72 is further configured to determine the loss value of the sample image in the sample image set according to the image category of the sample image in the sample image set and the corresponding actual image category; according to the sample image set determined by the multiple branch neural networks
  • the loss value of the sample image is obtained, and the loss value corresponding to the sample image set is obtained as the target classification loss value; the loss value of all sample images in the sample image subset corresponding to the multiple branch neural networks is obtained, and all samples in the sample image subset are obtained.
  • the sum of the loss values of the image is used as the classification loss value corresponding to the multiple branch neural networks; according to the target classification loss value and the classification loss values corresponding to the multiple branch neural networks, the loss value of the image recognition model to be trained is calculated.
  • the obtaining module 71 is further configured to obtain sample images, and determine the number of images of the image categories according to the image categories of the sample images; obtain the arrangement order of the image categories according to the number of images of the image categories, and assign the image categories according to the arrangement order. Divided into multiple category combinations; multiple category combinations contain the same number of image categories; according to multiple category combinations, and sample images corresponding to image categories in multiple category combinations, obtain sample image sub-sections corresponding to multiple category combinations set; a combination of multiple sample image subsets as a sample image set.
  • each module in the image recognition model generating apparatus can be implemented by software, hardware and combinations thereof.
  • the above modules can be embedded in or independent of the processor in the computer device in the form of hardware, or stored in the memory in the computer device in the form of software, so that the processor can call and execute the operations corresponding to the above modules.
  • a computer device is provided, and the computer device may be a server, and its internal structure diagram may be as shown in FIG. 8 .
  • the computer device includes a processor, memory, and a network interface connected by a system bus. Among them, the processor of the computer device is used to provide computing and control capabilities.
  • the memory of the computer device includes a non-volatile storage medium, an internal memory.
  • the nonvolatile storage medium stores an operating system, a computer program, and a database.
  • the internal memory provides an environment for the execution of the operating system and computer programs in the non-volatile storage medium.
  • a database of the computer device is used to store image recognition model generation data.
  • the network interface of the computer device is used to communicate with an external terminal through a network connection.
  • the computer program implements an image recognition model generation method when executed by the processor.
  • FIG. 8 is only a block diagram of a part of the structure related to the solution of the present application, and does not constitute a limitation on the computer equipment to which the solution of the present application is applied. Include more or fewer components than shown in the figures, or combine certain components, or have a different arrangement of components.
  • a computer device including a memory and a processor, where a computer program is stored in the memory, and the processor implements the steps in the foregoing method embodiments when the processor executes the computer program.
  • a computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, the steps in the foregoing method embodiments are implemented.
  • any reference to memory, storage, database or other media used in the various embodiments provided in this application may include at least one of non-volatile and volatile memory.
  • Non-volatile memory may include read-only memory (Read-Only Memory, ROM), magnetic tape, floppy disk, flash memory, or optical memory, and the like.
  • Volatile memory may include random access memory (RAM) or external cache memory.
  • the RAM may be in various forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM).

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)

Abstract

L'invention concerne un procédé et un appareil générateurs de modèles de reconnaissance d'images, un dispositif informatique et un support de stockage. Ledit procédé comprend les étapes consistant à : acquérir un ensemble d'images échantillons, l'ensemble d'images échantillons comprenant une pluralité de sous-ensembles d'images échantillons dont le nombre d'images diminue successivement et la pluralité de sous-ensembles d'images échantillons contenant tous le même nombre de catégories d'images ; selon l'ensemble d'images échantillons, instruire un modèle de reconnaissance d'images à instruire, pour obtenir des valeurs de perte dudit modèle de reconnaissance d'images, ledit modèle de reconnaissance d'images comprenant une pluralité de réseaux neuronaux ramifiés, les valeurs de perte comportant une valeur de perte de classification cible et des valeurs de perte de classification, la valeur de perte de classification cible étant une valeur de perte dudit modèle concernant l'ensemble d'images échantillons et les valeurs de perte de classification étant des valeurs de perte des réseaux neuronaux ramifiés concernant des sous-ensembles correspondants d'images échantillons ; et régler des paramètres dudit modèle selon la valeur de perte, jusqu'à ce que les valeurs de perte soient inférieures à un seuil prédéfini.
PCT/CN2021/106635 2020-08-25 2021-07-16 Procédé et appareil générateurs de modèles de reconnaissance d'images, dispositif informatique et support de stockage WO2022042123A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP2022564577A JP7376731B2 (ja) 2020-08-25 2021-07-16 画像認識モデル生成方法、装置、コンピュータ機器及び記憶媒体

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010862911.0 2020-08-25
CN202010862911.0A CN111950656B (zh) 2020-08-25 2020-08-25 图像识别模型生成方法、装置、计算机设备和存储介质

Publications (1)

Publication Number Publication Date
WO2022042123A1 true WO2022042123A1 (fr) 2022-03-03

Family

ID=73366432

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/106635 WO2022042123A1 (fr) 2020-08-25 2021-07-16 Procédé et appareil générateurs de modèles de reconnaissance d'images, dispositif informatique et support de stockage

Country Status (3)

Country Link
JP (1) JP7376731B2 (fr)
CN (1) CN111950656B (fr)
WO (1) WO2022042123A1 (fr)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115294644A (zh) * 2022-06-24 2022-11-04 北京昭衍新药研究中心股份有限公司 一种基于3d卷积参数重构的快速猴子行为识别方法
CN117036869A (zh) * 2023-10-08 2023-11-10 之江实验室 一种基于多样性和随机策略的模型训练方法及装置

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111950656B (zh) * 2020-08-25 2021-06-25 深圳思谋信息科技有限公司 图像识别模型生成方法、装置、计算机设备和存储介质
CN112966767B (zh) * 2021-03-19 2022-03-22 焦点科技股份有限公司 一种特征提取和分类任务分离的数据不均衡处理方法
CN113034368A (zh) * 2021-04-01 2021-06-25 深圳思谋信息科技有限公司 图像超分辨率模型训练方法、装置、计算机设备和介质
CN113240032B (zh) * 2021-05-25 2024-01-30 北京有竹居网络技术有限公司 一种图像分类方法、装置、设备及存储介质
CN114155388B (zh) * 2022-02-10 2022-05-13 深圳思谋信息科技有限公司 一种图像识别方法、装置、计算机设备和存储介质
CN114581751B (zh) * 2022-03-08 2024-05-10 北京百度网讯科技有限公司 图像识别模型的训练方法和图像识别方法、装置
CN117457101B (zh) * 2023-12-22 2024-03-26 中国农业科学院烟草研究所(中国烟草总公司青州烟草研究所) 一种烘烤烟叶含水量预测方法、介质及系统

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108875934A (zh) * 2018-05-28 2018-11-23 北京旷视科技有限公司 一种神经网络的训练方法、装置、系统及存储介质
CN110097130A (zh) * 2019-05-07 2019-08-06 深圳市腾讯计算机系统有限公司 分类任务模型的训练方法、装置、设备及存储介质
WO2019233341A1 (fr) * 2018-06-08 2019-12-12 Oppo广东移动通信有限公司 Procédé et appareil de traitement d'images, support d'informations lisible par ordinateur, et dispositif électronique
CN111242158A (zh) * 2019-12-05 2020-06-05 北京迈格威科技有限公司 神经网络训练方法、图像处理方法及装置
CN111401307A (zh) * 2020-04-08 2020-07-10 中国人民解放军海军航空大学 基于深度度量学习的卫星遥感图像目标关联方法和装置
CN111950656A (zh) * 2020-08-25 2020-11-17 深圳思谋信息科技有限公司 图像识别模型生成方法、装置、计算机设备和存储介质

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9940729B1 (en) * 2016-11-18 2018-04-10 Here Global B.V. Detection of invariant features for localization
US11138724B2 (en) 2017-06-01 2021-10-05 International Business Machines Corporation Neural network classification
CN110162556A (zh) * 2018-02-11 2019-08-23 陕西爱尚物联科技有限公司 一种有效发挥数据价值的方法
US11494687B2 (en) 2018-03-05 2022-11-08 Yodlee, Inc. Generating samples of transaction data sets
CN108921013B (zh) * 2018-05-16 2020-08-18 浙江零跑科技有限公司 一种基于深度神经网络的视觉场景识别系统及方法
US11372893B2 (en) 2018-06-01 2022-06-28 Ntt Security Holdings Corporation Ensemble-based data curation pipeline for efficient label propagation
CN111125460B (zh) * 2019-12-24 2022-02-25 腾讯科技(深圳)有限公司 信息推荐方法及装置
CN111291841B (zh) * 2020-05-13 2020-08-21 腾讯科技(深圳)有限公司 图像识别模型训练方法、装置、计算机设备和存储介质

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108875934A (zh) * 2018-05-28 2018-11-23 北京旷视科技有限公司 一种神经网络的训练方法、装置、系统及存储介质
WO2019233341A1 (fr) * 2018-06-08 2019-12-12 Oppo广东移动通信有限公司 Procédé et appareil de traitement d'images, support d'informations lisible par ordinateur, et dispositif électronique
CN110097130A (zh) * 2019-05-07 2019-08-06 深圳市腾讯计算机系统有限公司 分类任务模型的训练方法、装置、设备及存储介质
CN111242158A (zh) * 2019-12-05 2020-06-05 北京迈格威科技有限公司 神经网络训练方法、图像处理方法及装置
CN111401307A (zh) * 2020-04-08 2020-07-10 中国人民解放军海军航空大学 基于深度度量学习的卫星遥感图像目标关联方法和装置
CN111950656A (zh) * 2020-08-25 2020-11-17 深圳思谋信息科技有限公司 图像识别模型生成方法、装置、计算机设备和存储介质

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115294644A (zh) * 2022-06-24 2022-11-04 北京昭衍新药研究中心股份有限公司 一种基于3d卷积参数重构的快速猴子行为识别方法
CN117036869A (zh) * 2023-10-08 2023-11-10 之江实验室 一种基于多样性和随机策略的模型训练方法及装置
CN117036869B (zh) * 2023-10-08 2024-01-09 之江实验室 一种基于多样性和随机策略的模型训练方法及装置

Also Published As

Publication number Publication date
JP7376731B2 (ja) 2023-11-08
CN111950656A (zh) 2020-11-17
JP2023523029A (ja) 2023-06-01
CN111950656B (zh) 2021-06-25

Similar Documents

Publication Publication Date Title
WO2022042123A1 (fr) Procédé et appareil générateurs de modèles de reconnaissance d'images, dispositif informatique et support de stockage
US11537884B2 (en) Machine learning model training method and device, and expression image classification method and device
US11741361B2 (en) Machine learning-based network model building method and apparatus
CN108764195B (zh) 手写模型训练方法、手写字识别方法、装置、设备及介质
WO2019100724A1 (fr) Procédé et dispositif d'apprentissage de modèle de classification à étiquettes multiples
CN109359725B (zh) 卷积神经网络模型的训练方法、装置、设备及计算机可读存储介质
CN109086653B (zh) 手写模型训练方法、手写字识别方法、装置、设备及介质
CN111191737A (zh) 基于多尺度反复注意力机制的细粒度图像分类方法
EP3620982B1 (fr) Procédé et dispositif de traitement d'échantillons
CN111310800B (zh) 图像分类模型生成方法、装置、计算机设备和存储介质
CN112926654A (zh) 预标注模型训练、证件预标注方法、装置、设备及介质
WO2021159748A1 (fr) Procédé et appareil de compression de modele, dispositif informatique et support de stockage
WO2023206944A1 (fr) Procédé et appareil de segmentation sémantique, dispositif informatique et support de stockage
WO2023284608A1 (fr) Procédé et appareil de génération de modèle de reconnaissance de caractères, dispositif informatique et support de stockage
CN112529068B (zh) 一种多视图图像分类方法、系统、计算机设备和存储介质
CN112257738A (zh) 机器学习模型的训练方法、装置和图像的分类方法、装置
CN114417095A (zh) 一种数据集划分方法及装置
CN109101984B (zh) 一种基于卷积神经网络的图像识别方法及装置
WO2021135063A1 (fr) Procédé et appareil d'analyses de données pathologiques, et dispositif et support d'enregistrement
CN111079930A (zh) 数据集质量参数的确定方法、装置及电子设备
CN111126501A (zh) 一种图像识别方法、终端设备及存储介质
CN116384471A (zh) 模型剪枝方法、装置、计算机设备、存储介质和程序产品
CN111091198B (zh) 一种数据处理方法及装置
CN114677535A (zh) 域适应图像分类网络的训练方法、图像分类方法及装置
CN113902959A (zh) 图像识别方法、装置、计算机设备和存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21859963

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2022564577

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21859963

Country of ref document: EP

Kind code of ref document: A1