WO2021248932A1 - Image data processing method and apparatus, device and readable storage medium - Google Patents

Image data processing method and apparatus, device and readable storage medium Download PDF

Info

Publication number
WO2021248932A1
WO2021248932A1 PCT/CN2021/076826 CN2021076826W WO2021248932A1 WO 2021248932 A1 WO2021248932 A1 WO 2021248932A1 CN 2021076826 W CN2021076826 W CN 2021076826W WO 2021248932 A1 WO2021248932 A1 WO 2021248932A1
Authority
WO
WIPO (PCT)
Prior art keywords
sample
sample set
image
sampling
data processing
Prior art date
Application number
PCT/CN2021/076826
Other languages
French (fr)
Chinese (zh)
Inventor
张润泽
郭振华
赵雅倩
Original Assignee
广东浪潮智慧计算技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 广东浪潮智慧计算技术有限公司 filed Critical 广东浪潮智慧计算技术有限公司
Publication of WO2021248932A1 publication Critical patent/WO2021248932A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Definitions

  • the present invention relates to the field of image processing technology, and in particular to an image data processing method, device, equipment and readable storage medium.
  • Pedestrian re-recognition, face recognition, image-based target detection and other image recognition processing technologies often need to collect a large number of labeled image samples to train models with learning capabilities, and finally obtain learning that can effectively recognize unknown images Model.
  • the uneven distribution of image samples often leads to poor learning effects, and ultimately makes the trained model unable to achieve the expected recognition accuracy.
  • the number of pictures corresponding to a pedestrian ranges from 1 to hundreds of thousands.
  • the pictures corresponding to different pedestrians range from 1 to more than 1,000.
  • the median of pictures corresponding to each pedestrian is only 2. Nearly half of the pedestrians only have one picture, and there is only one picture. A small number of pedestrians have more than 100 pictures.
  • this type of data distribution is called long-tailed data.
  • the purpose of the present invention is to provide an image data processing method, device, equipment, and readable storage medium to classify and divide the image sample set, and combine the contribution of the sample to model training, to provide a sample subset of each category
  • the determined sampling weight is sub-sampled to the image sample set to achieve the purpose of data balance, which can further improve the accuracy of model training.
  • the present invention provides the following technical solutions:
  • An image data processing method including:
  • Each of the sample subsets is used to train the target model to obtain the model training accuracy corresponding to each of the sample subsets;
  • sampling weights matching the model training accuracy Using sampling weights matching the model training accuracy, sampling each of the sample subsets to obtain a target image sample set.
  • said segmenting said image sample set by said fitting index to obtain a plurality of sample subsets includes:
  • the image sample set is divided by the integral of the fitting index to obtain a plurality of the sample subsets with an equal total number of pictures.
  • the sampling of each of the sample subsets by using the sampling weight matching the model training accuracy to obtain a target image sample set includes:
  • the combining the relative position and the sampling weight to sample each of the sample subsets to obtain the target image sample set includes:
  • the combining the relative position and the sampling weight to sample each of the sample subsets to obtain the target image sample set includes:
  • sampling is performed using the number of pictures corresponding to each of the tags in the sample subset and the sampling weight.
  • the combining the relative position and the sampling weight to sample each of the sample subsets to obtain the target image sample set includes:
  • it also includes:
  • the classification and recognition model is used to recognize the target image to be recognized, and the recognition result is obtained.
  • An image data processing device including:
  • the image sample set sorting module is used to sort the tags according to the number of pictures corresponding to each tag in the image sample set;
  • the fitting module is used to obtain the fitting index corresponding to the image sample set after sorting
  • An image sample set segmentation module configured to use the fitting index to segment the image sample set to obtain multiple sample subsets
  • the training module is configured to train the target model by using each of the sample subsets to obtain the model training accuracy corresponding to each of the sample subsets;
  • the re-sampling module is used to sample each of the sample subsets by using the sampling weight matching the model training accuracy to obtain a target image sample set.
  • An image data processing device including:
  • Memory used to store computer programs
  • the processor is used to implement the steps of the image data processing method when the computer program is executed.
  • the tags are sorted according to the number of pictures corresponding to each tag in the image sample set; the fitting index corresponding to the sorted image sample set is obtained; the image sample set is divided by the fitting index to obtain Sample subsets; each sample subset is used to train the target model to obtain the model training accuracy corresponding to each sample subset; the sampling weight matching the model training accuracy is used to sample each sample subset to obtain the target Image sample set.
  • the image sample set is first reordered based on the number of pictures of the label, and then the fitting index of the image sample set is determined. Based on the fitting index, the image sample set can be divided into multiple samples according to the number of pictures of the label. Subset. That is, the number of pictures corresponding to the tags in the same sample subset are all similar. Then, each sample subset is used to train the target model to obtain the model training accuracy, that is, to determine the contribution of each sample subset to the target model training. The sampling weight is determined based on the model training accuracy, and each sample subset is sampled to obtain the target image sample set.
  • the sample distribution in the target image sample set can be distributed with the sample contribution ability, which can further improve the model training effect and improve the accuracy of the result of image recognition processing.
  • the embodiments of the present invention also provide image data processing apparatuses, equipment, and readable storage media corresponding to the above-mentioned image data processing methods, which have the above technical effects, and will not be repeated here.
  • Figure 1 is an implementation flowchart of an image data processing method in an embodiment of the present invention
  • FIG. 2 is a schematic diagram of image sample set segmentation in an embodiment of the present invention.
  • FIG. 3 is a schematic structural diagram of an image data processing device in an embodiment of the present invention.
  • Figure 4 is a schematic structural diagram of an image data processing device in an embodiment of the present invention.
  • Fig. 5 is a schematic diagram of a specific structure of an image data processing device in an embodiment of the present invention.
  • FIG. 1 is a flowchart of an image data processing method in an embodiment of the present invention. The method includes the following steps:
  • S101 Sort the tags according to the number of pictures corresponding to each tag in the image sample set.
  • the image sample set may specifically be a set of image samples used for training face recognition, object recognition, or pedestrian re-recognition.
  • Each picture in the image sample set has a label, which indicates the identification mark of the picture.
  • the label is pedestrian, and each pedestrian has one picture or more than one picture.
  • the image sample set is used as the sample data corresponding to pedestrian re-recognition for description.
  • the process of processing the image sample set of other types of image recognition processing please refer to this, which will not be repeated here.
  • the fitting index corresponding to the sorted image sample set can be determined.
  • an exponential fitting with the base of the natural base e can be performed on the newly arranged image sample set, namely:
  • formula (1) can be used to replace the ascending order of the original image sample set for subsequent steps.
  • the obtained fitting index is shown in FIG. 2, which is a schematic diagram of image sample set segmentation in an embodiment of the present invention.
  • the number of sample subsets is at least 3. There is no intersection between sample subsets.
  • the image sample set can be segmented using the integral of the fitting index to obtain multiple sample subsets with an equal total number of pictures.
  • the number of sample subsets can be achieved by setting.
  • the number of pictures corresponding to the labels is different, and the total number of pictures is not necessarily an integer multiple of the sample subset, the number of labels in multiple sample subsets may not be exactly the same, and the number of pictures in multiple sample subsets is also It's not exactly the same.
  • the target model is a model that needs to be trained after image data processing.
  • the model may be a deep learning model or a machine learning model that has learning capabilities, such as a machine learning model.
  • the sampling situation is determined mainly based on the contribution of the sample subset to the target model training. Therefore, in this embodiment, after obtaining multiple sample subsets, each sample subset can be used to train the target model, and then the model training accuracy corresponding to each sample subset can be obtained.
  • the training accuracy of the model can be specifically the recognition accuracy rate, which can be obtained through the validation set.
  • a sampling weight corresponds to a sample subset. That is to say, each sample subset is sampled according to the sampling weight matching the model training accuracy, and the sampled samples are added to the target image sample set. In this way, a single over-sampling or under-sampling will not cause the problem of sampling difficulty in retaining samples that have the ability to contribute to model training.
  • sampling implementation process includes:
  • Step 1 Use the model training accuracy to determine the sampling weight
  • Step 2 Sampling each sample subset according to the sampling weight to obtain the target image sample set.
  • each sample subset can be sampled directly based on the sampling weight.
  • the number of samples corresponding to the label is too small, which may cause the target model to fail to learn effectively, and the number of samples corresponding to the label is too large, which may cause the target model to overfit. That is, data that contributes a lot to the model needs to be oversampled, and redundant data needs to be undersampled. Therefore, in this embodiment, effective differentiated sampling can also be performed for the number of samples corresponding to the labels, so that the labels with the median number of samples being better concentrated are retained and reasonably sampled.
  • the specific implementation process includes:
  • Step 1 Obtain the relative position of each sample subset in the fitting index
  • Step 2 Combine the relative position and sampling weight to sample each sample subset to obtain the target image sample set.
  • Step 1 Determine whether the sampling weight is greater than 1
  • Step 2 If yes, use the number of pictures corresponding to each label in the sample subset and the sampling weight to perform oversampling;
  • Step 3. If not, sample the original pictures of the subset.
  • Step 1 Determine whether the sampling weight is greater than 1
  • Step 2 If yes, use the number of pictures corresponding to each label in the sample subset, sampling weight, and preset weighting multiple to perform oversampling;
  • Step 3 If not, use the number of pictures and sampling weights corresponding to each label in the sample subset to perform sampling.
  • Step 1 Determine whether the sampling weight is greater than 1
  • Step 2 If yes, use the number of pictures and sampling weights corresponding to each label in the sample subset to perform oversampling;
  • Step 3 If not, obtain the median number of pictures corresponding to each label in the sample subset, and randomly select pictures with the median number of pictures for each label.
  • the target image sample set is obtained.
  • the target image sample set After obtaining the target image sample set, use the target image sample set to train the target model to obtain a trained classification recognition model; use the classification recognition model to recognize the target image to be recognized to obtain the recognition result.
  • the tags are sorted according to the number of pictures corresponding to each tag in the image sample set; the fitting index corresponding to the sorted image sample set is obtained; the image sample set is divided by the fitting index to obtain Sample subsets; each sample subset is used to train the target model to obtain the model training accuracy corresponding to each sample subset; the sampling weight matching the model training accuracy is used to sample each sample subset to obtain the target Image sample set.
  • the image sample set is first reordered based on the number of pictures of the label, and then the fitting index of the image sample set is determined. Based on the fitting index, the image sample set can be divided into multiple samples according to the number of pictures of the label. Subset. That is, the number of pictures corresponding to the tags in the same sample subset are all similar. Then, each sample subset is used to train the target model to obtain the model training accuracy, that is, to determine the contribution of each sample subset to the target model training. The sampling weight is determined based on the model training accuracy, and each sample subset is sampled to obtain the target image sample set.
  • the sample distribution in the target image sample set can be distributed with the sample contribution ability, which can further improve the model training effect and improve the accuracy of the result of image recognition processing.
  • the embodiment of the present invention also provides an image data processing device.
  • the image data processing device described below and the image data processing method described above can be referred to each other.
  • the device includes the following modules:
  • the image sample set sorting module 101 is used to sort the tags according to the number of pictures corresponding to each tag in the image sample set;
  • the fitting module 102 is configured to obtain the fitting index corresponding to the sorted image sample set
  • the image sample set segmentation module 103 is used to segment the image sample set by using the fitting index to obtain multiple sample subsets;
  • the training module 104 is configured to use each sample subset to train the target model to obtain the model training accuracy corresponding to each sample subset;
  • the re-sampling module 105 is used for sampling each sample subset by using the sampling weight matching the model training accuracy to obtain the target image sample set.
  • the labels are sorted according to the number of pictures corresponding to each label in the image sample set; the fitting index corresponding to the sorted image sample set is obtained; the image sample set is segmented by the fitting index to obtain multiple Sample subsets; each sample subset is used to train the target model to obtain the model training accuracy corresponding to each sample subset; the sampling weight matching the model training accuracy is used to sample each sample subset to obtain the target Image sample set.
  • the image sample set is first reordered based on the number of pictures of the label, and then the fitting index of the image sample set is determined. Based on the fitting index, the image sample set can be divided into multiple samples according to the number of pictures of the label. Subset. That is, the number of pictures corresponding to the tags in the same sample subset are all similar. Then, each sample subset is used to train the target model to obtain the model training accuracy, that is, to determine the contribution of each sample subset to the target model training. The sampling weight is determined based on the model training accuracy, and each sample subset is sampled to obtain the target image sample set.
  • the sample distribution in the target image sample set can be distributed with the sample contribution ability, which can further improve the model training effect and improve the accuracy of the result of image recognition processing.
  • the image sample set segmentation module 103 is specifically configured to segment the image sample set using the integral of the fitting index to obtain multiple sample subsets with an equal total number of pictures.
  • the resampling module 105 specifically includes:
  • the relative position obtaining unit is used to obtain the relative position of each sample subset in the fitting index
  • the re-sampling unit is used to combine the relative position and the sampling weight to sample each sample subset to obtain the target image sample set.
  • the resampling unit is specifically used to determine whether the sampling weight is greater than 1 if the relative position is the head; if so, use the number of pictures corresponding to each label in the sample subset and the sampling weight, Oversampling is performed; if not, the original pictures of the subset are sampled.
  • the resampling unit is specifically used to determine whether the sampling weight is greater than 1 if the relative position is in the middle; if so, use the number of pictures corresponding to each label in the sample subset, sampling weight and Preset weighting multiples and perform over-sampling; if not, use the number of pictures and sampling weights corresponding to each label in the sample subset to perform sampling.
  • the resampling unit is specifically configured to determine whether the sampling weight is greater than 1 if the relative position is the tail; if so, use the number of pictures and sampling weights corresponding to each label in the sample subset, Oversampling is performed; if not, the median number of pictures corresponding to each label in the sample subset is obtained, and pictures with the median number of pictures are randomly selected for each label.
  • the model training module is used to train the target model by using the target image sample set to obtain a trained classification and recognition model
  • the recognition module is used to recognize the target image to be recognized by using the classification recognition model to obtain the recognition result.
  • the embodiment of the present invention also provides an image data processing device.
  • the image data processing device described below and the image data processing method described above can be referenced correspondingly.
  • the image data processing equipment includes:
  • the memory 332 is used to store computer programs
  • the processor 322 is configured to implement the steps of the image data processing method in the foregoing method embodiment when the computer program is executed.
  • FIG. 5 is a schematic diagram of a specific structure of an image data processing device provided by this embodiment.
  • the image data processing device may have relatively large differences due to different configurations or performances, and may include one or one
  • the foregoing central processing units (CPU) 322 for example, one or more processors
  • the memory 332 stores one or more computer application programs 342 or data 344.
  • the memory 332 may be short-term storage or persistent storage.
  • the program stored in the memory 332 may include one or more modules (not shown in the figure), and each module may include a series of instruction operations on the data processing device.
  • the central processing unit 322 may be configured to communicate with the memory 332, and execute a series of instruction operations in the memory 332 on the image data processing device 301.
  • the image data processing device 301 may also include one or more power supplies 326, one or more wired or wireless network interfaces 350, one or more input and output interfaces 358, and/or one or more operating systems 341.
  • the steps in the image data processing method described above can be implemented by the structure of the image data processing device.
  • the embodiment of the present invention also provides a readable storage medium, and a readable storage medium described below and an image data processing method described above can be referenced correspondingly.
  • a readable storage medium in which a computer program is stored, and when the computer program is executed by a processor, the steps of the image data processing method in the foregoing method embodiment are implemented.
  • the readable storage medium can specifically be a U disk, a mobile hard disk, a read-only memory (Read-Only Memory, ROM), a random access memory (Random Access Memory, RAM), a magnetic disk, or an optical disk that can store program codes. Readable storage medium.

Abstract

An image data processing method and apparatus, a device and a readable storage medium. In the present method, an image sample set is first re-sorted on the basis of the number of labeled pictures, a fitting index of the image sample set is then determined, and on the basis of the fitting index, the image sample set may be divided into a plurality of sample subsets according to the number of labeled pictures. Then, each sample subset is used to train a target model to obtain the model training accuracy, i.e. the contribution of each sample subset to the training of the target model is determined. A sampling weight is determined on the basis of the model training accuracy, and each sample subset is sampled to obtain a target image sample set. The sample distribution in the target image sample set may be distributed according to the sample contribution ability, which may further improve the model training effect and improve the accuracy of the result of picture recognition processing.

Description

一种图像数据处理方法、装置、设备及可读存储介质Image data processing method, device, equipment and readable storage medium
本申请要求于2020年6月11日提交至中国专利局、申请号为202010530581.5、发明名称为“一种图像数据处理方法、装置、设备及可读存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of a Chinese patent application filed with the Chinese Patent Office on June 11, 2020, with the application number of 202010530581.5, and the title of the invention as "an image data processing method, device, equipment, and readable storage medium". The entire content is incorporated into this application by reference.
技术领域Technical field
本发明涉及图像处理技术领域,特别是涉及一种图像数据处理方法、装置、设备及可读存储介质。The present invention relates to the field of image processing technology, and in particular to an image data processing method, device, equipment and readable storage medium.
背景技术Background technique
行人重识别,人脸识别,基于图片的目标检测等图像识别处理技术中,往往需采集大量的有标签图像样本来对具有学习能力的模型进行训练,最终得到能够对未知图像进行有效识别的学习模型。Pedestrian re-recognition, face recognition, image-based target detection and other image recognition processing technologies often need to collect a large number of labeled image samples to train models with learning capabilities, and finally obtain learning that can effectively recognize unknown images Model.
但是,在进行模型训练时,往往会因图像样本分布不均,导致学习效果不佳,最终使得训练得到的模型无法达到预期的识别精度。以行人重识别为例,对于行人重识别的图像样本集,一个行人(标签)对应的图片数量有1到成百上千不等。尤其是有些图像样本集中,不同行人对应的图片范围从1张到一千多张不等,然后每个行人对应图片的中位数仅为2,有将近一半的行人仅有一张图片,而有一小部分的行人有超过100张图片。通常这种类型的数据分布称为长尾型数据。However, during model training, the uneven distribution of image samples often leads to poor learning effects, and ultimately makes the trained model unable to achieve the expected recognition accuracy. Taking pedestrian re-recognition as an example, for the image sample set of pedestrian re-recognition, the number of pictures corresponding to a pedestrian (tag) ranges from 1 to hundreds of thousands. Especially in some image sample sets, the pictures corresponding to different pedestrians range from 1 to more than 1,000. Then the median of pictures corresponding to each pedestrian is only 2. Nearly half of the pedestrians only have one picture, and there is only one picture. A small number of pedestrians have more than 100 pictures. Usually this type of data distribution is called long-tailed data.
综上所述,如何有效地解决图像样本不平衡等问题,是目前本领域技术人员急需解决的技术问题。In summary, how to effectively solve problems such as image sample imbalance is a technical problem urgently needed to be solved by those skilled in the art.
发明内容Summary of the invention
本发明的目的是提供一种图像数据处理方法、装置、设备及可读存储介质,以通过对图像样本集进行分类划分,并结合样本对模型训练的贡献情况,对每一个类别的样本子集确定出的采样权重,对图像样本集进行二次采样,以达到数据平衡的目的,进一步可提高模型训 练精度。The purpose of the present invention is to provide an image data processing method, device, equipment, and readable storage medium to classify and divide the image sample set, and combine the contribution of the sample to model training, to provide a sample subset of each category The determined sampling weight is sub-sampled to the image sample set to achieve the purpose of data balance, which can further improve the accuracy of model training.
为解决上述技术问题,本发明提供如下技术方案:In order to solve the above technical problems, the present invention provides the following technical solutions:
一种图像数据处理方法,包括:An image data processing method, including:
按照图像样本集中每个标签对应的图片数目,对所述标签进行排序;Sort the tags according to the number of pictures corresponding to each tag in the image sample set;
获取排序后所述图像样本集对应的拟合指数;Acquiring the fitting index corresponding to the image sample set after sorting;
利用所述拟合指数分割所述图像样本集,得到多个样本子集;Segmenting the image sample set by using the fitting index to obtain multiple sample subsets;
分别利用每个所述样本子集对目标模型进行训练,得到每个所述样本子集对应的模型训练精度;Each of the sample subsets is used to train the target model to obtain the model training accuracy corresponding to each of the sample subsets;
利用与所述模型训练精度匹配的采样权重,对各个所述样本子集进行采样,得到目标图像样本集。Using sampling weights matching the model training accuracy, sampling each of the sample subsets to obtain a target image sample set.
优选地,所述利用所述拟合指数分割所述图像样本集,得到多个样本子集,包括:Preferably, said segmenting said image sample set by said fitting index to obtain a plurality of sample subsets includes:
利用所述拟合指数的积分分割所述图像样本集,得到图片总量均等的多个所述样本子集。The image sample set is divided by the integral of the fitting index to obtain a plurality of the sample subsets with an equal total number of pictures.
优选地,所述利用与所述模型训练精度匹配的所述采样权重,对各个所述样本子集进行采样,得到目标图像样本集,包括:Preferably, the sampling of each of the sample subsets by using the sampling weight matching the model training accuracy to obtain a target image sample set includes:
获取各个所述样本子集在所述拟合指数中相对位置;Acquiring the relative position of each of the sample subsets in the fitting index;
结合所述相对位置和所述采样权重,对各个所述样本子集进行采样,得到所述目标图像样本集。Combining the relative position and the sampling weight, sampling each of the sample subsets to obtain the target image sample set.
优选地,所述结合所述相对位置和所述采样权重,对各个所述样本子集进行采样,得到所述目标图像样本集,包括:Preferably, the combining the relative position and the sampling weight to sample each of the sample subsets to obtain the target image sample set includes:
若所述相对位置为首部,则判断所述采样权重是否大于1;If the relative position is the head, it is determined whether the sampling weight is greater than 1;
如果是,则利用所述样本子集中各个所述标签对应的图片数量以及所述采样权重,进行过采样;If yes, perform oversampling by using the number of pictures corresponding to each of the tags in the sample subset and the sampling weight;
如果否,则取所述样本子集的原图片。If not, then take the original picture of the sample subset.
优选地,所述结合所述相对位置和所述采样权重,对各个所述样本子集进行采样,得到所述目标图像样本集,包括:Preferably, the combining the relative position and the sampling weight to sample each of the sample subsets to obtain the target image sample set includes:
若所述相对位置为中部,则判断所述采样权重是否大于1;If the relative position is in the middle, it is determined whether the sampling weight is greater than 1;
如果是,则利用所述样本子集中各个所述标签对应的图片数量、所述采样权重和预设加权倍数,进行过采样;If so, perform oversampling by using the number of pictures corresponding to each of the tags in the sample subset, the sampling weight, and the preset weighting multiple;
如果否,则利用所述样本子集中各个所述标签对应的图片数量、所述采样权重,进行采样。If not, sampling is performed using the number of pictures corresponding to each of the tags in the sample subset and the sampling weight.
优选地,所述结合所述相对位置和所述采样权重,对各个所述样本子集进行采样,得到所述目标图像样本集,包括:Preferably, the combining the relative position and the sampling weight to sample each of the sample subsets to obtain the target image sample set includes:
若所述相对位置为尾部,则判断所述采样权重是否大于1;If the relative position is the tail, judge whether the sampling weight is greater than 1;
如果是,则利用所述样本子集中各个所述标签对应的图片数量及所述采样权重,进行过采样;If yes, perform oversampling by using the number of pictures corresponding to each of the tags in the sample subset and the sampling weight;
如果否,则获取所述样本子集中各个所述标签对应的图片数量中位数,对于每个所述标签随机抽取所述图片数量中位数的图片。If not, obtain the median number of pictures corresponding to each of the tags in the sample subset, and randomly select pictures with the median number of pictures for each tag.
优选地,还包括:Preferably, it also includes:
利用所述目标图像样本集对所述目标模型进行训练,得到训练好的分类识别模型;Training the target model by using the target image sample set to obtain a trained classification and recognition model;
利用所述分类识别模型对待识别的目标图片进行识别,得到识别结果。The classification and recognition model is used to recognize the target image to be recognized, and the recognition result is obtained.
一种图像数据处理装置,包括:An image data processing device, including:
图像样本集排序模块,用于按照图像样本集中每个标签对应的图片数目,对所述标签进行排序;The image sample set sorting module is used to sort the tags according to the number of pictures corresponding to each tag in the image sample set;
拟合模块,用于获取排序后所述图像样本集对应的拟合指数;The fitting module is used to obtain the fitting index corresponding to the image sample set after sorting;
图像样本集分割模块,用于利用所述拟合指数分割所述图像样本集,得到多个样本子集;An image sample set segmentation module, configured to use the fitting index to segment the image sample set to obtain multiple sample subsets;
训练模块,用于分别利用每个所述样本子集对目标模型进行训练,得到每个所述样本子集对应的模型训练精度;The training module is configured to train the target model by using each of the sample subsets to obtain the model training accuracy corresponding to each of the sample subsets;
重采样模块,用于利用与所述模型训练精度匹配的采样权重,对各个所述样本子集进行采样,得到目标图像样本集。The re-sampling module is used to sample each of the sample subsets by using the sampling weight matching the model training accuracy to obtain a target image sample set.
一种图像数据处理设备,包括:An image data processing device, including:
存储器,用于存储计算机程序;Memory, used to store computer programs;
处理器,用于执行所述计算机程序时实现上述图像数据处理方法 的步骤。The processor is used to implement the steps of the image data processing method when the computer program is executed.
一种可读存储介质,所述可读存储介质上存储有计算机程序,所述计算机程序被处理器执行时实现上述图像数据处理方法的步骤。A readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, the steps of the image data processing method described above are realized.
应用本发明实施例所提供的方法,按照图像样本集中每个标签对应的图片数目,对标签进行排序;获取排序后图像样本集对应的拟合指数;利用拟合指数分割图像样本集,得到多个样本子集;分别利用每个样本子集对目标模型进行训练,得到每个样本子集对应的模型训练精度;利用与模型训练精度匹配的采样权重,对各个样本子集进行采样,得到目标图像样本集。Using the method provided by the embodiment of the present invention, the tags are sorted according to the number of pictures corresponding to each tag in the image sample set; the fitting index corresponding to the sorted image sample set is obtained; the image sample set is divided by the fitting index to obtain Sample subsets; each sample subset is used to train the target model to obtain the model training accuracy corresponding to each sample subset; the sampling weight matching the model training accuracy is used to sample each sample subset to obtain the target Image sample set.
在本方法中,首先基于标签的图片数量对图像样本集进行重新排序,然后确定出图像样本集的拟合指数,基于拟合指数便可将图像样本集按照标签的图片数量分割为多个样本子集。即同一个样本子集中的标签对应的图片数量都是近似的。然后,分别利用每一个样本子集对目标模型进行训练,得到模型训练精度,即确定出每一个样本子集对目标模型训练的贡献情况。基于模型训练精度确定出采样权重,对各个样本子集进行采样,得到目标图像样本集。该目标图像样本集中的样本分布便可随样本贡献能力分布,进一步可提高模型训练效果,提高图片识别处理的结果准确率。In this method, the image sample set is first reordered based on the number of pictures of the label, and then the fitting index of the image sample set is determined. Based on the fitting index, the image sample set can be divided into multiple samples according to the number of pictures of the label. Subset. That is, the number of pictures corresponding to the tags in the same sample subset are all similar. Then, each sample subset is used to train the target model to obtain the model training accuracy, that is, to determine the contribution of each sample subset to the target model training. The sampling weight is determined based on the model training accuracy, and each sample subset is sampled to obtain the target image sample set. The sample distribution in the target image sample set can be distributed with the sample contribution ability, which can further improve the model training effect and improve the accuracy of the result of image recognition processing.
相应地,本发明实施例还提供了与上述图像数据处理方法相对应的图像数据处理装置、设备和可读存储介质,具有上述技术效果,在此不再赘述。Correspondingly, the embodiments of the present invention also provide image data processing apparatuses, equipment, and readable storage media corresponding to the above-mentioned image data processing methods, which have the above technical effects, and will not be repeated here.
附图说明Description of the drawings
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to explain the embodiments of the present invention or the technical solutions in the prior art more clearly, the following will briefly introduce the drawings that need to be used in the description of the embodiments or the prior art. Obviously, the drawings in the following description are only These are some embodiments of the present invention. For those of ordinary skill in the art, other drawings can be obtained based on these drawings without creative work.
图1为本发明实施例中一种图像数据处理方法的实施流程图;Figure 1 is an implementation flowchart of an image data processing method in an embodiment of the present invention;
图2为本发明实施例中一种图像样本集分割示意图;2 is a schematic diagram of image sample set segmentation in an embodiment of the present invention;
图3为本发明实施例中一种图像数据处理装置的结构示意图;FIG. 3 is a schematic structural diagram of an image data processing device in an embodiment of the present invention;
图4为本发明实施例中一种图像数据处理设备的结构示意图;Figure 4 is a schematic structural diagram of an image data processing device in an embodiment of the present invention;
图5为本发明实施例中一种图像数据处理设备的具体结构示意图。Fig. 5 is a schematic diagram of a specific structure of an image data processing device in an embodiment of the present invention.
具体实施方式detailed description
为了使本技术领域的人员更好地理解本发明方案,下面结合附图和具体实施方式对本发明作进一步的详细说明。显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。In order to enable those skilled in the art to better understand the solution of the present invention, the present invention will be further described in detail below with reference to the accompanying drawings and specific embodiments. Obviously, the described embodiments are only a part of the embodiments of the present invention, rather than all the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the protection scope of the present invention.
请参考图1,图1为本发明实施例中一种图像数据处理方法的流程图,该方法包括以下步骤:Please refer to FIG. 1. FIG. 1 is a flowchart of an image data processing method in an embodiment of the present invention. The method includes the following steps:
S101、按照图像样本集中每个标签对应的图片数目,对标签进行排序。S101: Sort the tags according to the number of pictures corresponding to each tag in the image sample set.
其中,图像样本集可以具体为用于训练人脸识别、物品识别或行人重识别的图片样本集合。Among them, the image sample set may specifically be a set of image samples used for training face recognition, object recognition, or pedestrian re-recognition.
在图像样本集中的每一个图片均有标签,该标签即表明该图片的识别标记。例如,当图像样本集为行人重识别的图像样本集,该标签即为行人,每一个行人有一个图片或一个以上的图片。为便于描述,下面以图像样本集为行人重识别对应的样本数据进行说明,对于其他类型的图片识别处理的图片样本集进行处理的过程可参照与此,在此不再一一赘述。Each picture in the image sample set has a label, which indicates the identification mark of the picture. For example, when the image sample set is the image sample set for pedestrian re-recognition, the label is pedestrian, and each pedestrian has one picture or more than one picture. For ease of description, the image sample set is used as the sample data corresponding to pedestrian re-recognition for description. For the process of processing the image sample set of other types of image recognition processing, please refer to this, which will not be repeated here.
按照标签对应的图片数据,对标签进行排序。即,对每一个标签的图片数据进行统计,然后按照统计结果对图像样本集进行重新排序。需要说明的是,这里的排序可以为自少到多,也可为自多到少。为便于描述,在本文件中以排序为自少到多进行描述,自多到少的方式可参照于此,在此不再一一赘述。Sort the tags according to the image data corresponding to the tags. That is, the image data of each label is counted, and then the image sample set is reordered according to the statistical result. It should be noted that the order here can be from less to more, or from more to less. For ease of description, in this document, the order is described from less to more, and the way from more to less can be referred to here, so I will not repeat them here.
S102、获取排序后图像样本集对应的拟合指数。S102. Obtain a fitting index corresponding to the sorted image sample set.
在排序完成之后,可确定出排序后的图像样本集对应的拟合指数。After the sorting is completed, the fitting index corresponding to the sorted image sample set can be determined.
具体的,可对新排列好的图像样本集做底数为自然底数e的指数拟合,即:Specifically, an exponential fitting with the base of the natural base e can be performed on the newly arranged image sample set, namely:
f(x)=a*e b*x+c   (1) f(x)=a*e b*x +c (1)
求得(1)式中的a,b和c之后,可用(1)式代替原图像样本集的升序排列进行后续步骤操作。所得到的拟合指数如图2所示,图2为本发明实施例中一种图像样本集分割示意图。After obtaining a, b, and c in formula (1), formula (1) can be used to replace the ascending order of the original image sample set for subsequent steps. The obtained fitting index is shown in FIG. 2, which is a schematic diagram of image sample set segmentation in an embodiment of the present invention.
S103、利用拟合指数分割图像样本集,得到多个样本子集。S103. Use the fitting index to segment the image sample set to obtain multiple sample subsets.
其中,样本子集的数量至少为3个。样本子集间无交集。Among them, the number of sample subsets is at least 3. There is no intersection between sample subsets.
具体的,为了使得分割得到的多个样本子集的图片数量均等,便于后续采样处理,可利用拟合指数的积分分割图像样本集,得到图片总量均等的多个样本子集。样本子集的数量可通过设置实现。Specifically, in order to equalize the number of pictures in the multiple sample subsets obtained by segmentation and facilitate subsequent sampling processing, the image sample set can be segmented using the integral of the fitting index to obtain multiple sample subsets with an equal total number of pictures. The number of sample subsets can be achieved by setting.
需要说明的是,由于标签对应的图片数量不同,且图片总数也不一定是样本子集的整数倍,因而多个样本子集的标签数量可以不完全相同,多个样本子集中的图片数量也可不完全相同。It should be noted that because the number of pictures corresponding to the labels is different, and the total number of pictures is not necessarily an integer multiple of the sample subset, the number of labels in multiple sample subsets may not be exactly the same, and the number of pictures in multiple sample subsets is also It's not exactly the same.
举例说明,请参考图2,若需将图像样本集分为3个样本子集,可通过计算拟合指数的积分的方式,得到:g 1=[0,p),g 2=[p,q)以及g 3=(q,n-1],每一部分的积分都是S(拟合指数的积分)的三分之一,依次对应图2所示的S1,S2和S3。 For example, please refer to Figure 2. If you need to divide the image sample set into 3 sample subsets, you can calculate the integral of the fitting index to get: g 1 = [0, p), g 2 = [p, q) and g 3 =(q, n-1], the integral of each part is one-third of S (the integral of the fitting index), corresponding to S1, S2 and S3 shown in Figure 2 in turn.
S104、分别利用每个样本子集对目标模型进行训练,得到每个样本子集对应的模型训练精度。S104: Use each sample subset to train the target model to obtain the model training accuracy corresponding to each sample subset.
其中,目标模型即为进行图像数据处理之后,需要进行训练的模型,该模型可以为深度学习模型也可以为机器学习模型等具有学习能力的模型。Among them, the target model is a model that needs to be trained after image data processing. The model may be a deep learning model or a machine learning model that has learning capabilities, such as a machine learning model.
为了确定使得最终的目标图像样本集能够更好地训练目标模型,在本实施例中,重点依据样本子集对目标模型训练的贡献情况确定采样情况。因而,在本实施例中,在得到多个样本子集之后,可分别利用每个样本子集对目标模型进行训练,然后得到每一个样本子集对应 的模型训练精度。该模型训练精度可具体为识别准确率,可通过验证集得到。In order to determine that the final target image sample set can better train the target model, in this embodiment, the sampling situation is determined mainly based on the contribution of the sample subset to the target model training. Therefore, in this embodiment, after obtaining multiple sample subsets, each sample subset can be used to train the target model, and then the model training accuracy corresponding to each sample subset can be obtained. The training accuracy of the model can be specifically the recognition accuracy rate, which can be obtained through the validation set.
S105、利用与模型训练精度匹配的采样权重,对各个样本子集进行采样,得到目标图像样本集。S105. Using sampling weights that match the model training accuracy, sample each sample subset to obtain a target image sample set.
其中,一个采样权重与一个样本子集对应。也就是说,分别按照与模型训练精度匹配的采样权重分别对每一个样本子集进行采样,将采样得到的样本添加至目标图像样本集中。如此,便不会因单一的过采样或欠采样导致采样难以保留具有模型训练贡献能力的样本的问题。Among them, a sampling weight corresponds to a sample subset. That is to say, each sample subset is sampled according to the sampling weight matching the model training accuracy, and the sampled samples are added to the target image sample set. In this way, a single over-sampling or under-sampling will not cause the problem of sampling difficulty in retaining samples that have the ability to contribute to model training.
具体的,采样实现过程,包括:Specifically, the sampling implementation process includes:
步骤一、利用模型训练精度确定采样权重;Step 1: Use the model training accuracy to determine the sampling weight;
步骤二、按照采样权重,对各个样本子集进行采样,得到目标图像样本集。Step 2: Sampling each sample subset according to the sampling weight to obtain the target image sample set.
举例说明,若样本子集的数量为3,可对每个样本子集分别进行控制变量的单独的训练,并用验证集得到它们训练目标模型所能达到的精度的数值a 1,a 2,a 3。则g 1,g 2,g 3对应的权重w 1,w 2,w 3为: For example, if the number of sample subsets is 3, separate training of the control variables can be performed on each sample subset, and the verification set can be used to obtain the accuracy values a 1 , a 2 , a that can be achieved by their training target model 3 . Then g 1, g 2, g 3 corresponding weight w 1, w 2, w 3 is:
Figure PCTCN2021076826-appb-000001
Figure PCTCN2021076826-appb-000001
得到采样权重之后,便可直接基于采样权重对各个样本子集进行采样。After the sampling weight is obtained, each sample subset can be sampled directly based on the sampling weight.
优选地,考虑到在实际应用中,标签对应的样本数量过少,可能会导致目标模型无法有效学习,标签对应的样本数量过多,可能会导致目标模型过拟合。即,对模型贡献大的数据需要进行过采样,冗余的数据需要进行欠采样。因此,在本实施例中还可针对标签对应的样本数量进行有效区分采样,以使更好地集中在样本数量为中位数的标签被保留及合理采样。具体实现过程,包括:Preferably, considering that in practical applications, the number of samples corresponding to the label is too small, which may cause the target model to fail to learn effectively, and the number of samples corresponding to the label is too large, which may cause the target model to overfit. That is, data that contributes a lot to the model needs to be oversampled, and redundant data needs to be undersampled. Therefore, in this embodiment, effective differentiated sampling can also be performed for the number of samples corresponding to the labels, so that the labels with the median number of samples being better concentrated are retained and reasonably sampled. The specific implementation process includes:
步骤一、获取各个样本子集在拟合指数中相对位置;Step 1: Obtain the relative position of each sample subset in the fitting index;
步骤二、结合相对位置和采样权重,对各个样本子集进行采样,得到目标图像样本集。Step 2: Combine the relative position and sampling weight to sample each sample subset to obtain the target image sample set.
从图2可知,在拟合指数的首部,如g 1,该部分中同一个标签对应的图片数量偏低;在拟合指数的中部,如g 2,该部分中同一个标签对应的图片数量居中,可在贡献能力大时,多采样。在拟合指数的尾部,如g 3,该部分中同一个标签对应的图片数量偏多,可在贡献能力小时,减少采样。 It can be seen from Figure 2 that in the header of the fitting index, such as g 1 , the number of pictures corresponding to the same label in this part is low; in the middle of the fitting index, such as g 2 , the number of pictures corresponding to the same label in this part In the middle, you can sample more when the contribution ability is large. At the end of the fitting index, such as g 3 , the number of pictures corresponding to the same label in this part is too large, and the sampling can be reduced when the contribution ability is small.
即具体的采样过程,包括以下几种情况:That is, the specific sampling process, including the following situations:
情况一:若相对位置为首部,则采样过程包括:Case 1: If the relative position is the head, the sampling process includes:
步骤1、判断采样权重是否大于1; Step 1. Determine whether the sampling weight is greater than 1;
步骤2、如果是,则利用样本子集中各个标签对应的图片数量以及采样权重,进行过采样; Step 2. If yes, use the number of pictures corresponding to each label in the sample subset and the sampling weight to perform oversampling;
步骤3、如果否,则取样本子集的原图片。Step 3. If not, sample the original pictures of the subset.
举例说明:对于g 1,若w 1>1,则对g 1中的每个标签取样他们所拥有图片数的w 1倍,若为非整数则上取整,为增加数据多样性,可对于重复照片做随机翻转(如角度旋转,左右翻转),剪裁与擦除;若w 1≤1,则取g 1的原图像样本集。 For example: for g 1 , if w 1 > 1, then sample w 1 times the number of pictures they own for each tag in g 1 , if it is a non-integer, round up to the whole. In order to increase data diversity, Repeat the photo to do random flip (such as angle rotation, left and right flip), crop and erase; if w 1 ≤ 1, then take the original image sample set of g 1.
情况二:若相对位置为中部,则采样过程包括:Case 2: If the relative position is in the middle, the sampling process includes:
步骤1、判断采样权重是否大于1; Step 1. Determine whether the sampling weight is greater than 1;
步骤2、如果是,则利用样本子集中各个标签对应的图片数量、采样权重和预设加权倍数,进行过采样; Step 2. If yes, use the number of pictures corresponding to each label in the sample subset, sampling weight, and preset weighting multiple to perform oversampling;
步骤3、如果否,则利用样本子集中各个标签对应的图片数量、采样权重,进行采样。Step 3. If not, use the number of pictures and sampling weights corresponding to each label in the sample subset to perform sampling.
举例说明:对于g 2,若w 2>1,则对g 2中的每个标签取样他们所拥有图片数的m*w 2倍(m为预设加权倍数,m可根据具体情况取大于1的数,如2),若为非整数则上取整,对于重复照片做随机翻转,剪裁与擦除;若w 2≤1,则取g 2的原图像样本集的w 2倍。 For example: for g 2 , if w 2 > 1, then sample each tag in g 2 m*w 2 times the number of pictures they own (m is a preset weighted multiple, m can be greater than 1 according to the specific situation The number of, such as 2). If it is a non-integer, round up. Repeated photos are randomly flipped, cropped and erased; if w 2 ≤ 1, then take w 2 times the original image sample set of g 2.
情况三:若相对位置为尾部,则采样过程包括:Case 3: If the relative position is the tail, the sampling process includes:
步骤1、判断采样权重是否大于1; Step 1. Determine whether the sampling weight is greater than 1;
步骤2、如果是,则利用样本子集中各个标签对应的图片数量及采样权重,进行过采样; Step 2. If yes, use the number of pictures and sampling weights corresponding to each label in the sample subset to perform oversampling;
步骤3、如果否,则获取样本子集中各个标签对应的图片数量中位数,对于每个标签随机抽取图片数量中位数的图片。Step 3. If not, obtain the median number of pictures corresponding to each label in the sample subset, and randomly select pictures with the median number of pictures for each label.
举例说明:对于g 3,若w 3>1,则对g 3中的每个标签,取样他们所拥有图片数的w 3倍,若为非整数则上取整,对于重复照片做随机翻转,剪裁与擦除;若w 3≤1,则取整个升序排列图像样本集的中位数,g 3中每个标签随机取样该中位数数目的图片。 For example: for g 3 , if w 3 > 1, then for each label in g 3 , sample w 3 times the number of pictures they have, if it is a non-integer, round up, and perform random flips for repeated pictures. Cropping and erasing; if w 3 ≤ 1, then take the median of the entire image sample set in ascending order, and randomly sample the median number of pictures for each label in g 3.
针对各个样本子集进行采样之后,得到目标图像样本集。After sampling each sample subset, the target image sample set is obtained.
优选地,在得到目标图像样本集之后,利用目标图像样本集对目标模型进行训练,得到训练好的分类识别模型;利用分类识别模型对待识别的目标图片进行识别,得到识别结果。Preferably, after obtaining the target image sample set, use the target image sample set to train the target model to obtain a trained classification recognition model; use the classification recognition model to recognize the target image to be recognized to obtain the recognition result.
应用本发明实施例所提供的方法,按照图像样本集中每个标签对应的图片数目,对标签进行排序;获取排序后图像样本集对应的拟合指数;利用拟合指数分割图像样本集,得到多个样本子集;分别利用每个样本子集对目标模型进行训练,得到每个样本子集对应的模型训练精度;利用与模型训练精度匹配的采样权重,对各个样本子集进行采样,得到目标图像样本集。Using the method provided by the embodiment of the present invention, the tags are sorted according to the number of pictures corresponding to each tag in the image sample set; the fitting index corresponding to the sorted image sample set is obtained; the image sample set is divided by the fitting index to obtain Sample subsets; each sample subset is used to train the target model to obtain the model training accuracy corresponding to each sample subset; the sampling weight matching the model training accuracy is used to sample each sample subset to obtain the target Image sample set.
在本方法中,首先基于标签的图片数量对图像样本集进行重新排序,然后确定出图像样本集的拟合指数,基于拟合指数便可将图像样本集按照标签的图片数量分割为多个样本子集。即同一个样本子集中的标签对应的图片数量都是近似的。然后,分别利用每一个样本子集对目标模型进行训练,得到模型训练精度,即确定出每一个样本子集对目标模型训练的贡献情况。基于模型训练精度确定出采样权重,对各个样本子集进行采样,得到目标图像样本集。该目标图像样本集中的样本分布便可随样本贡献能力分布,进一步可提高模型训练效果,提高图片识别处理的结果准确率。In this method, the image sample set is first reordered based on the number of pictures of the label, and then the fitting index of the image sample set is determined. Based on the fitting index, the image sample set can be divided into multiple samples according to the number of pictures of the label. Subset. That is, the number of pictures corresponding to the tags in the same sample subset are all similar. Then, each sample subset is used to train the target model to obtain the model training accuracy, that is, to determine the contribution of each sample subset to the target model training. The sampling weight is determined based on the model training accuracy, and each sample subset is sampled to obtain the target image sample set. The sample distribution in the target image sample set can be distributed with the sample contribution ability, which can further improve the model training effect and improve the accuracy of the result of image recognition processing.
相应于上面的方法实施例,本发明实施例还提供了一种图像数据处理装置,下文描述的图像数据处理装置与上文描述的图像数据处理方法可相互对应参照。Corresponding to the above method embodiment, the embodiment of the present invention also provides an image data processing device. The image data processing device described below and the image data processing method described above can be referred to each other.
参见图3所示,该装置包括以下模块:As shown in Figure 3, the device includes the following modules:
图像样本集排序模块101,用于按照图像样本集中每个标签对应的图片数目,对标签进行排序;The image sample set sorting module 101 is used to sort the tags according to the number of pictures corresponding to each tag in the image sample set;
拟合模块102,用于获取排序后图像样本集对应的拟合指数;The fitting module 102 is configured to obtain the fitting index corresponding to the sorted image sample set;
图像样本集分割模块103,用于利用拟合指数分割图像样本集,得到多个样本子集;The image sample set segmentation module 103 is used to segment the image sample set by using the fitting index to obtain multiple sample subsets;
训练模块104,用于分别利用每个样本子集对目标模型进行训练,得到每个样本子集对应的模型训练精度;The training module 104 is configured to use each sample subset to train the target model to obtain the model training accuracy corresponding to each sample subset;
重采样模块105,用于利用与模型训练精度匹配的采样权重,对各个样本子集进行采样,得到目标图像样本集。The re-sampling module 105 is used for sampling each sample subset by using the sampling weight matching the model training accuracy to obtain the target image sample set.
应用本发明实施例所提供的装置,按照图像样本集中每个标签对应的图片数目,对标签进行排序;获取排序后图像样本集对应的拟合指数;利用拟合指数分割图像样本集,得到多个样本子集;分别利用每个样本子集对目标模型进行训练,得到每个样本子集对应的模型训练精度;利用与模型训练精度匹配的采样权重,对各个样本子集进行采样,得到目标图像样本集。Using the device provided by the embodiment of the present invention, the labels are sorted according to the number of pictures corresponding to each label in the image sample set; the fitting index corresponding to the sorted image sample set is obtained; the image sample set is segmented by the fitting index to obtain multiple Sample subsets; each sample subset is used to train the target model to obtain the model training accuracy corresponding to each sample subset; the sampling weight matching the model training accuracy is used to sample each sample subset to obtain the target Image sample set.
在本装置中,首先基于标签的图片数量对图像样本集进行重新排序,然后确定出图像样本集的拟合指数,基于拟合指数便可将图像样本集按照标签的图片数量分割为多个样本子集。即同一个样本子集中的标签对应的图片数量都是近似的。然后,分别利用每一个样本子集对目标模型进行训练,得到模型训练精度,即确定出每一个样本子集对目标模型训练的贡献情况。基于模型训练精度确定出采样权重,对各个样本子集进行采样,得到目标图像样本集。该目标图像样本集中的样本分布便可随样本贡献能力分布,进一步可提高模型训练效果,提高图片识别处理的结果准确率。In this device, the image sample set is first reordered based on the number of pictures of the label, and then the fitting index of the image sample set is determined. Based on the fitting index, the image sample set can be divided into multiple samples according to the number of pictures of the label. Subset. That is, the number of pictures corresponding to the tags in the same sample subset are all similar. Then, each sample subset is used to train the target model to obtain the model training accuracy, that is, to determine the contribution of each sample subset to the target model training. The sampling weight is determined based on the model training accuracy, and each sample subset is sampled to obtain the target image sample set. The sample distribution in the target image sample set can be distributed with the sample contribution ability, which can further improve the model training effect and improve the accuracy of the result of image recognition processing.
在本发明的一种具体实施方式中,图像样本集分割模块103,具体用于利用拟合指数的积分分割图像样本集,得到图片总量均等的多个样本子集。In a specific embodiment of the present invention, the image sample set segmentation module 103 is specifically configured to segment the image sample set using the integral of the fitting index to obtain multiple sample subsets with an equal total number of pictures.
在本发明的一种具体实施方式中,重采样模块105,具体包括:In a specific implementation of the present invention, the resampling module 105 specifically includes:
相对位置获取单元,用于获取各个样本子集在拟合指数中相对位 置;The relative position obtaining unit is used to obtain the relative position of each sample subset in the fitting index;
重采样单元,用于结合相对位置和采样权重,对各个样本子集进行采样,得到目标图像样本集。The re-sampling unit is used to combine the relative position and the sampling weight to sample each sample subset to obtain the target image sample set.
在本发明的一种具体实施方式中,重采样单元,具体用于若相对位置为首部,则判断采样权重是否大于1;如果是,则利用样本子集中各个标签对应的图片数量以及采样权重,进行过采样;如果否,则取样本子集的原图片。In a specific embodiment of the present invention, the resampling unit is specifically used to determine whether the sampling weight is greater than 1 if the relative position is the head; if so, use the number of pictures corresponding to each label in the sample subset and the sampling weight, Oversampling is performed; if not, the original pictures of the subset are sampled.
在本发明的一种具体实施方式中,重采样单元,具体用于若相对位置为中部,则判断采样权重是否大于1;如果是,则利用样本子集中各个标签对应的图片数量、采样权重和预设加权倍数,进行过采样;如果否,则利用样本子集中各个标签对应的图片数量、采样权重,进行采样。In a specific embodiment of the present invention, the resampling unit is specifically used to determine whether the sampling weight is greater than 1 if the relative position is in the middle; if so, use the number of pictures corresponding to each label in the sample subset, sampling weight and Preset weighting multiples and perform over-sampling; if not, use the number of pictures and sampling weights corresponding to each label in the sample subset to perform sampling.
在本发明的一种具体实施方式中,重采样单元,具体用于若相对位置为尾部,则判断采样权重是否大于1;如果是,则利用样本子集中各个标签对应的图片数量及采样权重,进行过采样;如果否,则获取样本子集中各个标签对应的图片数量中位数,对于每个标签随机抽取图片数量中位数的图片。In a specific embodiment of the present invention, the resampling unit is specifically configured to determine whether the sampling weight is greater than 1 if the relative position is the tail; if so, use the number of pictures and sampling weights corresponding to each label in the sample subset, Oversampling is performed; if not, the median number of pictures corresponding to each label in the sample subset is obtained, and pictures with the median number of pictures are randomly selected for each label.
在本发明的一种具体实施方式中,还包括:In a specific embodiment of the present invention, it further includes:
模型训练模块,用于利用目标图像样本集对目标模型进行训练,得到训练好的分类识别模型;The model training module is used to train the target model by using the target image sample set to obtain a trained classification and recognition model;
识别模块,用于利用分类识别模型对待识别的目标图片进行识别,得到识别结果。The recognition module is used to recognize the target image to be recognized by using the classification recognition model to obtain the recognition result.
相应于上面的方法实施例,本发明实施例还提供了一种图像数据处理设备,下文描述的一种图像数据处理设备与上文描述的一种图像数据处理方法可相互对应参照。Corresponding to the above method embodiment, the embodiment of the present invention also provides an image data processing device. The image data processing device described below and the image data processing method described above can be referenced correspondingly.
参见图4所示,该图像数据处理设备包括:As shown in Figure 4, the image data processing equipment includes:
存储器332,用于存储计算机程序;The memory 332 is used to store computer programs;
处理器322,用于执行计算机程序时实现上述方法实施例的图像 数据处理方法的步骤。The processor 322 is configured to implement the steps of the image data processing method in the foregoing method embodiment when the computer program is executed.
具体的,请参考图5,图5为本实施例提供的一种图像数据处理设备的具体结构示意图,该图像数据处理设备可因配置或性能不同而产生比较大的差异,可以包括一个或一个以上处理器(central processing units,CPU)322(例如,一个或一个以上处理器)和存储器332,存储器332存储有一个或一个以上的计算机应用程序342或数据344。其中,存储器332可以是短暂存储或持久存储。存储在存储器332的程序可以包括一个或一个以上模块(图示没标出),每个模块可以包括对数据处理设备中的一系列指令操作。更进一步地,中央处理器322可以设置为与存储器332通信,在图像数据处理设备301上执行存储器332中的一系列指令操作。Specifically, please refer to FIG. 5. FIG. 5 is a schematic diagram of a specific structure of an image data processing device provided by this embodiment. The image data processing device may have relatively large differences due to different configurations or performances, and may include one or one The foregoing central processing units (CPU) 322 (for example, one or more processors) and a memory 332, and the memory 332 stores one or more computer application programs 342 or data 344. Among them, the memory 332 may be short-term storage or persistent storage. The program stored in the memory 332 may include one or more modules (not shown in the figure), and each module may include a series of instruction operations on the data processing device. Furthermore, the central processing unit 322 may be configured to communicate with the memory 332, and execute a series of instruction operations in the memory 332 on the image data processing device 301.
图像数据处理设备301还可以包括一个或一个以上电源326,一个或一个以上有线或无线网络接口350,一个或一个以上输入输出接口358,和/或,一个或一个以上操作系统341。The image data processing device 301 may also include one or more power supplies 326, one or more wired or wireless network interfaces 350, one or more input and output interfaces 358, and/or one or more operating systems 341.
上文所描述的图像数据处理方法中的步骤可以由图像数据处理设备的结构实现。The steps in the image data processing method described above can be implemented by the structure of the image data processing device.
相应于上面的方法实施例,本发明实施例还提供了一种可读存储介质,下文描述的一种可读存储介质与上文描述的一种图像数据处理方法可相互对应参照。Corresponding to the above method embodiment, the embodiment of the present invention also provides a readable storage medium, and a readable storage medium described below and an image data processing method described above can be referenced correspondingly.
一种可读存储介质,可读存储介质上存储有计算机程序,计算机程序被处理器执行时实现上述方法实施例的图像数据处理方法的步骤。A readable storage medium in which a computer program is stored, and when the computer program is executed by a processor, the steps of the image data processing method in the foregoing method embodiment are implemented.
该可读存储介质具体可以为U盘、移动硬盘、只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等各种可存储程序代码的可读存储介质。The readable storage medium can specifically be a U disk, a mobile hard disk, a read-only memory (Read-Only Memory, ROM), a random access memory (Random Access Memory, RAM), a magnetic disk, or an optical disk that can store program codes. Readable storage medium.
本领域技术人员还可以进一步意识到,结合本文中所公开的实施 例描述的各示例的单元及算法步骤,能够以电子硬件、计算机软件或者二者的结合来实现,为了清楚地说明硬件和软件的可互换性,在上述说明中已经按照功能一般性地描述了各示例的组成及步骤。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。本领域技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本发明的范围。Those skilled in the art may further realize that the units and algorithm steps of the examples described in the embodiments disclosed herein can be implemented by electronic hardware, computer software, or a combination of the two, in order to clearly illustrate the hardware and software The interchangeability of, the composition and steps of each example have been described in general in accordance with the function in the above description. Whether these functions are executed by hardware or software depends on the specific application and design constraint conditions of the technical solution. Those skilled in the art can use different methods for each specific application to implement the described functions, but such implementation should not be considered as going beyond the scope of the present invention.

Claims (10)

  1. 一种图像数据处理方法,其特征在于,包括:An image data processing method, characterized in that it comprises:
    按照图像样本集中每个标签对应的图片数目,对所述标签进行排序;Sort the tags according to the number of pictures corresponding to each tag in the image sample set;
    获取排序后所述图像样本集对应的拟合指数;Acquiring the fitting index corresponding to the image sample set after sorting;
    利用所述拟合指数分割所述图像样本集,得到多个样本子集;Segmenting the image sample set by using the fitting index to obtain multiple sample subsets;
    分别利用每个所述样本子集对目标模型进行训练,得到每个所述样本子集对应的模型训练精度;Each of the sample subsets is used to train the target model to obtain the model training accuracy corresponding to each of the sample subsets;
    利用与所述模型训练精度匹配的采样权重,对各个所述样本子集进行采样,得到目标图像样本集。Using sampling weights matching the model training accuracy, sampling each of the sample subsets to obtain a target image sample set.
  2. 根据权利要求1所述的图像数据处理方法,其特征在于,所述利用所述拟合指数分割所述图像样本集,得到多个样本子集,包括:The image data processing method according to claim 1, wherein said segmenting said image sample set by said fitting index to obtain a plurality of sample subsets comprises:
    利用所述拟合指数的积分分割所述图像样本集,得到图片总量均等的多个所述样本子集。The image sample set is divided by the integral of the fitting index to obtain a plurality of the sample subsets with an equal total number of pictures.
  3. 根据权利要求1所述的图像数据处理方法,其特征在于,所述利用与所述模型训练精度匹配的采样权重,对各个所述样本子集进行采样,得到目标图像样本集,包括:The image data processing method according to claim 1, wherein said sampling each of said sample subsets by using sampling weights matching the accuracy of said model training to obtain a target image sample set comprises:
    获取各个所述样本子集在所述拟合指数中相对位置;Acquiring the relative position of each of the sample subsets in the fitting index;
    结合所述相对位置和所述采样权重,对各个所述样本子集进行采样,得到所述目标图像样本集。Combining the relative position and the sampling weight, sampling each of the sample subsets to obtain the target image sample set.
  4. 根据权利要求3所述的图像数据处理方法,其特征在于,所述结合所述相对位置和所述采样权重,对各个所述样本子集进行采样,得到所述目标图像样本集,包括:The image data processing method according to claim 3, wherein the combining the relative position and the sampling weight to sample each of the sample subsets to obtain the target image sample set comprises:
    若所述相对位置为首部,则判断所述采样权重是否大于1;If the relative position is the head, it is determined whether the sampling weight is greater than 1;
    如果是,则利用所述样本子集中各个所述标签对应的图片数量以及所述采样权重,进行过采样;If yes, perform oversampling by using the number of pictures corresponding to each of the tags in the sample subset and the sampling weight;
    如果否,则取所述样本子集的原图片。If not, then take the original picture of the sample subset.
  5. 根据权利要求3所述的图像数据处理方法,其特征在于,所述结合所述相对位置和所述采样权重,对各个所述样本子集进行采 样,得到所述目标图像样本集,包括:The image data processing method according to claim 3, wherein the combining the relative position and the sampling weight to sample each of the sample subsets to obtain the target image sample set comprises:
    若所述相对位置为中部,则判断所述采样权重是否大于1;If the relative position is in the middle, it is determined whether the sampling weight is greater than 1;
    如果是,则利用所述样本子集中各个所述标签对应的图片数量、所述采样权重和预设加权倍数,进行过采样;If so, perform oversampling by using the number of pictures corresponding to each of the tags in the sample subset, the sampling weight, and the preset weighting multiple;
    如果否,则利用所述样本子集中各个所述标签对应的图片数量、所述采样权重,进行采样。If not, sampling is performed using the number of pictures corresponding to each of the tags in the sample subset and the sampling weight.
  6. 根据权利要求3所述的图像数据处理方法,其特征在于,所述结合所述相对位置和所述采样权重,对各个所述样本子集进行采样,得到所述目标图像样本集,包括:The image data processing method according to claim 3, wherein the combining the relative position and the sampling weight to sample each of the sample subsets to obtain the target image sample set comprises:
    若所述相对位置为尾部,则判断所述采样权重是否大于1;If the relative position is the tail, judge whether the sampling weight is greater than 1;
    如果是,则利用所述样本子集中各个所述标签对应的图片数量及所述采样权重,进行过采样;If yes, perform oversampling by using the number of pictures corresponding to each of the tags in the sample subset and the sampling weight;
    如果否,则获取所述样本子集中各个所述标签对应的图片数量中位数,对于每个所述标签随机抽取所述图片数量中位数的图片。If not, obtain the median number of pictures corresponding to each of the tags in the sample subset, and randomly select pictures with the median number of pictures for each tag.
  7. 根据权利要求1至6任一项所述的图像数据处理方法,其特征在于,还包括:The image data processing method according to any one of claims 1 to 6, further comprising:
    利用所述目标图像样本集对所述目标模型进行训练,得到训练好的分类识别模型;Training the target model by using the target image sample set to obtain a trained classification and recognition model;
    利用所述分类识别模型对待识别的目标图片进行识别,得到识别结果。The classification and recognition model is used to recognize the target image to be recognized, and the recognition result is obtained.
  8. 一种图像数据处理装置,其特征在于,包括:An image data processing device, characterized in that it comprises:
    图像样本集排序模块,用于按照图像样本集中每个标签对应的图片数目,对所述标签进行排序;The image sample set sorting module is used to sort the tags according to the number of pictures corresponding to each tag in the image sample set;
    拟合模块,用于获取排序后所述图像样本集对应的拟合指数;The fitting module is used to obtain the fitting index corresponding to the image sample set after sorting;
    图像样本集分割模块,用于利用所述拟合指数分割所述图像样本集,得到多个样本子集;An image sample set segmentation module, configured to use the fitting index to segment the image sample set to obtain multiple sample subsets;
    训练模块,用于分别利用每个所述样本子集对目标模型进行训练,得到每个所述样本子集对应的模型训练精度;The training module is configured to train the target model by using each of the sample subsets to obtain the model training accuracy corresponding to each of the sample subsets;
    重采样模块,用于利用与所述模型训练精度匹配的采样权重,对 各个所述样本子集进行采样,得到目标图像样本集。The re-sampling module is used to sample each of the sample subsets by using the sampling weight matching the model training accuracy to obtain a target image sample set.
  9. 一种图像数据处理设备,其特征在于,包括:An image data processing device, characterized in that it comprises:
    存储器,用于存储计算机程序;Memory, used to store computer programs;
    处理器,用于执行所述计算机程序时实现如权利要求1至7任一项所述图像数据处理方法的步骤。The processor is configured to implement the steps of the image data processing method according to any one of claims 1 to 7 when the computer program is executed.
  10. 一种可读存储介质,其特征在于,所述可读存储介质上存储有计算机程序,所述计算机程序被处理器执行时实现如权利要求1至7任一项所述图像数据处理方法的步骤。A readable storage medium, characterized in that a computer program is stored on the readable storage medium, and when the computer program is executed by a processor, the steps of the image data processing method according to any one of claims 1 to 7 are realized .
PCT/CN2021/076826 2020-06-11 2021-02-19 Image data processing method and apparatus, device and readable storage medium WO2021248932A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010530581.5A CN111723856B (en) 2020-06-11 2020-06-11 Image data processing method, device, equipment and readable storage medium
CN202010530581.5 2020-06-11

Publications (1)

Publication Number Publication Date
WO2021248932A1 true WO2021248932A1 (en) 2021-12-16

Family

ID=72568019

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/076826 WO2021248932A1 (en) 2020-06-11 2021-02-19 Image data processing method and apparatus, device and readable storage medium

Country Status (2)

Country Link
CN (1) CN111723856B (en)
WO (1) WO2021248932A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114612725A (en) * 2022-03-18 2022-06-10 北京百度网讯科技有限公司 Image processing method, device, equipment and storage medium

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111723856B (en) * 2020-06-11 2023-06-09 广东浪潮大数据研究有限公司 Image data processing method, device, equipment and readable storage medium
CN112138394B (en) * 2020-10-16 2022-05-03 腾讯科技(深圳)有限公司 Image processing method, image processing device, electronic equipment and computer readable storage medium
CN113223017A (en) * 2021-05-18 2021-08-06 北京达佳互联信息技术有限公司 Training method of target segmentation model, target segmentation method and device

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160162757A1 (en) * 2014-12-03 2016-06-09 Institute For Information Industry Multi-class object classifying method and system
CN109241903A (en) * 2018-08-30 2019-01-18 平安科技(深圳)有限公司 Sample data cleaning method, device, computer equipment and storage medium
CN110163865A (en) * 2019-05-28 2019-08-23 闽江学院 A kind of method of sampling for unbalanced data in models fitting
CN110889457A (en) * 2019-12-03 2020-03-17 深圳奇迹智慧网络有限公司 Sample image classification training method and device, computer equipment and storage medium
CN110969260A (en) * 2019-10-22 2020-04-07 成都信息工程大学 Unbalanced data oversampling method and device and storage medium
CN111723856A (en) * 2020-06-11 2020-09-29 广东浪潮大数据研究有限公司 Image data processing method, device and equipment and readable storage medium

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7184929B2 (en) * 2004-01-28 2007-02-27 Microsoft Corporation Exponential priors for maximum entropy models
CN108304936B (en) * 2017-07-12 2021-11-16 腾讯科技(深圳)有限公司 Machine learning model training method and device, and expression image classification method and device
CN110929026B (en) * 2018-09-19 2023-04-25 阿里巴巴集团控股有限公司 Abnormal text recognition method, device, computing equipment and medium
CN110222757A (en) * 2019-05-31 2019-09-10 华北电力大学(保定) Based on insulator image pattern extending method, the system for generating confrontation network
CN110674756B (en) * 2019-09-25 2022-07-05 普联技术有限公司 Human body attribute recognition model training method, human body attribute recognition method and device
CN110674881B (en) * 2019-09-27 2022-02-11 长城计算机软件与系统有限公司 Trademark image retrieval model training method, system, storage medium and computer equipment
CN110610061B (en) * 2019-09-30 2022-11-08 湖南大学 Concrete slump high-precision prediction method fusing multi-source information
CN110852396A (en) * 2019-11-15 2020-02-28 苏州中科华影健康科技有限公司 Sample data processing method for cervical image
CN111104572A (en) * 2019-12-12 2020-05-05 北京金山云网络技术有限公司 Feature selection method and device for model training and electronic equipment

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160162757A1 (en) * 2014-12-03 2016-06-09 Institute For Information Industry Multi-class object classifying method and system
CN109241903A (en) * 2018-08-30 2019-01-18 平安科技(深圳)有限公司 Sample data cleaning method, device, computer equipment and storage medium
CN110163865A (en) * 2019-05-28 2019-08-23 闽江学院 A kind of method of sampling for unbalanced data in models fitting
CN110969260A (en) * 2019-10-22 2020-04-07 成都信息工程大学 Unbalanced data oversampling method and device and storage medium
CN110889457A (en) * 2019-12-03 2020-03-17 深圳奇迹智慧网络有限公司 Sample image classification training method and device, computer equipment and storage medium
CN111723856A (en) * 2020-06-11 2020-09-29 广东浪潮大数据研究有限公司 Image data processing method, device and equipment and readable storage medium

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114612725A (en) * 2022-03-18 2022-06-10 北京百度网讯科技有限公司 Image processing method, device, equipment and storage medium
CN116229175A (en) * 2022-03-18 2023-06-06 北京百度网讯科技有限公司 Image processing method, device, equipment and storage medium
CN116229175B (en) * 2022-03-18 2023-12-26 北京百度网讯科技有限公司 Image processing method, device, equipment and storage medium

Also Published As

Publication number Publication date
CN111723856A (en) 2020-09-29
CN111723856B (en) 2023-06-09

Similar Documents

Publication Publication Date Title
WO2021248932A1 (en) Image data processing method and apparatus, device and readable storage medium
US11657602B2 (en) Font identification from imagery
WO2020063314A1 (en) Character segmentation identification method and apparatus, electronic device, and storage medium
US20160026848A1 (en) Global-scale object detection using satellite imagery
JP2014232533A (en) System and method for ocr output verification
WO2020223859A1 (en) Slanted text detection method, apparatus and device
CN113255694A (en) Training image feature extraction model and method and device for extracting image features
WO2017088537A1 (en) Component classification method and apparatus
CN111353491B (en) Text direction determining method, device, equipment and storage medium
WO2020164278A1 (en) Image processing method and device, electronic equipment and readable storage medium
CN110717492B (en) Method for correcting direction of character string in drawing based on joint features
WO2023138188A1 (en) Feature fusion model training method and apparatus, sample retrieval method and apparatus, and computer device
JP6997369B2 (en) Programs, ranging methods, and ranging devices
CN111191649A (en) Method and equipment for identifying bent multi-line text image
CN115937655B (en) Multi-order feature interaction target detection model, construction method, device and application thereof
WO2023206944A1 (en) Semantic segmentation method and apparatus, computer device, and storage medium
CN112036520A (en) Panda age identification method and device based on deep learning and storage medium
CN115223166A (en) Picture pre-labeling method, picture labeling method and device, and electronic equipment
JP2012048624A (en) Learning device, method and program
WO2023241385A1 (en) Model transferring method and apparatus, and electronic device
WO2024074042A1 (en) Data storage method and apparatus, data reading method and apparatus, and device
CN111611388A (en) Account classification method, device and equipment
CN115984853A (en) Character recognition method and device
CN113569940B (en) Knowledge migration and probability correction-based few-sample target detection method
WO2022127333A1 (en) Training method and apparatus for image segmentation model, image segmentation method and apparatus, and device

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21822968

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21822968

Country of ref document: EP

Kind code of ref document: A1