CN114399512A - Example segmentation model training method and device based on artificial intelligence and storage medium - Google Patents

Example segmentation model training method and device based on artificial intelligence and storage medium Download PDF

Info

Publication number
CN114399512A
CN114399512A CN202210074092.2A CN202210074092A CN114399512A CN 114399512 A CN114399512 A CN 114399512A CN 202210074092 A CN202210074092 A CN 202210074092A CN 114399512 A CN114399512 A CN 114399512A
Authority
CN
China
Prior art keywords
size
sample
image
target
tail
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210074092.2A
Other languages
Chinese (zh)
Inventor
郑喜民
陈振宏
舒畅
陈又新
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN202210074092.2A priority Critical patent/CN114399512A/en
Publication of CN114399512A publication Critical patent/CN114399512A/en
Priority to PCT/CN2022/090748 priority patent/WO2023137921A1/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses an example segmentation model training method, device and storage medium based on artificial intelligence, wherein the model training method comprises the following steps: acquiring a long-tail distribution image data set; acquiring a first sample and a second sample from the long-tail distribution image dataset; cutting the first sample according to the first position information to obtain a target tail category image; acquiring a first size and a second size; determining target application position information of the target tail category image in the second sample according to the first position information, the first size and the second size; applying the target tail category image to a second sample according to the target application position information to obtain training data; and acquiring a preset example segmentation model, and training the example segmentation model according to the training data and a preset loss function to obtain a target example segmentation model. According to the technical scheme, the data category distribution of the long-tail distribution image data can be effectively balanced, and the accuracy of the example segmentation model is improved.

Description

Example segmentation model training method and device based on artificial intelligence and storage medium
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to an example segmentation model training method, device and storage medium based on artificial intelligence.
Background
The method is characterized in that an example segmentation model in the unmanned automobile system identifies information of surrounding roads, vehicles and obstacles according to street view image data, and makes a decision according to the information of the surrounding roads, the vehicles and the obstacles so as to control the direction and the speed of the vehicles. Thus, the representation of the example segmentation model is directly related to the safety, stability and comfort of the unmanned vehicle.
In practical application, the obtained street view image data set, i.e., the training data set for training the example segmentation model, is often subjected to long-tailed distribution, i.e., a few training data occupy most of the occurrence times, while most of the training data have low occurrence frequency, which causes unbalanced distribution of training data categories, thereby resulting in low accuracy of the example segmentation model.
Disclosure of Invention
The following is a summary of the subject matter described in detail herein. This summary is not intended to limit the scope of the claims.
The embodiment of the invention provides an example segmentation model training method, device and storage medium based on artificial intelligence, and the accuracy of an example segmentation model can be improved.
In a first aspect, an embodiment of the present invention provides an example segmentation model training method based on artificial intelligence, including:
acquiring a long-tail distribution image data set;
obtaining a first sample and a second sample from the long-tail distribution image dataset, wherein the first sample comprises a tail category image, and the second sample is different from the first sample;
determining first position information of a tail type image in the first sample, and cutting the first sample according to the first position information to obtain a target tail type image;
acquiring a first size and a second size, wherein the first size is the image size of the target tail category image, and the second size is the image size of the second sample;
determining target application position information of the target tail category image in the second sample according to the first position information, the first size and the second size;
applying the target tail type image to the second sample according to the target application position information to obtain training data;
and acquiring a preset example segmentation model, and training the example segmentation model according to the training data and a preset loss function to obtain a target example segmentation model.
In an embodiment, after the cropping the first sample according to the first position information to obtain a target tail category image, the method further includes:
and performing data enhancement on the target tail category image to obtain a new target tail category image.
In an embodiment, before the applying the target tail class image to the second sample according to the target application position information, the method further includes:
obtaining a scaling factor according to the first size and the second size;
and carrying out size adjustment on the target tail type image according to the scaling factor.
In an embodiment, the determining the target application position information of the target tail category image at the second sample according to the first position information, the first size and the second size includes:
acquiring second abscissa information, wherein the second abscissa information is obtained by multiplying the first abscissa information by the ratio of the second height to the first height;
acquiring second vertical coordinate information, wherein the second vertical coordinate information is obtained by multiplying the first vertical coordinate information by the ratio of the second width to the first width;
and determining the second abscissa information and the second ordinate information as the target application position information.
In an embodiment, the deriving the scaling factor according to the first size and the second size includes:
obtaining a first intermediate value, wherein the first intermediate value is obtained by multiplying the first width by the first height;
obtaining a second intermediate value, wherein the second intermediate value is obtained by multiplying the second width by the second height;
and determining a value obtained by dividing the second intermediate value by the first intermediate value as the scaling factor.
In one embodiment, the loss function includes a classification loss function, and the specific formula of the classification loss function is as follows:
Figure BDA0003483192450000021
wherein L iscls(z) is the classification loss value, z is the preset activation function, yiFor the true value of the sample, i is the first class label, NiNumber of samples of different classes, j being a second class label, σiDetermined according to the following formula:
Figure BDA0003483192450000031
Sijdetermined according to the following formula:
Figure BDA0003483192450000032
in one embodiment, the loss function includes a segmentation loss function, and the specific formula of the segmentation loss function is as follows:
Figure BDA0003483192450000033
wherein the content of the first and second substances,
Figure BDA0003483192450000034
determined according to the following formula:
Figure BDA0003483192450000035
wherein p ismAs a result of the segmentation prediction for class m, SbboxTo predict the area of the bounding box, SmaskTo divide the area of the mask.
In a second aspect, an embodiment of the present invention provides an example segmentation model training apparatus, including:
the first acquisition module is used for acquiring a long-tail distribution image dataset;
a second obtaining module, configured to obtain a first sample and a second sample from the long-tail distribution image dataset, where the first sample includes a tail category image, and the second sample is different from the first sample;
the target tail category image determining module is used for determining first position information of a tail category image in the first sample and cutting the first sample according to the first position information to obtain a target tail category image;
a third obtaining module, configured to obtain a first size and a second size, where the first size is an image size of the target tail category image, and the second size is an image size of the second sample;
a target application position information determining module, configured to determine target application position information of the target tail category image in the second sample according to the first position information, the first size, and the second size;
the training data determining module is used for applying the target tail type image to the second sample according to the target application position information to obtain training data;
and the model training module is used for acquiring a preset example segmentation model, and training the example segmentation model according to the training data and a preset loss function to obtain a target example segmentation model.
In a third aspect, an embodiment of the present invention provides an example segmentation model training apparatus, including: memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor when executing the computer program implements the method for training an artificial intelligence based instance segmentation model as described in any of the embodiments of the first aspect.
In a fourth aspect, an embodiment of the present invention further provides a computer-readable storage medium, which stores computer-executable instructions for performing the artificial intelligence based example segmentation model training method according to any one of the embodiments of the first aspect.
The embodiment of the invention comprises an example segmentation model training method, an example segmentation model training device and a storage medium based on artificial intelligence, wherein the example segmentation model training method based on artificial intelligence comprises the following steps: acquiring a long-tail distribution image data set; obtaining a first sample and a second sample from the long-tail distribution image dataset, wherein the first sample comprises a tail category image, and the second sample is different from the first sample; determining first position information of a tail type image in the first sample, and cutting the first sample according to the first position information to obtain a target tail type image; acquiring a first size and a second size, wherein the first size is the image size of the target tail category image, and the second size is the image size of the second sample; determining target application position information of the target tail category image in the second sample according to the first position information, the first size and the second size; applying the target tail type image to the second sample according to the target application position information to obtain training data; and acquiring a preset example segmentation model, and training the example segmentation model according to the training data and a preset loss function to obtain a target example segmentation model. According to the technical scheme, the data category distribution of the long-tail distribution image data can be effectively balanced, and the accuracy of the example segmentation model is improved.
Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.
Drawings
The accompanying drawings are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the example serve to explain the principles of the invention and not to limit the invention.
FIG. 1 is a flowchart illustrating steps of an example segmentation model training method based on artificial intelligence, according to an embodiment of the present invention;
FIG. 2 is a flowchart of the steps provided by another embodiment of the present invention for enhancing data of a target tail class image;
FIG. 3 is a flowchart of the steps provided by another embodiment of the present invention to resize a target tail category image;
FIG. 4 is a flowchart of steps provided by another embodiment of the present invention to obtain location information for a target application;
FIG. 5 is a flowchart of the steps provided by another embodiment of the present invention to obtain a scaling factor;
FIG. 6 is a block diagram of an example segmentation model training apparatus according to another embodiment of the present invention;
FIG. 7 is a block diagram of an example segmentation model training apparatus according to another embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
It should be noted that although functional blocks are partitioned in a schematic diagram of an apparatus and a logical order is shown in a flowchart, in some cases, the steps shown or described may be performed in a different order than the partitioning of blocks in the apparatus or the order in the flowchart. The terms "first," "second," and the like in the description, in the claims, or in the drawings described above, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order.
The invention provides an example segmentation model training method, device and storage medium based on artificial intelligence, wherein the example segmentation model training method based on artificial intelligence comprises the following steps: acquiring a long-tail distribution image data set; obtaining a first sample and a second sample from the long-tail distribution image dataset, wherein the first sample comprises a tail category image, and the second sample is different from the first sample; determining first position information of a tail type image in the first sample, and cutting the first sample according to the first position information to obtain a target tail type image; acquiring a first size and a second size, wherein the first size is the image size of the target tail category image, and the second size is the image size of the second sample; determining target application position information of the target tail category image in the second sample according to the first position information, the first size and the second size; applying the target tail type image to the second sample according to the target application position information to obtain training data; and acquiring a preset example segmentation model, and training the example segmentation model according to the training data and a preset loss function to obtain a target example segmentation model. According to the technical scheme, the data category distribution of the long-tail distribution image data can be effectively balanced, and the accuracy of the example segmentation model is improved.
The embodiment of the application can acquire and process related data based on an artificial intelligence technology. Among them, Artificial Intelligence (AI) is a theory, method, technique and application device that simulates, extends and expands human Intelligence using a digital computer or a machine controlled by a digital computer, senses the environment, acquires knowledge and uses the knowledge to obtain the best result.
The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction devices, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a robot technology, a biological recognition technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.
The terminal mentioned in the embodiment of the present invention may be a smart phone, a tablet computer, a notebook computer, a desktop computer, a vehicle-mounted computer, a smart home, a wearable electronic device, a VR (Virtual Reality)/AR (Augmented Reality) device, and the like; the server may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, a cloud server providing basic cloud computing services such as cloud service, a cloud database, cloud computing, a cloud function, cloud storage, Network service, cloud communication, middleware service, domain name service, security service, Content Delivery Network (CDN), big data and an artificial intelligence platform, and the like.
It should be noted that the data in the embodiment of the present invention may be stored in a server, and the server may be an independent server, or may be a cloud server that provides basic cloud computing services such as cloud service, a cloud database, cloud computing, a cloud function, cloud storage, Network service, cloud communication, middleware service, domain name service, security service, Content Delivery Network (CDN), big data, and an artificial intelligence platform.
The embodiments of the present invention will be further explained with reference to the drawings.
As shown in fig. 1, fig. 1 is a flowchart illustrating steps of an example segmentation model training method based on artificial intelligence according to an embodiment of the present invention, where the example segmentation model training method includes, but is not limited to, the following steps:
step S110, acquiring a long-tail distribution image data set;
step S120, a first sample and a second sample are obtained from the long-tail distribution image data set, the first sample comprises a tail type image, and the second sample is different from the first sample;
step S130, determining first position information of the tail type image in the first sample, and cutting the first sample according to the first position information to obtain a target tail type image;
step S140, acquiring a first size and a second size, wherein the first size is the image size of the target tail category image, and the second size is the image size of the second sample;
step S150, determining target application position information of the target tail category image in the second sample according to the first position information, the first size and the second size;
step S160, applying the target tail type image to a second sample according to the target application position information to obtain training data;
step S170, obtaining a preset example segmentation model, and training the example segmentation model according to the training data and a preset loss function to obtain a target example segmentation model.
It should be noted that, the embodiment of the present application does not limit the scene type of the long-tailed distribution image data set and the obtaining method, and it can be understood that the application scene of the example segmentation model training method of the embodiment of the present application may be an unmanned application scene, the long-tailed distribution image data set may be street view image data, and is obtained from an existing public data set, such as a cityscape data set, or a first image set is obtained from a camera of an unmanned vehicle, and tail data or head data of the first image set is labeled, so as to provide a data basis for training the example segmentation model.
It is understood that the first sample is an image including the tail class image, and the second sample is different from the first sample, i.e. the second sample does not include the tail class image, and the obtaining of the first sample and the second sample can provide an effective data base for obtaining the training data.
It should be noted that, in the embodiment of the present application, a specific manner for obtaining the first sample and the second sample is not limited, when each image in the long-tail distribution image dataset carries a tail type tag or a head type tag, the first sample may be obtained from the long-tail distribution image dataset according to the tail type tag, and the second sample may be obtained from the long-tail distribution image dataset according to the head type tag, where the obtaining manner of the tail type tag or the head type tag is well known by those skilled in the art and can be obtained according to the labeling information of the long-tail distribution image dataset.
It can be understood that the first position information of the tail category image in the first sample can be determined through the tail category label of the first sample, and the first sample is cut according to the first position information to obtain the target tail category image, so that an effective data base can be provided for acquiring training data.
It can be understood that obtaining the image size of the target tail category image, that is, the first size, and obtaining the image size of the second sample, that is, the second size, can provide an effective data basis for the target tail category image to be applied to the second sample, and it should be noted that specific ways of obtaining the first size and the second size are well known to those skilled in the art, and details of the embodiments of the present application are not repeated herein.
It can be understood that, determining the target application position information of the target tail category image in the second sample according to the first position information, the first size and the second size can effectively ensure that the target tail category image does not exceed the image width range of the second sample in the state that the target tail category image is applied to the second sample, and compared with a cut-and-paste enhancement method for mining semantic information from an image by using a model to determine a paste position, the technical solution of the embodiment of the present application has the advantages of small calculation amount, less time consumption and no need of introducing additional supervision information.
It should be noted that the embodiment of the present application does not relate to the improvement of the example segmentation model, and those skilled in the art may select a specific example segmentation model according to the actual situation, which may be a Mask-RCNN model, a fast-RCNN model, or the like.
It can be understood that, in practical applications, most data sets used for model training present long tail distribution, the long tail distribution is a special asymmetric distribution, where a part of categories includes a very large amount of data, called head categories, and another corresponding part of categories includes a very small amount of data, called tail categories, according to the technical scheme of the present application, a first sample and a second sample are obtained from a long tail distribution image data set, the first sample includes a tail category image, the second sample is different from the first sample, first position information of the tail category image in the first sample is determined, and the first sample is clipped according to the first position information to obtain a target tail category image; acquiring a first size and a second size, wherein the first size is the image size of the target tail category image, and the second size is the image size of a second sample; determining target application position information of the target tail category image in the second sample according to the first position information, the first size and the second size; and applying the target tail class image to a second sample according to the target application position information to obtain training data, so that tail class data and head class data in the training data can be distributed uniformly, and the accuracy of the example segmentation model is improved.
In addition, referring to fig. 2, in an embodiment, after step S130 in the embodiment shown in fig. 1, the following steps are included, but not limited to:
step S210, performing data enhancement on the target tail category image to obtain a new target tail category image.
It can be understood that, in order to improve the robustness and generalization capability of the example segmentation model, the example segmentation model is often required to be trained by using large-scale data. In a real situation, there are often situations where training data is insufficient, and in this case, the data enhancement technology is very important. Therefore, data enhancement is performed on the target tail category image, and the number of training data can be increased and the diversity of the training data can be improved.
It should be noted that the embodiment of the present application does not limit the specific manner of data enhancement, and may be implemented by a geometric transformation method, such as flipping, rotating, cropping, scaling, translating, or dithering the image data, or by a pixel transformation method, by adjusting the brightness of the image, adjusting the white balance, and so on.
In addition, referring to fig. 3, in an embodiment, before step S160 in the embodiment shown in fig. 1, the following steps are further included, but not limited to:
step S310, obtaining a scaling factor according to the first size and the second size;
and step S320, adjusting the size of the target tail category image according to the scaling factor.
It can be understood that, obtaining the scaling factor according to the first size and the second size, and performing the size adjustment on the target tail category image according to the scaling factor can enable the target tail category image not to exceed the frame range of the second sample when being applied to the second sample.
In addition, referring to fig. 4, in an embodiment, the first size includes a first width and a first height, the second size includes a second width and a second height, the first position information includes a first abscissa information and a first ordinate information, and the step S150 in the embodiment shown in fig. 1 includes, but is not limited to, the following steps:
step S410, second abscissa information is obtained, and the second abscissa information is obtained by multiplying the first abscissa information by the ratio of the second height to the first height;
step S420, second vertical coordinate information is obtained, and the second vertical coordinate information is obtained by multiplying the first vertical coordinate information by the ratio of the second width to the first width;
step S430, determining the second abscissa information and the second ordinate information as the target application location information.
It should be noted that the embodiment of the present application may be applied to an unmanned scene, and the long tail distribution image dataset may be street view image data of an example of a vehicle including a tail category, such as a train, a van, a bus, a motorcycle, and the like. As vehicles, all of which are in contact with the road surface, the scheme takes this into account when determining the target application position information of the target tail category image at the second sample, and marks the minimum bounding rectangle width surrounding the road label in the target tail category image, i.e. the first width as w1Marking the minimum circumscribed rectangle height of the road label in the tail category image surrounding the target, namely the first height as h1The pasted image without the long tail category, i.e. the minimum bounding rectangle width of the road label in the second sample, i.e. the second width is marked as w2And the image to be pasted not containing the long-tail category, i.e., the minimum circumscribed rectangular height of the road label in the second sample, and the second height is marked as h2The position information of the target tail category image in the first sample is (x)1,y1),x1Is the first abscissa information, y, in the first position information1For the first ordinate information in the first position information, the target application position information of the target tail class image in the second sample, i.e. the second abscissa information and the second ordinate information (x)2,y2) The specific acquisition mode of (a) is realized according to the following formula:
Figure BDA0003483192450000101
Figure BDA0003483192450000102
it can be understood that the target tail class image is applied to the second sample according to the target application position information to obtain training data, a preset example segmentation model is obtained, the example segmentation model is trained according to the training data and a preset loss function to obtain the target example segmentation model, and therefore data class distribution of long tail distribution image data can be effectively balanced, and accuracy of the example segmentation model is improved.
Additionally, referring to fig. 5, in an embodiment, step S310 in the embodiment shown in fig. 3 includes, but is not limited to, the following steps:
step S510, obtaining a first intermediate value, where the first intermediate value is obtained by multiplying a first width by a first height;
step S520, a second intermediate value is obtained, wherein the second intermediate value is obtained by multiplying the second width by the second height;
in step S530, a value obtained by dividing the second intermediate value by the first intermediate value is determined as a scaling factor.
It should be noted that, the specific manner of obtaining the scaling factor can be implemented according to the following formula:
Figure BDA0003483192450000103
where s is the scaling factor, w1、h1、w2、h2、x1、x2And y2The detailed explanation of the embodiments is described in detail in the above embodiments, and will not be described herein.
It should be noted that, the embodiment of the present application further provides a constraint method for the scaling factor s, which can effectively enhance the diversity of data, so as to enhance the robustness of the model, and a specific constraint formula is as follows:
Figure BDA0003483192450000111
in one embodiment of the present application, the application scenario of the example segmentation model is the field of unmanned driving technology, the training data is street view image data, since the size of the instances in the street view image data is not unique, it can be very different, such as trains and traffic lights, if the example partition model uses a large border (bounding box) and a small mask (mask) more, it will not be beneficial to the mining of the features by the example partition model, in order to guide the example segmentation model to predict borders and masks whose areas do not differ too much, meanwhile, in order to avoid a data set distributed at the long tail, the classifier tends to give a high prediction score to more appeared categories, and according to the technical scheme of the application, under the condition that the example segmentation model is trained according to street view image data, the example segmentation model is trained through a classification loss function and a segmentation loss function.
The specific formula of the classification loss function is as follows:
Figure BDA0003483192450000112
wherein L iscls(z) is the classification loss value, z is the preset activation function, yiFor the true value of the sample, i is the first class label, NiNumber of samples of different classes, j being a second class label, σiDetermined according to the following formula:
Figure BDA0003483192450000113
Sijdetermined according to the following formula:
Figure BDA0003483192450000114
it should be noted that, in the embodiment of the present application, the selection of the activation function z is not limited, and a person skilled in the art may select the activation function according to actual situations, and in the embodiment of the present application, z may be obtained according to the following formula:
Figure BDA0003483192450000115
it will be appreciated that the classification loss function varies dynamically with the relative proportion of samples of different classes, in particular by the relative proportion of samples of classes, embodiments of the present application rely on a constraint SijThe weight of each class penalty is adjusted. While different penalties are taken for more and less classes of samples, the loss function provided by the embodiment does not explicitly distinguish between head and tail classes, and the whole loss calculation process keeps fluency. In addition, the loss function can automatically learn the relative proportion of different classes of samples so as to adjust the weight punished to each class, and the class distribution does not need to be calculated in advance or a specially designed data sampling mode is not needed.
The specific formula of the segmentation loss function is as follows:
Figure BDA0003483192450000121
wherein the content of the first and second substances,
Figure BDA0003483192450000122
determined according to the following formula:
Figure BDA0003483192450000123
wherein p ismAs a result of the segmentation prediction for class m, SbboxTo predict the area of the bounding box, SmaskTo divide the area of the mask.
It can be understood that the cross entropy weight is set by judging the area ratio of the frame and the segmentation mask, and a larger penalty is given to the prediction with a large difference of the area ratio of the frame and the segmentation mask, so that the loss of characteristic information is reduced, and the generalization capability of the example segmentation model is improved.
In addition, referring to fig. 6, fig. 6 is a schematic block diagram of an example segmentation model training apparatus 600 according to another embodiment of the present invention, and an embodiment of the present invention further provides an example segmentation model training apparatus 600, where the example segmentation model training apparatus 600 includes a first obtaining module 610, a second obtaining module 620, a target tail class image determining module 630, a third obtaining module 640, a target application position information determining module 650, a training data determining module 660, and a model training module 670, where the first obtaining module 610 is configured to obtain a long-tail distribution image dataset; the second obtaining module 620 is configured to obtain a first sample and a second sample from the long-tail distribution image dataset, where the first sample includes a tail category image, and the second sample is different from the first sample; the target tail category image determining module 630 is configured to determine first position information of a tail category image in the first sample, and cut the first sample according to the first position information to obtain a target tail category image; the third obtaining module 640 is configured to obtain a first size and a second size, where the first size is an image size of the target tail category image, and the second size is an image size of the second sample; the target application position information determining module 650 is configured to determine target application position information of the target tail category image in the second sample according to the first position information, the first size, and the second size; the training data determining module 660 is configured to apply the target tail category image to the second sample according to the target application position information to obtain training data; the model training module 670 is configured to obtain a preset instance segmentation model, and train the instance segmentation model according to training data and a preset loss function to obtain a target instance segmentation model.
In addition, referring to fig. 7, fig. 7 is a block diagram of an example segmentation model training apparatus 700 according to another embodiment of the present invention, and an embodiment of the present invention further provides an example segmentation model training apparatus 700, where the example segmentation model training apparatus 700 includes: memory 710, processor 720, and computer programs stored on memory 710 and executable on processor 720.
The processor 720 and the memory 710 may be connected by a bus or other means.
Non-transitory software programs and instructions required to implement the artificial intelligence based example segmentation model training method of the above-described embodiments are stored in the memory 710, and when executed by the processor 720, perform the artificial intelligence based example segmentation model training method of the above-described embodiments, for example, performing the above-described method steps S110 to S170 in fig. 1, S210 in fig. 2, S310 to S320 in fig. 3, S410 to S430 in fig. 4, and S510 to S530 in fig. 5.
The above-described embodiments of the apparatus are merely illustrative, wherein the units illustrated as separate components may or may not be physically separate, i.e. may be located in one place, or may also be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment.
Furthermore, an embodiment of the present invention further provides a computer-readable storage medium, which stores computer-executable instructions, which are executed by a processor 720 or a controller, for example, by a processor 720 in the embodiment of the example segmentation model training apparatus 700, and can make the processor 720 execute the example segmentation model training method based on artificial intelligence in the embodiment, for example, execute the above-described method steps S110 to S170 in fig. 1, method step S210 in fig. 2, method steps S310 to S320 in fig. 3, method steps S410 to S430 in fig. 4, and method steps S510 to S530 in fig. 5. One of ordinary skill in the art will appreciate that all or some of the steps, systems, and methods disclosed above may be implemented as software, firmware, hardware, and suitable combinations thereof. Some or all of the physical components may be implemented as software executed by a processor, such as a central processing unit, digital signal processor, or microprocessor, or as hardware, or as an integrated circuit, such as an application specific integrated circuit. Such software may be distributed on computer readable media, which may include computer storage media (or non-transitory media) and communication media (or transitory media). The term computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data, as is well known to those of ordinary skill in the art. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, Digital Versatile Disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can accessed by a computer. In addition, communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media as known to those skilled in the art.
While the preferred embodiments of the present invention have been described in detail, it will be understood by those skilled in the art that the foregoing and various other changes, omissions and deviations in the form and detail thereof may be made without departing from the scope of this invention.

Claims (10)

1. An example segmentation model training method based on artificial intelligence is characterized by comprising the following steps:
acquiring a long-tail distribution image data set;
obtaining a first sample and a second sample from the long-tail distribution image dataset, wherein the first sample comprises a tail category image, and the second sample is different from the first sample;
determining first position information of a tail type image in the first sample, and cutting the first sample according to the first position information to obtain a target tail type image;
acquiring a first size and a second size, wherein the first size is the image size of the target tail category image, and the second size is the image size of the second sample;
determining target application position information of the target tail category image in the second sample according to the first position information, the first size and the second size;
applying the target tail type image to the second sample according to the target application position information to obtain training data;
and acquiring a preset example segmentation model, and training the example segmentation model according to the training data and a preset loss function to obtain a target example segmentation model.
2. The method of claim 1, further comprising, after cropping the first sample according to the first position information to obtain a target tail class image:
and performing data enhancement on the target tail category image to obtain a new target tail category image.
3. The method of claim 1, further comprising, prior to said applying the target tail class image to the second sample according to the target application location information:
obtaining a scaling factor according to the first size and the second size;
and carrying out size adjustment on the target tail type image according to the scaling factor.
4. The method of claim 3, wherein the first size comprises a first width and a first height, the second size comprises a second width and a second height, the first position information comprises first abscissa information and first ordinate information, and the determining the target application position information of the target tail category image on the second sample according to the first position information, the first size, and the second size comprises:
acquiring second abscissa information, wherein the second abscissa information is obtained by multiplying the first abscissa information by the ratio of the second height to the first height;
acquiring second vertical coordinate information, wherein the second vertical coordinate information is obtained by multiplying the first vertical coordinate information by the ratio of the second width to the first width;
and determining the second abscissa information and the second ordinate information as the target application position information.
5. The method of claim 4, wherein deriving the scaling factor according to the first size and the second size comprises:
obtaining a first intermediate value, wherein the first intermediate value is obtained by multiplying the first width by the first height;
obtaining a second intermediate value, wherein the second intermediate value is obtained by multiplying the second width by the second height;
and determining a value obtained by dividing the second intermediate value by the first intermediate value as the scaling factor.
6. The method of claim 1, wherein the loss function comprises a categorical loss function, and wherein the categorical loss function is formulated as follows:
Figure FDA0003483192440000021
wherein L iscls(z) is the classification loss value, z is the preset activation function, yiFor the true value of the sample, i is the first class label, NiNumber of samples of different classes, j being a second class label, σiDetermined according to the following formula:
Figure FDA0003483192440000022
Sijdetermined according to the following formula:
Figure FDA0003483192440000023
7. the method of claim 1, wherein the loss function comprises a segmentation loss function, and wherein the segmentation loss function is specifically formulated as follows:
Figure FDA0003483192440000024
wherein the content of the first and second substances,
Figure FDA0003483192440000025
determined according to the following formula:
Figure FDA0003483192440000031
wherein p ismAs a result of the segmentation prediction for class m, SbboxTo predict the area of the bounding box, SmaskTo divide the area of the mask.
8. An example segmentation model training apparatus, comprising:
the first acquisition module is used for acquiring a long-tail distribution image dataset;
a second obtaining module, configured to obtain a first sample and a second sample from the long-tail distribution image dataset, where the first sample includes a tail category image, and the second sample is different from the first sample;
the target tail category image determining module is used for determining first position information of a tail category image in the first sample and cutting the first sample according to the first position information to obtain a target tail category image;
a third obtaining module, configured to obtain a first size and a second size, where the first size is an image size of the target tail category image, and the second size is an image size of the second sample;
a target application position information determining module, configured to determine target application position information of the target tail category image in the second sample according to the first position information, the first size, and the second size;
the training data determining module is used for applying the target tail type image to the second sample according to the target application position information to obtain training data;
and the model training module is used for acquiring a preset example segmentation model, and training the example segmentation model according to the training data and a preset loss function to obtain a target example segmentation model.
9. An example segmentation model training apparatus, comprising: memory, processor and computer program stored on the memory and executable on the processor, characterized in that the processor implements the example segmentation model training method according to any one of claims 1 to 7 when executing the computer program.
10. A computer-readable storage medium storing computer-executable instructions for performing the example segmentation model training method of any one of claims 1 to 7.
CN202210074092.2A 2022-01-21 2022-01-21 Example segmentation model training method and device based on artificial intelligence and storage medium Pending CN114399512A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202210074092.2A CN114399512A (en) 2022-01-21 2022-01-21 Example segmentation model training method and device based on artificial intelligence and storage medium
PCT/CN2022/090748 WO2023137921A1 (en) 2022-01-21 2022-04-29 Artificial intelligence-based instance segmentation model training method and apparatus, and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210074092.2A CN114399512A (en) 2022-01-21 2022-01-21 Example segmentation model training method and device based on artificial intelligence and storage medium

Publications (1)

Publication Number Publication Date
CN114399512A true CN114399512A (en) 2022-04-26

Family

ID=81233124

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210074092.2A Pending CN114399512A (en) 2022-01-21 2022-01-21 Example segmentation model training method and device based on artificial intelligence and storage medium

Country Status (2)

Country Link
CN (1) CN114399512A (en)
WO (1) WO2023137921A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115170800A (en) * 2022-07-15 2022-10-11 浙江大学 Urban waterlogging deep recognition method based on social media and deep learning
WO2023137921A1 (en) * 2022-01-21 2023-07-27 平安科技(深圳)有限公司 Artificial intelligence-based instance segmentation model training method and apparatus, and storage medium

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117058564B (en) * 2023-10-11 2023-12-22 光轮智能(北京)科技有限公司 Virtual perception data acquisition method and long tail scene data mining method

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9396546B2 (en) * 2014-01-21 2016-07-19 Adobe Systems Incorporated Labeling objects in image scenes
CN111832406B (en) * 2020-06-05 2022-12-06 中国科学院计算技术研究所 Long-tail target detection method and system
CN112101544A (en) * 2020-08-21 2020-12-18 清华大学 Training method and device of neural network suitable for long-tail distributed data set
CN113689436B (en) * 2021-09-29 2024-02-02 平安科技(深圳)有限公司 Image semantic segmentation method, device, equipment and storage medium
CN114399512A (en) * 2022-01-21 2022-04-26 平安科技(深圳)有限公司 Example segmentation model training method and device based on artificial intelligence and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023137921A1 (en) * 2022-01-21 2023-07-27 平安科技(深圳)有限公司 Artificial intelligence-based instance segmentation model training method and apparatus, and storage medium
CN115170800A (en) * 2022-07-15 2022-10-11 浙江大学 Urban waterlogging deep recognition method based on social media and deep learning

Also Published As

Publication number Publication date
WO2023137921A1 (en) 2023-07-27

Similar Documents

Publication Publication Date Title
CN111626208B (en) Method and device for detecting small objects
CN114399512A (en) Example segmentation model training method and device based on artificial intelligence and storage medium
US20220319046A1 (en) Systems and methods for visual positioning
CN111160205B (en) Method for uniformly detecting multiple embedded types of targets in traffic scene end-to-end
CN111091023A (en) Vehicle detection method and device and electronic equipment
US20240005642A1 (en) Data Augmentation for Vehicle Control
US20240005641A1 (en) Data Augmentation for Detour Path Configuring
CN111753592A (en) Traffic sign recognition method, traffic sign recognition device, computer equipment and storage medium
CN112613434A (en) Road target detection method, device and storage medium
CN115273032A (en) Traffic sign recognition method, apparatus, device and medium
CN113887481A (en) Image processing method and device, electronic equipment and medium
CN115946722A (en) Vehicle control method and system based on traffic sign and traffic sign identification platform
CN109635719A (en) A kind of image-recognizing method, device and computer readable storage medium
CN112434601B (en) Vehicle illegal detection method, device, equipment and medium based on driving video
CN114463460A (en) Scene graph generation method and device for visual traffic scene
Alam et al. Faster RCNN based robust vehicle detection algorithm for identifying and classifying vehicles
CN115565152B (en) Traffic sign extraction method integrating vehicle-mounted laser point cloud and panoramic image
CN114495044A (en) Label identification method, label identification device, computer equipment and storage medium
KR20220022749A (en) Device, method, system and computer readable storage medium to provide vehicle moved on a road with target advertisement
CN112215189A (en) Accurate detecting system for illegal building
CN116958915B (en) Target detection method, target detection device, electronic equipment and storage medium
US11702011B1 (en) Data augmentation for driver monitoring
Titarev et al. Intelligent image labeling system for recognizing traffic violations
Jayan et al. Improved Traffic Sign Detection in Autonomous Driving Using A Simulation-Based Deep Learning Approach Under Adverse Conditions
CN117011823A (en) Image recognition method, apparatus, device, storage medium, and computer program product

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination