CN110516707A - A kind of image labeling method and its device, storage medium - Google Patents
A kind of image labeling method and its device, storage medium Download PDFInfo
- Publication number
- CN110516707A CN110516707A CN201910655710.0A CN201910655710A CN110516707A CN 110516707 A CN110516707 A CN 110516707A CN 201910655710 A CN201910655710 A CN 201910655710A CN 110516707 A CN110516707 A CN 110516707A
- Authority
- CN
- China
- Prior art keywords
- image
- data set
- style
- model
- information
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
Abstract
A kind of image labeling method and its device, storage medium, wherein image labeling method includes: the image for obtaining target object in a site environment;Feature information extraction is carried out according to image of the machine vision model pre-established to target object, wherein machine vision model is the second data set for formed after style conversion using preset first data set, the model obtained by machine learning training;The target object in the image of target object, and the markup information of output target object are marked out using the characteristic information that extraction obtains.Due to when establishing machine vision model, the data set marked is subjected to Style Transfer to field data collection by GAN model, so make the data set marked that can also obtain the style information of field data collection while keeping label information, to simulation site environment to greatest extent, enhance the migration effect of machine vision model.
Description
Technical field
The present invention relates to technical field of image processing, and in particular to a kind of image labeling method and its device, storage medium.
Background technique
Pedestrian identifies that (Person Re-identification is also referred to as pedestrian and identifies again, and referred to as ReID is in recent years again
One research emphasis of computer vision, that is, give a monitoring pedestrian image, and striding equipment retrieves the image of the pedestrian.Due to not
Had differences between picture pick-up device, pedestrian's appearance vulnerable to dress, scale, block, posture and visual angle etc. influence, pedestrian identifies again
It is one not only to have had researching value simultaneously but also be rich in the project of challenge.
The target of ReID is the surveyor's image for matching and returning the large-scale atlas collected from camera network, due to
Important application of the ReID in terms of safety and monitoring, causes the extensive concern of academia and industry, due to deep learning
The availability of development and many data sets, but also the performance of ReID obtains significant promotion.
Although the performance of current ReID data set is satisfactory, but still not solving there are some obstruction personnel ReID application
Certainly problem.Firstly, existing common data sets are different from the data collected in real scene, there are illumination, resolution ratio, ethnic group,
The difference of clarity, background etc..For example, current data set includes the identity of limited quantity or carries out under constrained environment,
The limited personnel amount and simple illumination condition presented simplifies the ReID task of personnel and helps to realize high-precision knowledge
Other effect;But under actual scene ReID be usually indoors with executed in the camera network disposed in outdoor scene, and
Processing for a long time shooting video, therefore really application program must cope with challenge, such as a large amount of identity and complexity illumination with
And scene changes, current algorithm possibly can not solve.
In addition, using deep neural network training computer vision for the use of ReID model when, from a data set (data
Collection is usually by manually mark or the obtained object picture of image labeling algorithm) train come model in another data
Performance on collection, which has, to decline to a great extent, i.e. model migration effect is poor.Therefore when computer vision technical application, need to existing
Field data is largely marked, and using the field data re -training model marked, is taken considerable time and expense.
Summary of the invention
The present invention solves the technical problem of the migration effects for how enhancing machine vision model, to improve image mark
Accuracy rate when note.
According in a first aspect, providing a kind of image labeling method in a kind of embodiment, comprising: obtain mesh in a site environment
Mark the image of object;Feature information extraction is carried out according to image of the machine vision model pre-established to the target object;
The machine vision model is the second data set for formed after style conversion using preset first data set, passes through machine
Learn and trains obtained model;The target pair in the image of the target object is marked out using the characteristic information that extraction obtains
As, and the markup information of the output target object.
The characteristic information obtained using extraction marks out the target object in the image of the target object, comprising:
Several characteristic informations extracted from the target object are matched with the default feature of the target object respectively, it will
The characteristic information of successful match is labeled;The markup information of the target object is formed according to the characteristic information marked.
The machine vision model is the second data set for formed after style conversion using preset first data set,
Obtained model is trained by machine learning, then the establishment process of the machine vision model are as follows:
Acquisition step: acquiring one group of image of at least one mobile object site environment Nei, forms field data collection,
The style information of the field data collection is obtained, the style information includes brightness, color, color difference, clarity, contrast, divides
One or more of resolution;Switch process: according to the style information of the field data collection to preset first data set into
The conversion of row style, obtains the second data set;First data set includes that at least one mobile object has been marked in any environment
One group of image of note, and the corresponding one group of image of each mobile object has unified label information;Training step: described in utilization
Second data set, by machine learning, training obtains the machine vision model.
In the switch process, the style information according to the field data collection to preset first data set into
The conversion of row style, obtains the second data set, comprising: by GAN model by the first data set Style Transfer to the scene
Data set is turned with carrying out style to every group of image in first data set according to the style information of the field data collection
It changes, obtains corresponding one group of new images;The corresponding one group of new images of every group of image in first data set are integrated, described in formation
Second data set.
It is described by GAN model by the first data set Style Transfer to the field data collection, according to described existing
The style information of field data collection carries out style conversion to every group of image in first data set, obtains corresponding one group and newly schemes
Picture, comprising: establish total losses function, be formulated as
Loss=LStyle+λ1LID
Wherein, LStyleIndicate the corresponding style loss function of the style information of the field data collection, LIDIndicate described
The corresponding label loss function of the label information of every group of image, λ in one data set1For specific gravity factor;
The parameter that the GAN model is adjusted using the style loss function and the label loss function, so that institute
The Loss value for stating total losses function reaches minimum;Every group of image in first data set is input to Loss value in minimum
The obtained GAN model is adjusted, to carry out style conversion to every group of image in first data set, output obtains the group
The corresponding one group of new images of image.
In the total losses function, the style loss function is expressed as
Wherein, A, B are respectively the field data collection, first data set, LGANFor standard antagonism loss function,
LcycFor period consistency loss function, G indicates the pattern mapping function from A to B,Indicate the pattern mapping function from B to A,
DAAnd DBIt is the pattern discriminator of A and B, λ respectively2For specific gravity factor;The label loss function is expressed as
Wherein, the data distribution of A is a~pdata(a), the data distribution of B is b~pdata(b), Var is the variance meter of data
Calculate function, G (a) is that image a is migrated target image in A, and M (a) is the foreground mask of image a, and G (b) is in B
Image b's is migrated target image, and M (b) is the foreground mask of image a.
It further include testing procedure after the training step, the testing procedure includes: to utilize the field data collection
The machine vision model is tested, the super ginseng in the GAN model is adjusted by iterative algorithm or gradient descent algorithm;
After adjusting the super ginseng in the GAN model every time, second data set is re-formed by the switch process, and pass through
The training step re -training obtains the machine vision model, continues with the field data collection and obtains to re -training
The machine vision model tested, until the GAN model in super ginseng complete adjustment.
According to second aspect, a kind of image labeling device is provided in a kind of embodiment, comprising:
Acquiring unit, for obtaining the image of target object in a site environment;
Extraction unit, for carrying out feature letter according to image of the machine vision model pre-established to the target object
Breath extracts;The machine vision model is the second data set for formed after style conversion using preset first data set,
The model obtained by machine learning training;
Unit is marked, the characteristic information for obtaining using extraction marks out the target pair in the image of the target object
As, and the markup information of the output target object.
The image labeling device further includes the model foundation unit for establishing the machine vision model, and described
Extraction unit connection, the model foundation unit includes: acquisition module, for acquiring at least one movement in the site environment
One group of image of object forms field data collection, obtains the style information of the field data collection, the style information includes bright
One or more of degree, color, color difference, clarity, contrast, resolution ratio;Conversion module, for according to the field data
The style information of collection carries out style conversion to preset first data set, obtains the second data set;First data set includes
One group of image that at least one mobile object has marked in any environment, and the corresponding one group of image of each mobile object has
Unified label information;Training module, for utilizing second data set, by machine learning, training obtains the machine
Vision mode.
According to the third aspect, a kind of computer readable storage medium, including program, described program are provided in a kind of embodiment
It can be executed by processor to realize the image labeling method as described in above-mentioned first aspect.
The beneficial effect of the application is:
According to a kind of image labeling method and its device, storage medium of above-described embodiment, wherein image labeling method packet
It includes: obtaining the image of target object in a site environment;According to the machine vision model pre-established to the image of target object
Feature information extraction is carried out, wherein machine vision model is to utilize the formed after the progress style conversion of preset first data set
Two data sets, the model obtained by machine learning training;Target object is marked out using obtained characteristic information is extracted
The target object in image, and the markup information of output target object.In a first aspect, due to establishing machine vision model
When, the data set marked is carried out to field data collection, then making the data set marked by Style Transfer by GAN model
The style information of field data collection can be also obtained while keeping label information, thus simulation site environment to greatest extent,
Enhance the migration effect of machine vision model;Second aspect not only overcomes machine using the machine vision model of foundation well
The poor problem of device vision mode migration effect, also the machine vision model be applied to site environment when, can well from
Characteristic information is extracted in image, conducive to rapidly target object is identified in environment at the scene when image labeling, to reduce new
Artificial mark work needed for scene modeling is effectively saved time and the cost of new scene modeling.
Detailed description of the invention
Fig. 1 is the flow chart of image labeling method in the application;
Fig. 2 is the flow chart of label target object;
Fig. 3 is the flow chart that machine vision model is established in the application;
Fig. 4 is the flow chart of testing procedure when establishing machine vision model;
Fig. 5 is the schematic illustration for establishing machine vision model;
Fig. 6 is the structural schematic diagram of image labeling device in the application;
Fig. 7 is the structural schematic diagram of model foundation unit in image labeling device;
Fig. 8 is the schematic illustration of GAN model Style Transfer.
Specific embodiment
Below by specific embodiment combination attached drawing, invention is further described in detail.Wherein different embodiments
Middle similar component uses associated similar element numbers.In the following embodiments, many datail descriptions be in order to
The application is better understood.However, those skilled in the art can recognize without lifting an eyebrow, part of feature
It is dispensed, or can be substituted by other elements, material, method in varied situations.In some cases, this Shen
Please it is relevant it is some operation there is no in the description show or describe, this is the core in order to avoid the application by mistake
More descriptions are flooded, and to those skilled in the art, these relevant operations, which are described in detail, not to be necessary, they
Relevant operation can be completely understood according to the general technology knowledge of description and this field in specification.
It is formed respectively in addition, feature described in this description, operation or feature can combine in any suitable way
Kind embodiment.Meanwhile each step in method description or movement can also can be aobvious and easy according to those skilled in the art institute
The mode carry out sequence exchange or adjustment seen.Therefore, the various sequences in the description and the appended drawings are intended merely to clearly describe a certain
A embodiment is not meant to be necessary sequence, and wherein some sequentially must comply with unless otherwise indicated.
It is herein component institute serialization number itself, such as " first ", " second " etc., is only used for distinguishing described object,
Without any sequence or art-recognized meanings.And " connection ", " connection " described in the application, unless otherwise instructed, include directly and
It is indirectly connected with (connection).
Embodiment one,
Referring to FIG. 1, the application discloses a kind of image labeling method comprising step S110-S130 is said separately below
It is bright.
Step S110 obtains the image of target object in a site environment.
In the present embodiment, site environment can be the public arenas such as street, square, highway, station, market, hotel, mesh
Mark object can be the moveable object such as pedestrian, vehicle, pet, be not specifically limited here.Furthermore, it is possible to by being installed on
One or more video capture devices (such as camera) of public arena acquire the image of target object in associated field environment,
And control centre is transmitted to for obtaining.
Step S120 carries out feature information extraction according to image of the machine vision model pre-established to target object;
Here machine vision model is the second data set for formed after style conversion using preset first data set, passes through machine
Device learns and trains obtained model.
In one embodiment, if through the machine vision model that pre-establishes in the image for extracting target object
When dry characteristic information, the main some feature vectors extracted in image.For example, multiple pedestrians move on square, then scheming
As not only there is the characteristic information of pedestrian, there are also the characteristic informations of objects other on square, to extract in image about pedestrian at this time
With the characteristic information of other objects.
It should be noted that characteristic information here is often feature vector, it is equivalent to table of the picture in goal task
Show, it is believed that be a kind of generic representation mode of computer vision field, i.e., target object characterized with vector, to support reality
The tasks such as recognition of face, pedestrian's identification in the application of border.As recognition of face is retrieved in face vector library with object vector
The highest feature of similarity, when similarity is considered the same person higher than a certain threshold value.
Step S130 marks out the target object in the image of the target object using the characteristic information that extraction obtains, with
And the markup information of output target object.Specifically, the markup information of target object can be subjected to classified and stored and display, made
Obtaining administrative staff can find conveniently by these markup informations to target object.
In one embodiment, step S130 may include step S131-S132 in Fig. 2, be described as follows.
Step S131, several characteristic informations extracted from the image of target object are default with target object respectively
Feature is matched, and the feature vector of successful match is labeled.For example target object is some pedestrian, can pass through elder generation
Preceding acquired image determines the default feature (such as height, figure profile, face contour, clothing) of the pedestrian, then logical
Cross machine vision model and identified from other images with can be convenient the feature that matches with the default feature of the pedestrian to
Amount, and determine that the characteristic information to match is relevant with the pedestrian, to be got the bid by way of rectangle frame in other images
These characteristic informations are outpoured, that is, identify the pedestrian.
Step S132 forms the markup information of target object according to the characteristic information marked.Specifically, if it is some
Characteristic information through marking is associated with some pedestrian, then not only can use rectangle frame marks out the pedestrian, can also lead to
The form for crossing label information is that the pedestrian carries out exclusive number, to form the markup information of the pedestrian.
In the present embodiment, for the accuracy rate of image recognition, according to the machine vision model pre-established to target pair
The image of elephant is handled, to extract the characteristic information in image;Wherein, machine vision model is to utilize preset first number
The second data set formed after style conversion is carried out according to collection, the model obtained by machine learning training;So, machine vision
The establishment process of model can be illustrated by step S200, referring to FIG. 3, step S200 may include step S210-
S230 illustrates separately below.
Step S210 is considered as acquisition step, and one group of image of at least one mobile object in collection site environment is formed existing
Field data collection obtains the style information of field data collection, the style information include brightness, color, color difference, clarity, contrast,
One or more of resolution ratio.
Such as, it is desirable to image labeling is carried out to the pedestrian in a square, then firstly the need of establishing showing for a square Ge Yugai
The machine vision model of field environmental correclation, such machine learning model also needs to simulate the site environment on the square in order to obtain
And form corresponding field data collection.So equipment can be acquired by camera etc. here come acquire in the square one or
One group of image that multiple pedestrians are constituted, this group of image may include multiframe digital picture and have continuity in time;By adopting
The field data collection that the one group of image collected is formed has frequently included some and specific style information, as in current square environment
The information such as the brightness, color, the clarity that just have.
Step S220, is considered as switch process, is carried out according to the style information of field data collection to preset first data set
Style conversion, obtains the second data set.In the present embodiment, the first data set includes at least one mobile object in any environment
The one group of image inside marked, and the corresponding one group of image of each mobile object has unified label information.
For example, the first data set can be the DukeMTMC-reID data set for collecting open source under the scene of ReID, it
Including taken by camera 1404 pedestrians, 36411 pedestrian's rectangle frames detected, also, exist including each pedestrian
Several images on the move shot when different time points or under different angle in the traveling process in street.
It should be noted that between the first data set in the field data collection in step S210 and step S220, it may
Can due to illumination, angle, video camera, background difference, lead to the brightness of the collected picture of different data collection, color, clear
Degree, contrast etc. have difference on the whole, this species diversity will lead to model migration effect variation.
In one embodiment, see that Fig. 4, step S220 may include step S221-S222, be respectively described below.
Step S221, by GAN model by the first data set Style Transfer to field data collection, according to field data collection
Style information in the first data set every group of image carry out style conversion, obtain corresponding one group of new images.
Such as Fig. 8, the first data set is formed using the DukeMTMC-reID data set of open source, including a pedestrian
The image marked when during advancing on street in different time points;Utilize a group picture collected in environment at the scene
As formed field data collect, including a pedestrian on square advance during in different time points when image (marked
It is note or not marking).By GAN model by the first data set Style Transfer to field data collection, live number is obtained
According to the style information of collection and style conversion is carried out with this, to obtain corresponding second data set of the first data set;Second
In data set, the style of every piece image is changed, and is closer to the style of field data collection, while pedestrian still protects
Certain resolution is held, ID can also remain unchanged.
Step S222 integrates the corresponding one group of new images of every group of image in the first data set, forms the second data set.
It should be noted that GAN model involved in the present embodiment is exactly production confrontation network (Generative
Adversarial Networks, abbreviation GAN), it is a kind of deep learning model.GAN model passes through two moulds in frame
Block: the mutual Game Learning of generation module (Generative Model) and discrimination module (Discriminative Model) produces
Raw fairly good output.In practical applications, it often uses deep neural network as G and D, it is good also to should be the setting of GAN model
Good training method, otherwise may due to neural network model freedom and cause to export undesirable.Wherein, generation module master
To be used to learn true picture distribution to which the image for allowing itself to generate is truer, with discrimination module of out-tricking;Discrimination module is then
It is that true and false differentiation is carried out to received image.Whole process is exactly to make the image of generation module generation more and more truer, differentiates mould
Block differentiates that the performance of true picture is more accurate, and over time, two modules reach a kind of balance.Due to GAN model
It is usually used to carry out the Style Transfer of two images, belongs to the prior art, so being no longer described in detail here.
In the present embodiment, by GAN model by the first data set Style Transfer to field data collection, according to live number
Style conversion is carried out to every group of image in the first data set according to the style information of collection, obtains the process of corresponding one group of new images
In, for the realization effect for guaranteeing GAN model Style Transfer, Style Transfer is controlled by following 3 steps here, is had
Body are as follows:
(1) total losses function is established, is formulated as
Loss=LStyle+λ1LID
Wherein, LStyleIndicate the corresponding style loss function of the style information of the field data collection, LIDIndicate described
The corresponding label loss function of the label information of every group of image, λ in one data set1For specific gravity factor.
In total losses function Loss, style loss function is expressed as
Wherein, A, B are respectively the field data collection, first data set, LGANFor standard antagonism loss function,
LcycFor period consistency loss function, G indicates the pattern mapping function from A to B,Indicate the pattern mapping function from B to A,
DAAnd DBIt is the pattern discriminator of A and B, λ respectively2For specific gravity factor;
In total losses function Loss, label loss function is expressed as
Wherein, the data distribution of A is a~pdata(a), the data distribution of B is b~pdata(b), Var is the variance meter of data
Calculate function, G (a) is that image a is migrated target image in A, and M (a) is the foreground mask of image a, and G (b) is in B
Image b's is migrated target image, and M (b) is the foreground mask of image a.
(2) style loss function L is utilizedStyleWith label loss function LIDThe parameter of GAN model is adjusted, so that total damage
The Loss value for losing function reaches minimum.
(3) every group of image in the first data set is input to the GAN model that Loss value is adjusted in minimum, with right
Every group of image in first data set carries out style conversion, and output obtains the corresponding one group of new images of this group of image.
Step S230, is considered as training step, and using the second data set, by machine learning, training obtains machine vision mould
Type.
For example, can use the second data set, machine vision model is obtained by ReID model training, ReID mould here
Type is pedestrian's weight identification model (Person re-identification, abbreviation Re-ID, also referred to as pedestrian are identified again), is benefit
The technology that whether there is specific pedestrian in image or video sequence is judged with computer vision technique.There are two in ReID model
A key technology, one is feature extraction, and study copes with the feature that pedestrian changes under different cameras;The other is degree
Amount study, keeps the identical closer different people of people farther the Feature Mapping learnt to new space.Due to ReID model category
In the prior art, so being no longer described in detail here.
In another embodiment, see Fig. 4, further include testing procedure S240, the test step after training step S230
Rapid S400 may be summarized to be: (a) be tested using field data collection machine vision model, by under iterative algorithm or gradient
Super ginseng (such as parameter lambda in algorithm adjustment GAN model drops1、λ2);(b) every time after the super ginseng in adjustment GAN model, by turning
It changes step S220 (i.e. S221-S222) and re-forms the second data set, and machine is obtained by training step S230 re -training
Vision mode continues with field data collection and tests the machine vision model that re -training obtains, until in GAN model
Super ginseng complete adjustment.
In order to clearly demonstrate the principle for establishing machine vision model, it is described herein by Fig. 5.Referring to Fig. 5,
One data set includes one group of image that at least one mobile object has marked in any environment, and field data collection includes live ring
One group of image of at least one domestic mobile object;By GAN model by the first data set Style Transfer to field data collection, with
According to the style information of field data collection in the first data set every group of image carry out style conversion, obtain corresponding one group it is new
Image, and integrate new images and form the second data set;Using the second data set training ReID model, asked to obtain the application
Seek the machine vision model of protection.Later, machine vision model is tested using field data collection, by iterative algorithm or
Gradient descent algorithm adjusts the super ginseng in GAN model, in the number of iterations for reaching setting or when reaching the requirement of gradient decline
Think that the adjustment of super ginseng terminates, machine vision model is optimized, and can carry out the image mark of target object under environment at the scene
Note.
Embodiment two,
Referring to FIG. 6, the application is also corresponding open a kind of on the basis of the image labeling method disclosed in embodiment one
Image labeling device 3, the image labeling device 3 mainly include acquiring unit 31, extraction unit 32 and mark unit 33, are divided below
It does not mentionlet alone bright.
Acquiring unit 31 is used to obtain the image of target object in a site environment.
In the present embodiment, site environment can be the public arenas such as street, square, highway, station, market, hotel, mesh
Mark object can be the moveable object such as pedestrian, vehicle, pet, be not specifically limited here.Furthermore, it is possible to by being installed on
One or more video capture devices (such as camera) of public arena acquire the image of target object in associated field environment,
And control centre is transmitted to for obtaining.
Extraction unit 32 is connect with acquiring unit 31, for according to the machine vision model that pre-establishes to target object
Image carries out feature information extraction.Machine vision model is after carrying out style conversion using preset first data set in the application
The second data set formed, the model obtained by machine learning training.Concrete function about extraction unit 32 can join
The step S120 in embodiment one is examined, is not discussed here.
Mark unit 33 is connect with extraction unit 32, and the characteristic information for being obtained using extraction marks out target object
Target object in image, and the markup information of output target object.Specifically, if some characteristic informations marked
(feature vector) is associated with some pedestrian, then not only can use rectangle frame marks out the pedestrian, can also pass through label
The form of information is that the pedestrian carries out exclusive number, to form the markup information of the pedestrian.In addition, mark unit 33 can be with
The markup information of target object is subjected to classified and stored and display, enables administrative staff conveniently by these markup informations
It finds to target object.
Further, referring to Fig. 6 and Fig. 7, image labeling device 3 further includes building for establishing the model of machine vision model
Vertical unit 34, connect with extraction unit 32, which includes acquisition module 341, conversion module 342 and training mould
Block 343.
One group image of the acquisition module 341 at least one mobile object in collection site environment, forms field data
Collection, obtains the style information of field data collection, which may include brightness, color, color difference, clarity, contrast, divides
One or more of resolution.Concrete function about acquisition module 341 can be with the step S210 in reference implementation example one, here
No longer repeated.
Conversion module 342 is used to carry out style to preset first data set according to the style information of field data collection to turn
It changes, obtains the second data set;Here the first data set includes one that at least one mobile object has marked in any environment
Group image, and the corresponding one group of image of each mobile object has unified label information.Specific function about conversion module 342
It can be not discussed here with the step S220 in reference implementation example one.
Training module 343 is used to utilize the second data set, and by machine learning (such as ReID model), training obtains machine view
Feel model.Concrete function about training module 343 can no longer be gone to live in the household of one's in-laws on getting married here with the step S230 in reference implementation example one
It states.
For the beneficial effect for clearly demonstrating present techniques method, comparative test has been carried out here.It is tested at first
In, a machine vision model is obtained with the DukeMTMC-reID data set directly training of open source, according to this machine vision mould
Type carries out the test under site environment, obtains first group of test index mAP and Rank1;In second test, by open source
Second data set DukeMTMC- of the DukeMTMC-reID data set Style Transfer to field data collection, after forming Style Transfer
ReID*M, obtains another machine vision model with the training of the second data set, carries out live ring according to this machine vision model
Test under border obtains second group of test index mAP and Rank1.
The test index result of 1 comparative test of table
It can be seen from Table 1 that test index obtained in second test has biggish mention compared to first test
It rises, then illustrating that the migration effect of machine vision model is preferable, it is possible to reduce required artificial mark work improves image labeling
Accuracy rate.
It should be noted that mAP (full name is mean average precision) and rank1 are measure algorithm search
The index of ability carrys out the accuracy quality of measure algorithm as a kind of benchmark, belongs to the prior art, no longer carry out here specifically
It is bright.
It will be understood by those skilled in the art that all or part of function of various methods can pass through in above embodiment
The mode of hardware is realized, can also be realized by way of computer program.When function all or part of in above embodiment
When being realized by way of computer program, which be can be stored in a computer readable storage medium, and storage medium can
To include: read-only memory, random access memory, disk, CD, hard disk etc., it is above-mentioned to realize which is executed by computer
Function.For example, program is stored in the memory of equipment, when executing program in memory by processor, can be realized
State all or part of function.In addition, when function all or part of in above embodiment is realized by way of computer program
When, which also can store in storage mediums such as server, another computer, disk, CD, flash disk or mobile hard disks
In, through downloading or copying and saving into the memory of local device, or version updating is carried out to the system of local device, when logical
When crossing the program in processor execution memory, all or part of function in above embodiment can be realized.
Use above specific case is illustrated the present invention, is merely used to help understand the present invention, not to limit
The system present invention.For those skilled in the art, according to the thought of the present invention, can also make several simple
It deduces, deform or replaces.
Claims (10)
1. a kind of image labeling method characterized by comprising
Obtain the image of target object in a site environment;
Feature information extraction is carried out according to image of the machine vision model pre-established to the target object;The machine view
Feel that model is the second data set for formed after style conversion using preset first data set, is trained by machine learning
Obtained model;
The target object in the image of the target object, and the output mesh are marked out using the characteristic information that extraction obtains
Mark the markup information of object.
2. image labeling method as described in claim 1, which is characterized in that the characteristic information obtained using extraction is marked
Target object in the image of the target object out, comprising:
By several characteristic informations extracted from the image of the target object respectively with the default feature of the target object
It is matched, the characteristic information of successful match is labeled;
The markup information of the target object is formed according to the characteristic information marked.
3. image labeling method as claimed in claim 1 or 2, which is characterized in that the machine vision model is using default
The first data set carry out the second data set for being formed after style conversion, the model obtained by machine learning training, then institute
State the establishment process of machine vision model are as follows:
Acquisition step: acquiring one group of image of at least one mobile object site environment Nei, forms field data collection, obtains
The style information of the field data collection, the style information include brightness, color, color difference, clarity, contrast, resolution ratio
One or more of;
Switch process: style conversion is carried out to preset first data set according to the style information of the field data collection, is obtained
Second data set;First data set includes one group of image that at least one mobile object has marked in any environment, and
The corresponding one group of image of each mobile object has unified label information;
Training step: utilizing second data set, and by machine learning, training obtains the machine vision model.
4. image labeling method as claimed in claim 3, which is characterized in that described according in the switch process
The style information of field data collection carries out style conversion to preset first data set, obtains the second data set, comprising:
By GAN model by the first data set Style Transfer to the field data collection, according to the field data collection
Style information in first data set every group of image carry out style conversion, obtain corresponding one group of new images;
The corresponding one group of new images of every group of image in first data set are integrated, second data set is formed.
5. image labeling method as claimed in claim 4, which is characterized in that it is described by GAN model by first data
Collect Style Transfer to the field data collection, with according to the style information of the field data collection in first data set
Every group of image carries out style conversion, obtains corresponding one group of new images, comprising:
Total losses function is established, is formulated as
Loss=LStyle+λ1LID
Wherein, LStyleIndicate the corresponding style loss function of the style information of the field data collection, LIDIndicate first number
According to the corresponding label loss function of label information for concentrating every group of image, λ1For specific gravity factor;
The parameter of the GAN model is adjusted using the style loss function and the label loss function, so that described total
The Loss value of loss function reaches minimum;
Every group of image in first data set is input to the GAN model that Loss value is adjusted in minimum, with
Style conversion is carried out to every group of image in first data set, output obtains the corresponding one group of new images of this group of image.
6. image labeling method as claimed in claim 5, which is characterized in that in the total losses function, the style damage
Losing function representation is
Wherein, A, B are respectively the field data collection, first data set, LGANFor standard antagonism loss function, LcycFor
Period consistency loss function, G indicate the pattern mapping function from A to B,Indicate the pattern mapping function from B to A, DAAnd DB
It is the pattern discriminator of A and B, λ respectively2For specific gravity factor;
The label loss function is expressed as
Wherein, the data distribution of A is a~pdata(a), the data distribution of B is b~pdata(b), Var is that the variance of data calculates letter
Number, G (a) are that image a is migrated target image in A, and M (a) is the foreground mask of image a, G (b) is the image in B
B's is migrated target image, and M (b) is the foreground mask of image a.
7. image labeling method as claimed in claim 5, which is characterized in that after the training step further include test step
Suddenly, the testing procedure includes:
The machine vision model is tested using the field data collection, passes through iterative algorithm or gradient descent algorithm tune
Super ginseng in the whole GAN model;
After adjusting the super ginseng in the GAN model every time, second data set is re-formed by the switch process, and
The machine vision model is obtained by the training step re -training, continues with the field data collection to re -training
The obtained machine vision model is tested, until the super ginseng in the GAN model completes adjustment.
8. a kind of image labeling device characterized by comprising
Acquiring unit, for obtaining the image of target object in a site environment;
Extraction unit is mentioned for carrying out characteristic information according to image of the machine vision model pre-established to the target object
It takes;The machine vision model is the second data set for formed after style conversion using preset first data set, is passed through
Machine learning and the obtained model of training;
Unit is marked, the characteristic information for being obtained using extraction marks out the target object in the image of the target object,
And the markup information of the output target object.
9. image labeling device as described in claim 1, which is characterized in that further include for establishing the machine vision model
Model foundation unit, connect with the extraction unit, the model foundation unit includes:
Acquisition module forms field data collection for acquiring one group of image of at least one mobile object in the site environment,
The style information of the field data collection is obtained, the style information includes brightness, color, color difference, clarity, contrast, divides
One or more of resolution;
Conversion module, for carrying out style conversion to preset first data set according to the style information of the field data collection,
Obtain the second data set;First data set includes the group picture that at least one mobile object has marked in any environment
Picture, and the corresponding one group of image of each mobile object has unified label information;
Training module, for utilizing second data set, by machine learning, training obtains the machine vision model.
10. a kind of computer readable storage medium, which is characterized in that including program, described program can be executed by processor with
Realize such as image labeling method of any of claims 1-7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910655710.0A CN110516707B (en) | 2019-07-19 | 2019-07-19 | Image labeling method and device and storage medium thereof |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910655710.0A CN110516707B (en) | 2019-07-19 | 2019-07-19 | Image labeling method and device and storage medium thereof |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110516707A true CN110516707A (en) | 2019-11-29 |
CN110516707B CN110516707B (en) | 2023-06-02 |
Family
ID=68622921
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910655710.0A Active CN110516707B (en) | 2019-07-19 | 2019-07-19 | Image labeling method and device and storage medium thereof |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110516707B (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111598152A (en) * | 2020-05-12 | 2020-08-28 | 北京阿丘机器人科技有限公司 | Visual system reproduction method, apparatus and computer-readable storage medium |
CN111882038A (en) * | 2020-07-24 | 2020-11-03 | 深圳力维智联技术有限公司 | Model conversion method and device |
CN112396923A (en) * | 2020-11-25 | 2021-02-23 | 贵州轻工职业技术学院 | Marketing teaching simulation system |
CN114511510A (en) * | 2022-01-13 | 2022-05-17 | 中山大学孙逸仙纪念医院 | Method and device for automatically extracting ascending aorta image |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2013098925A (en) * | 2011-11-04 | 2013-05-20 | Casio Comput Co Ltd | Image processing apparatus, image processing method and program |
CN107808149A (en) * | 2017-11-17 | 2018-03-16 | 腾讯数码(天津)有限公司 | A kind of face information mask method, device and storage medium |
CN108256439A (en) * | 2017-12-26 | 2018-07-06 | 北京大学 | A kind of pedestrian image generation method and system based on cycle production confrontation network |
CN108564127A (en) * | 2018-04-19 | 2018-09-21 | 腾讯科技(深圳)有限公司 | Image conversion method, device, computer equipment and storage medium |
US20180357800A1 (en) * | 2017-06-09 | 2018-12-13 | Adobe Systems Incorporated | Multimodal style-transfer network for applying style features from multi-resolution style exemplars to input images |
CN109671018A (en) * | 2018-12-12 | 2019-04-23 | 华东交通大学 | A kind of image conversion method and system based on production confrontation network and ResNets technology |
CN109697389A (en) * | 2017-10-23 | 2019-04-30 | 北京京东尚科信息技术有限公司 | Personal identification method and device |
CN109829849A (en) * | 2019-01-29 | 2019-05-31 | 深圳前海达闼云端智能科技有限公司 | A kind of generation method of training data, device and terminal |
CN109919251A (en) * | 2019-03-21 | 2019-06-21 | 腾讯科技(深圳)有限公司 | A kind of method and device of object detection method based on image, model training |
-
2019
- 2019-07-19 CN CN201910655710.0A patent/CN110516707B/en active Active
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2013098925A (en) * | 2011-11-04 | 2013-05-20 | Casio Comput Co Ltd | Image processing apparatus, image processing method and program |
US20180357800A1 (en) * | 2017-06-09 | 2018-12-13 | Adobe Systems Incorporated | Multimodal style-transfer network for applying style features from multi-resolution style exemplars to input images |
CN109697389A (en) * | 2017-10-23 | 2019-04-30 | 北京京东尚科信息技术有限公司 | Personal identification method and device |
CN107808149A (en) * | 2017-11-17 | 2018-03-16 | 腾讯数码(天津)有限公司 | A kind of face information mask method, device and storage medium |
CN108256439A (en) * | 2017-12-26 | 2018-07-06 | 北京大学 | A kind of pedestrian image generation method and system based on cycle production confrontation network |
CN108564127A (en) * | 2018-04-19 | 2018-09-21 | 腾讯科技(深圳)有限公司 | Image conversion method, device, computer equipment and storage medium |
CN109671018A (en) * | 2018-12-12 | 2019-04-23 | 华东交通大学 | A kind of image conversion method and system based on production confrontation network and ResNets technology |
CN109829849A (en) * | 2019-01-29 | 2019-05-31 | 深圳前海达闼云端智能科技有限公司 | A kind of generation method of training data, device and terminal |
CN109919251A (en) * | 2019-03-21 | 2019-06-21 | 腾讯科技(深圳)有限公司 | A kind of method and device of object detection method based on image, model training |
Non-Patent Citations (2)
Title |
---|
何剑华等: "基于改进的CycleGAN模型非配对的图像到图像转换", 《玉林师范学院学报》 * |
曾碧等: "基于CycleGAN的非配对人脸图片光照归一化方法", 《广东工业大学学报》 * |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111598152A (en) * | 2020-05-12 | 2020-08-28 | 北京阿丘机器人科技有限公司 | Visual system reproduction method, apparatus and computer-readable storage medium |
CN111882038A (en) * | 2020-07-24 | 2020-11-03 | 深圳力维智联技术有限公司 | Model conversion method and device |
CN112396923A (en) * | 2020-11-25 | 2021-02-23 | 贵州轻工职业技术学院 | Marketing teaching simulation system |
CN112396923B (en) * | 2020-11-25 | 2023-09-19 | 贵州轻工职业技术学院 | Marketing teaching simulation system |
CN114511510A (en) * | 2022-01-13 | 2022-05-17 | 中山大学孙逸仙纪念医院 | Method and device for automatically extracting ascending aorta image |
Also Published As
Publication number | Publication date |
---|---|
CN110516707B (en) | 2023-06-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Wang et al. | Learning from synthetic data for crowd counting in the wild | |
CN106845357B (en) | A kind of video human face detection and recognition methods based on multichannel network | |
CN110516707A (en) | A kind of image labeling method and its device, storage medium | |
CN107204010B (en) | A kind of monocular image depth estimation method and system | |
CN108256439A (en) | A kind of pedestrian image generation method and system based on cycle production confrontation network | |
CN106767812B (en) | A kind of indoor semantic map updating method and system based on Semantic features extraction | |
CN109190508A (en) | A kind of multi-cam data fusion method based on space coordinates | |
CN109409261B (en) | Crop classification method and system | |
CN104732208A (en) | Video human action reorganization method based on sparse subspace clustering | |
CN110188835A (en) | Data based on production confrontation network model enhance pedestrian's recognition methods again | |
CN110111338A (en) | A kind of visual tracking method based on the segmentation of super-pixel time and space significance | |
CN109145766A (en) | Model training method, device, recognition methods, electronic equipment and storage medium | |
CN109299707A (en) | A kind of unsupervised pedestrian recognition methods again based on fuzzy depth cluster | |
CN109635695B (en) | Pedestrian re-identification method based on triple convolution neural network | |
CN109583373B (en) | Pedestrian re-identification implementation method | |
US20230351794A1 (en) | Pedestrian tracking method and device, and computer-readable storage medium | |
CN109410190B (en) | Tower pole reverse-breaking detection model training method based on high-resolution remote sensing satellite image | |
CN106228109A (en) | A kind of action identification method based on skeleton motion track | |
CN109886356A (en) | A kind of target tracking method based on three branch's neural networks | |
CN103853794B (en) | Pedestrian retrieval method based on part association | |
CN107948586A (en) | Trans-regional moving target detecting method and device based on video-splicing | |
CN110008828A (en) | Pairs of constraint ingredient assay measures optimization method based on difference regularization | |
CN105631405B (en) | Traffic video intelligent recognition background modeling method based on Multilevel Block | |
CN108648210A (en) | It is a kind of static state complex scene under fast multi-target detection method and device | |
CN115147644A (en) | Method, system, device and storage medium for training and describing image description model |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |