CN110516707A

CN110516707A - A kind of image labeling method and its device, storage medium

Info

Publication number: CN110516707A
Application number: CN201910655710.0A
Authority: CN
Inventors: 张�浩; 邵新庆; 宋咏君; 刘强
Original assignee: Shenzhen Liwei Zhilian Technology Co Ltd; Nanjing ZNV Software Co Ltd
Current assignee: Shenzhen Liwei Zhilian Technology Co Ltd; Nanjing ZNV Software Co Ltd
Priority date: 2019-07-19
Filing date: 2019-07-19
Publication date: 2019-11-29
Anticipated expiration: 2039-07-19
Also published as: CN110516707B

Abstract

A kind of image labeling method and its device, storage medium, wherein image labeling method includes: the image for obtaining target object in a site environment；Feature information extraction is carried out according to image of the machine vision model pre-established to target object, wherein machine vision model is the second data set for formed after style conversion using preset first data set, the model obtained by machine learning training；The target object in the image of target object, and the markup information of output target object are marked out using the characteristic information that extraction obtains.Due to when establishing machine vision model, the data set marked is subjected to Style Transfer to field data collection by GAN model, so make the data set marked that can also obtain the style information of field data collection while keeping label information, to simulation site environment to greatest extent, enhance the migration effect of machine vision model.

Description

A kind of image labeling method and its device, storage medium

Technical field

The present invention relates to technical field of image processing, and in particular to a kind of image labeling method and its device, storage medium.

Background technique

Pedestrian identifies that (Person Re-identification is also referred to as pedestrian and identifies again, and referred to as ReID is in recent years again One research emphasis of computer vision, that is, give a monitoring pedestrian image, and striding equipment retrieves the image of the pedestrian.Due to not Had differences between picture pick-up device, pedestrian's appearance vulnerable to dress, scale, block, posture and visual angle etc. influence, pedestrian identifies again It is one not only to have had researching value simultaneously but also be rich in the project of challenge.

The target of ReID is the surveyor's image for matching and returning the large-scale atlas collected from camera network, due to Important application of the ReID in terms of safety and monitoring, causes the extensive concern of academia and industry, due to deep learning The availability of development and many data sets, but also the performance of ReID obtains significant promotion.

Although the performance of current ReID data set is satisfactory, but still not solving there are some obstruction personnel ReID application Certainly problem.Firstly, existing common data sets are different from the data collected in real scene, there are illumination, resolution ratio, ethnic group, The difference of clarity, background etc..For example, current data set includes the identity of limited quantity or carries out under constrained environment, The limited personnel amount and simple illumination condition presented simplifies the ReID task of personnel and helps to realize high-precision knowledge Other effect；But under actual scene ReID be usually indoors with executed in the camera network disposed in outdoor scene, and Processing for a long time shooting video, therefore really application program must cope with challenge, such as a large amount of identity and complexity illumination with And scene changes, current algorithm possibly can not solve.

In addition, using deep neural network training computer vision for the use of ReID model when, from a data set (data Collection is usually by manually mark or the obtained object picture of image labeling algorithm) train come model in another data Performance on collection, which has, to decline to a great extent, i.e. model migration effect is poor.Therefore when computer vision technical application, need to existing Field data is largely marked, and using the field data re -training model marked, is taken considerable time and expense.

Summary of the invention

The present invention solves the technical problem of the migration effects for how enhancing machine vision model, to improve image mark Accuracy rate when note.

According in a first aspect, providing a kind of image labeling method in a kind of embodiment, comprising: obtain mesh in a site environment Mark the image of object；Feature information extraction is carried out according to image of the machine vision model pre-established to the target object； The machine vision model is the second data set for formed after style conversion using preset first data set, passes through machine Learn and trains obtained model；The target pair in the image of the target object is marked out using the characteristic information that extraction obtains As, and the markup information of the output target object.

The characteristic information obtained using extraction marks out the target object in the image of the target object, comprising: Several characteristic informations extracted from the target object are matched with the default feature of the target object respectively, it will The characteristic information of successful match is labeled；The markup information of the target object is formed according to the characteristic information marked.

The machine vision model is the second data set for formed after style conversion using preset first data set, Obtained model is trained by machine learning, then the establishment process of the machine vision model are as follows:

Acquisition step: acquiring one group of image of at least one mobile object site environment Nei, forms field data collection, The style information of the field data collection is obtained, the style information includes brightness, color, color difference, clarity, contrast, divides One or more of resolution；Switch process: according to the style information of the field data collection to preset first data set into The conversion of row style, obtains the second data set；First data set includes that at least one mobile object has been marked in any environment One group of image of note, and the corresponding one group of image of each mobile object has unified label information；Training step: described in utilization Second data set, by machine learning, training obtains the machine vision model.

In the switch process, the style information according to the field data collection to preset first data set into The conversion of row style, obtains the second data set, comprising: by GAN model by the first data set Style Transfer to the scene Data set is turned with carrying out style to every group of image in first data set according to the style information of the field data collection It changes, obtains corresponding one group of new images；The corresponding one group of new images of every group of image in first data set are integrated, described in formation Second data set.

It is described by GAN model by the first data set Style Transfer to the field data collection, according to described existing The style information of field data collection carries out style conversion to every group of image in first data set, obtains corresponding one group and newly schemes Picture, comprising: establish total losses function, be formulated as

Loss=L_Style+λ₁L_ID

Wherein, L_StyleIndicate the corresponding style loss function of the style information of the field data collection, L_IDIndicate described The corresponding label loss function of the label information of every group of image, λ in one data set₁For specific gravity factor；

The parameter that the GAN model is adjusted using the style loss function and the label loss function, so that institute The Loss value for stating total losses function reaches minimum；Every group of image in first data set is input to Loss value in minimum The obtained GAN model is adjusted, to carry out style conversion to every group of image in first data set, output obtains the group The corresponding one group of new images of image.

In the total losses function, the style loss function is expressed as

Wherein, A, B are respectively the field data collection, first data set, L_GANFor standard antagonism loss function, L_cycFor period consistency loss function, G indicates the pattern mapping function from A to B,Indicate the pattern mapping function from B to A, D_AAnd D_BIt is the pattern discriminator of A and B, λ respectively₂For specific gravity factor；The label loss function is expressed as

Wherein, the data distribution of A is a~p_data(a), the data distribution of B is b~p_data(b), Var is the variance meter of data Calculate function, G (a) is that image a is migrated target image in A, and M (a) is the foreground mask of image a, and G (b) is in B Image b's is migrated target image, and M (b) is the foreground mask of image a.

It further include testing procedure after the training step, the testing procedure includes: to utilize the field data collection The machine vision model is tested, the super ginseng in the GAN model is adjusted by iterative algorithm or gradient descent algorithm； After adjusting the super ginseng in the GAN model every time, second data set is re-formed by the switch process, and pass through The training step re -training obtains the machine vision model, continues with the field data collection and obtains to re -training The machine vision model tested, until the GAN model in super ginseng complete adjustment.

According to second aspect, a kind of image labeling device is provided in a kind of embodiment, comprising:

Acquiring unit, for obtaining the image of target object in a site environment；

Extraction unit, for carrying out feature letter according to image of the machine vision model pre-established to the target object Breath extracts；The machine vision model is the second data set for formed after style conversion using preset first data set, The model obtained by machine learning training；

Unit is marked, the characteristic information for obtaining using extraction marks out the target pair in the image of the target object As, and the markup information of the output target object.

The image labeling device further includes the model foundation unit for establishing the machine vision model, and described Extraction unit connection, the model foundation unit includes: acquisition module, for acquiring at least one movement in the site environment One group of image of object forms field data collection, obtains the style information of the field data collection, the style information includes bright One or more of degree, color, color difference, clarity, contrast, resolution ratio；Conversion module, for according to the field data The style information of collection carries out style conversion to preset first data set, obtains the second data set；First data set includes One group of image that at least one mobile object has marked in any environment, and the corresponding one group of image of each mobile object has Unified label information；Training module, for utilizing second data set, by machine learning, training obtains the machine Vision mode.

According to the third aspect, a kind of computer readable storage medium, including program, described program are provided in a kind of embodiment It can be executed by processor to realize the image labeling method as described in above-mentioned first aspect.

The beneficial effect of the application is:

According to a kind of image labeling method and its device, storage medium of above-described embodiment, wherein image labeling method packet It includes: obtaining the image of target object in a site environment；According to the machine vision model pre-established to the image of target object Feature information extraction is carried out, wherein machine vision model is to utilize the formed after the progress style conversion of preset first data set Two data sets, the model obtained by machine learning training；Target object is marked out using obtained characteristic information is extracted The target object in image, and the markup information of output target object.In a first aspect, due to establishing machine vision model When, the data set marked is carried out to field data collection, then making the data set marked by Style Transfer by GAN model The style information of field data collection can be also obtained while keeping label information, thus simulation site environment to greatest extent, Enhance the migration effect of machine vision model；Second aspect not only overcomes machine using the machine vision model of foundation well The poor problem of device vision mode migration effect, also the machine vision model be applied to site environment when, can well from Characteristic information is extracted in image, conducive to rapidly target object is identified in environment at the scene when image labeling, to reduce new Artificial mark work needed for scene modeling is effectively saved time and the cost of new scene modeling.

Detailed description of the invention

Fig. 1 is the flow chart of image labeling method in the application；

Fig. 2 is the flow chart of label target object；

Fig. 3 is the flow chart that machine vision model is established in the application；

Fig. 4 is the flow chart of testing procedure when establishing machine vision model；

Fig. 5 is the schematic illustration for establishing machine vision model；

Fig. 6 is the structural schematic diagram of image labeling device in the application；

Fig. 7 is the structural schematic diagram of model foundation unit in image labeling device；

Fig. 8 is the schematic illustration of GAN model Style Transfer.

Specific embodiment

Below by specific embodiment combination attached drawing, invention is further described in detail.Wherein different embodiments Middle similar component uses associated similar element numbers.In the following embodiments, many datail descriptions be in order to The application is better understood.However, those skilled in the art can recognize without lifting an eyebrow, part of feature It is dispensed, or can be substituted by other elements, material, method in varied situations.In some cases, this Shen Please it is relevant it is some operation there is no in the description show or describe, this is the core in order to avoid the application by mistake More descriptions are flooded, and to those skilled in the art, these relevant operations, which are described in detail, not to be necessary, they Relevant operation can be completely understood according to the general technology knowledge of description and this field in specification.

It is formed respectively in addition, feature described in this description, operation or feature can combine in any suitable way Kind embodiment.Meanwhile each step in method description or movement can also can be aobvious and easy according to those skilled in the art institute The mode carry out sequence exchange or adjustment seen.Therefore, the various sequences in the description and the appended drawings are intended merely to clearly describe a certain A embodiment is not meant to be necessary sequence, and wherein some sequentially must comply with unless otherwise indicated.

It is herein component institute serialization number itself, such as " first ", " second " etc., is only used for distinguishing described object, Without any sequence or art-recognized meanings.And " connection ", " connection " described in the application, unless otherwise instructed, include directly and It is indirectly connected with (connection).

Embodiment one,

Referring to FIG. 1, the application discloses a kind of image labeling method comprising step S110-S130 is said separately below It is bright.

Step S110 obtains the image of target object in a site environment.

In the present embodiment, site environment can be the public arenas such as street, square, highway, station, market, hotel, mesh Mark object can be the moveable object such as pedestrian, vehicle, pet, be not specifically limited here.Furthermore, it is possible to by being installed on One or more video capture devices (such as camera) of public arena acquire the image of target object in associated field environment, And control centre is transmitted to for obtaining.

Step S120 carries out feature information extraction according to image of the machine vision model pre-established to target object； Here machine vision model is the second data set for formed after style conversion using preset first data set, passes through machine Device learns and trains obtained model.

In one embodiment, if through the machine vision model that pre-establishes in the image for extracting target object When dry characteristic information, the main some feature vectors extracted in image.For example, multiple pedestrians move on square, then scheming As not only there is the characteristic information of pedestrian, there are also the characteristic informations of objects other on square, to extract in image about pedestrian at this time With the characteristic information of other objects.

It should be noted that characteristic information here is often feature vector, it is equivalent to table of the picture in goal task Show, it is believed that be a kind of generic representation mode of computer vision field, i.e., target object characterized with vector, to support reality The tasks such as recognition of face, pedestrian's identification in the application of border.As recognition of face is retrieved in face vector library with object vector The highest feature of similarity, when similarity is considered the same person higher than a certain threshold value.

Step S130 marks out the target object in the image of the target object using the characteristic information that extraction obtains, with And the markup information of output target object.Specifically, the markup information of target object can be subjected to classified and stored and display, made Obtaining administrative staff can find conveniently by these markup informations to target object.

In one embodiment, step S130 may include step S131-S132 in Fig. 2, be described as follows.

Step S131, several characteristic informations extracted from the image of target object are default with target object respectively Feature is matched, and the feature vector of successful match is labeled.For example target object is some pedestrian, can pass through elder generation Preceding acquired image determines the default feature (such as height, figure profile, face contour, clothing) of the pedestrian, then logical Cross machine vision model and identified from other images with can be convenient the feature that matches with the default feature of the pedestrian to Amount, and determine that the characteristic information to match is relevant with the pedestrian, to be got the bid by way of rectangle frame in other images These characteristic informations are outpoured, that is, identify the pedestrian.

Step S132 forms the markup information of target object according to the characteristic information marked.Specifically, if it is some Characteristic information through marking is associated with some pedestrian, then not only can use rectangle frame marks out the pedestrian, can also lead to The form for crossing label information is that the pedestrian carries out exclusive number, to form the markup information of the pedestrian.

In the present embodiment, for the accuracy rate of image recognition, according to the machine vision model pre-established to target pair The image of elephant is handled, to extract the characteristic information in image；Wherein, machine vision model is to utilize preset first number The second data set formed after style conversion is carried out according to collection, the model obtained by machine learning training；So, machine vision The establishment process of model can be illustrated by step S200, referring to FIG. 3, step S200 may include step S210- S230 illustrates separately below.

Step S210 is considered as acquisition step, and one group of image of at least one mobile object in collection site environment is formed existing Field data collection obtains the style information of field data collection, the style information include brightness, color, color difference, clarity, contrast, One or more of resolution ratio.

Such as, it is desirable to image labeling is carried out to the pedestrian in a square, then firstly the need of establishing showing for a square Ge Yugai The machine vision model of field environmental correclation, such machine learning model also needs to simulate the site environment on the square in order to obtain And form corresponding field data collection.So equipment can be acquired by camera etc. here come acquire in the square one or One group of image that multiple pedestrians are constituted, this group of image may include multiframe digital picture and have continuity in time；By adopting The field data collection that the one group of image collected is formed has frequently included some and specific style information, as in current square environment The information such as the brightness, color, the clarity that just have.

Step S220, is considered as switch process, is carried out according to the style information of field data collection to preset first data set Style conversion, obtains the second data set.In the present embodiment, the first data set includes at least one mobile object in any environment The one group of image inside marked, and the corresponding one group of image of each mobile object has unified label information.

For example, the first data set can be the DukeMTMC-reID data set for collecting open source under the scene of ReID, it Including taken by camera 1404 pedestrians, 36411 pedestrian's rectangle frames detected, also, exist including each pedestrian Several images on the move shot when different time points or under different angle in the traveling process in street.

It should be noted that between the first data set in the field data collection in step S210 and step S220, it may Can due to illumination, angle, video camera, background difference, lead to the brightness of the collected picture of different data collection, color, clear Degree, contrast etc. have difference on the whole, this species diversity will lead to model migration effect variation.

In one embodiment, see that Fig. 4, step S220 may include step S221-S222, be respectively described below.

Step S221, by GAN model by the first data set Style Transfer to field data collection, according to field data collection Style information in the first data set every group of image carry out style conversion, obtain corresponding one group of new images.

Such as Fig. 8, the first data set is formed using the DukeMTMC-reID data set of open source, including a pedestrian The image marked when during advancing on street in different time points；Utilize a group picture collected in environment at the scene As formed field data collect, including a pedestrian on square advance during in different time points when image (marked It is note or not marking).By GAN model by the first data set Style Transfer to field data collection, live number is obtained According to the style information of collection and style conversion is carried out with this, to obtain corresponding second data set of the first data set；Second In data set, the style of every piece image is changed, and is closer to the style of field data collection, while pedestrian still protects Certain resolution is held, ID can also remain unchanged.

Step S222 integrates the corresponding one group of new images of every group of image in the first data set, forms the second data set.

It should be noted that GAN model involved in the present embodiment is exactly production confrontation network (Generative Adversarial Networks, abbreviation GAN), it is a kind of deep learning model.GAN model passes through two moulds in frame Block: the mutual Game Learning of generation module (Generative Model) and discrimination module (Discriminative Model) produces Raw fairly good output.In practical applications, it often uses deep neural network as G and D, it is good also to should be the setting of GAN model Good training method, otherwise may due to neural network model freedom and cause to export undesirable.Wherein, generation module master To be used to learn true picture distribution to which the image for allowing itself to generate is truer, with discrimination module of out-tricking；Discrimination module is then It is that true and false differentiation is carried out to received image.Whole process is exactly to make the image of generation module generation more and more truer, differentiates mould Block differentiates that the performance of true picture is more accurate, and over time, two modules reach a kind of balance.Due to GAN model It is usually used to carry out the Style Transfer of two images, belongs to the prior art, so being no longer described in detail here.

In the present embodiment, by GAN model by the first data set Style Transfer to field data collection, according to live number Style conversion is carried out to every group of image in the first data set according to the style information of collection, obtains the process of corresponding one group of new images In, for the realization effect for guaranteeing GAN model Style Transfer, Style Transfer is controlled by following 3 steps here, is had Body are as follows:

(1) total losses function is established, is formulated as

Loss=L_Style+λ₁L_ID

Wherein, L_StyleIndicate the corresponding style loss function of the style information of the field data collection, L_IDIndicate described The corresponding label loss function of the label information of every group of image, λ in one data set₁For specific gravity factor.

In total losses function Loss, style loss function is expressed as

Wherein, A, B are respectively the field data collection, first data set, L_GANFor standard antagonism loss function, L_cycFor period consistency loss function, G indicates the pattern mapping function from A to B,Indicate the pattern mapping function from B to A, D_AAnd D_BIt is the pattern discriminator of A and B, λ respectively₂For specific gravity factor；

In total losses function Loss, label loss function is expressed as

(2) style loss function L is utilized_StyleWith label loss function L_IDThe parameter of GAN model is adjusted, so that total damage The Loss value for losing function reaches minimum.

(3) every group of image in the first data set is input to the GAN model that Loss value is adjusted in minimum, with right Every group of image in first data set carries out style conversion, and output obtains the corresponding one group of new images of this group of image.

Step S230, is considered as training step, and using the second data set, by machine learning, training obtains machine vision mould Type.

For example, can use the second data set, machine vision model is obtained by ReID model training, ReID mould here Type is pedestrian's weight identification model (Person re-identification, abbreviation Re-ID, also referred to as pedestrian are identified again), is benefit The technology that whether there is specific pedestrian in image or video sequence is judged with computer vision technique.There are two in ReID model A key technology, one is feature extraction, and study copes with the feature that pedestrian changes under different cameras；The other is degree Amount study, keeps the identical closer different people of people farther the Feature Mapping learnt to new space.Due to ReID model category In the prior art, so being no longer described in detail here.

In another embodiment, see Fig. 4, further include testing procedure S240, the test step after training step S230 Rapid S400 may be summarized to be: (a) be tested using field data collection machine vision model, by under iterative algorithm or gradient Super ginseng (such as parameter lambda in algorithm adjustment GAN model drops₁、λ₂)；(b) every time after the super ginseng in adjustment GAN model, by turning It changes step S220 (i.e. S221-S222) and re-forms the second data set, and machine is obtained by training step S230 re -training Vision mode continues with field data collection and tests the machine vision model that re -training obtains, until in GAN model Super ginseng complete adjustment.

In order to clearly demonstrate the principle for establishing machine vision model, it is described herein by Fig. 5.Referring to Fig. 5, One data set includes one group of image that at least one mobile object has marked in any environment, and field data collection includes live ring One group of image of at least one domestic mobile object；By GAN model by the first data set Style Transfer to field data collection, with According to the style information of field data collection in the first data set every group of image carry out style conversion, obtain corresponding one group it is new Image, and integrate new images and form the second data set；Using the second data set training ReID model, asked to obtain the application Seek the machine vision model of protection.Later, machine vision model is tested using field data collection, by iterative algorithm or Gradient descent algorithm adjusts the super ginseng in GAN model, in the number of iterations for reaching setting or when reaching the requirement of gradient decline Think that the adjustment of super ginseng terminates, machine vision model is optimized, and can carry out the image mark of target object under environment at the scene Note.

Embodiment two,

Referring to FIG. 6, the application is also corresponding open a kind of on the basis of the image labeling method disclosed in embodiment one Image labeling device 3, the image labeling device 3 mainly include acquiring unit 31, extraction unit 32 and mark unit 33, are divided below It does not mentionlet alone bright.

Acquiring unit 31 is used to obtain the image of target object in a site environment.

Extraction unit 32 is connect with acquiring unit 31, for according to the machine vision model that pre-establishes to target object Image carries out feature information extraction.Machine vision model is after carrying out style conversion using preset first data set in the application The second data set formed, the model obtained by machine learning training.Concrete function about extraction unit 32 can join The step S120 in embodiment one is examined, is not discussed here.

Mark unit 33 is connect with extraction unit 32, and the characteristic information for being obtained using extraction marks out target object Target object in image, and the markup information of output target object.Specifically, if some characteristic informations marked (feature vector) is associated with some pedestrian, then not only can use rectangle frame marks out the pedestrian, can also pass through label The form of information is that the pedestrian carries out exclusive number, to form the markup information of the pedestrian.In addition, mark unit 33 can be with The markup information of target object is subjected to classified and stored and display, enables administrative staff conveniently by these markup informations It finds to target object.

Further, referring to Fig. 6 and Fig. 7, image labeling device 3 further includes building for establishing the model of machine vision model Vertical unit 34, connect with extraction unit 32, which includes acquisition module 341, conversion module 342 and training mould Block 343.

One group image of the acquisition module 341 at least one mobile object in collection site environment, forms field data Collection, obtains the style information of field data collection, which may include brightness, color, color difference, clarity, contrast, divides One or more of resolution.Concrete function about acquisition module 341 can be with the step S210 in reference implementation example one, here No longer repeated.

Conversion module 342 is used to carry out style to preset first data set according to the style information of field data collection to turn It changes, obtains the second data set；Here the first data set includes one that at least one mobile object has marked in any environment Group image, and the corresponding one group of image of each mobile object has unified label information.Specific function about conversion module 342 It can be not discussed here with the step S220 in reference implementation example one.

Training module 343 is used to utilize the second data set, and by machine learning (such as ReID model), training obtains machine view Feel model.Concrete function about training module 343 can no longer be gone to live in the household of one's in-laws on getting married here with the step S230 in reference implementation example one It states.

For the beneficial effect for clearly demonstrating present techniques method, comparative test has been carried out here.It is tested at first In, a machine vision model is obtained with the DukeMTMC-reID data set directly training of open source, according to this machine vision mould Type carries out the test under site environment, obtains first group of test index mAP and Rank1；In second test, by open source Second data set DukeMTMC- of the DukeMTMC-reID data set Style Transfer to field data collection, after forming Style Transfer ReID*M, obtains another machine vision model with the training of the second data set, carries out live ring according to this machine vision model Test under border obtains second group of test index mAP and Rank1.

The test index result of 1 comparative test of table

It can be seen from Table 1 that test index obtained in second test has biggish mention compared to first test It rises, then illustrating that the migration effect of machine vision model is preferable, it is possible to reduce required artificial mark work improves image labeling Accuracy rate.

It should be noted that mAP (full name is mean average precision) and rank1 are measure algorithm search The index of ability carrys out the accuracy quality of measure algorithm as a kind of benchmark, belongs to the prior art, no longer carry out here specifically It is bright.

It will be understood by those skilled in the art that all or part of function of various methods can pass through in above embodiment The mode of hardware is realized, can also be realized by way of computer program.When function all or part of in above embodiment When being realized by way of computer program, which be can be stored in a computer readable storage medium, and storage medium can To include: read-only memory, random access memory, disk, CD, hard disk etc., it is above-mentioned to realize which is executed by computer Function.For example, program is stored in the memory of equipment, when executing program in memory by processor, can be realized State all or part of function.In addition, when function all or part of in above embodiment is realized by way of computer program When, which also can store in storage mediums such as server, another computer, disk, CD, flash disk or mobile hard disks In, through downloading or copying and saving into the memory of local device, or version updating is carried out to the system of local device, when logical When crossing the program in processor execution memory, all or part of function in above embodiment can be realized.

Use above specific case is illustrated the present invention, is merely used to help understand the present invention, not to limit The system present invention.For those skilled in the art, according to the thought of the present invention, can also make several simple It deduces, deform or replaces.

Claims

1. a kind of image labeling method characterized by comprising

Obtain the image of target object in a site environment；

Feature information extraction is carried out according to image of the machine vision model pre-established to the target object；The machine view Feel that model is the second data set for formed after style conversion using preset first data set, is trained by machine learning Obtained model；

The target object in the image of the target object, and the output mesh are marked out using the characteristic information that extraction obtains Mark the markup information of object.

2. image labeling method as described in claim 1, which is characterized in that the characteristic information obtained using extraction is marked Target object in the image of the target object out, comprising:

By several characteristic informations extracted from the image of the target object respectively with the default feature of the target object It is matched, the characteristic information of successful match is labeled；

The markup information of the target object is formed according to the characteristic information marked.

3. image labeling method as claimed in claim 1 or 2, which is characterized in that the machine vision model is using default The first data set carry out the second data set for being formed after style conversion, the model obtained by machine learning training, then institute State the establishment process of machine vision model are as follows:

Acquisition step: acquiring one group of image of at least one mobile object site environment Nei, forms field data collection, obtains The style information of the field data collection, the style information include brightness, color, color difference, clarity, contrast, resolution ratio One or more of；

Switch process: style conversion is carried out to preset first data set according to the style information of the field data collection, is obtained Second data set；First data set includes one group of image that at least one mobile object has marked in any environment, and The corresponding one group of image of each mobile object has unified label information；

Training step: utilizing second data set, and by machine learning, training obtains the machine vision model.

4. image labeling method as claimed in claim 3, which is characterized in that described according in the switch process The style information of field data collection carries out style conversion to preset first data set, obtains the second data set, comprising:

By GAN model by the first data set Style Transfer to the field data collection, according to the field data collection Style information in first data set every group of image carry out style conversion, obtain corresponding one group of new images；

The corresponding one group of new images of every group of image in first data set are integrated, second data set is formed.

5. image labeling method as claimed in claim 4, which is characterized in that it is described by GAN model by first data Collect Style Transfer to the field data collection, with according to the style information of the field data collection in first data set Every group of image carries out style conversion, obtains corresponding one group of new images, comprising:

Total losses function is established, is formulated as

Loss=L_Style+λ₁L_ID

Wherein, L_StyleIndicate the corresponding style loss function of the style information of the field data collection, L_IDIndicate first number According to the corresponding label loss function of label information for concentrating every group of image, λ₁For specific gravity factor；

The parameter of the GAN model is adjusted using the style loss function and the label loss function, so that described total The Loss value of loss function reaches minimum；

Every group of image in first data set is input to the GAN model that Loss value is adjusted in minimum, with Style conversion is carried out to every group of image in first data set, output obtains the corresponding one group of new images of this group of image.

6. image labeling method as claimed in claim 5, which is characterized in that in the total losses function, the style damage Losing function representation is

Wherein, A, B are respectively the field data collection, first data set, L_GANFor standard antagonism loss function, L_cycFor Period consistency loss function, G indicate the pattern mapping function from A to B,Indicate the pattern mapping function from B to A, D_AAnd D_B It is the pattern discriminator of A and B, λ respectively₂For specific gravity factor；

The label loss function is expressed as

Wherein, the data distribution of A is a~p_data(a), the data distribution of B is b~p_data(b), Var is that the variance of data calculates letter Number, G (a) are that image a is migrated target image in A, and M (a) is the foreground mask of image a, G (b) is the image in B B's is migrated target image, and M (b) is the foreground mask of image a.

7. image labeling method as claimed in claim 5, which is characterized in that after the training step further include test step Suddenly, the testing procedure includes:

The machine vision model is tested using the field data collection, passes through iterative algorithm or gradient descent algorithm tune Super ginseng in the whole GAN model；

After adjusting the super ginseng in the GAN model every time, second data set is re-formed by the switch process, and The machine vision model is obtained by the training step re -training, continues with the field data collection to re -training The obtained machine vision model is tested, until the super ginseng in the GAN model completes adjustment.

8. a kind of image labeling device characterized by comprising

Extraction unit is mentioned for carrying out characteristic information according to image of the machine vision model pre-established to the target object It takes；The machine vision model is the second data set for formed after style conversion using preset first data set, is passed through Machine learning and the obtained model of training；

Unit is marked, the characteristic information for being obtained using extraction marks out the target object in the image of the target object, And the markup information of the output target object.

9. image labeling device as described in claim 1, which is characterized in that further include for establishing the machine vision model Model foundation unit, connect with the extraction unit, the model foundation unit includes:

Acquisition module forms field data collection for acquiring one group of image of at least one mobile object in the site environment, The style information of the field data collection is obtained, the style information includes brightness, color, color difference, clarity, contrast, divides One or more of resolution；

Conversion module, for carrying out style conversion to preset first data set according to the style information of the field data collection, Obtain the second data set；First data set includes the group picture that at least one mobile object has marked in any environment Picture, and the corresponding one group of image of each mobile object has unified label information；

Training module, for utilizing second data set, by machine learning, training obtains the machine vision model.

10. a kind of computer readable storage medium, which is characterized in that including program, described program can be executed by processor with Realize such as image labeling method of any of claims 1-7.