CN108959304A

CN108959304A - A kind of Tag Estimation method and device

Info

Publication number: CN108959304A
Application number: CN201710363676.0A
Authority: CN
Inventors: 魏溪含
Original assignee: Alibaba Group Holding Ltd
Current assignee: Alibaba Group Holding Ltd
Priority date: 2017-05-22
Filing date: 2017-05-22
Publication date: 2018-12-07
Anticipated expiration: 2037-05-22
Also published as: CN108959304B

Abstract

The embodiment of the present application discloses a kind of Tag Estimation method and device.The described method includes: obtaining at least one image data set, the image in described image data set belongs to same category；Tag Estimation is carried out to the image in described image data set respectively within the scope of preset image tag, generates at least one prediction label of each image；The number that each prediction label occurs is counted respectively, the number is obtained and meets image corresponding to the prediction label of preset condition, and the label that image data set belonging to described image is arranged is the prediction label.Using each embodiment of the application, the order of accuarcy and combined efficiency for improving Tag Estimation can be improved.

Description

A kind of Tag Estimation method and device

Technical field

This application involves technical field of information processing, in particular to a kind of Tag Estimation method and device.

Background technique

In recent years, with the rapid development of science and technology, people generate more and more demands for intelligence life. " scheme to search figure ", " scheme to search text " be not perhaps strange for users, many shopping at network platforms, search platform etc. all It can be obtained belonging to of a sort or similar picture with input picture according to the picture searching that user inputs, or even can be with Match content shown in picture.In some instances, it is such as obtained and input cat according to the cat picture searching that user inputs The similar picture of miaow picture, or obtain the information such as kind of cat in picture.

Picture relevant to picture is inputted or text can be searched in order to guarantee user on platform, generally requires energy Enough platforms for providing the services such as " scheme to search figure ", " scheme to search text " have the image data resource of magnanimity.In structure figures the piece number During according to resource, generally require it is tagged to picture, the label can characterize the picture in data resource belonging to Classification, preferably to manage picture resource, the label " English is short ", " gardenia ", " keyboard " etc..For service platform For, of course, it is desirable to the picture number under each label is The more the better, and therefore, related service platform is needed from other picture numbers According to collecting picture in resource, and extend in the image data resource of oneself.During expansion, in other image data resources Picture also include label information, but the setting rule of picture tag is not quite similar in different platform.For example, external data are flat The markup language of picture and object language be not identical on platform, and markup language is translated into object language using translation software, can be with It was found that will appear polysemy, phenomena such as meaning of a word is unknown.It is existing that above-mentioned phenomenon causes certain picture tags that can not be incorporated on platform Have in image data resource.For example, in Google open image comprising label be " comics " multiple pictures, if mesh Poster speech is Chinese, and " comics " is translated into Chinese, then may include that " caricature ", " comic books ", " case of caricatures of persons " etc. are a variety of Expression way.If in existing image data platform including " caricature ", " comic books ", " case of caricatures of persons " these three labels, It will appear uncertain the problem of which label " comics " is incorporated into the prior art.

To solve the above-mentioned problems, judge that can picture tag with often through the mode of artificial observation in the prior art Existing label merges.Multiple pictures in Google open image comprising label for " comics ", people can such as be opened Work checks that " comics " is to belong to " caricature ", still " comic books ", " case of caricatures of persons ".The mode workload of above-mentioned artificial observation compared with Greatly, working efficiency is lower.

Therefore, a kind of more accurate, intelligentized image tag merging mode is needed in the prior art.

Summary of the invention

The embodiment of the present application is designed to provide a kind of Tag Estimation method and device, and raising Tag Estimation can be improved Order of accuarcy and combined efficiency.

A kind of Tag Estimation method and device provided by the embodiments of the present application is specifically achieved in that

A kind of Tag Estimation method, comprising:

At least one image data set is obtained, the image in described image data set belongs to same category；

Tag Estimation is carried out to the image in described image data set respectively within the scope of preset image tag, is generated each At least one prediction label of a image；

Count the number that each prediction label occurs respectively, obtain the number meet preset condition prediction label institute it is right The image answered, the label that image data set belonging to described image is arranged is the prediction label.

A kind of Tag Estimation method, comprising:

Institute is arranged when the number meets preset condition in the number for counting the prediction label corresponding image respectively The label for stating image data set belonging to image is the prediction label.

A kind of Tag Estimation method, comprising:

Acquisition belongs to same category of multiple images；

Tag Estimation is carried out to described multiple images using prediction model, generates at least one pre- mark for each image Label；

The frequency of occurrence is met the prediction label of preset condition as institute by the frequency of occurrence for counting single prediction label State the recommendation label of multiple images.

A kind of Tag Estimation device, it is described including processor and for the memory of storage processor executable instruction Processor is realized when executing described instruction:

Acquisition belongs to same category of multiple images；

A kind of computer readable storage medium is stored thereon with computer instruction, and it is following that described instruction is performed realization Step:

Tag Estimation method and device provided by the present application can merge the image data set with the same category to first In beginning image data source, in data merging process, Tag Estimation is carried out to the image that image data is concentrated first, and measure in advance To prediction label belong to the label range in original data source.Further according to the number of each prediction label, decide whether Merge corresponding image data set.With judge whether data set merges according to the literal meaning of image tag in the prior art, this In embodiment, considers using with opposite direction in the prior art, set about from the corresponding image of label, be equivalent to and first judge image Whether belong to same category, remerges the corresponding label of image data set.Using aforesaid way, image data merging can be improved Order of accuarcy, in addition, by setting label in the way of carry out data merging, may be implemented large-scale image data rapidly move It moves, improves the combined efficiency of large-scale data.

Detailed description of the invention

In order to illustrate the technical solutions in the embodiments of the present application or in the prior art more clearly, to embodiment or will show below There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this The some embodiments recorded in application, for those of ordinary skill in the art, in the premise of not making the creative labor property Under, it is also possible to obtain other drawings based on these drawings.

Fig. 1 is application scenarios schematic diagram provided by the present application；

Fig. 2 is application scenarios schematic diagram provided by the present application；

Fig. 3 is a kind of method flow schematic diagram of embodiment of Tag Estimation method provided by the present application；

Fig. 4 is the example schematic diagram provided by the present application for merging new image data source and original data source；

Fig. 5 is BP network topology structure schematic diagram provided by the present application；

Fig. 6 is application scenarios schematic diagram provided by the present application；

Fig. 7 is application scenarios schematic diagram provided by the present application；

Fig. 8 is the method flow schematic diagram of another embodiment of Tag Estimation method provided by the present application；

Fig. 9 is the method flow schematic diagram of another embodiment of Tag Estimation method provided by the present application；

Figure 10 is a kind of modular structure schematic diagram of embodiment of Tag Estimation device provided by the present application.

Specific embodiment

In order to make those skilled in the art better understand the technical solutions in the application, below in conjunction with the application reality The attached drawing in example is applied, the technical scheme in the embodiment of the application is clearly and completely described, it is clear that described implementation Example is merely a part but not all of the embodiments of the present application.Based on the embodiment in the application, this field is common The application protection all should belong in technical staff's every other embodiment obtained without creative efforts Range.

For convenience those skilled in the art understand that technical solution provided by the embodiments of the present application, below first to technical solution The technological accumulation and inheritance of realization is illustrated.

Currently, many search engines can not only input text, sound, it can be with information such as input pictures.Search engine Development and the demand of user have and closely contact, in the past, when user sees or hears some strange vocabulary, it is desirable to The information of the vocabulary can be understood by search engine, later, when user hears the interesting to listen to music of a head, it is desirable to pass through search The relevant information of engine acquisition music.Above two user demand has realized that nowadays, user wishes in a search engine can Search the relevant information about any picture seen.In some typical scenes, user is by the snow mountain on a secondary poster Beautiful scenery is attracted, but where is not aware that, then user has taken the photo of the snow mountain beautiful scenery shown on poster, and will Photo is input to the search engine for supporting picture searching, it is desirable to be able to the place of the snow mountain beautiful scenery is searched by search engine. If the snow mountain beautiful scenery is a small town positioned at Switzerland in fact, but by the background data base of the used search engine of user In there is no the Switzerland small town snow mountain image resource, therefore, which is likely to that the ground of snow mountain beautiful scenery can not be got Point, in some instances it may even be possible to mistakenly export the information that the snow mountain beautiful scenery is located at the popular scenic spot such as Xinjiang Tianshan.There are also very for similar situation It is more, due to the deficiency of search engine backstage image resource, it is easy to search engine be caused can not identify the phase of user's input picture Close information, or even the information that identification makes mistake.The information of mistake plays passive guiding function to user, to entire search engine Service quality for, will receive the evaluation of user's passiveness, or even generate users to trust crisis.

To solve the above-mentioned problems, need constantly to carry out the expansion of image resource in the background data base of image search engine It fills.As described above, during image resource expands, it may appear that the skimble-scamble problem of image tag expression way leads to other Image data resource on platform can not be accurately incorporated into the image data resource of search engine.Pass through people in the prior art The mode of work observation judges the image tags of different expression ways, and whether meaning is identical, but which need to consume it is more at This, and efficiency is lower, does not catch up with the speed of image resource update.

Based on actual techniques demand similarly as described above, Tag Estimation mode provided by the present application can be to search Existing image resource carries out deep learning in engine, and training generates the model based on relationship between label and image.Hereafter, will Image data on other platforms is input to after the model, can predict to obtain label corresponding to image data.According to pre- The number of image corresponding to the label measured may determine whether two labels for merging image, that is, merge two Image resource on platform.

Technical solution provided by the embodiments of the present application can be applied to kinds of platform, answer below the several of the technical solution It is simply introduced with platform.

A kind of image search service platform of simple application platform such as offer picture search.On the service platform, use Family can input arbitrary image in the search engine provided by the service platform, the image search service platform can according to The images match of family input goes out the relevant information of image similar with input picture or the input picture.For example, for upper The example in technological accumulation and inheritance is stated, the photo of snow mountain beautiful scenery on the poster of shooting is input in search engine by user, described search Engine can then match the relevant information of the photo immediately.As shown in Figure 1, the relevant information can be shown in the form of a label Out, for snow scenes photo shown in FIG. 1, described search engine can export following label: " Switzerland ", " snow mountain ", " Alps Mountain ", " ferreous scholar peak ", " winter ", " sky " etc..Certainly, the relevant information can also be shown in the form of sentence etc., the application Herein with no restrictions.Using the technical solution of the application, large nuber of images resource quickly and accurately can be extended into search engine Background data base in, guarantee user in the case where input picture scans for, input picture can be correctly searched for Relevant information.

Such as user picture management of another application scenarios.It is generated beyond the clouds to count with the rapid development of cloud It is growing day by day according to measuring, and the personal photo of user is wherein important component part.Either the client device of user also Beyond the clouds, to there is the way to manage for album in the prior art, but way to manage is fairly simple, such as according to when Between, personage, more macroscopical, the coarse mode classification such as place be managed.It, can be with using technical solution provided herein Deep learning is carried out to existing scene class image, generates the relationship mould between the image under scene class label and each label Type.Based on above-mentioned relation model, can picture, photo to user carry out scene classification.Fig. 2 is provided by the present application according to field The user interface schematic diagram that scape classification is managed user's photograph album.It, can be with as shown in Fig. 2, using the technical solution of the application The photo of user is divided into several big classifications such as interior, open air, caricature, several height can also be divided under each big classification Classification, such as indoor scene can also be divided into shelter, office, coffee shop, the multiple scenes in market, and similarly, Outdoor Scene is also It is segmented into multiple scenes such as mountain-climbing, sea, gardens.Classification based on scene searches for photo to user and provides great convenience, For example, user needs to search for the photo that a first two years are played by the sea, but the specific time has forgotten about.If according to existing Some photograph album way to manages, user need open for a long time before photograph album, browsing can just find photo for a long time, for users, Operation is extremely inconvenient.If be managed according to scene to album according to shown in Fig. 2, user need to only remember rough field Scape, such as " sea ", it can quickly find corresponding photo, search efficiency greatly improves.

It should be noted that the mode classification of above-mentioned scene is not limited to the example above, it can also be directly with each subclass work It for major category, can be optionally combined between classification, the application is herein with no restrictions.

The technical solution of the application can also be applied to entertaining and know the application scenarios such as figure, security protection, for setting for application scenarios It sets, the application is herein with no restrictions.To sum up, the realization of above-mentioned application scenarios is based on technical solution provided by the present application, i.e., will be extra large Spirogram is quickly and accurately incorporated into existing data resource as data resource, so that existing image data resource is more abundant, Real data basis is built to provide various image class services.

3 pairs of Tag Estimation methods described herein are illustrated with reference to the accompanying drawing.Fig. 3 is label provided by the present application A kind of method flow schematic diagram of embodiment of prediction technique.Although this application provides as the following examples or shown in the drawings Method operating procedure, but based on routine or in the method may include more or less without creative labor Operating procedure.In the step of there is no necessary causalities in logicality, the execution sequence of these steps is not limited to the application What embodiment provided executes sequence.It, can be according to embodiment or attached drawing during the Tag Estimation of the method in practice Shown in method sequence execute or parallel execute (such as environment of parallel processor or multiple threads).

A kind of embodiment of specific Tag Estimation method provided by the present application as shown in figure 3, the method may include:

S31: obtaining at least one image data set, and the image in described image data set belongs to same category.

In the present embodiment, at least one described image data set can be used as image data source, in described image data set Image belong to same category.Described image not only may include the still images such as photo, picture, can also move including Gif etc. State image.In some embodiments, described image data set can be derived from Google open image database, MIT scene number According to image data bases such as, Imagenet data, image included in described image database all has image tag.The figure As label can be used for describing the key feature of correspondence image, concrete form may include at least one phrase, vocabulary etc., scheme As can use the corresponding image of described image tag access in data.

In the present embodiment, since the image in described image data set belongs to same category, then in described image data set Image can at least have an identical image tag.Described image label such as " English is short ", " gardenia ", " keyboard " Deng.Certainly, in other embodiments, the image in described image data set can not have image tag, but can determine Described image belongs to the same classification.For example, can determine that this group of photo is in same place according to the source information of one group of photo Continuous time period in the same target (such as model, sunrise) of shooting, then can determine that this group of photo is with the same category Image.Therefore, in the present embodiment, not only picture number can be determined with other identifier information by image tag information Whether belong to the same category according to the image of concentration, the application is herein with no restrictions.

It should be noted that described image data set is not limited to certainly may be used from large database disclosed above To include the image data resource etc. of other users individual foundation.Such as in the case where obtaining user's agreement, get user's Album, and the classification photograph collection etc. established in the photograph album comprising individual subscriber.For the source of image data set, the application Herein with no restrictions.

S32: carrying out Tag Estimation to the image in described image data set respectively within the scope of preset image tag, raw At at least one prediction label of each image.

In the embodiment of the present application, at least one described image data set can be merged to existing original data source In, the original data source can have preset image tag range, i.e., can wrap in the described original data source Include the label of several determinations, and described image data set, as between new image data source and original data source Merging is are as follows: from choosing the image mark to match with the image in described image data set in the preset image tag range Label.Fig. 4 is provided by the present application by new image data source (at least one i.e. described image data set) and original data source The example schematic diagram merged, as shown in figure 4, may include " sky ", " sea ", " quotient in original data source Image data set under several labels such as field ", " bar " may include " アニメ ", " purchase in new image data source Image data set under the labels such as object ", " seabeach ", " Pubs ", it is found that the image tag in new image data source can be The expression way of multilingual, from literal meaning, " shopping " and " market ", " sea " and " seabeach ", " bar " with " Pubs " is likely to belong to the same classification, i.e., the label of two image data sets can merge, but exists much not Determining factor.For example, " shopping " corresponding image data set can be supermarket shopping scene, and " market " corresponding picture number It can be the market shoppings scene such as clothes, shoes and hats, jewellery according to collection, two image data sets merged clearly improper 's.Therefore, the merging that label is simply carried out according to the literal meaning of label, is easy to produce error.

In the present embodiment, the original data source is to pre-establish, in establishment process, it may be predetermined that The image tag needed can be configured in the selection of image tag according to actual business demand.For example, for pressing According to the business demand that scene classification classifies to user's photograph album, then scene class label can be set, such as " office " " occupies The labels such as institute ", " market ", " cinema ", " coffee shop ", " wedding ", " pleasure-boat ".It is provided with after corresponding image tag, The image under each image tag can be filled, specifically, can be searched for by search engine relevant to pre-set image label Image, then the image searched is cleaned, is screened, so that image matches with image tag.Finally, being completed again to matching Image stamp corresponding image tag, that is, generate the original data source.

It, can be to the original data source after the original data source is completed in building in the present embodiment Deep learning is carried out, to obtain the relational model between image tag and image.Specifically mode of learning may include:

SS1: the image pattern of multiple known image labels is obtained.

SS2: deep learning processing is carried out to the image pattern of the multiple known image label, obtains image tag and figure Relational model as between.

In the present embodiment, the image pattern of multiple known image labels can be obtained from the original data source, And deep learning is carried out to described multiple images sample, obtain the relational model between image tag and image.In some implementations In example, it can use convolutional neural networks algorithm and described multiple images sample learnt.Specifically, convolutional Neural is being utilized During network algorithm carries out deep learning, the initial relation model between image tag and image, the relationship can be set Model is using image pattern as input data, using the image tag of described image sample as output data.The relational model In be provided with training parameter, the process of the deep learning is to optimize the process of the training parameter.It, can be in the present embodiment The image pattern that a large amount of known image labels are obtained from existing image data source, by constantly by image pattern and institute The image tag for stating image pattern is input in the relational model, can constantly optimize the training parameter, described in raising The output accuracy of relational model, until the relational model meets preset requirement.The preset requirement for example may include most Bigization goal-selling function, model accuracy is not less than certain threshold value etc..Certainly, in other embodiments, autocoding can be used Algorithm, sparse autocoding algorithm, limitation Boltzmann machine algorithm, degree of deeply convinceing network algorithm carry out described multiple images sample Study, the application for deep learning mode with no restrictions.

It is illustrated without limitation with the example that convolutional neural networks algorithm carries out deep learning to image pattern below. The process of deep learning is just primarily to training parameter θ, the objective function that deep learning is arranged is L (θ, D), when maximization mesh When scalar functions L (θ, D), parameter θ is optimized parameter, when θ is optimized parameter, between available image tag and image Relational model.The expression formula of the objective function L (θ, D) may include:

Wherein, L (θ, D) indicates likelihood function of the raw image data source D on the model containing parameter θ；θ is nerve net Network needs the training parameter learnt；D indicates original data source；I indicates i-th image in original data source；x⁽ⁱ⁾ The sample representation for indicating i-th image, such as pixel grey scale value matrix；y⁽ⁱ⁾Indicate the image tag of the i-th picture；Y is indicated The corresponding entire tag set in original data source；P (Y=y⁽ⁱ⁾|x⁽ⁱ⁾, θ) and indicate conditional probability, i.e., in existing parameter θ In the case of input original data source in image pattern x⁽ⁱ⁾, predict image pattern x⁽ⁱ⁾Corresponding image tag y⁽ⁱ⁾Probability.

In the present embodiment, error back propagation (Error Back Propagation, BP) algorithm can use to training Parameter θ is learnt.BP algorithm can realize the reality output of network using gradient search technology based on Delta learning rules It is minimized with the mean square deviation of desired output.The process of neural network learning can be understood as the mistake that power is corrected in back-propagation Journey, BP network topology structure are as shown in Figure 5.BP algorithm is substantially to seek the minimum problems of error function, which can adopt With the steepest descending method in Non-Linear Programming, weight coefficient is modified by the negative gradient direction of error function.

In order to illustrate BP algorithm, error function E is defined first.The quadratic sum of the difference of desired output and reality output is taken to miss Difference function then has:

Wherein, t_kIt is unit k for training sample x⁽ⁱ⁾Target value, the target value in the present embodiment is to indicate Training sample x⁽ⁱ⁾Corresponding image tag y⁽ⁱ⁾, o_kIndicate given training sample x⁽ⁱ⁾The output valve of Shi Danyuan k.

Subsequently, the modes such as gradient descent method can be used to calculate error function E, and according to the negative of error function Gradient direction modifies the weight coefficient of network topology structure shown in fig. 5.

The relational model (i.e. training parameter θ) of image tag and image is obtained using the deep learning for being similar to the above method Later, it can use the relational model and Tag Estimation carried out to the image in described image data set.Arbitrary image is inputted To after in the relational model, image tag corresponding to described image can be calculated.Therefore, in the reality of the application It applies in example, it is described to may include: to described multiple images progress Tag Estimation respectively in preset label range

Tag Estimation is carried out to the image in described image data set respectively using the relational model, generates each image At least one prediction label, wherein the prediction label is contained in the image tag of described image sample.

In the present embodiment, it can use the relational model and Tag Estimation carried out to described image respectively, generate described more The prediction label of a image.The prediction label can be one or more, if an image is the pyrotechnics in night, then Prediction label may include two image tags of night and pyrotechnics, and the two image tags are both contained in and participate in deep learning In the image tag of image pattern.

It, can be to described if the image data amount of image data concentration is larger in another embodiment of the application Image data set carries out Tag Estimation after being sampled again.It is described right respectively within the scope of preset image tag based on this Described multiple images carry out Tag Estimation

SS-1: described multiple images are sampled according to preset rules.

SS-2: Tag Estimation is carried out to the image after sampling respectively within the scope of preset image tag.

In the present embodiment, described multiple images can be sampled according to preset rules, the preset rules for example may be used To include when the number of described multiple images is greater than a certain threshold value.If the known image for belonging to label " rock-climbing " has 2000 width, If doing Tag Estimation to this 2000 width image, the calculation amount that is bound to is larger, since this known 2000 width image belongs to figure As label " rock-climbing ", then stochastical sampling can be carried out to the 2000 width image, it is pre- such as to sample 80 width image progress label therein It surveys, then can greatly reduce Tag Estimation workload, improve the efficiency of Tag Estimation.

In another embodiment of the application, confidence level corresponding to each prediction label can also be calculated separately, and It is chosen whether to count the image according to the confidence level.Based on this, it is described within the scope of preset image tag respectively to described Multiple images carry out Tag Estimation

SSS1: Tag Estimation is carried out to described multiple images respectively within the scope of preset image tag, generates each figure At least one prediction label of picture；

SSS2: the confidence level of the prediction label is calculated separately.

In the present embodiment, not only Tag Estimation can be carried out to each image, each pre- mark can also be calculated separately out The corresponding confidence level of label, the confidence level can be used for characterizing the matching degree between the prediction label and described image, When the prediction label is more matched with described image, the confidence value is higher.As shown in fig. 6, input picture is Sydney song The fireworks night scene of theater can be predicted to obtain " fireworks ", " night scene ", " Sydney opera using relational model provided by the present application Multiple prediction labels such as institute ", " lake water ", " pleasure-boat ", but confidence level corresponding to each prediction label is not identical, such as " night scene " Confidence level be 97, be confidence level peak, it is confidence level minimum that the confidence level of " pleasure-boat ", which is 63,.Utilize each pre- mark The corresponding confidence level of label, can determine it is subsequent when counting the number of the prediction label corresponding image, choose whether include The image.

S33: the number that each prediction label occurs is counted respectively, obtains the prediction label that the number meets preset condition Corresponding image, the label that image data set belonging to described image is arranged is the prediction label.

In the present embodiment, the number of the prediction label can be counted respectively, it, can when the number meets preset condition Meet image corresponding to the prediction label of preset condition to obtain the number, and picture number corresponding to described image is set Label according to collection is the default label.In one embodiment of the application, the preset condition for example may include following At least one of:

The number is greater than first threshold；

The number accounts for the ratio that total degree occur in all prediction labels and is greater than second threshold；

The number is in prediction label frequency of occurrence according to before being located at least in third threshold value position in more to few sequence.

It, can be when the number of the prediction label meets above-mentioned condition, by described image data acquisition system in the present embodiment And extremely original data corresponding to the prediction label is concentrated.The first preset condition for example may include the pre- bidding The number of label is greater than first threshold, for example, Tag Estimation such as is carried out to the 2000 width images for belonging to label " rock-climbing ", if It is " climbing " that prediction, which is wherein had the prediction label of 1800 sub-pictures, i.e., the number of prediction label " climbing " is 1800.If setting Setting the first threshold is 1750, can be the image data set and initial pictures number of " rock-climbing " by label due to 1800 > 1750 It is merged according to the image data set that label in source is " climbing ".For second of preset condition, the number of the prediction label The ratio for accounting for prediction label total number is greater than second threshold.If prediction obtains it from label " rock-climbing " corresponding 2000 width image In 1800 width images prediction label be " climbings ", if hypothesis only one label of each image, prediction label " rock-climbing " The 1800/2000=90% of entire prediction label sum is accounted for, if it is 85% that the second threshold, which is arranged, label " can be climbed Step on " corresponding to image data set merged with image data set corresponding to " rock-climbing ".For the third preset condition, institute Number is stated in prediction label number according to before being located at least in third threshold value position in more to few sequence.If to multiple classifications Image set carries out Tag Estimation, and obtains multiple prediction labels, hereafter, how much can carry out to each prediction label according to number Sequence, is such as ranked up according to from more to few sequence, if it is 6 that the third threshold value, which is arranged, can will sort at first 6 Image data set corresponding to prediction label merges processing.

In one embodiment of the application, the acquisition number meets corresponding to the prediction label of preset condition Image, the label that image data set belonging to described image is arranged is that the prediction label may include:

SSS_1: it obtains the number and meets image corresponding to the prediction label of preset condition；

SSS_2: filter out that quantity is no less than the 4th threshold value from described image belongs to the same category image data set Image；

SSS_3: the label of image data set belonging to the image that setting screening obtains is the prediction label.

In the present embodiment, if carrying out Tag Estimation to the image that different classes of image data is concentrated, and each image can To include multiple prediction labels, it is likely that the image that different classes of image data is concentrated occur has identical prediction label Situation.For example, the classification of image data set A belongs to pleasure-boat, wherein the prediction label of image 1 includes " pleasure-boat ", " fireworks ", " night Scape ", and the classification of image data set B belongs to lake water, the prediction label of image 2 therein includes " lake water ", " swan ", " trip Ship ".If the corresponding image overwhelming majority of discovery " pleasure-boat " derives from image when carrying out number statistics to prediction label " pleasure-boat " Data set A, certainly also comprising the image 2 in image data set B.If the number of prediction label " pleasure-boat " meets default item at this time Part, then the label that image data set A can be set is " pleasure-boat ", and needs the image 2 in rejection image data set B.In order to make Obtaining some different classes of image datas concentrates the image of few part to fall into prediction label, then can sieve from described image The image for belonging to the same category image data set that quantity is no less than the 4th threshold value is selected, and is arranged belonging to the image that screening obtains The label of image data set be the prediction label so that the setting of label is more accurate, reliable.

For the above-mentioned prediction label with confidence level, in one embodiment of the application, described in the statistics respectively The number of prediction label corresponding image may include:

SS-A: judge whether the confidence level of the prediction label is greater than preset threshold；

SS-B: if the determination result is YES, it is determined that the prediction label participates in number statistics.

In the present embodiment, it can be determined that whether the confidence level of the prediction label is greater than preset threshold, when the confidence level When greater than threshold value, the number of the prediction label is counted.Specifically, it such as can be set when the confidence level of the prediction label is big When 80%, prediction label corresponding to described image is just counted.Such as image shown in fig. 6, " fireworks ", " night are only counted Scape ", " lake water " these three prediction labels, without statistics " Sydney Opera House ", " pleasure-boat " the two prediction labels.

It should be noted that the merging mode of two image data sets may include by the mark of image data set to be combined Label are updated to prediction label, and the prediction label is to be put in storage label.Fig. 7 be in the application by source of new data Goole image library _ In 01 original tag be " Basset Hound " image data set merge to prediction label be " BASSET HOUND " image data The schematic diagram of concentration.The merging that image data set is carried out by way of updating label, may be implemented large-scale image data Migration rapidly improves data merging efficiency.

Tag Estimation method provided by the present application can merge the image data set with the same category to initial pictures In data source, in data merging process, Tag Estimation is carried out to the image that image data is concentrated first, and predict to obtain pre- Mark label belong to the label range in original data source.Further according to the number of each prediction label, decide whether to merge phase The image data set answered.With judge whether data set merges according to the literal meaning of image tag in the prior art, the present embodiment In, consider using with opposite direction in the prior art, sets about from the corresponding image of label, be equivalent to and first judge whether image belongs to In same category, the corresponding label of image data set is remerged.Using aforesaid way, the accurate of image data merging can be improved Degree may be implemented the rapid migration of large-scale image data, mention in addition, carrying out data merging in the way of setting label The combined efficiency of high large-scale data.

The application also proposes another embodiment of Tag Estimation method, as shown in figure 8, the method may include:

S81: obtaining at least one image data set, and the image in described image data set belongs to same category；

S82: carrying out Tag Estimation to the image in described image data set respectively within the scope of preset image tag, raw At at least one prediction label of each image；

S83: counting the number of the prediction label corresponding image respectively, when the number meets preset condition, if The label for setting image data set belonging to described image is the prediction label.

In the present embodiment, the specific embodiment of S81, S82 can refer to S31, S32, and details are not described herein.In this implementation In example, the number of image corresponding to the prediction label can also be counted, in this way, uniting in the number to image After meter, the label of image data set described in described image can be directly set.

The application also proposes another embodiment of Tag Estimation method, as shown in figure 9, the method may include:

S91: acquisition belongs to same category of multiple images；

S92: Tag Estimation is carried out to described multiple images using prediction model, generates at least one prediction for each image Label；

S93: counting the frequency of occurrence of single prediction label, and the prediction label that the frequency of occurrence meets preset condition is made For the recommendation label of described multiple images.

Figure 10 is a kind of modular structure schematic diagram of embodiment of Tag Estimation device provided by the present application, such as Figure 10 institute Show, described device includes processor and the memory for storage processor executable instruction, described in the processor executes It may be implemented when instruction:

Tag Estimation device provided by the present application can merge the image data set with the same category to initial pictures In data source, in data merging process, Tag Estimation is carried out to the image that image data is concentrated first, and predict to obtain pre- Mark label belong to the label range in original data source.Further according to the number of each prediction label, decide whether to merge phase The image data set answered.With judge whether data set merges according to the literal meaning of image tag in the prior art, the present embodiment In, consider using with opposite direction in the prior art, sets about from the corresponding image of label, be equivalent to and first judge whether image belongs to In same category, the corresponding label of image data set is remerged.Using aforesaid way, the accurate of image data merging can be improved Degree may be implemented the rapid migration of large-scale image data, mention in addition, carrying out data merging in the way of setting label The combined efficiency of high large-scale data.

Optionally, in one embodiment of the application, the processor is realizing step in preset image tag model Enclose it is interior respectively in described image data set image carry out Tag Estimation before, can also realize:

Obtain the image pattern of multiple known image labels；

Deep learning processing is carried out to the image pattern of the multiple known image label, obtain image tag and image it Between relational model.

Optionally, in one embodiment of the application, the processor is realizing step to the multiple known image The image pattern of label carries out

The relational model of image and image tag is set, is provided with training parameter in the relational model；

Using described image sample as the input data of the relational model, the image tag of described image sample is as institute The output data for stating relational model adjusts the training parameter, until the relational model reaches preset requirement.

Optionally, in one embodiment of the application, the processor is realizing step in preset image tag model Enclose it is interior respectively in described image data set image carry out Tag Estimation when may include:

Optionally, in one embodiment of the application, the image in described image data set can at least have one Identical image tag.

The image in described image data set is sampled according to preset rules；

Tag Estimation is carried out to the image after sampling respectively within the scope of preset image tag.

Calculate separately the confidence level of the prediction label.

Optionally, in one embodiment of the application, the processor counts the pre- mark in realization step respectively May include: when signing the number of corresponding image

Judge whether the confidence level of the prediction label is greater than preset threshold；

If the determination result is YES, it is determined that the prediction label participates in number statistics.

Optionally, in one embodiment of the application, the preset condition may include at least one of following:

The number is greater than first threshold；

Optionally, in one embodiment of the application, the processor is realizing that it is pre- that step obtains the number satisfaction If image corresponding to the prediction label of condition, the label that image data set belonging to described image is arranged is the prediction label When may include:

It obtains the number and meets image corresponding to the prediction label of preset condition；

The image for belonging to the same category image data set that quantity is no less than the 4th threshold value is filtered out from described image；

The label that image data set belonging to the image that screening obtains is arranged is the prediction label.

On the other hand the application also provides another embodiment of Tag Estimation device, the apparatus may include processors And the memory for storage processor executable instruction, the processor may be implemented when executing described instruction:

Acquisition belongs to same category of multiple images；

On the other hand the application also proposes a kind of computer readable storage medium, be stored thereon with computer instruction, described Instruction, which is performed, may be implemented following steps:

The number for counting the prediction label obtains the number and meets figure corresponding to the prediction label of preset condition Picture, the label that image data set belonging to described image is arranged is the prediction label.

The computer readable storage medium may include the physical unit for storing information, usually by message digit It is stored again by the media in the way of electricity, magnetic or optics etc. after change.Computer-readable storage medium described in the present embodiment It may include: that the device of information is stored in the way of electric energy such as that matter, which has, various memory, such as RAM, ROM；In the way of magnetic energy Store information device such as, hard disk, floppy disk, tape, core memory, magnetic bubble memory, USB flash disk；It is stored and is believed using optical mode The device of breath such as, CD or DVD.Certainly, there are also the readable storage medium storing program for executing of other modes, such as quantum memory, graphene to store Device etc..

Although mentioning the deep learning method in embodiment, Tag Estimation, data statistics or the like in teachings herein Data study, processing description, still, the application is not limited to comply fully with industry programming language design standard or reality Apply the case where data described in example show, handle.It is modified slightly on the basis of certain Pages Design language or embodiment description Embodiment afterwards can also carry out above-described embodiment it is identical, it is equivalent or it is close or deformation after it is anticipated that implementation result.When So, even if not by the way of upper data processing, as long as meeting the data study of the application the various embodiments described above, processing description, Still identical application may be implemented, details are not described herein.

Although this application provides the method operating procedure as described in embodiment or flow chart, based on conventional or noninvasive The means for the property made may include more or less operating procedure.The step of enumerating in embodiment sequence is only numerous steps One of execution sequence mode, does not represent and unique executes sequence.It, can when device or client production in practice executes To execute or parallel execute (such as at parallel processor or multithreading according to embodiment or method shown in the drawings sequence The environment of reason).

It is also known in the art that other than realizing controller in a manner of pure computer readable program code, it is complete Entirely can by by method and step carry out programming in logic come so that controller with logic gate, switch, specific integrated circuit, programmable Logic controller realizes identical function with the form for being embedded in microcontroller etc..Therefore this controller is considered one kind Hardware component, and the structure that the device for realizing various functions that its inside includes can also be considered as in hardware component.Or Person even, can will be considered as realizing the device of various functions either the software module of implementation method can be hardware again Structure in component.

The application can describe in the general context of computer-executable instructions executed by a computer, such as program Module.Generally, program module includes routines performing specific tasks or implementing specific abstract data types, programs, objects, group Part, data structure, class etc..The application can also be practiced in a distributed computing environment, in these distributed computing environments, By executing task by the connected remote processing devices of communication network.In a distributed computing environment, program module can To be located in the local and remote computer storage media including storage equipment.

As seen through the above description of the embodiments, those skilled in the art can be understood that the application can It realizes by means of software and necessary general hardware platform.Based on this understanding, the technical solution essence of the application On in other words the part that contributes to existing technology can be embodied in the form of software products, the computer software product It can store in storage medium, such as ROM/RAM, magnetic disk, CD, including some instructions are used so that a computer equipment (can be personal computer, mobile terminal, server or the network equipment etc.) executes each embodiment of the application or implementation Method described in certain parts of example.

Each embodiment in this specification is described in a progressive manner, the same or similar portion between each embodiment Dividing may refer to each other, and each embodiment focuses on the differences from other embodiments.The application can be used for crowd In mostly general or special purpose computing system environments or configuration.Such as: personal computer, server computer, handheld device or Portable device, laptop device, multicomputer system, microprocessor-based system, set top box, programmable electronics set Standby, network PC, minicomputer, mainframe computer, distributed computing environment including any of the above system or equipment etc..

Although depicting the application by embodiment, it will be appreciated by the skilled addressee that the application there are many deformation and Variation is without departing from spirit herein, it is desirable to which the attached claims include these deformations and change without departing from the application's Spirit.

Claims

1. a kind of Tag Estimation method characterized by comprising

Tag Estimation is carried out to the image in described image data set respectively within the scope of preset image tag, generates each figure At least one prediction label of picture；

The number that each prediction label occurs is counted respectively, is obtained the number and is met corresponding to the prediction label of preset condition Image, the label that image data set belonging to described image is arranged is the prediction label.

2. the method according to claim 1, wherein respectively to described image within the scope of preset image tag Before image in data set carries out Tag Estimation, the method also includes:

Obtain the image pattern of multiple known image labels；

Deep learning processing is carried out to the image pattern of the multiple known image label, is obtained between image tag and image Relational model.

3. according to the method described in claim 2, it is characterized in that, the image pattern to the multiple known image label Carrying out deep learning processing includes:

Using described image sample as the input data of the relational model, the image tag of described image sample is as the pass It is the output data of model, adjusts the training parameter, until the relational model reaches preset requirement.

4. according to the method in claim 2 or 3, which is characterized in that described right respectively within the scope of preset image tag Image in described image data set carries out Tag Estimation

Tag Estimation is carried out to the image in described image data set respectively using the relational model, generates each image extremely A few prediction label, wherein the prediction label is contained in the image tag of described image sample.

5. the method according to claim 1, wherein the image in described image data set at least has a phase Same image tag.

6. the method according to claim 1, wherein it is described within the scope of preset image tag respectively to described The image that image data is concentrated carries out Tag Estimation

The image in described image data set is sampled according to preset rules；

7. the method according to claim 1, wherein it is described within the scope of preset image tag respectively to described The image that image data is concentrated carries out Tag Estimation

Calculate separately the confidence level of the prediction label.

8. the method according to the description of claim 7 is characterized in that the number packet for counting each prediction label respectively and occurring It includes:

9. the method according to claim 1, wherein the preset condition includes at least one of following:

The number is greater than first threshold；

10. the method according to claim 1, wherein the prediction for obtaining the number and meeting preset condition Image corresponding to label, the label that image data set belonging to described image is arranged is that the prediction label includes:

11. a kind of Tag Estimation method characterized by comprising

The figure is arranged when the number meets preset condition in the number for counting the prediction label corresponding image respectively As the label of affiliated image data set is the prediction label.

12. a kind of Tag Estimation method characterized by comprising

Acquisition belongs to same category of multiple images；

Tag Estimation is carried out to described multiple images using prediction model, generates at least one prediction label for each image；

The frequency of occurrence is met the prediction label of preset condition as described more by the frequency of occurrence for counting single prediction label The recommendation label of a image.

13. a kind of Tag Estimation device, which is characterized in that including processor and depositing for storage processor executable instruction Reservoir, the processor are realized when executing described instruction:

14. device according to claim 13, which is characterized in that the processor is realizing step in preset image mark It signs before carrying out Tag Estimation to the image in described image data set respectively in range, also realizes:

Obtain the image pattern of multiple known image labels；

15. device according to claim 14, which is characterized in that the processor is realizing step to the multiple known The image pattern of image tag carries out

16. device according to claim 14 or 15, which is characterized in that the processor is realizing step in preset figure As including: when carrying out Tag Estimation to the image in described image data set respectively in label range

17. device according to claim 13, which is characterized in that the image in described image data set at least has one Identical image tag.

18. device according to claim 13, which is characterized in that the processor is realizing step in preset image mark Include: when carrying out Tag Estimation to the image in described image data set respectively in label range

The image in described image data set is sampled according to preset rules；

19. device according to claim 13, which is characterized in that the processor is realizing step in preset image mark Include: when carrying out Tag Estimation to the image in described image data set respectively in label range

Calculate separately the confidence level of the prediction label.

20. device according to claim 19, which is characterized in that the processor counts each pre- in realization step respectively Mark includes: when checking out existing number

21. device according to claim 13, which is characterized in that the preset condition includes at least one of following:

The number is greater than first threshold；

22. device according to claim 13, which is characterized in that the processor is realizing that it is full that step obtains the number Image corresponding to the prediction label of sufficient preset condition, the label that image data set belonging to described image is arranged is the prediction Include: when label

23. a kind of Tag Estimation device, which is characterized in that including processor and depositing for storage processor executable instruction Reservoir, the processor are realized when executing described instruction:

24. a kind of Tag Estimation device, which is characterized in that including processor and depositing for storage processor executable instruction Reservoir, the processor are realized when executing described instruction:

Acquisition belongs to same category of multiple images；

25. a kind of computer readable storage medium, is stored thereon with computer instruction, which is characterized in that described instruction is performed When perform the steps of