CN114461986B

CN114461986B - Method for training recognition identification model, and method and device for image recognition

Info

Publication number: CN114461986B
Application number: CN202210051062.XA
Authority: CN
Inventors: 阚野; 胡瑞华; 崔国珍
Original assignee: Beijing Shareit Information Technology Co Ltd
Current assignee: Beijing Shareit Information Technology Co Ltd
Priority date: 2022-01-17
Filing date: 2022-01-17
Publication date: 2023-04-07
Anticipated expiration: 2042-01-17
Also published as: CN114461986A

Abstract

The disclosure relates to a method for training a recognition identification model, a method and a device for image recognition; the method for training the recognition mark model comprises the following steps: acquiring a background picture and a foreground picture, wherein the background picture is an initial sample picture; the foreground picture is a to-be-identified identification picture; synthesizing according to the background picture and the foreground picture to generate a calibration sample; adding a corresponding label to the calibration sample, wherein the label is used for representing the category of the mark to be identified and the position of the mark to be identified in the picture; and inputting the calibration sample added with the label and the background picture into a recognition model to be trained for training to obtain a target recognition model, wherein the target recognition model is used for recognizing whether the input picture contains a mark to be recognized or not. LOGO detection accuracy can be improved as such in this disclosure.

Description

Method for training recognition identification model, and method and device for image recognition

Technical Field

The present disclosure relates to, but not limited to, the field of internet technologies, and in particular, to a method for training a recognition marker model, and a method and an apparatus for image recognition.

Background

At present, the copyright requirement of the internet for image Content is becoming strict, and a Content infringement phenomenon including a third party identifier LOGO may exist in a Content uploaded by a User of a User Generated Content (UGC) platform, and in order to avoid the problem, an Artificial Intelligence (AI) model or an auditor is generally used for examining image and video Content and deleting the Content including the third party LOGO.

At present, methods such as manual identification, pattern identification, machine learning and the like are mainly used for LOGO detection in pictures and videos.

However, in the process of detecting the LOGO in the related technology, the LOGO is detected based on the requirements on samples and the requirement on the quantity, so that the problem of low LOGO detection accuracy is caused, and no related solution is provided.

Disclosure of Invention

The disclosure provides a method for training a recognition identification model, and a method and a device for image recognition, which are used for solving the technical problem that the LOGO detection accuracy is low due to the fact that the LOGO detection is influenced based on the requirements for samples and the quantity in the related technology.

According to a first aspect of the embodiments of the present disclosure, there is provided a method for training a recognition marker model, including: acquiring a background picture and a foreground picture, wherein the background picture is an initial sample picture; the foreground picture is a to-be-identified identification picture; synthesizing according to the background picture and the foreground picture to generate a calibration sample; adding a corresponding label to the calibration sample, wherein the label is used for representing the category of the mark to be identified and the position of the mark to be identified in the picture; and inputting the calibration sample added with the label and the background picture into a recognition model to be trained for training to obtain a target recognition model, wherein the target recognition model is used for recognizing whether the input picture contains a mark to be recognized or not.

In the above scheme, obtaining the background picture includes: the method comprises the steps that a plurality of pictures are obtained through a preset channel and serve as initial sample pictures, wherein the pictures comprise at least one identifier to be identified; generating a source sample set according to the initial sample picture; and processing all pictures in the source sample set through image data enhancement to generate a background picture.

In the above scheme, the background picture includes: the image processing method comprises an original image and a negative sample image, wherein the negative sample image is a non-category and non-coordinate image.

In the above scheme, obtaining the foreground picture includes: acquiring a picture set containing a to-be-identified identifier, and taking the picture set as the to-be-identified identifier picture set; acquiring pictures with preset transparency of the to-be-identified mark from the to-be-identified mark picture set; and acquiring any one of the pictures with preset transparency of the identification to be identified as a foreground picture.

In the above scheme, taking the picture set as the to-be-identified identification atlas includes: classifying the picture set according to a preset condition to obtain at least one to-be-identified identification picture set, wherein the preset condition is a brand corresponding to the to-be-identified identification.

In the above scheme, the identifier atlas to be identified includes: a static identification picture and/or a dynamic identification picture.

In the above scheme, acquiring any one of the pictures with the preset transparency of the identifier to be identified as the foreground picture includes: under the condition that the identification atlas to be identified comprises a dynamic identification picture, obtaining pictures of each frame of identification form to be identified; and acquiring a picture with preset transparency of the identifier to be identified according to each frame of the picture with the identifier form to be identified, and generating a foreground picture.

In the above scheme, adding the corresponding label to the calibration sample includes: obtaining the category of the mark to be identified in the foreground image; obtaining the position of a mark to be identified in a calibration sample; and adding corresponding labels to the calibration samples according to the categories and the positions to obtain the calibration samples added with the category labels and the position labels.

In the above scheme, inputting the labeled calibration sample and the background picture into the recognition model to be trained for training, and obtaining the target recognition model includes: training the recognition model to be trained according to the labeled calibration sample and the background picture to obtain the background picture containing the identification picture to be recognized; deleting the background picture containing the identification picture to be identified; and inputting the calibration sample added with the label and the background picture deleted with the identification picture to be recognized into the recognition model to be trained for training until the recognition model to be recognized is converged to obtain the target recognition model.

In the above scheme, training the recognition model to be trained according to the labeled calibration sample and the background picture comprises: inputting the calibration sample added with the label and the background picture into a recognition model to be trained for training to obtain initial model parameters; and identifying the background picture according to the initial model parameters and the identification model to be trained to obtain an identification result, wherein the identification result is used for indicating whether the background picture is the identification picture to be identified.

According to a second aspect of the embodiments of the present disclosure, there is provided an image recognition method applied to the above method for training a recognition marker model, including: acquiring a picture to be identified; analyzing the picture to be recognized through the target recognition model, and judging whether the picture to be recognized contains a preset label or not; if the judgment result is yes, identifying that the picture to be identified comprises a target identification; and if the judgment result is negative, outputting the detection result of the identification failure.

In the above scheme, analyzing the picture to be recognized through the target recognition model, and determining whether the picture to be recognized includes the preset tag includes: and under the condition that the picture to be recognized contains the dynamic identification, analyzing whether a preset label contained in the picture to be recognized contains each frame of identification form to be recognized or not through the target recognition model.

In the above scheme, after outputting the detection result of the identification failure, the method further includes: and adding a label to the picture to be recognized, and correspondingly adjusting the target recognition model to obtain the optimized target recognition model.

According to a third aspect of the embodiments of the present disclosure, there is provided an apparatus for training a recognition mark model, including: the acquisition module is used for acquiring a background picture and a foreground picture, wherein the background picture is an initial sample picture; the foreground picture is a to-be-identified identification picture; the sample generation module is used for synthesizing the background picture and the foreground picture to generate a calibration sample; the label adding module is used for adding a corresponding label to the calibration sample, wherein the label is used for representing the category of the mark to be identified and the position of the mark to be identified in the picture; and the model training module is used for inputting the calibration sample added with the label and the background picture into the identification model to be trained for training to obtain a target identification model, and the target identification model is used for identifying whether the input picture calibration sample contains the identification to be identified.

According to a fourth aspect of the embodiments of the present disclosure, there is provided an image recognition apparatus including: the acquisition module is used for acquiring a picture to be identified; the identification module is used for analyzing the picture to be identified through the target identification model and judging whether the picture to be identified contains a preset label or not; the first judgment module is used for identifying that the picture to be identified comprises the target identification under the condition that the judgment result is yes; and the second judging module is used for outputting the detection result of the identification failure under the condition that the judgment result is negative.

The technical scheme provided by the disclosure can comprise the following beneficial effects:

in the disclosure, a background picture and a foreground picture are obtained, wherein the background picture is an initial sample picture; the foreground picture is a to-be-identified identification picture; synthesizing according to the background picture and the foreground picture to generate a calibration sample; adding a corresponding label to the calibration sample, wherein the label is used for representing the category of the mark to be identified and the position of the mark to be identified in the picture; and inputting the calibration sample added with the label and the background picture into a recognition model to be trained for training to obtain a target recognition model, wherein the target recognition model is used for recognizing whether the input picture contains a mark to be recognized or not, and the LOGO detection accuracy is improved.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and, together with the description, serve to explain the principles of the disclosure.

FIG. 1 illustrates a flowchart of a method for training a recognition marker model according to an exemplary embodiment;

FIG. 2a is a diagram illustrating an original picture in a method for training a landmark model according to an exemplary embodiment;

FIG. 2b is a diagram illustrating a negative example picture in a method for training a landmark model according to an exemplary embodiment;

FIG. 2c is a diagram illustrating a negative example picture in another method for training a recognizer model according to an exemplary embodiment;

fig. 3 is a schematic diagram illustrating a picture with a preset transparency of a to-be-recognized identifier in a method for training a recognition identifier model according to an exemplary embodiment;

FIG. 4a is a diagram illustrating a calibration sample in a method for training a recognizer model according to an exemplary embodiment;

FIG. 4b is a diagram illustrating calibration samples in another method for training a landmark model according to an exemplary embodiment;

FIG. 5a is a diagram illustrating a machine learning model training process in a method for training a recognition marker model according to an exemplary embodiment;

FIG. 5b is a diagram illustrating a machine learning model detection process in another method for training a recognition marker model according to an exemplary embodiment

FIG. 6 is a flow diagram illustrating another method for training a recognition marker model according to an exemplary embodiment;

FIG. 7 illustrates a flowchart of an image recognition method provided by an exemplary embodiment;

FIG. 8 is a diagram illustrating an apparatus for training a landmark model according to an exemplary embodiment;

fig. 9 is a schematic diagram illustrating an image recognition apparatus according to an exemplary embodiment.

Detailed Description

The embodiments of the present application are described below with reference to the drawings. In the following description, reference is made to the accompanying drawings which form a part hereof and which show by way of illustration specific aspects of embodiments of the application or which may be used in the practice of the embodiments of the application. It should be understood that embodiments of the present application may be used in other ways and may involve structural or logical changes not depicted in the drawings. The following detailed description is, therefore, not to be taken in a limiting sense, and the scope of the present application is defined by the appended claims. For example, it should be understood that the disclosure in connection with the described methods may equally apply to the corresponding apparatus or system for performing the described methods, and vice versa. For example, if one or more particular method steps are described, the corresponding apparatus may comprise one or more units, such as functional units, to perform the described one or more method steps (e.g., a unit performs one or more steps, or multiple units, each of which performs one or more of the multiple steps), even if such one or more units are not explicitly described or illustrated in the figures. On the other hand, for example, if a particular apparatus is described based on one or more units, such as functional units, the corresponding method may comprise one step to perform the functionality of the one or more units (e.g., one step performs the functionality of the one or more units, or multiple steps, each of which performs the functionality of one or more of the plurality of units), even if such one or more steps are not explicitly described or illustrated in the figures. Further, it is to be understood that features of the various exemplary embodiments and/or aspects described herein may be combined with each other, unless explicitly stated otherwise.

Example 1

According to a first aspect of the embodiments of the present disclosure, there is provided a method for training a signature model, and fig. 1 shows a flowchart of the method for training a signature model according to an exemplary embodiment, as shown in fig. 1, including:

step S102, obtaining a background picture and a foreground picture, wherein the background picture is an initial sample picture; the foreground picture is a to-be-identified identification picture;

the background picture and the foreground picture in the embodiment of the present disclosure are used to generate the calibration sample in step S104, and then the calibration sample and the background picture are input into the recognition model to be trained together for training.

The background picture and the foreground picture are obtained in the embodiment of the present disclosure as follows:

in the above solution, the acquiring the background picture in step S102 includes: the method comprises the steps that a plurality of pictures are obtained through a preset channel and serve as initial sample pictures, wherein the pictures comprise at least one identifier to be identified; generating a source sample set according to the initial sample picture; and processing all pictures in the source sample set through image data enhancement to generate a background picture.

In the above scheme, the background picture includes: the image processing method comprises an original image and a negative sample image, wherein the negative sample image is an image without category and coordinates.

Specifically, in the embodiment of the present disclosure, the preset channel may include: a gallery purchased by a content platform, the internet, or a third party;

the plurality of pictures acquired through the preset channel may be taken as initial sample pictures: the method comprises the steps of obtaining a plurality of pictures from a content platform, the internet or a gallery purchased by a third party, further obtaining initial sample pictures, taking a set formed by the plurality of initial sample pictures as an original sample set (namely, a source sample set in the embodiment of the disclosure), and recording the original sample set as So, wherein all the pictures in So are not manually screened, so that So can contain a mark to be identified, and in the embodiment of the disclosure, the mark to be identified can be recorded as a LOGO to be identified.

After So is obtained, for example, more background pictures are obtained, all pictures in So may be enhanced by an image data enhancement technology to generate a background picture, and in the embodiment of the present disclosure, the image data enhancement mode may include, but is not limited to, the following modes:

the first method is as follows: random scaling: changing the resolution and the image scale;

the second method comprises the following steps: random position clipping: cutting parts from an original picture;

the third method comprises the following steps: random horizontal/vertical flipping: turning the original picture horizontally or vertically;

the method is as follows: random angle rotation: the original picture is rotated at an angle randomly;

the fifth mode is as follows: variation in chromaticity, luminance, saturation, contrast: modifying the chroma, brightness, saturation and contrast of the original picture;

the method six: random graying: changing the image into a gray scale image of black and white with a certain probability;

the method is as follows: edge filling: filling the edges of the image randomly by using pure black;

the method eight: random masking: randomly selecting one area for masking, wherein pure black or ground glass, mosaic and the like can be used;

an infinite number of background pictures can be generated by randomly selecting any one or a combination of the above 1-8 methods, and the image set Sa (i.e., the set of background pictures) is obtained.

It should be noted that, in the embodiment of the present disclosure, the background picture may include: as shown in fig. 2a to 2c, fig. 2a shows a schematic diagram of an original picture in a method for training a recognition and identification model according to an exemplary embodiment, fig. 2b shows a schematic diagram of a negative sample picture in a method for training a recognition and identification model according to an exemplary embodiment, and fig. 2c shows a schematic diagram of a negative sample picture in another method for training a recognition and identification model according to an exemplary embodiment.

In the foregoing scheme, the obtaining the foreground picture in step S102 includes: acquiring a picture set containing a to-be-identified identifier, and taking the picture set as the to-be-identified identifier picture set; acquiring a picture with preset transparency of the mark to be identified from the mark to be identified picture set; and acquiring any one of the pictures with preset transparency of the identification to be identified as a foreground picture.

Specifically, in the embodiment of the present disclosure, fig. 3 illustrates a schematic diagram of a picture with a preset transparency of a to-be-recognized identifier in a method for training a recognition identifier model according to an exemplary embodiment, as shown in fig. 3, the to-be-recognized identifier may be any one LOGO in fig. 3, and in the embodiment of the present disclosure, the degree of the preset transparency may be translucency in fig. 3, so that an atlas (denoted as L) of the to-be-recognized identifier may be an atlas set including the to-be-recognized identifier, and a semitransparent picture carrying the to-be-recognized identifier is obtained from the atlas set, and is classified according to a brand corresponding to the identifier to obtain at least one atlas of the to-be-recognized identifier, which is denoted as Li, and i is a brand name.

Specifically, in order to optimize the execution speed of the model, the video is usually subjected to frame extraction detection (extracting dozens of frames from videos in dozens of minutes), however, the LOGO in many videos is animation LOGO (namely, a dynamic identification picture in the embodiment of the present disclosure), namely, the LOGO appears in an expanded form only at a certain moment, the complete LOGO is presented for only about 1 second, and then disappears with the animation, the total frame number of the frames containing the complete LOGO accounts for less than 1% of the total frame number of the whole video, the frames containing the complete LOGO are easily missed by frame extraction detection, in order to solve the problem, the incomplete shape of each LOGO animation is added to the set Li at different moments, so that the model can learn different states of one LOGO animation, and the detection rate can be greatly improved during frame extraction identification.

The embodiment of the disclosure generates samples according to different forms of the animation LOGO, improves the detection capability of the model, and optimizes the video detection effect.

It should be noted that, the embodiment of the present disclosure is described only by taking the above example as an example, and is not limited specifically by the method for training the recognition identification model provided by the embodiment of the present disclosure.

Step S104, synthesizing the background picture and the foreground picture to generate a calibration sample;

specifically, a set Li is randomly selected from the to-be-identified identification image set L in step S102, a LOGO semi-transparent image is randomly selected from the set Li as a foreground image, an image is randomly selected from the image set Sa (i.e., a set of background images) in step S102 as a background image, the size of the foreground image is randomly scaled, the position of the background image is randomly selected, and the foreground image and the background image are synthesized, that is, the semi-transparent foreground image is attached to the background image, so as to obtain a calibration sample.

Step S106, adding corresponding labels to the calibration samples, wherein the labels are used for representing the types of the marks to be identified and the positions of the marks to be identified in the pictures;

in the foregoing scheme, the adding a corresponding label to the calibration sample in step S106 includes: acquiring the category of the mark to be identified in the foreground image; obtaining the position of a mark to be identified in a calibration sample; and adding corresponding labels to the calibration samples according to the categories and the positions to obtain the calibration samples added with the category labels and the position labels.

The format of the tag in the embodiment of the present disclosure may be:

[ Category, LOGO coordinate position [ LOGO top left abscissa, LOGO top left ordinate, LOGO width, LOGO height ] ]

The labeled calibration sample is shown in fig. 4a and 4b, and fig. 4a is a schematic diagram illustrating the calibration sample in a method for training a recognition mark model according to an exemplary embodiment; as shown in fig. 4a, the label of the calibration sample 1 can be expressed as: class cocofun, coordinate point positions [288,210,118,34];

FIG. 4b is a diagram illustrating calibration samples in another method for training a landmark model according to an exemplary embodiment; as shown in fig. 4b, the label of the calibration sample 2 can be expressed as: category tiktok, coordinate point position [118,78,70,20].

In the embodiment of the present disclosure, the synthesized picture is a calibration sample of the LOGO picture, the name of the LOGO is a category label of the sample, and the selected mapping position during synthesis is a position label of the sample. The generated sample data is put into the set Sp (i.e., the set consisting of calibration samples).

The method for training the identification model automatically generates the calibration sample in a mode of fusing the brand LOGO semitransparent picture and the common picture.

And S108, inputting the calibration sample added with the label and the background picture into a recognition model to be trained for training to obtain a target recognition model, wherein the target recognition model is used for recognizing whether the input picture contains a mark to be recognized or not.

The recognition model to be trained may include any machine learning model supporting image classification or image detection, including but not limited to KNN, SVM, DNN, etc.

Based on the background picture in step S102 and the labeled calibration sample in step S106, that is, based on the set Sp and the set Sa (as negative samples), the machine learning model is fed for training, and the model parameters of the target recognition model are saved. And testing whether the indexes of the target recognition model meet the requirements.

In the embodiment of the disclosure, the labeled calibration sample and the background picture are input into the recognition model to be trained for training, and the target recognition model is obtained as follows:

in the above scheme, the step S108 of inputting the labeled calibration sample and the background picture into the recognition model to be trained for training to obtain the target recognition model includes: training the recognition model to be trained according to the calibration sample added with the label and the background picture to obtain a background picture containing a to-be-recognized identification picture; deleting the background picture containing the identification picture to be identified; and inputting the calibration sample added with the label and the background picture deleted with the identification picture to be recognized into the recognition model to be trained for training until the recognition model to be recognized is converged to obtain the target recognition model.

Specifically, as the pictures in the set So are not manually screened, the set Sa may contain the LOGO to be detected and identified, the picture is used as a negative sample to be trained, so that the effect of the model is not good, at the moment, the identification model to be trained can be used for training, the pictures in the Sa set are detected, the model containing the LOGO is identified, and the LOGO is deleted from the Sa. And (5) circulating the steps S102 to S108 until the recognition model to be trained is converged to obtain the target recognition model.

Specifically, fig. 5a is a schematic diagram illustrating a machine learning model training process in a method for training a recognition marker model according to an exemplary embodiment; FIG. 5b is a diagram illustrating a machine learning model detection flow in another method for training a recognition marker model according to an exemplary embodiment; as shown in fig. 5a, in the training process of the machine learning model, the calibration sample added with the label and the background picture are input into the recognition model to be trained for training, so as to obtain initial model parameters; as shown in fig. 5b, in order to verify whether the model can recognize the identifier to be recognized, the background map is recognized according to the initial model parameters and the recognition model to be trained, and a recognition result is obtained.

In summary, fig. 6 shows a flowchart of another method for training a recognition identifier model according to an exemplary embodiment, and as shown in fig. 6, the method for training a recognition identifier model according to the embodiment of the present disclosure is implemented as follows:

step1, collecting an original picture;

step2, collecting a semitransparent LOGO picture, namely acquiring a foreground picture in the embodiment of the disclosure;

step3, performing image enhancement based on the original picture in Step1, namely, acquiring a background picture in the embodiment of the disclosure;

step4, generating a calibration sample;

step5, training a classification or detection model, namely, a recognition model to be trained in the embodiment of the disclosure;

and Step6, purifying the original picture, and circulating the processes of the Step3-6 until the recognition model to be trained is converged to obtain the target recognition model.

The method for training the identification model comprises the steps of enhancing the background and fusing the LOGO, greatly increasing the generated quantity, and causing the LOGO to deform when the image containing the LOGO is used for enhancement to influence the training effect. And the original picture is detected by using the trained model, so that the original picture containing LOGO can be removed, and the training sample is purified. The model training effect is improved by continuously iterating the process.

Example 2

According to a second aspect of the embodiments of the present disclosure, an image recognition method is provided, which is applied to the method for training the recognition marker model in embodiment 1, and fig. 7 shows a flowchart of the image recognition method provided in an exemplary embodiment, and as shown in fig. 7, the image recognition method provided in the embodiments of the present disclosure includes:

step S702, acquiring a picture to be identified;

step S704, analyzing the picture to be recognized through the target recognition model, and judging whether the picture to be recognized contains a preset label;

in the foregoing solution, in step S704, analyzing the to-be-identified picture through the target identification model, and determining whether the to-be-identified picture includes the preset tag includes: and under the condition that the picture to be recognized contains the dynamic identification, analyzing whether a preset label contained in the picture to be recognized contains each frame of identification form to be recognized or not through the target recognition model.

Specifically, based on the target identification model obtained in embodiment 1, the picture to be identified in step S702 is identified, and it is determined whether the picture to be identified includes a preset tag. If the determination result is yes, step S706 is executed, and if the determination result is no, step S708 is executed.

In addition, under the condition that the picture to be recognized contains the dynamic identification, the picture to be recognized is analyzed through the target recognition model, and whether the preset value label contained in the picture to be recognized contains each frame of identification form to be recognized is judged.

Step S706, under the condition that the judgment result is yes, identifying that the picture to be identified comprises a target identification;

in step S708, if the determination result is negative, the detection result of the recognition failure is output.

In the foregoing solution, after the detection result of the recognition failure is output in step S708, the image recognition method provided in the embodiment of the present disclosure further includes: and adding a label to the picture to be recognized, and correspondingly adjusting the target recognition model to obtain the optimized target recognition model.

In order to improve the identification accuracy of the target identification model, the target identification model is optimized based on the picture to be identified under the condition that the judgment fails.

In the disclosure, a picture to be identified is obtained; analyzing the picture to be recognized through the target recognition model, and judging whether the picture to be recognized contains a preset label or not; if the judgment result is yes, identifying that the picture to be identified contains the target identification; and under the condition that the judgment result is negative, outputting a detection result of failed recognition, and improving the LOGO detection accuracy.

Example 3

According to a third aspect of the embodiments of the present disclosure, an apparatus for training a signature model is provided, and fig. 8 shows a schematic diagram of an apparatus for training a signature model according to an exemplary embodiment, as shown in fig. 8, the apparatus for training a signature model in the embodiments of the present disclosure includes: an obtaining module 82, configured to obtain a background picture and a foreground picture, where the background picture is an initial sample picture; the foreground picture is a to-be-identified identification picture; a sample generation module 84, configured to synthesize the background picture and the foreground picture to generate a calibration sample; a label adding module 86, configured to add a corresponding label to the calibration sample, where the label is used to indicate a category of the identifier to be identified and a position of the identifier to be identified in the picture; and the model training module 88 is configured to input the calibration sample with the added label and the background picture into the recognition model to be trained for training, so as to obtain a target recognition model, where the target recognition model is configured to recognize whether the input picture calibration sample includes the identifier to be recognized.

In the foregoing solution, the obtaining module 82 includes: the image acquisition unit is used for taking a plurality of images as initial sample images through the plurality of images acquired through a preset channel, wherein the plurality of images comprise at least one identification to be identified; the sample set generating unit is used for generating a source sample set according to the initial sample picture; and the picture processing unit is used for processing all pictures in the source sample set through image data enhancement to generate a background picture.

In the foregoing solution, the obtaining module 82 includes: the image set acquisition unit is used for acquiring an image set containing the identification to be identified and taking the image set as the identification image set to be identified; the screening unit is used for acquiring the pictures with the preset transparency of the to-be-identified marks from the to-be-identified mark picture set; and the picture determining unit is used for acquiring any one picture in the pictures with the preset transparency of the identifier to be identified as the foreground picture.

In the above solution, the atlas handling unit includes: and the atlas acquisition subunit is used for classifying the picture sets according to preset conditions to obtain at least one atlas of the identifier to be identified, wherein the preset conditions are brands corresponding to the identifier to be identified.

In the foregoing solution, the picture determining unit includes: the image acquisition subunit is used for acquiring each frame of image with the to-be-identified identification form under the condition that the to-be-identified identification atlas comprises the dynamic identification image; and the picture determining subunit is used for acquiring a picture with the preset transparency of the identifier to be identified according to each frame of picture with the identifier form to be identified, and generating a foreground picture.

In the above solution, the tag adding module 86 includes: the device comprises a category acquisition unit, a foreground identification unit and a recognition unit, wherein the category acquisition unit is used for acquiring the category of the identification to be recognized in the foreground image; the position acquisition unit is used for acquiring the position of the mark to be identified in the calibration sample; and the label adding unit is used for adding a corresponding label to the calibration sample according to the category and the position to obtain the calibration sample added with the category label and the position label.

In the above solution, the model training module 88 includes: the training unit is used for training the identification model to be trained according to the labeled calibration sample and the background picture to obtain the background picture containing the identification picture to be identified; the deleting unit is used for deleting the background picture containing the identification picture to be identified; and the training convergence unit is used for inputting the calibration sample added with the label and the background picture deleted with the identification picture to be recognized into the recognition model to be trained for training until the recognition model to be recognized converges to obtain the target recognition model.

In the above scheme, the training unit includes: the training subunit is used for inputting the labeled calibration sample and the background picture into the recognition model to be trained for training to obtain initial model parameters; and identifying the background picture according to the initial model parameters and the identification model to be trained to obtain an identification result, wherein the identification result is used for indicating whether the background picture is the identification picture to be identified.

Example 4

According to a fourth aspect of the embodiments of the present disclosure, there is provided an image recognition apparatus, and fig. 9 shows a schematic diagram of an image recognition apparatus provided by an exemplary embodiment, as shown in fig. 9, the image recognition apparatus in the embodiments of the present disclosure includes: an obtaining module 92, configured to obtain a picture to be identified; the identification module 94 is configured to analyze the picture to be identified through the target identification model, and determine whether the picture to be identified includes a preset tag; the first judging module 96 is configured to, if the judgment result is yes, identify that the picture to be identified includes the target identifier; and a second judging module 98, configured to output a detection result of the recognition failure if the judgment result is negative.

In the above solution, the identification module 94 includes: and the identification unit is used for analyzing whether the preset label contained in the picture to be identified contains each frame of identification form to be identified or not through the target identification model under the condition that the picture to be identified contains the dynamic identification.

In the foregoing solution, the image recognition apparatus in the embodiment of the present disclosure further includes: and the optimization module is used for adding a label to the picture to be recognized after the detection result of the recognition failure is output, and correspondingly adjusting the target recognition model to obtain the optimized target recognition model.

Those of skill in the art will appreciate that the functions described in connection with the various illustrative logical blocks, modules, and algorithm steps described in the disclosure herein may be implemented as hardware, software, firmware, or any combination thereof. If implemented in software, the functions described in the various illustrative logical blocks, modules, and steps may be stored on or transmitted over as one or more instructions or code on a computer-readable medium and executed by a hardware-based processing unit. The computer-readable medium may include a computer-readable storage medium, which corresponds to a tangible medium, such as a data storage medium, or any communication medium including a medium that facilitates transfer of a computer program from one place to another (e.g., according to a communication protocol). In this manner, a computer-readable medium may generally correspond to (1) a non-transitory tangible computer-readable storage medium, or (2) a communication medium, such as a signal or carrier wave. A data storage medium may be any available medium that can be accessed by one or more computers or one or more processors to retrieve instructions, code and/or data structures for implementing the techniques described herein. The computer program product may include a computer-readable medium.

By way of example, and not limitation, such computer-readable storage media can comprise RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices, flash memory, or any other medium that can be used to store desired program code in the form of instructions or data structures and that can be accessed by a computer. Also, any connection is properly termed a computer-readable medium. For example, if the instructions are transmitted from a website, server, or other remote source using a coaxial cable, fiber optic cable, twisted pair, digital Subscriber Line (DSL), or wireless technologies such as infrared, radio, and microwave, then the coaxial cable, fiber optic cable, twisted pair, DSL, or wireless technologies such as infrared, radio, and microwave are included in the definition of medium. It should be understood, however, that computer-readable storage media and data storage media do not include connections, carrier waves, signals, or other transitory media, but are instead directed to non-transitory tangible storage media. Disk and disc, as used herein, includes Compact Disc (CD), laser disc, optical disc, digital Versatile Disc (DVD), and blu-ray disc where disks usually reproduce data magnetically, while discs reproduce data optically with lasers. Combinations of the above should also be included within the scope of computer-readable media.

The instructions may be executed by one or more processors, such as one or more Digital Signal Processors (DSPs), general purpose microprocessors, application Specific Integrated Circuits (ASICs), field programmable logic arrays (FPGAs), or other equivalent integrated or discrete logic circuitry. Thus, the term "processor," as used herein may refer to any of the foregoing structure or any other structure suitable for implementation of the techniques described herein. Additionally, in some aspects, the functions described by the various illustrative logical blocks, modules, and steps described herein may be provided within dedicated hardware and/or software modules configured for encoding and decoding, or incorporated in a combined codec. Also, the techniques may be fully implemented in one or more circuits or logic elements.

The techniques of this application may be implemented in a variety of described devices or apparatuses, including a wireless handset, an Integrated Circuit (IC), or a set of ICs (e.g., a chipset). Various components, modules, or units are described in this application to emphasize functional aspects of means for performing the disclosed techniques, but do not necessarily require realization by different hardware units. Indeed, as described above, the various units may be combined in a codec hardware unit, in conjunction with suitable software and/or firmware, or provided by an interoperating hardware unit (including one or more processors as described above).

In the foregoing embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.

The above description is only an exemplary embodiment of the present application, but the scope of the present application is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present application should be covered within the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. A method of training a landmark model, comprising:

acquiring a background picture and a foreground picture, wherein the background picture is generated by enhancing an initial sample picture through an image data enhancement technology; the foreground picture is a to-be-identified identification picture;

randomly scaling the size of the foreground picture, randomly selecting the position of the background picture, and synthesizing the foreground picture with the background picture to generate a calibration sample;

adding a corresponding label to the calibration sample, wherein the label is used for representing the category of the mark to be identified and the position of the mark to be identified in the picture; wherein the adding of the corresponding label to the calibration sample comprises: acquiring the category of the identifier to be identified in the foreground picture; obtaining the position of the mark to be identified in the calibration sample; adding a corresponding label to the calibration sample according to the category and the position to obtain the calibration sample added with the category label and the position label;

inputting the calibration sample added with the label and the background picture into a recognition model to be trained for training to obtain a target recognition model, wherein the target recognition model is used for recognizing whether the input picture contains the identification to be recognized;

the method further comprises the following steps:

analyzing a picture to be recognized through the target recognition model, and judging whether the picture to be recognized contains a preset label or not;

if the judgment result is yes, identifying that the picture to be identified contains a target identifier;

and under the condition that the judgment result is negative, outputting a detection result of failed recognition, adding a label to the picture to be recognized, and correspondingly adjusting the target recognition model to obtain the optimized target recognition model.

2. The method of claim 1, wherein the obtaining the background picture comprises:

taking a plurality of pictures as the initial sample pictures through a plurality of pictures acquired through a preset channel, wherein the plurality of pictures comprise at least one identifier to be identified;

generating a source sample set according to the initial sample picture;

and processing all pictures in the source sample set through image data enhancement to generate the background picture.

3. The method of claim 2, wherein the background picture comprises: the image processing method comprises an original image and a negative sample image, wherein the negative sample image is a non-category and non-coordinate image.

4. The method of claim 1, wherein the obtaining the foreground picture comprises:

acquiring a picture set containing a to-be-identified identifier, and taking the picture set as an image set of the to-be-identified identifier;

acquiring pictures with preset transparency of the to-be-identified marks from the to-be-identified mark picture set;

and acquiring any one of the pictures with the preset transparency of the identifier to be identified as the foreground picture.

5. The method of claim 4, wherein the using the set of pictures as the identification atlas to be identified comprises:

classifying the picture set according to a preset condition to obtain at least one to-be-identified identification picture set, wherein the preset condition is a brand corresponding to the to-be-identified identification.

6. The method of claim 4, wherein the atlas of identifiers to be identified comprises: a static identification picture and/or a dynamic identification picture.

7. The method according to claim 6, wherein the obtaining any one of the pictures with the preset transparency of the to-be-identified identifier as the foreground picture comprises:

under the condition that the to-be-identified identification atlas comprises the dynamic identification picture, obtaining each frame of picture in the to-be-identified identification form;

and acquiring the picture with the preset transparency of the identifier to be identified according to each frame of picture with the shape of the identifier to be identified, and generating the foreground picture.

8. The method according to claim 1, wherein the inputting the calibration sample added with the label and the background picture into a recognition model to be trained for training to obtain a target recognition model comprises:

training a recognition model to be trained according to the calibration sample added with the label and the background picture to obtain the background picture containing the identification picture to be recognized;

deleting the background picture containing the identification picture to be identified;

and inputting the calibration sample added with the label and the background picture deleted with the identification picture to be recognized into the recognition model to be trained for training until the recognition model to be trained is converged to obtain the target recognition model.

9. The method according to claim 8, wherein the training a recognition model to be trained according to the calibration sample added with the label and the background picture comprises:

inputting the calibration sample added with the label and the background picture into a recognition model to be trained for training to obtain initial model parameters;

and recognizing the background picture according to the initial model parameters and the recognition model to be trained to obtain a recognition result, wherein the recognition result is used for indicating whether the background picture is the identification picture to be recognized or not.

10. An image recognition method applied to the method for training the recognition marker model of claim 1 comprises the following steps:

acquiring a picture to be identified;

analyzing the picture to be recognized through a target recognition model, and judging whether the picture to be recognized contains a preset label or not;

if the judgment result is yes, identifying that the picture to be identified comprises a target identification;

and outputting a detection result of the identification failure if the judgment result is negative.

11. The method of claim 10, wherein the parsing the picture to be recognized through the target recognition model to determine whether the picture to be recognized includes a preset tag comprises:

and under the condition that the picture to be recognized contains the dynamic identification, analyzing whether a preset label contained in the picture to be recognized contains each frame of identification form to be recognized or not through the target recognition model.

12. The method of claim 10, wherein after said outputting the detection result of the identification failure, the method further comprises:

and adding a label to the picture to be recognized, and correspondingly adjusting the target recognition model to obtain the optimized target recognition model.

13. An apparatus for training a landmark model, comprising:

the device comprises an acquisition module, a processing module and a display module, wherein the acquisition module is used for acquiring a background picture and a foreground picture, and the background picture is an initial sample picture; the foreground picture is a to-be-identified identification picture;

the sample generating module is used for synthesizing the background picture and the foreground picture to generate a calibration sample;

the label adding module is used for adding a corresponding label to the calibration sample, wherein the label is used for representing the category of the mark to be identified and the position of the mark to be identified in the picture;

the model training module is used for inputting the calibration sample added with the label and the background picture into a recognition model to be trained for training to obtain a target recognition model, and the target recognition model is used for recognizing whether the calibration sample of the input picture contains the identification to be recognized or not;

wherein the tag adding module comprises: the category acquisition unit is used for acquiring the category of the identifier to be identified in the foreground picture; the position acquisition unit is used for acquiring the position of the mark to be identified in the calibration sample; the label adding unit is used for adding a corresponding label to the calibration sample according to the category and the position to obtain the calibration sample added with the category label and the position label;

the device further comprises:

the acquisition module is used for acquiring a picture to be identified;

the identification module is used for analyzing the picture to be identified through the target identification model and judging whether the picture to be identified contains a preset label or not;

the first judgment module is used for identifying that the picture to be identified comprises a target identifier under the condition that the judgment result is yes;

the second judgment module is used for outputting a detection result of identification failure under the condition that the judgment result is negative;

and the optimization module is used for adding a label to the picture to be recognized after the detection result of the recognition failure is output, and correspondingly adjusting the target recognition model to obtain the optimized target recognition model.

14. An image recognition apparatus, applied to the apparatus for training the recognition mark model of claim 13, comprising:

the acquisition module is used for acquiring a picture to be identified;

the identification module is used for analyzing the picture to be identified through a target identification model and judging whether the picture to be identified contains a preset label or not;

and the second judging module is used for outputting the detection result of the identification failure under the condition that the judgment result is negative.