CN113761249A

CN113761249A - Method and device for determining picture type

Info

Publication number: CN113761249A
Application number: CN202010768739.2A
Authority: CN
Inventors: 雷超
Original assignee: Beijing Wodong Tianjun Information Technology Co Ltd
Current assignee: Beijing Wodong Tianjun Information Technology Co Ltd
Priority date: 2020-08-03
Filing date: 2020-08-03
Publication date: 2021-12-07

Abstract

The invention discloses a method and a device for determining picture types, and relates to the technical field of computers. One embodiment of the method comprises: acquiring a target picture, and processing pixels in the target picture to obtain a plurality of first pictures; respectively extracting feature vectors in the plurality of first pictures based on a neural network model, inputting the feature vectors into a classification model for classification processing to obtain a plurality of classification results, wherein the plurality of classification results correspond to the plurality of first pictures; and determining the type of the target picture according to the plurality of classification results. The method and the device reduce the cost of determining the picture type, improve the defense capability and stability of the classification model to attack noise, and improve the accuracy of the determined picture type.

Description

Method and device for determining picture type

Technical Field

The present invention relates to the field of computer technologies, and in particular, to a method and an apparatus for determining a picture type.

Background

In the picture content examination system, as shown in fig. 1, the left picture is an original picture (the content of the original picture is panda, and the picture artificially specified to include panda is an illegal type picture), the middle picture is a noise picture/attack noise, and under the condition that information seen by human eyes is not changed, the noise picture is added into the original picture by using a gradient rise method to obtain the attack picture of the right picture. The content of the attack picture is actually pandas, but the content identified by the classification model in the picture content examination system is gibbon, so that the attack picture is determined not to be an illegal type picture, and the confidence probability of the classification result is up to 99.3%. In order to solve the above problems, in the prior art, a plurality of classification models with different structures are simultaneously used for identifying the types of pictures, and different weight coefficients are set for the plurality of models, so as to ensure the accuracy of the determined picture types.

In the process of implementing the invention, the inventor finds that at least the following problems exist in the prior art:

in the prior art, a plurality of classification models with different structures need to be trained, so that the cost for determining the picture type is high, the setting modes of the weight coefficients of the different classification models are difficult to unify, the defense capability and stability of the classification models to attack noise are poor, and the accuracy of the determined picture type is low.

Disclosure of Invention

In view of this, embodiments of the present invention provide a method and an apparatus for determining a picture type, where a plurality of first pictures are obtained by processing pixels in a picture to be classified, and a same classification model is used to identify the picture types of the plurality of first pictures, so that a cost of determining the picture types is reduced, a defense capability and stability of the classification model against attack noise are improved, and an accuracy of the determined picture types is improved.

To achieve the above object, according to a first aspect of the embodiments of the present invention, there is provided a method for determining a picture type, including:

acquiring a target picture, and processing pixels in the target picture to obtain a plurality of first pictures;

respectively extracting feature vectors in the plurality of first pictures based on a neural network model, inputting the feature vectors into a classification model for classification processing to obtain a plurality of classification results, wherein the plurality of classification results correspond to the plurality of first pictures;

and determining the type of the target picture according to the plurality of classification results.

In one embodiment, the step of obtaining a plurality of first pictures based on processing the pixels in the target picture comprises:

carrying out data enhancement processing on the target picture to enable the generation position of a pixel point in the target picture to deviate, and further obtaining the first picture, wherein the data enhancement processing comprises at least one of the following processing modes: rotation processing, reduction processing, enlargement processing and translation processing.

In one embodiment, when the data enhancement processing is at least one of rotation processing, reduction processing, and translation processing, the obtaining of the first picture based on the data enhancement processing on the target picture includes:

processing the target picture according to the offset amplitude and the offset direction indicated by the data enhancement processing mode to obtain a second picture, wherein the offset amplitude is determined according to the size of the target picture;

determining a non-overlapping area corresponding to the target picture and the second picture;

adjusting the pixel value corresponding to each pixel point in the non-overlapping area to be the pixel mean value of the target picture to obtain an adjusted non-overlapping area;

and combining the adjusted non-overlapping area and the overlapping area of the second picture and the target picture to obtain a first picture.

In one embodiment, when the data enhancement processing is the enlargement processing, the step of obtaining the first picture based on the data enhancement processing on the target picture includes:

amplifying the target picture according to the amplification ratio to obtain a second picture;

and determining the overlapping area of the second picture and the target picture as the first picture.

In one embodiment, the step of obtaining a plurality of first pictures based on processing the pixels in the target picture further includes:

based on the filtering processing of the pixels in the target picture, the pixel values corresponding to the pixel points in the target picture are changed, and then a first picture is obtained; wherein the filtering process includes at least one of the following processing modes: the method comprises the following steps of block filtering processing, mean filtering processing, Gaussian filtering processing, median filtering processing and bilateral filtering processing.

In one embodiment, the step of determining the type of the target picture according to the plurality of classification results comprises:

and determining the type of the target picture according to the plurality of classification results and the voting rule.

In one embodiment, the step of determining the type of the target picture according to the plurality of classification results further comprises:

and performing weighting processing on the plurality of classification results, and determining the type of the target picture according to the weighting result and the result quantity threshold.

To achieve the above object, according to a second aspect of the embodiments of the present invention, there is provided an apparatus for determining a picture type, including:

the target picture acquisition module is used for acquiring a target picture and processing pixels in the target picture to obtain a plurality of first pictures;

the classification processing module is used for respectively extracting the feature vectors in the plurality of first pictures based on the neural network model, inputting the feature vectors into the classification model for classification processing to obtain a plurality of classification results, and the plurality of classification results correspond to the plurality of first pictures;

and the type determining module is used for determining the type of the target picture according to the plurality of classification results.

To achieve the above object, according to a third aspect of embodiments of the present invention, there is provided an electronic apparatus including:

one or more processors;

a storage device for storing one or more programs,

when executed by one or more processors, cause the one or more processors to implement a method for determining a picture type as in any above.

To achieve the above object, according to a fourth aspect of the embodiments of the present invention, there is provided a computer readable medium having a computer program stored thereon, the program, when executed by a processor, implementing any one of the above-mentioned methods for determining a picture type.

One embodiment of the above invention has the following advantages or benefits: because the target picture is obtained, a plurality of first pictures are obtained based on the processing of the pixels in the target picture; respectively extracting feature vectors in the plurality of first pictures based on a neural network model, inputting the feature vectors into a classification model for classification processing to obtain a plurality of classification results, wherein the plurality of classification results correspond to the plurality of first pictures; the technical means for determining the type of the target picture according to the classification results solves the technical problems that in the prior art, due to the fact that a plurality of classification models with different structures need to be trained, the cost for determining the picture type is high, the setting modes of the weight coefficients of the different classification models are difficult to unify, the defense capacity and stability of the classification models to attack noise are poor, and the accuracy of the determined picture type is low, further, the fact that a plurality of first pictures are obtained by processing pixels in the pictures to be classified is achieved, the same classification model is used for carrying out picture type identification on the first pictures is achieved, the cost for determining the picture type is reduced, the defense capacity and stability of the classification models to attack noise are improved, and the technical effect of the accuracy of the determined picture type is improved.

Further effects of the above-mentioned non-conventional alternatives will be described below in connection with the embodiments.

Drawings

The drawings are included to provide a better understanding of the invention and are not to be construed as unduly limiting the invention. Wherein:

FIG. 1 is a diagram of an original picture, a noise picture, and a final picture in the prior art;

fig. 2 is a schematic diagram of a main flow of a method for determining a picture type according to a first embodiment of the present invention;

fig. 3a is a schematic diagram of a main flow of a method for determining a picture type according to a second embodiment of the present invention;

FIG. 3b is a schematic diagram illustrating a rotation process performed on a target picture in the method of FIG. 3 a;

FIG. 3c is a schematic diagram of the method in FIG. 3a for performing an enlargement process and a reduction process on a target picture;

FIG. 3d is a schematic diagram illustrating a translation process performed on a target picture in the method of FIG. 3 a;

FIG. 4 is a diagram illustrating major blocks of an apparatus for determining a picture type according to an embodiment of the present invention;

FIG. 5 is an exemplary system architecture diagram in which embodiments of the present invention may be employed;

fig. 6 is a schematic block diagram of a computer system suitable for use in implementing a terminal device or server of an embodiment of the invention.

Detailed Description

Exemplary embodiments of the present invention are described below with reference to the accompanying drawings, in which various details of embodiments of the invention are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the invention. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

It should be noted that the embodiments and features of the embodiments may be combined with each other without conflict.

In order to solve the problems in the prior art, a first embodiment of the present invention provides a method for determining a picture type, as shown in fig. 2, the method mainly includes:

step S201, a target picture is obtained, and a plurality of first pictures are obtained based on processing pixels in the target picture.

Specifically, according to the embodiment of the present invention, pixels in a target picture (the target picture refers to a picture whose picture content is to be examined and/or whose picture type is to be determined) are processed, so that a pixel point is subjected to position shift and/or a pixel value corresponding to the pixel point is changed, thereby obtaining a plurality of first pictures. In the process of processing the pixels in the target picture, the pixel points corresponding to the attack noise are subjected to position offset and/or the pixel values corresponding to the pixel points are changed, so that the attack noise superposed on the target picture is invalid. The attack noise is obtained by training a specific picture through a gradient lifting method of the neural network, so that even if a neuron of the input layer vector (picture is pulled to be one-dimensional) of the neural network shifts (attack noise sending position shift), the specific attack vector (characteristic vector corresponding to the attack noise) (picture is pulled to be one-dimensional) fails, and the accuracy of subsequently utilizing the classification model to perform type identification on the plurality of first pictures is improved.

Further, according to an embodiment of the present invention, the step of obtaining a plurality of first pictures based on processing the pixels in the target picture includes:

It should be noted that, according to the embodiment of the present invention, the data enhancement processing is performed on the target picture, so that the pixel points in the target picture only need to generate displacement of several pixels (for example, a displacement of one pixel point or a displacement of two pixel points is generated), and the attack noise superimposed on the target picture can be displaced by a small offset amplitude, so that the attack noise is invalid. If the offset amplitude is too large, the recognition accuracy of the classification model is reduced.

Through the arrangement, the data enhancement processing is carried out on the target picture, so that the position of the pixel point in the target picture is deviated, the pixel in the target picture is processed, and a plurality of first pictures are obtained. Meanwhile, the target picture is subjected to data enhancement to obtain a plurality of first pictures, the plurality of first pictures can be classified by using the same classification model subsequently to obtain a plurality of classification results, and the type of the target picture is determined according to the plurality of classification results. The situation that a plurality of classification models are trained respectively to generate higher cost in the prior art is avoided.

Preferably, according to an embodiment of the present invention, when the data enhancement processing is at least one of rotation processing, reduction processing, and translation processing, the step of obtaining the first picture based on the data enhancement processing performed on the target picture includes:

Through the arrangement, the pixel points are shifted by adopting the enhancement treatment, so that the attack noise obtained through the targeted training is invalid, and the defense capability and the stability of the classification model to the attack noise are improved.

Preferably, according to an embodiment of the present invention, when the data enhancement processing is the enlargement processing, the step of obtaining the first picture based on the data enhancement processing performed on the target picture includes:

Through the arrangement, the influence of attack noise on the target picture is obviously reduced in the obtained first picture, and the accuracy of the classification result obtained by subsequently utilizing the classification model on the determination of the picture type is improved.

Alternatively, according to an embodiment of the present invention, the step of obtaining a plurality of first pictures based on processing the pixels in the target picture further includes:

It should be noted that, according to the embodiment of the present invention, the data enhancement processing is performed on the target picture, so that only a difference value of several pixels is generated for a pixel point in the target picture (for example, a difference value of one pixel or a difference value of two pixels is generated), and a small pixel difference value can cause the attack noise superimposed on the target picture to shift, so that the attack noise is disabled. If the pixel difference is too large, the recognition accuracy of the classification model is reduced.

The filtering process is also called a smoothing process, and noise is removed by changing the size of the pixel value. Wherein, the mean filtering is a typical linear filtering, which means that a template is given to a target pixel on an image, the template comprises neighboring pixels around the target pixel (8 surrounding pixels taking the target pixel as the center form a filtering template, i.e. the target pixel itself is removed), and the original pixel value is replaced by the average value of all pixels in the template; the block filtering and the mean filtering kernels are basically consistent, and the difference is that homogenization processing is not needed; median filtering is a non-linear smoothing technique, which sets the gray value of each pixel point as the median of the gray values of all pixel points in a certain neighborhood window of the point, i.e. the value of the central pixel is replaced by the median (not the average) of all pixel values. The median filtering avoids the influence of isolated noise points of the image by selecting a median, has good filtering effect on impulse noise, and particularly can protect the edge of a signal from being blurred while filtering the noise; the Gaussian filtering is linear smooth filtering and is suitable for eliminating Gaussian noise, specifically, the process of weighted average is carried out on the whole image, and the value of each pixel point is obtained by carrying out weighted average on the value of each pixel point and other pixel values in the neighborhood; bilateral filtering (Bilateral filter) is a nonlinear filtering method, a weighted average value is constructed according to each pixel and the field thereof, the weighted calculation comprises two parts, wherein the weighting mode of the first part is the same as that in Gaussian smoothing, and the second part also belongs to Gaussian weighting, but the weighting is not based on the spatial distance between a central pixel point and other pixel points, but based on the weighting of the brightness difference value between the other pixels and the central pixel.

According to the embodiment of the invention, the target picture can be processed at least once in the data enhancement processing and/or the filtering processing, if the target picture is processed for multiple times in the same processing mode, the change mode of the pixel point is different in each processing, for example, if the target picture is processed for multiple times in the rotation processing, the rotation angle or direction of each processing is not consistent.

Step S202, respectively extracting feature vectors in the plurality of first pictures based on the neural network model, inputting the feature vectors into the classification model for classification processing, and obtaining a plurality of classification results, wherein the plurality of classification results correspond to the plurality of first pictures.

Specifically, the plurality of first pictures are respectively input into a neural network model, the feature vectors corresponding to the first pictures are output from an output layer of the neural network model, the first pictures are obtained by processing pixels of a target picture, so that the feature vectors corresponding to the first pictures are different from the feature vectors corresponding to the target picture, interference on attack noise is realized, and then the classification model is used for respectively classifying the feature vectors corresponding to the first pictures, so that the classification results corresponding to the first pictures can be obtained. Through the arrangement, the situation that a plurality of classification models are adopted at the same time is avoided, and the cost for training the classification models with different structures is reduced. The classification model may be a classification model of the following structure: logistic Regression (a classical classification model) can process binary classification and multivariate classification, SVM (Support Vector Machine) shows many specific advantages in solving small sample, nonlinear and high-dimensional pattern recognition and can be popularized and applied to other Machine learning problems such as function fitting), xgboost (extreme Gradient boosting) is a boosted tree model, so that many tree models are integrated together to form a strong classifier, and the like. The classification model is a trained model, and the classification model is trained by adopting the existing training method.

Step S203, determining the type of the target picture according to a plurality of classification results.

The type of the picture can be set manually, for example, in the process of examining the picture, illegal type pictures and yellow-related type pictures can be set according to related requirements, and the illegal type pictures, the illegal type pictures and the yellow-related type pictures can also be set as transmission-prohibited type pictures.

Specifically, according to an embodiment of the present invention, the step of determining the type of the target picture according to the plurality of classification results includes:

The voting rule is a majority voting rule, which means that only when some kind of evaluation result obtains the voting number larger than a certain threshold value, the result is output, otherwise, the evaluation result is not output. In the case of not outputting the evaluation result, a manual intervention mode can be adopted for evaluation. The threshold number is generally an absolute number, i.e. greater than half the total number, and is also set according to the actual situation. Through the arrangement, the accuracy of the determined picture type is improved by introducing the majority voting rule in the process of determining the target picture type.

Alternatively, according to an embodiment of the present invention, the step of determining the type of the target picture according to the plurality of classification results further includes:

and performing weighting processing on the plurality of classification results, and determining the type of the target picture according to the weighting processing result and the result quantity threshold.

As the same classification model is adopted to classify the first picture respectively to obtain the classification results, the weighting here only needs to set each weight to be 1/N, and N is the number of the classification results.

According to the technical scheme of the embodiment of the invention, the target picture is obtained, and a plurality of first pictures are obtained based on processing the pixels in the target picture; respectively extracting feature vectors in the plurality of first pictures based on a neural network model, inputting the feature vectors into a classification model for classification processing to obtain a plurality of classification results, wherein the plurality of classification results correspond to the plurality of first pictures; the technical means for determining the type of the target picture according to the classification results solves the technical problems that in the prior art, due to the fact that a plurality of classification models with different structures need to be trained, the cost for determining the picture type is high, the setting modes of the weight coefficients of the different classification models are difficult to unify, the defense capacity and stability of the classification models to attack noise are poor, and the accuracy of the determined picture type is low, further, the fact that a plurality of first pictures are obtained by processing pixels in the pictures to be classified is achieved, the same classification model is used for carrying out picture type identification on the first pictures is achieved, the cost for determining the picture type is reduced, the defense capacity and stability of the classification models to attack noise are improved, and the technical effect of the accuracy of the determined picture type is improved.

Fig. 3a is a schematic diagram of a main flow of a method for determining a picture type according to a second embodiment of the present invention; as shown in fig. 3a, the method for determining a picture type according to the embodiment of the present invention mainly includes:

in step S301, a target picture is acquired.

The target picture refers to the content of the examination picture and/or the picture of the picture type to be determined, and attack noise which causes interference to the classification model is usually superposed on the picture. The target picture may be any one of, for example, a picture containing flowers, and for example, a picture containing jeans.

Step S302, when the data enhancement processing is at least one of rotation processing, reduction processing and translation processing, processing the target picture according to the offset magnitude and the offset direction indicated by the data enhancement processing mode to obtain a second picture.

With the above arrangement, the data enhancement processing is performed on the target picture, so that the position of the pixel point in the target picture is shifted, as shown in fig. 3b, 3c, and 3d, schematic diagrams when the target picture is subjected to the rotation processing, the reduction processing, and the translation processing are indicated, but the invention is not limited thereto. If the target picture is processed multiple times (twice or more) in the same processing manner, the offset range and the offset direction (such as the rotation angle, the rotation direction, the reduction ratio (reduction range), the enlargement direction, the enlargement ratio (enlargement range), and the like) of each time need to be adjusted so that the change manner of the pixel point is different in each processing.

According to the embodiment of the invention, the offset amplitude is determined according to the size of the target picture.

In this embodiment, in specific implementation, if the number of the pixel points of the target picture is greater than the preset number, at least two pixel points are determined as offsets; and if the number of the pixel points of the target picture is less than or equal to the preset number, determining one pixel point as the offset. Thereby flexibly processing each target picture. In addition, the offset (offset amplitude) should not be too large so as not to affect the accuracy of picture recognition. Moreover, the preset number can be set according to requirements, and the effect of the embodiment of the invention can be realized by setting the offset amplitude to be one pixel point or two pixel points under general conditions.

Step S303, determining a non-overlapping region corresponding to the target picture and the second picture.

As shown in fig. 3d, taking parallel translation processing on the target picture as an example, a box shown by a dotted line is the target picture, when the target picture is translated to the right side, an area shown by a solid line is obtained as the second picture, and at this time, an area formed by the left dotted line and the left solid line is a non-overlapping area corresponding to the target picture and the second picture.

Step S304, adjusting the pixel value corresponding to each pixel point in the non-overlapping area to the pixel mean value of the target picture to obtain the adjusted non-overlapping area.

Specifically, pixel values corresponding to all pixel points in a target picture are obtained, and the pixel values of all the pixel points in the target picture are added to obtain a total pixel value; and dividing the total pixel value by the number of pixel points of the target picture to obtain a pixel mean value of the target picture, and taking the pixel mean value as a pixel value corresponding to each pixel point in the non-overlapping area.

Step S305, combine the adjusted non-overlapping region and the overlapping region of the second picture and the target picture to obtain a first picture.

The first picture obtained by processing the target picture through at least one of the rotation processing, the reduction processing, and the translation processing acquired in the above-described steps S302 to S305. Because the pixel position in the first picture slightly changes, the neural network model still can normally extract the characteristics of the normal picture, but for the picture superimposed with attack noise, the attack noise of the picture is invalid, and the identification accuracy of the subsequent classification model is further improved.

And step S306, under the condition that the data enhancement processing is amplification processing, amplifying the target picture according to the amplification ratio to obtain a second picture.

Step S307, determining the overlapping area of the second picture and the target picture as the first picture.

The first picture obtained by processing the target picture through the enlargement processing acquired in the above-described steps S306 to S307. Attack noise is also nullified by adjusting pixel positions.

Step 308, filtering the pixels in the target picture to change the pixel values corresponding to the pixel points in the target picture, thereby obtaining a first picture.

Through the arrangement, the pixel value is changed by adopting filtering treatment, namely smoothing treatment, and the activation state of each layer of the neural network model is changed, so that the feature vector of the first picture is different from the feature vector of the target picture, the noise is eliminated, the noise picture is invalid, and the defense capability and the stability of the classification model to attack noise are improved.

Step S309, respectively extracting feature vectors in the plurality of first pictures based on the neural network model, inputting the feature vectors into the classification model for classification processing, and obtaining a plurality of classification results, wherein the plurality of classification results correspond to the plurality of first pictures.

Specifically, the plurality of first pictures are respectively input into a neural network model, the feature vectors corresponding to the first pictures are output from an output layer of the neural network model, the first pictures are obtained by processing pixels of a target picture, so that the feature vectors corresponding to the first pictures are different from the feature vectors corresponding to the target picture, interference on attack noise is realized, and then the classification model is used for respectively classifying the feature vectors corresponding to the first pictures, so that the classification results corresponding to the first pictures can be obtained. Through the arrangement, the situation that a plurality of classification models are adopted at the same time is avoided, and the cost for training the classification models with different structures is reduced. The classification model may be a classification model of the following results: logistic Regression (a classical classification model, which can process binary classification and multivariate classification), SVM (Support Vector Machine), which shows many specific advantages in solving small sample, nonlinear and high-dimensional pattern recognition and can be popularized and applied to other Machine learning problems such as function fitting), xgboost (extreme Gradient boosting), which is a boosted tree model, so that many tree models are integrated together to form a very strong classifier.

Step S310, determining the type of the target picture according to the plurality of classification results and the voting rule.

The voting rule refers to a majority voting rule, which means that only when a certain type of evaluation result obtains a voting number larger than a certain threshold value, the result is output, otherwise, the evaluation result is not output. In the case of not outputting the evaluation result, a manual intervention mode can be adopted for evaluation. The threshold number is generally an absolute number, i.e. greater than half the total number, and is also set according to the actual situation. Through the arrangement, the accuracy of the determined picture type is improved by introducing the majority voting rule in the process of determining the target picture type.

According to the embodiment of the present invention, the classification result may be a probability that the target picture type is a propagation prohibited picture, and for a probability that each first picture is the propagation prohibited picture, if the probability is greater than a first probability, the first picture corresponding to the probability is regarded as a propagation prohibited picture; counting the number of the pictures which are forbidden to be transmitted and correspond to all the first pictures; if the number of the first pictures which are the propagation-forbidden pictures is larger than or equal to the first number (majority voting principle), determining that the type of the target picture is the propagation-forbidden picture; if the number of the first pictures which are the propagation-forbidden pictures is less than or equal to the second number, determining whether the type of the target picture is the propagation-forbidden picture; said first number being greater than said second number; and if the number of the first pictures which are the propagation forbidden pictures is larger than the second number and smaller than the first number, determining the target picture type as the unrecognizable type.

In a specific implementation, the first number and the second number may be set manually, the smaller the absolute value of the difference between the first number and the second number is, the larger the amount of manual intervention is, the lower the accuracy of the determined type of the target picture is, the larger the absolute value of the difference between the first number and the second number is, the smaller the amount of manual intervention is, the higher the accuracy of the determined type of the target picture is. For example, the first number is 15 and the second number is 5. In addition, the first number is smaller than the number of the first pictures, and the second number is larger than zero. The first probability may be set, for example, to 0.6. If the accuracy of the classification model is higher (i.e. the performance is better), the lower the first probability is, the smaller the absolute value is (thereby reducing the manual workload); the higher the first probability, the larger the absolute value, the lower the accuracy of the classification model. The first information and the second information may be set, the first information may be different from the second information, the first information may be 1, and the second information may be 0. Specifically, for a target picture of which the classification model cannot identify the type, the type corresponding to the target picture can be judged manually; if the target picture is judged to be the propagation prohibited picture manually, inputting 1 manually, and when the embodiment of the invention receives 1, determining that the type of the target picture is the propagation prohibited picture; if the target picture is judged to be not the propagation prohibited picture manually, 0 is input manually, and when 0 is received, the type of the target picture is determined not to be the propagation prohibited picture.

In the embodiment, whether the type of the target picture is the propagation prohibited picture or not is judged according to the relation between the range formed by the first number and the second number and the number of the first picture which is the propagation prohibited picture, a reliable target picture type is directly output, and the accuracy of the determined target picture type is improved.

Step S311, performing weighting processing on the multiple classification results, and determining the type of the target picture according to the weighted classification result and the result quantity threshold.

FIG. 4 is a diagram illustrating major blocks of an apparatus for determining a picture type according to an embodiment of the present invention; as shown in fig. 4, an apparatus 400 for determining a picture type according to an embodiment of the present invention mainly includes:

the target picture obtaining module 401 is configured to obtain a target picture, and obtain a plurality of first pictures based on processing pixels in the target picture.

Specifically, according to the embodiment of the present invention, pixels in a target picture (the target picture refers to a picture whose picture content is to be examined and/or whose picture type is to be determined) are processed, so that a pixel point is subjected to position shift and/or a pixel value corresponding to the pixel point is changed, thereby obtaining a plurality of first pictures. In the process of processing the pixels in the target picture, the pixel points corresponding to the attack noise generate position offset and/or the pixel values corresponding to the pixel points change, so that the attack noise superposed on the target picture is invalid, and the accuracy of subsequently utilizing the classification model to perform type identification on the plurality of first pictures is improved.

Further, according to the embodiment of the present invention, the target picture acquiring module 401 is further configured to:

Preferably, according to an embodiment of the present invention, in a case where the data enhancement processing is at least one of rotation processing, reduction processing, and translation processing, the target picture acquiring module 401 is further configured to:

Preferably, according to an embodiment of the present invention, in a case that the data enhancement processing is an enlargement processing, the target picture acquiring module 401 is further configured to:

Alternatively, according to an embodiment of the present invention, the target picture acquiring module 401 is further configured to:

The filtering process is also called a smoothing process, and noise is removed by changing the size of the pixel value.

The classification processing module 402 is configured to extract feature vectors in the plurality of first pictures based on the neural network model, and input the feature vectors into the classification model for classification processing to obtain a plurality of classification results, where the plurality of classification results correspond to the plurality of first pictures.

Specifically, the plurality of first pictures are respectively input into a neural network model, the feature vectors corresponding to the first pictures are output from an output layer of the neural network model, the first pictures are obtained by processing pixels of a target picture, so that the feature vectors corresponding to the first pictures are different from the feature vectors corresponding to the target picture, interference on attack noise is realized, and then the classification model is used for respectively classifying the feature vectors corresponding to the first pictures, so that the classification results corresponding to the first pictures can be obtained. Through the arrangement, the situation that a plurality of classification models are adopted at the same time is avoided, and the cost for training the classification models with different structures is reduced.

A type determining module 403, configured to determine a type of the target picture according to the plurality of classification results.

Specifically, according to the embodiment of the present invention, the type determining module 403 is further configured to:

The majority voting rule refers to outputting a certain type of evaluation result only when the result obtains the number of votes larger than a certain threshold, otherwise, not outputting the evaluation result. In the case of not outputting the evaluation result, a manual intervention mode can be adopted for evaluation. The threshold number is generally an absolute number, i.e. greater than half the total number, and is also set according to the actual situation. Through the arrangement, the accuracy of the determined picture type is improved by introducing the majority voting rule in the process of determining the target picture type.

Alternatively, according to an embodiment of the present invention, the type determining module 403 is further configured to:

Fig. 5 illustrates an exemplary system architecture 500 of a method for recognizing a picture or an apparatus for recognizing a picture to which an embodiment of the present invention may be applied.

As shown in fig. 5, the system architecture 500 may include

terminal devices

501, 502, 503, a network 504, and a server 505. The network 504 serves to provide a medium for communication links between the

terminal devices

501, 502, 503 and the server 505. Network 504 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.

The user may use the

terminal devices

501, 502, 503 to interact with a server 505 over a network 504 to receive or send messages or the like. The

terminal devices

501, 502, 503 may have installed thereon various communication client applications, such as shopping-like applications, web browser applications, search-like applications, instant messaging tools, mailbox clients, social platform software, etc. (by way of example only).

The

terminal devices

501, 502, 503 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smart phones, tablet computers, laptop portable computers, desktop computers, and the like.

The server 505 may be a server providing various services, such as a background management server (for example only) providing support for shopping websites browsed by users using the

terminal devices

501, 502, 503. The backend management server may analyze and perform other processing on the received data such as the target picture, and feed back the classification result (for example, the first picture, the classification result, and the type of the target picture — just an example) to the terminal device.

It should be noted that the method for recognizing a picture provided by the embodiment of the present invention is executed by the server 505 or the terminal, and accordingly, the device for recognizing a picture is disposed in the server 505 or the terminal.

It should be understood that the number of terminal devices, networks, and servers in fig. 5 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.

Referring now to FIG. 6, a block diagram of a computer system 600 suitable for use with a terminal device implementing an embodiment of the invention is shown. The terminal device shown in fig. 6 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present invention.

As shown in fig. 6, the computer system 600 includes a Central Processing Unit (CPU)601 that can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM)602 or a program loaded from a storage section 608 into a Random Access Memory (RAM) 603. In the RAM 603, various programs and data necessary for the operation of the system 600 are also stored. The CPU 601, ROM 602, and RAM 603 are connected to each other via a bus 604. An input/output (I/O) interface 605 is also connected to bus 604.

The following components are connected to the I/O interface 605: an input portion 606 including a keyboard, a mouse, and the like; an output portion 607 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage section 608 including a hard disk and the like; and a communication section 609 including a network interface card such as a LAN card, a modem, or the like. The communication section 609 performs communication processing via a network such as the internet. The driver 610 is also connected to the I/O interface 605 as needed. A removable medium 611 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 610 as necessary, so that a computer program read out therefrom is mounted in the storage section 608 as necessary.

In particular, according to the embodiments of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network through the communication section 609, and/or installed from the removable medium 611. The computer program performs the above-described functions defined in the system of the present invention when executed by the Central Processing Unit (CPU) 601.

It should be noted that the computer readable medium shown in the present invention can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present invention, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present invention, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a unit, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The modules described in the embodiments of the present invention may be implemented by software or hardware. The described modules may also be provided in a processor, which may be described as: a processor includes a target picture acquisition module, a classification processing module, and a type determination module. The names of these modules do not limit the modules themselves in some cases, for example, the target picture acquiring module may also be described as "a module for acquiring a target picture and obtaining a plurality of first pictures based on processing pixels in the target picture".

As another aspect, the present invention also provides a computer-readable medium that may be contained in the apparatus described in the above embodiments; or may be separate and not incorporated into the device. The computer readable medium carries one or more programs which, when executed by a device, cause the device to comprise: acquiring a target picture, and processing pixels in the target picture to obtain a plurality of first pictures; respectively extracting feature vectors in the plurality of first pictures based on a neural network model, inputting the feature vectors into a classification model for classification processing to obtain a plurality of classification results, wherein the plurality of classification results correspond to the plurality of first pictures; and determining the type of the target picture according to the plurality of classification results.

The above-described embodiments should not be construed as limiting the scope of the invention. Those skilled in the art will appreciate that various modifications, combinations, sub-combinations, and substitutions can occur, depending on design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. A method for determining picture type, comprising:

respectively extracting feature vectors in the plurality of first pictures based on a neural network model, inputting the feature vectors into a classification model for classification processing to obtain a plurality of classification results, wherein the classification results correspond to the plurality of first pictures;

2. The method of claim 1, wherein the step of deriving the first plurality of pictures based on processing the pixels in the target picture comprises:

performing data enhancement processing on the target picture to enable the generation position of a pixel point in the target picture to deviate, and further obtain the first picture, wherein the data enhancement processing comprises at least one of the following processing modes: rotation processing, reduction processing, enlargement processing and translation processing.

3. The method according to claim 2, wherein when the data enhancement processing is at least one of rotation processing, reduction processing and translation processing, the step of obtaining the first picture based on the data enhancement processing on the target picture comprises:

determining a non-overlapping region corresponding to the target picture and the second picture;

and combining the adjusted non-overlapping area and the overlapping area of the second picture and the target picture to obtain the first picture.

4. The method according to claim 2, wherein when the data enhancement processing is an enlargement processing, the step of obtaining the first picture based on the data enhancement processing on the target picture comprises:

5. The method of claim 1, wherein the step of deriving the first plurality of pictures based on processing the pixels in the target picture further comprises:

based on the filtering processing of the pixels in the target picture, the pixel values corresponding to the pixel points in the target picture are changed, and then the first picture is obtained; wherein the filtering process includes at least one of the following processing modes: the method comprises the following steps of block filtering processing, mean filtering processing, Gaussian filtering processing, median filtering processing and bilateral filtering processing.

6. The method of claim 1, wherein the step of determining the type of the target picture according to the plurality of classification results comprises:

and determining the type of the target picture according to the classification results and the voting rule.

7. The method of claim 1, wherein the step of determining the type of the target picture according to the plurality of classification results further comprises:

and performing weighting processing on the plurality of classification results, and determining the type of the target picture according to the weighting result and a result quantity threshold.

8. An apparatus for determining picture type, comprising:

the classification processing module is used for respectively extracting the feature vectors in the plurality of first pictures based on a neural network model, inputting the feature vectors into a classification model for classification processing to obtain a plurality of classification results, and the classification results correspond to the plurality of first pictures;

9. An electronic device, comprising:

one or more processors;

a storage device for storing one or more programs,

when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-7.

10. A computer-readable medium, on which a computer program is stored, which, when being executed by a processor, carries out the method according to any one of claims 1-7.