CN114565800A

CN114565800A - Method for detecting illegal picture and picture detection engine

Info

Publication number: CN114565800A
Application number: CN202210452325.8A
Authority: CN
Inventors: 邓小明
Original assignee: Shenzhen Shang Mi Network Technology Co ltd
Current assignee: Shenzhen Shang Mi Network Technology Co ltd
Priority date: 2022-04-24
Filing date: 2022-04-24
Publication date: 2022-05-31
Anticipated expiration: 2042-04-24
Also published as: CN114565800B

Abstract

The invention provides a method for detecting illegal pictures and a picture detection engine; the picture detection engine comprises a data interface module, a black and white list filtering module, a picture preprocessing module, a text recognition detection module, a theme detection module, an illegal decision-making module, a user detection module, an engine database and an engine management module; the illegal picture detection method realizes identification and detection of illegal pictures by means of the modules of the picture detection engine.

Description

Method for detecting illegal picture and picture detection engine

Technical Field

The invention relates to the technical field of computers, in particular to a method for detecting illegal pictures and a picture detection engine.

Background

With the increasing growth of network users, the auditing and governing problems of the content issued by the users of the internet platform are increasingly severe, and the information or the content which violates the law or the regulations of the internet platform needs to be discovered and governed in time, so as to avoid the adverse social influence or the negative influence on the normal operation of the internet platform caused by the information issued by the users. Therefore, the internet platform needs to rely on an efficient and accurate user-generated content verification method to fulfill the above-mentioned needs.

In recent years, pictures become one of the main forms of information release for users of internet platforms, and the violation detection requirements for pictures are increasing. The current inspection method for the user generated picture content comprises manual auditing, deep learning, picture clustering, picture character recognition and the like.

However, the pictures contain rich themes and numerous contents, so that more types exist in violation of the pictures; the traditional illegal picture detection method has a good detection effect on a single picture theme or illegal types, but the problem of missed detection or false detection often occurs on complex picture themes and contents, a large amount of labor cost is consumed for manual detection, and adverse social influence is often caused due to untimely manual detection.

Disclosure of Invention

Aiming at the technical limitations, the invention provides a method for detecting illegal pictures and a picture detection engine;

in order to achieve the purpose, the invention adopts the following technical scheme:

the embodiment of the invention provides a method for detecting illegal pictures and a picture detection engine.

The picture detection engine comprises a data interface module, a black and white list filtering module, a picture preprocessing module, a text recognition detection module, a theme detection module, an illegal decision-making module, a user detection module, an engine database and an engine management module.

The data interface module is used for acquiring user issued picture request data, acquiring user information data from an external database and outputting a picture compliance inspection result. The black and white list filtering module is used for filtering black and white lists of users, ips and pictures; the picture preprocessing module is used for reading picture data issued by a user, carrying out picture format conversion, carrying out picture cutting rotation conversion and classifying the pictures according to contents. The text recognition detection module is used for extracting text content containing text pictures and carrying out text violation detection. The theme detection module is used for carrying out violation detection on the pictures according to the associated theme types in the picture request data issued by the user. The user detection module is used for calculating user risk probability according to the user behavior data.

The engine database is used for storing data depended by the picture detection engine, and comprises a violation text database, an associated subject picture database and a black and white list database. The illegal text database stores illegal text keywords, the associated subject picture database stores illegal pictures and subject labels of associated subjects, and the black-and-white list database is used for storing a user id black-and-white list, an ip black-and-white list and a picture black-and-white list.

And the violation decision module is used for judging whether the picture is violated according to the results of the black-and-white list filtering module, the text recognition detection module, the theme detection module and the user detection module. The engine management module is used for optimizing key parameters of the picture detection engine and an engine database.

The illegal picture detection method comprises the following steps:

step S1, the data interface module obtains the user issued picture data, including user data, picture data, and associated subject data;

step S2, the black-and-white list filtering module filters the black-and-white list of the picture data issued by the user, and inputs the corresponding result into the violation decision module to execute a first violation judgment operation to obtain a first violation judgment result; if the first violation judgment result represents that the black-and-white list is hit, outputting a first violation judgment result through a data interface module;

step S3, if the first violation judgment result represents that the black-and-white list is not hit, inputting the picture data issued by the user into a picture preprocessing module to perform picture preprocessing operation, and obtaining a picture preprocessing result; meanwhile, inputting the user data in the picture data issued by the user into a user detection module for user detection operation to obtain a user detection result;

the picture preprocessing result comprises processed picture data and a picture classification result; the user detection result comprises a user risk probability value;

step S4, according to the picture classification result in the picture preprocessing result, the operation is carried out: if the picture classification result is a picture containing text, inputting the picture preprocessing result into a text recognition detection module for text detection to obtain a text violation detection result; if the picture classification result is a non-text picture, inputting the picture preprocessing result into a theme detection module to perform theme violation detection to obtain a theme violation detection result;

and step S5, the violation decision module makes violation decisions according to the user detection result, the text violation detection result and the subject violation detection result to obtain violation judgment results, and the violation judgment results are output by the data interface module.

Compared with the prior art, the invention has obvious advantages and beneficial effects. By means of the technical scheme, the method for detecting the illegal picture and the picture detection engine provided by the invention achieve considerable technical progress and practicability, have wide industrial utilization value and at least have the following advantages:

through picture filtering and picture enhancement of picture preprocessing, the signal-to-noise ratio of the picture is optimized, and the efficiency and the accuracy of picture violation detection are improved; the picture content is detected through picture classification detection, different detection models are adopted for different subject contents, and the picture violation detection accuracy is improved.

The foregoing description is only an overview of the technical solutions of the present invention, and in order to make the technical means of the present invention more clearly understood, the present invention may be implemented in accordance with the content of the description, and in order to make the above and other objects, features, and advantages of the present invention more clearly understood, the following preferred embodiments are described in detail with reference to the accompanying drawings.

Drawings

Fig. 1 is a diagram of a picture detection engine for detecting illegal pictures according to an embodiment of the present invention.

Detailed Description

To further illustrate the technical means and effects of the present invention adopted to achieve the predetermined objects, the following detailed description will be made on a method for detecting an illegal picture and a picture detection engine according to the present invention with reference to the accompanying drawings and preferred embodiments.

Before discussing exemplary embodiments in more detail, it should be noted that some exemplary embodiments are described as processes or methods depicted as flowcharts. Although a flowchart may describe the steps as a sequential process, many of the steps can be performed in parallel, concurrently or simultaneously. In addition, the order of the steps may be rearranged. A process may be terminated when its operations are completed, but may have additional steps not included in the figure. A process may correspond to a method, a function, a procedure, a subroutine, a subprogram, etc.

The following is an explanation of the embodiments of the present invention to which the terms pertain:

picture filtering: i.e. suppress picture noise while preserving as much picture detail as possible.

Enhancing the picture: i.e. enhancing useful information in the image, thereby enhancing the picture interpretation and recognition effect.

OCR: i.e., optical character recognition, refers to a process in which an electronic device (e.g., a scanner or digital camera) examines a character printed on paper, determines its shape by detecting dark and light patterns, and then translates the shape into a computer text by a character recognition method.

The following specifically describes implementations of the present invention in conjunction with the foregoing noun terms:

Referring to fig. 1, the picture detection engine includes a data interface module, a black and white list filtering module, a picture preprocessing module, a text recognition detection module, a theme detection module, an illegal decision module, a user detection module, an engine database, and an engine management module.

The illegal picture detection method comprises the following steps:

step S2, the black and white list filtering module filters the black and white list of the picture data issued by the user, and inputs the corresponding result into the violation decision module to execute a first violation judgment operation, so as to obtain a first violation judgment result; if the first violation judgment result represents that the black-and-white list is hit, outputting a first violation judgment result through a data interface module;

step S4, according to the picture classification result in the picture preprocessing result, performing operation: if the picture classification result is a picture containing text, inputting the picture preprocessing result into a text recognition detection module for text detection to obtain a text violation detection result; if the picture classification result is a non-text picture, inputting the picture preprocessing result into a theme detection module to perform theme violation detection to obtain a theme violation detection result;

As an embodiment, the data of the picture data published by the user in step S1 is JSON in the form of { 'user': id ': ip': }, 'image': data ': type': ',' size ':' } 'tag': [ ] }. The user refers to user data, the data form is a dictionary, the id is a user id, and the ip is a user login ip; "image" refers to picture data in the form of a dictionary, "data" is a picture coding string, "type" is a picture format, and "size" is picture resolution information; "tag" refers to a picture associated subject label, and the data is in the form of a string list. The picture formats comprise jpg, jpeg, png, bmp and svg; the encoding character string is an encoding character string for storing picture data.

As an embodiment, the black and white list filtering in step S2 includes user id black and white list filtering, user ip black and white list filtering, and picture black and white list filtering.

The black and white list filtering of the user id is realized by the following modes: and querying the user id in a black and white list of the user id in the black and white list database by taking the user id as a keyword, and outputting a corresponding query result, wherein the result comprises '0', '1', '2', '0' indicating that the user id is a white list id, '1' indicating that the user id is a black list id, and '2' indicating that no query result exists.

The user ip black and white list filtering is realized by the following modes: and inquiring the user ip as a keyword in the ip black-and-white list in the black-and-white list database, and outputting a corresponding inquiry result, wherein the result comprises '0', '1', '2', '0' indicating that the user ip is a white list ip, '1' indicating that the user ip is a black list ip, and '2' indicating that no inquiry result exists.

The picture black and white list filtering is realized by the following modes: converting the picture into a gray-scale image and performing hash operation to obtain a picture key code, inquiring in a picture black and white list in the black and white list database by taking the picture key code as a key word, and outputting a corresponding inquiry result, wherein the result comprises '0', '1', '2', '0' indicating that the picture is a white list picture, '1' indicating that the picture is a black list picture, and '2' indicating that no inquiry result exists. Wherein the hash operation adopts an MD5 algorithm; the picture black-and-white list stores black-and-white list picture key codes and black-and-white list identifications, and the black-and-white list picture key codes are obtained by performing hash operation after gray scale conversion.

As an example, the first violation determination result in step S2 is that the character string is used to characterize whether the black-and-white list is hit, and includes "0", "1", and "2", where "0" indicates that the black-and-white list is hit, and the determination result is the white list, "1" indicates that the black-and-white list is hit, and the determination result is the black list, and "2" indicates that the black-and-white list is not hit.

The first violation judgment operation is performed according to the following rule:

if the black and white list of the user id is filtered, the black and white list of the user ip is filtered, and the black and white list of the picture has 0 and does not have 1, the first violation judgment result is 0; if the black and white list filtering result of the user id, the black and white list filtering result of the user ip and the black and white list filtering result of the picture are '1', the first violation judgment result is '1'; and if the black and white list filtering result of the user id, the black and white list filtering result of the user ip and the black and white list filtering result of the picture are all '2', the first violation judgment result is '2'.

As an embodiment, the picture preprocessing operation in step S3 includes: the method comprises the following steps of picture filtering, picture enhancement and image classification detection, and specifically comprises the following steps: reading an input picture according to a picture coding mode and converting a color space into an RGB space to obtain first picture data; performing picture filtering and picture enhancement processing on the first picture data to obtain second picture data; and performing picture classification detection on the second picture data to obtain picture classification data.

As an embodiment, the picture filtering is implemented by the following algorithm:

(1) conversion to grayscale for input pictures

Obtaining a three-dimensional matrix according to the following mapping mode:

(2) obtaining the dimensionality-increased matrix for the three-dimensional matrix according to the following mode

And a weight matrix

：

(3) Obtaining a filtered image:

wherein interp () is an interpolation function; g is a linearized spatial proximity factor

Factor of similarity to gray

The calculation method is as follows:

wherein p = (i, j) is a central pixel point, q is a neighborhood pixel point of the central pixel point p,

the gray values of the pixel points p and q are respectively.

Represents the spatial distance of p and q,

indicating the gray scale distance of p, q.

The spatial distance standard deviation and the gray level distance standard deviation are based on the Gaussian function respectively.

As an embodiment, the picture addition is implemented by an algorithm:

pixel point of picture (i, j) position

The transformation is performed in such a way that a pixel of the processed position is obtained

：

The depth represents the enhancement intensity of the picture, and generally, depth =2 is taken for middle-range enhancement, and depth =2.5 is taken for high-range enhancement.

As an example, the picture classification detection is performed by:

(1) performing picture feature extraction on the second picture data to obtain first picture feature data, and inputting the trained first picture classification model to obtain a first picture classification result; the first picture classification model is used for distinguishing whether the picture contains text or not; the first picture classification result is 'T' or 'N-T', the 'T' represents that the picture contains the text, and the 'N-T' represents that the picture does not contain the text;

(2) when the first picture classification result is 'T', finishing picture classification detection and outputting a list containing the first picture classification result; when the image classification result is 'N-T', inputting the first image characteristic data into a second image classification model to obtain a second image classification result, finishing image classification detection and merging and outputting the first image classification result and the second image classification result; the second picture classification model is used for identifying the theme tags related to the detected pictures, and the second picture classification results are lists containing picture theme tag character strings.

The image feature extraction adopts an HOG algorithm, namely a histogram of oriented gradients algorithm, which is a mature technical means and is not described herein any more.

The first picture classification model is obtained by the following method: acquiring first model original data including picture data and a label of whether the picture contains a text or not in a manual screening mode; splitting the first model original data into a first model training set and a first model testing set; and training a first picture classification model through a first model training set by adopting a Logistic Regression algorithm (Logistic Regression), evaluating and optimizing by depending on a first model testing set, and outputting the first picture classification model meeting the requirements of recall rate and accuracy.

The Logistic Regression algorithm (Logistic Regression) is a mature technical means, and those skilled in the art can smoothly implement the above description, which is not described herein again.

The second image classification model is a convolutional neural network classifier, and the implementation method is a mature technical means, so that the detailed description is omitted; the classification result and the output identification of the second image classification model are shown in table 1:

TABLE 1 output identification corresponding to each classification result of the second picture classification model

As an example, the user detection operation in step S3 is performed by:

the user detection module performs feature extraction on the input user information and the equipment environment information to obtain user feature data, and inputs the user feature data to the trained user analysis model to obtain a user risk probability value.

The user detection result is a user risk probability numerical value, and represents whether the user who sends the user generated text request has malicious publishing risk, wherein 0 represents no violation, 1 represents violation, and the rest numerical values represent violation possibility.

The user analysis model is obtained by the following method: carrying out data cleaning and feature extraction on an original user operation data set to obtain a user analysis model data set; splitting the user analysis model data set into a user analysis model training set and a user analysis model testing set; training a user analysis model by using a machine learning algorithm depending on a user analysis model training set, and evaluating the user analysis model by using a user analysis model test set; and adjusting parameters to continuously train the model until the recall rate and the accuracy rate meet preset threshold values, and outputting the user analysis model.

The raw user data set is obtained directly from a user database external to the text detection engine, and the raw user operation data set includes, but is not limited to, the following data fields: the method comprises the steps of operating objects, operating types, operating time, login ip addresses during operation, violation identifications, violation type labels and violation time.

It is understood that the machine learning algorithm adopted in the training of the user analysis model includes: the logistic regression algorithm, the decision tree, the genetic algorithm, the support vector machine (SVN), the K-means algorithm, and the random forest and naive bayes algorithm are different in program design when different algorithms are adopted, but are mature technical means, and a person skilled in the art can completely and smoothly realize the algorithms according to the description of the above embodiments, and details are not repeated herein.

As an embodiment, the text recognition detection module performs text detection by:

(1) performing text recognition and extraction on an input picture to obtain text content to be detected;

(2) detecting the text to be detected according to a preset rule of rule-breaking text by a regular matching mode, wherein if the matching is successful, the text detection result is 'rule-breaking'; if the result is not matched, performing word segmentation processing on the text to be detected, and removing safety words to obtain a keyword list;

(3) querying the illegal text database by taking character strings in the keyword list as keywords, and if the keywords hit the illegal text database, determining that the text detection result is illegal; and if the keyword does not hit the illegal text database, the text detection result is safe.

The text violation detection result includes "0", "1", "0" indicates "safe", and "1" indicates "violation".

The text recognition and extraction can be realized by using an OCR algorithm adopting a "CNN + BLSTM + CTC" architecture, which is a mature technical means, and a person skilled in the art can completely and smoothly realize the algorithm according to the description of the above embodiment, and details are not described herein.

As an example, the topic detection module implements topic violation detection by:

(1) matching different detection models according to picture classification results obtained by picture preprocessing, inputting picture data into corresponding detection models for detection to obtain violation detection results of the corresponding models, and assembling the violation detection results into a violation detection result list; wherein elements in the violation detection result list are duplets including corresponding absolute risk factors and violation detection results;

(2) calculating a topic violation risk probability value according to the violation detection result list; the method comprises the following steps:

wherein M is a detection model set corresponding to the classification result of the input picture,

in order to detect the absolute risk factor of the model,

the detection result of the corresponding detection model is obtained; the detection model and the risk factor corresponding to the image classification result are shown in table 2.

Table 2 detection model, output result, and absolute risk factor corresponding to each classification result

Wherein the output result comprises "0", "1"; a "0" is used to characterize the picture as not violating the rule, and a "1" is used to characterize the picture as violating the rule.

As an embodiment, the first topic detection model is obtained by:

obtaining first subject original data related to subjects of people and human body parts in a manual screening mode, wherein the first subject original data comprise picture data and a label for judging whether the picture violates rules or not; splitting the first subject original data into a first subject model training set and a first subject model testing set; and training a first topic detection model through a first topic model training set by adopting a Logistic Regression algorithm (Logistic Regression), evaluating and optimizing by means of a first topic model test set, and outputting the first topic detection model meeting the requirements of recall rate and accuracy.

The training methods of the second theme detection model, the third theme detection model, the fourth theme detection model and the fifth theme detection model are similar to the training method of the first theme detection model, and can be realized only by replacing the original training picture set with the corresponding theme picture, which is not repeated herein.

As an example, the flag filter is implemented by:

(1) cutting the picture data into a plurality of image blocks through a Normalized cutting algorithm (Normalized-Cut);

(2) respectively carrying out picture similarity calculation on each image block and a picture with a subject label of 'Logo' in an associated subject picture database to obtain similarity calculation results, and assembling all the similarity calculation results into a similarity numerical value list; the picture similarity calculation adopts a Hamming distance similarity calculation method;

(3) if the maximum value in the similarity numerical value list is larger than the similarity threshold value

The flag filter outputs a "1", otherwise a "0" is output.

Preferably, the similarity threshold

And when the value is 70%, the accuracy of the detection result is better.

As an example, the violation decision module in step S5 makes the violation decision by the following rule:

if the first judgment result output by the blacklist filtering module represents that the blacklist is hit, the violation decision result is '1'; if the first judgment result output by the blacklist filtering module represents hit in the white list, the violation decision result is '0';

if the first violation judgment result represents that the black-and-white list is not hit, performing violation decision according to the detection result of the text recognition detection module or the theme detection module: when the text violation detection result of the text recognition detection module is '1', the violation decision result is '1'; when the text violation detection result of the text recognition detection module is '0', if the topic violation risk probability value of the topic detection module is greater than the preset topic risk probability threshold value, the violation decision result is '1', otherwise, the violation decision result is '0'; and when no text violation detection result exists and the topic violation risk probability value of the topic detection module is greater than the preset topic risk probability threshold, the violation decision result is '1', otherwise, the violation decision result is '0'.

The picture detection engine also comprises an engine management module, and the engine management module is used for supporting the key parameter optimization of the picture engine.

The key parameter optimization refers to that operation and maintenance personnel of the image detection engine add, modify and delete data in the illegal text database, the associated subject image database and the black and white list database according to business needs through a database operation interface provided by an engine management module.

The present invention can take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computers having computer-usable program code embodied therein, which may be non-transitory storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like).

Finally, it is noted that, in this document, relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "include", "including" or any other variations thereof are intended to cover non-exclusive inclusion, so that a process, method, article, or terminal device including a series of elements includes not only those elements but also other elements not explicitly listed or inherent to such process, method, article, or terminal device. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or terminal that comprises the element.

Although the present invention has been described with reference to a preferred embodiment, it should be understood that various changes, substitutions and alterations can be made herein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims

1. A method for illegal picture detection,

the illegal picture detection method comprises the following steps:

step S5, the violation decision module makes violation decisions according to the user detection result, the text violation detection result and the subject violation detection result to obtain violation judgment results, and the violation judgment results are output by the data interface module;

in step S2, the black and white list filtering includes user id black and white list filtering, user ip black and white list filtering, and picture black and white list filtering.

2. The method of claim 1,

the text detection is realized by depending on an illegal text database, and the illegal text database stores illegal text keywords;

the topic violation detection is realized by depending on an associated topic picture database, and the associated topic picture database stores violation pictures and topic labels of associated topics;

the black and white list filtering depends on a black and white list database, and the black and white list database is used for storing a user id black and white list, an ip black and white list and a picture black and white list.

3. The method of claim 1,

the black and white list filtering of the user id is realized by the following modes: inquiring a user id in a black and white list of the user id in a black and white list database by taking the user id as a keyword, and outputting a corresponding inquiry result, wherein the result comprises '0', '1', '2', '0' indicating that the user id is a white list id, '1' indicating that the user id is a black list id, and '2' indicating that no inquiry result exists;

the filtering of the black and white list of the user ip is realized by the following modes: inquiring a user ip in an ip black-and-white list in a black-and-white list database by taking the user ip as a keyword, and outputting a corresponding inquiry result, wherein the result comprises '0', '1', '2', '0' indicating that the user ip is a white list ip, '1' indicating that the user ip is a black list ip, and '2' indicating that no inquiry result exists;

the black and white list filtering of the picture is realized in the following way: converting the picture into a gray-scale image and performing hash operation to obtain a picture key code, inquiring in a picture black-and-white list in a black-and-white list database by taking the picture key code as a keyword, and outputting a corresponding inquiry result, wherein the result comprises ' 0 ', ' 1 ', ' 2 ', ' 0 ' indicating that the picture is a white list picture, 1 ' indicating that the picture is a black list picture, and ' 2 ' indicating that no inquiry result exists; wherein the hash operation adopts an MD5 algorithm; the picture black-and-white list stores black-and-white list picture key codes and black-and-white list identifications, and the black-and-white list picture key codes are obtained by performing hash operation after gray scale conversion.

4. The method of claim 1,

the picture preprocessing operation in step S3 includes: the method comprises the following steps of picture filtering, picture enhancement and image classification detection, and specifically comprises the following steps: reading an input picture according to a picture coding mode and converting a color space into an RGB space to obtain first picture data; performing picture filtering and picture enhancement processing on the first picture data to obtain second picture data; and performing picture classification detection on the second picture data to obtain picture classification data.

5. The method of claim 4,

the picture filtering is realized by the following algorithm: converting an input picture into a gray-scale image and obtaining a dimensional-increased three-dimensional matrix according to a preset mapping mode; obtaining an increased dimension matrix IX and a weight matrix EX for the three-dimensional matrix according to a preset transformation mode; obtaining a filtered image through spatial interpolation;

the picture enhancement is realized by an algorithm:

pixel point of picture (i, j) position

：

Where depth represents the picture enhancement strength, depth =2 for mid-range enhancement, and depth =2.5 for enhancement.

6. The method of claim 4,

the picture classification detection is carried out in the following way:

performing picture feature extraction on the second picture data to obtain first picture feature data, and inputting the trained first picture classification model to obtain a first picture classification result; the first picture classification model is used for distinguishing whether the picture contains text or not; the first picture classification result is 'T' or 'N-T', the 'T' represents that the picture contains the text, and the 'N-T' represents that the picture does not contain the text;

when the first picture classification result is 'T', finishing picture classification detection and outputting a list containing the first picture classification result; when the image classification result is 'N-T', inputting the first image characteristic data into a second image classification model to obtain a second image classification result, finishing image classification detection and merging and outputting the first image classification result and the second image classification result; the second picture classification model is used for identifying the theme tags related to the detected pictures, and the second picture classification results are lists containing picture theme tag character strings.

7. The method of claim 6,

the first picture classification model is obtained by the following method: acquiring first model original data including picture data and a label of whether the picture contains a text or not in a manual screening mode; splitting the first model original data into a first model training set and a first model testing set; training a first picture classification model through a first model training set by adopting a logistic regression algorithm, evaluating and optimizing by means of a first model testing set, and outputting the first picture classification model meeting the requirements of recall rate and accuracy;

the second image classification model is a convolutional neural network classifier.

8. The method of claim 1,

the user detection operation in step S3 is performed by:

9. The method of claim 1,

the theme detection module realizes theme violation detection in the following way:

matching different detection models according to the picture classification result obtained by picture preprocessing, inputting picture data into the corresponding detection model for detection to obtain violation detection results of the corresponding model, and splicing the violation detection results into a violation detection result list;

and calculating a topic violation risk probability value according to the violation detection result list, wherein the method comprises the following steps:

in order to detect the absolute risk factor of the model,

is the detection result of the corresponding detection model.

10. A picture detection engine for violation picture detection,

the image detection engine comprises a data interface module, a black and white list filtering module, an image preprocessing module, a text recognition detection module, a theme detection module, an illegal decision-making module, a user detection module, an engine database and an engine management module;

the data interface module is used for acquiring user issued picture request data, acquiring user information data from an external database and outputting a picture compliance inspection result; the black and white list filtering module is used for filtering black and white lists of users, ips and pictures; the picture preprocessing module is used for reading picture data issued by a user, carrying out picture format conversion, carrying out picture cutting rotation conversion and classifying the pictures according to contents; the text recognition detection module is used for extracting text contents containing text pictures and carrying out text violation detection; the theme detection module is used for carrying out violation detection on the pictures according to the associated theme types in the picture request data issued by the user; the user detection module is used for calculating user risk probability according to user behavior data; the engine database is used for storing data depended by the picture detection engine and comprises a violation text database, an associated subject picture database and a black and white list database; the violation decision module is used for judging whether the picture is violated according to the results of the black-and-white list filtering module, the text recognition detection module, the theme detection module and the user detection module; the engine management module is used for supporting the optimization of key parameters of the picture engine.