CN111611828A - Abnormal image recognition method and device, electronic equipment and storage medium - Google Patents

Abnormal image recognition method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN111611828A
CN111611828A CN201910140017.XA CN201910140017A CN111611828A CN 111611828 A CN111611828 A CN 111611828A CN 201910140017 A CN201910140017 A CN 201910140017A CN 111611828 A CN111611828 A CN 111611828A
Authority
CN
China
Prior art keywords
image
character recognition
recognized
character
recognition result
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910140017.XA
Other languages
Chinese (zh)
Inventor
王闾威
赵元
沈海峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Didi Infinity Technology and Development Co Ltd
Original Assignee
Beijing Didi Infinity Technology and Development Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Didi Infinity Technology and Development Co Ltd filed Critical Beijing Didi Infinity Technology and Development Co Ltd
Priority to CN201910140017.XA priority Critical patent/CN111611828A/en
Publication of CN111611828A publication Critical patent/CN111611828A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/62Text, e.g. of license plates, overlay texts or captions on TV images
    • G06V20/63Scene text, e.g. street names
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Multimedia (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Human Computer Interaction (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The application provides an abnormal image identification method, an abnormal image identification device, an electronic device and a storage medium, wherein the method comprises the following steps: the method comprises the steps of obtaining an image to be recognized, carrying out character recognition on the image to be recognized to obtain a character recognition result, and determining whether the image to be recognized is an abnormal image or not based on the character recognition result and the character recognition result. According to the method and the device, whether the image is an abnormal image or not can be identified according to the contents of the image, the characters and the like in the image, and the accuracy of identifying the abnormal image is improved.

Description

Abnormal image recognition method and device, electronic equipment and storage medium
Technical Field
The present application relates to the field of security technologies, and in particular, to an abnormal image recognition method, an abnormal image recognition apparatus, an electronic device, and a storage medium.
Background
At present, the information is more and more convenient to spread and exchange, and meanwhile, various abnormal information is increasingly flooded. Since an image is an important information transmission method, how to recognize an image carrying abnormal information, that is, an abnormal image, is also receiving attention.
In the prior art, a large number of abnormal images are usually collected in advance, and when an image to be recognized is recognized, the image to be recognized and the stored abnormal image can be compared, so that whether the image to be recognized is the abnormal image or not is judged according to the similarity between the image to be recognized and the abnormal image. However, the above method only considers the similarity between images, and the recognition accuracy is low.
Disclosure of Invention
In view of the above, an object of the embodiments of the present application is to provide an abnormal image recognition method, an abnormal image recognition apparatus, an electronic device, and a storage medium, which can recognize whether an image is an abnormal image according to contents of the image, characters, and the like included in the image, thereby improving accuracy of recognizing the abnormal image.
According to an aspect of the present application, there is provided an abnormal image recognition method including:
acquiring an image to be identified;
carrying out character recognition on the image to be recognized to obtain a character recognition result, and carrying out character recognition on the image to be recognized to obtain a character recognition result;
and determining whether the image to be recognized is an abnormal image or not based on the character recognition result and the character recognition result.
Optionally, the performing character recognition on the image to be recognized to obtain a character recognition result, and performing character recognition on the image to be recognized to obtain a character recognition result includes:
extracting image features of the image to be recognized to obtain a feature map corresponding to the image to be recognized;
and performing character recognition on the feature map to obtain the character recognition result, and performing character recognition on the feature map to obtain the character recognition result.
Optionally, the extracting the image feature of the image to be recognized to obtain a feature map corresponding to the image to be recognized includes:
and extracting the feature map from the image to be identified through a shared convolutional neural network.
Optionally, the performing of the person identification on the feature map to obtain the person identification result includes:
and carrying out figure identification on the characteristic diagram through a figure target positioning network to obtain the figure identification result.
Optionally, the human target location network includes a first convolution kernel, where the convolution kernel is a two-dimensional matrix and is used to extract features in the image, and a ratio of a row-column ratio of the first convolution kernel is greater than or equal to 1/2 and less than or equal to 2.
Optionally, the performing character recognition on the feature map to obtain the character recognition result includes:
identifying the characteristic diagram through a character target positioning network to determine a character area;
and identifying the character area to obtain the character identification result.
Optionally, the text object locating network includes a second convolution kernel, where the convolution kernel is a two-dimensional matrix and is used to extract features in the image, and a ratio of a row-column ratio of the second convolution kernel is less than 1/2 or greater than 2.
Optionally, the recognizing the text region includes:
and identifying the character area through a character identification neural network.
Optionally, the determining whether the image to be recognized is an abnormal image based on the person recognition result and the character recognition result includes:
inputting the figure recognition result and the character recognition result into a probabilistic neural network;
and determining the probability that the image to be identified is an abnormal image through the probabilistic neural network.
According to another aspect of the present application, there is provided an abnormal image recognition apparatus including:
the acquisition module is used for acquiring an image to be identified;
the recognition module is used for carrying out character recognition on the image to be recognized to obtain a character recognition result and carrying out character recognition on the image to be recognized to obtain a character recognition result;
and the determining module is used for determining whether the image to be recognized is an abnormal image or not based on the character recognition result and the character recognition result.
Optionally, the identification module is specifically configured to:
extracting image features of the image to be recognized to obtain a feature map corresponding to the image to be recognized;
and performing character recognition on the feature map to obtain the character recognition result, and performing character recognition on the feature map to obtain the character recognition result.
Optionally, the identification module is specifically configured to:
and extracting the feature map from the image to be identified through a shared convolutional neural network.
Optionally, the identification module is specifically configured to:
and carrying out figure identification on the characteristic diagram through a figure target positioning network to obtain the figure identification result.
Optionally, the human target location network includes a first convolution kernel, where the convolution kernel is a two-dimensional matrix and is used to extract features in the image, and a ratio of a row-column ratio of the first convolution kernel is greater than or equal to 1/2 and less than or equal to 2.
Optionally, the identification module is specifically configured to:
identifying the characteristic diagram through a character target positioning network to determine a character area;
and identifying the character area to obtain the character identification result.
Optionally, the text object locating network includes a second convolution kernel, where the convolution kernel is a two-dimensional matrix and is used to extract features in the image, and a ratio of a row-column ratio of the second convolution kernel is less than 1/2 or greater than 2.
Optionally, the identification module is specifically configured to:
and identifying the character area through a character identification neural network.
Optionally, the determining module is specifically configured to:
inputting the figure recognition result and the character recognition result into a probabilistic neural network;
and determining the probability that the image to be identified is an abnormal image through the probabilistic neural network.
According to another aspect of the present application, there is provided an electronic device including: a processor, a storage medium and a bus, wherein the storage medium stores machine-readable instructions executable by the processor, when the electronic device is operated, the processor and the storage medium communicate through the bus, and the processor executes the machine-readable instructions to execute the steps of the abnormal image recognition method.
According to another aspect of the present application, a computer-readable storage medium is provided, having stored thereon a computer program which, when being executed by a processor, performs the steps of the abnormal image recognition method as set forth above.
In the embodiment of the application, whether the image is the abnormal image depends on the information content carried by the image, so that the image to be recognized is obtained, the character recognition and the character recognition are performed on the image to be recognized, the character recognition result and the character recognition result are obtained, and whether the image to be recognized is the abnormal image is determined based on the character recognition result and the character recognition result, that is, the image to be recognized is recognized based on the information content of the character, the character and the like included in the image to be recognized, so that the recognition accuracy is improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained from the drawings without inventive effort.
Fig. 1 illustrates a block diagram of an electronic device provided by an embodiment of the present application;
FIG. 2 is a flow chart illustrating a method for recognizing abnormal images according to an embodiment of the present disclosure;
FIG. 3 is a flow chart of another abnormal image recognition method provided by the embodiment of the application;
FIG. 4 is a block diagram illustrating abnormal image recognition provided by an embodiment of the present application;
fig. 5 shows a schematic structural diagram of an electronic device provided in an embodiment of the present application.
Detailed Description
In order to make the purpose, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it should be understood that the drawings in the present application are for illustrative and descriptive purposes only and are not used to limit the scope of protection of the present application. Additionally, it should be understood that the schematic drawings are not necessarily drawn to scale. The flowcharts used in this application illustrate operations implemented according to some embodiments of the present application. It should be understood that the operations of the flow diagrams may be performed out of order, and steps without logical context may be performed in reverse order or simultaneously. One skilled in the art, under the guidance of this application, may add one or more other operations to, or remove one or more operations from, the flowchart.
In addition, the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. The components of the embodiments of the present application, generally described and illustrated in the figures herein, can be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present application, presented in the accompanying drawings, is not intended to limit the scope of the claimed application, but is merely representative of selected embodiments of the application. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present application without making any creative effort, shall fall within the protection scope of the present application.
One aspect of the present application relates to an abnormal image recognition method. The method can identify the contents such as characters, characters and the like included in the image to be identified, so that whether the image to be identified is an abnormal image or not is determined according to the identification result of the characters and the characters.
It is noted that before the application of the present application, it is generally determined whether an image to be recognized is normal or not according to the similarity between the image to be recognized and a large number of abnormal images collected in advance, but since only the similarity between the images is considered, the recognition accuracy is low. However, the abnormal image recognition method provided by the application can judge whether the image to be recognized is a normal image according to the contents of characters, characters and the like included in the image to be recognized, and the recognition accuracy is high, especially aiming at the abnormal images related to the behaviors of people, such as images including abnormal contents of pornography, violence, terrorism and the like.
Fig. 1 illustrates a schematic diagram of exemplary hardware and software components of an electronic device 100, according to some embodiments of the present application.
The electronic device 100 may be a general-purpose computer or a special-purpose computer, both of which may be used to implement the abnormal image recognition method of the present application. Although only a single computer is shown, for convenience, the functions described herein may be implemented in a distributed fashion across multiple similar platforms to balance processing loads.
For example, electronic device 100 may include a network port 110 connected to a network, one or more processors 120 for executing program instructions, a communication bus 130, and different forms of storage media 140, such as disks, Read-Only memories (ROMs), or Random Access Memories (RAMs), or any combination thereof. Illustratively, the computer platform may also include program instructions stored in ROM, RAM, or other types of non-transitory storage media, or any combination thereof. The method of the present application may be implemented in accordance with these program instructions. The electronic device 100 also includes an Input/Output (I/O) interface 150 between the computer and other Input/Output devices (e.g., keyboard, display screen).
For ease of illustration, only one processor is depicted in electronic device 100. However, it should be noted that the electronic device 100 in the present application may also comprise a plurality of processors, and thus the steps performed by one processor described in the present application may also be performed by a plurality of processors in combination or individually. For example, if the processor of the electronic device 100 executes steps a and B, it should be understood that steps a and B may also be executed by two different processors together or separately in one processor. For example, a first processor performs step a and a second processor performs step B, or the first processor and the second processor perform steps a and B together.
Fig. 2 illustrates a flow chart of a method of abnormal image recognition of some embodiments of the present application. It should be noted that the abnormal image identification method described in the present application is not limited by the specific sequence described in fig. 2 and below, and it should be understood that, in other embodiments, the sequence of some steps in the abnormal image identification method described in the present application may be interchanged according to actual needs, or some steps may be omitted or deleted. The flow shown in fig. 2 will be explained in detail below.
Step 201, acquiring an image to be identified.
In order to facilitate the subsequent identification of whether the image to be identified is an abnormal image, the abnormal image is processed in time, for example, the image to be identified can be obtained by filtering or tracking the image source.
The image to be identified is the image needing to be identified.
The abnormal image is an image including abnormal information, such as illegal information or violation information including pornography, violence, advertisement, distribution, or terrorism.
An image input interface may be provided to a user so that an image input by the user is received through the image input interface as an image to be recognized.
Of course, in practical applications, the electronic device may also obtain the image to be recognized in other manners, such as receiving an image sent by another device as the image to be recognized, or the electronic device obtains the image as the image to be recognized through a camera of the electronic device.
Step 202, performing character recognition on the image to be recognized to obtain a character recognition result, and performing character recognition on the image to be recognized to obtain a character recognition result.
Whether the image is an abnormal image depends on the information content carried by the image, and the image is displayed to people, so that the image comprises information such as characters and characters, and particularly, the image is directed to abnormal images related to human behaviors, such as images containing abnormal contents such as pornography, violence, terrorism and the like, so that the image to be recognized can be recognized by character recognition and character recognition respectively in order to improve the accuracy of recognizing the image to be recognized.
The person identification result is a result of identifying a person in the image to be identified.
The character recognition result is the result of character recognition of the image to be recognized.
The result of the person identification or the character identification may include any one of a plurality of identification results, such as normal or abnormal; alternatively, the result of abnormal sub-recognition may be further included, for example, normal, pornographic, violent or terrorist may be included; or, when the image to be recognized for some specific scenes is recognized, a more detailed recognition result for the sub-recognition result may be included, for example, when the pornographic image is recognized, a normal, pornographic 1 level, pornographic 2 level or pornographic 3 level may be included, wherein the higher the level is, the higher the possibility that the image includes pornographic content is.
It should be noted that the degree of detail of the character recognition result and the character recognition result may be different, for example, when recognizing the pornographic image, the character recognition result may include normal, pornographic level 1, pornographic level 2 or pornographic level 3, and the character recognition result may include normal or pornographic.
It should be further noted that the result of the person identification or the result of the character identification may also include probability values respectively corresponding to the identification results, for example, the result of the person identification may include: normal 30%, abnormal 70%.
The person and the character included in the image to be recognized can be recognized, so that a person recognition result and a character recognition result can be obtained. When performing the person recognition, the area where the person is located may be determined first, and clothing, posture, expression, and the like of the person may be recognized from the area, thereby obtaining a person recognition result. When performing character recognition, the region where the character is located may be determined first, the character information included in the region is recognized from the region, and then semantic recognition and grammar recognition are performed on the character information including characters, words or sentences, so as to obtain a character recognition result.
It should be noted that, the image to be recognized may be subjected to person recognition and/or character recognition through a neural network model or a machine learning model.
It should be further noted that the image to be recognized can be subjected to character recognition and character recognition respectively, and the sequence of the character recognition and the character recognition is not limited in the application; alternatively, the person recognition and the character recognition may be performed simultaneously on the image to be recognized.
And step 203, determining whether the image to be recognized is an abnormal image or not based on the character recognition result and the character recognition result.
Since the character recognition result and the character recognition result can indicate whether the information such as the characters and the characters included in the image to be recognized is normal, whether the content such as the characters and the characters of the image to be recognized is abnormal or not can be judged according to the character recognition result and the character recognition result, and whether the recognized image is the abnormal image or not can be determined.
The recognition result of the image to be recognized can be determined according to the person recognition result and the character recognition result. When the character recognition result is the same as the character recognition result, the character recognition result or the character recognition result can be used as the recognition result of the image to be recognized; when the result of the person recognition and the result of the character recognition are different, as is apparent from the foregoing, the result of the person recognition and the result of the character recognition are represented by probability values, and therefore, an arithmetic average or a weighted average of the result of the person recognition and the result of the character recognition can be used as the result of recognition of the image to be recognized. Of course, in practical application, the recognition result of the image to be recognized may also be determined according to the person recognition result and the character recognition result through a neural network model or a machine learning model.
It should be noted that, if the recognition result of the image to be recognized includes normal or abnormal, when the recognition result includes abnormal, it is determined that the image to be recognized is an abnormal image; if the recognition result of the image to be recognized includes the probability values respectively corresponding to the recognition results, the image to be recognized may be determined to be an abnormal image when the abnormal probability value is the highest.
Of course, in practical application, the recognition result of the image to be recognized may also be directly output.
In the embodiment of the application, whether the image is the abnormal image depends on the information content carried by the image, so that the image to be recognized is obtained, the character recognition and the character recognition are performed on the image to be recognized, the character recognition result and the character recognition result are obtained, and whether the image to be recognized is the abnormal image is determined based on the character recognition result and the character recognition result, that is, the image to be recognized is recognized based on the information content of the character, the character and the like included in the image to be recognized, so that the recognition accuracy is improved.
Fig. 3 illustrates a flow chart of a method of abnormal image recognition of some embodiments of the present application. It should be noted that the abnormal image identification method described in the present application is not limited by the specific sequence described in fig. 3 and below, and it should be understood that, in other embodiments, the sequence of some steps in the abnormal image identification method described in the present application may be interchanged according to actual needs, or some steps may be omitted or deleted. The flow shown in fig. 3 will be explained in detail below.
Step 301, acquiring an image to be identified.
For a manner of obtaining the image to be identified, reference may be made to the related description in step 201, and details are not repeated here.
Step 302, extracting image features of the image to be recognized to obtain a feature map corresponding to the image to be recognized.
Because the image is formed by orderly arranging a plurality of pixels, the colors and the arrangement rules of the pixels in different images are different, so that different images have different characteristics including various image characteristics such as colors, textures and the like, whether abnormal information is carried in the image to be identified or not is identified for the image characteristics of the image to be identified in order to facilitate the follow-up identification, the identification efficiency and the identification accuracy are improved, the image characteristics of the image to be identified can be extracted, and the characteristic diagram corresponding to the image to be identified is obtained.
The feature map is an image for explaining image features of an image.
Alternatively, the feature map may be extracted from the image to be recognized by sharing a convolutional neural network.
Because the object of the perception image is usually a human, the image features in the image can accord with the human brain visual system, the convolution neural network can directly convolute the pixels in the image, so that the image features are extracted from the image, the processing mode is closer to the processing result of the human brain visual system, so that the accuracy of recognition can be improved, the convolution neural network comprises fewer parameters, the training process is simple, and the recognition efficiency can be improved, so that the feature map can be extracted from the image to be recognized by sharing the convolution neural network.
The shared convolutional neural network is a convolutional neural network for extracting a characteristic diagram from an image to be identified.
It should be noted that the shared convolutional neural network can be obtained by obtaining in advance. For example, a plurality of images may be obtained in advance as a first training sample, and feature maps may be extracted from the images and trained on the shared convolutional neural network through the first training sample.
The shared convolutional neural network can be used for receiving an input image to be recognized and carrying out convolution operation on pixels included in the image to be recognized so as to obtain the feature map.
Optionally, in order to reduce the resolution of the feature map, facilitate subsequent identification of information included in the feature map, improve the accuracy of identification, reduce the data amount of subsequent budget, and improve the identification efficiency, the feature map may be subjected to downsampling processing through the shared convolutional neural network, so that the feature map after downsampling may be identified in the subsequent identification process.
The downsampling can utilize the static property of the image to carry out aggregation statistics on adjacent pixels.
In addition, in practical application, the feature map may be extracted from the image to be recognized in other manners, for example, the feature map may also be extracted by other types of neural network models or machine learning models.
And step 303, performing character recognition on the feature map to obtain a character recognition result, and performing character recognition on the feature map to obtain a character recognition result.
Since the feature map can explain the image features of the image to be recognized, the feature map can be subjected to person recognition and character recognition.
Optionally, the person identification result may be obtained by performing person identification on the feature map through a person target positioning network.
The human target positioning network is a convolutional neural network which determines the position and the size of a human from an image and identifies and classifies the characteristics of the human. As can be seen from the advantages of the convolutional neural network, in order to improve the efficiency and accuracy of the recognition, the person can be recognized through the person target positioning network.
The human target location network may be obtained by determining in advance, for example, a plurality of images may be obtained in advance as a second training sample, the second training sample is labeled according to a normal image or an abnormal image, and the image is trained by performing human recognition through the human target location network through the labeled second training sample.
The input feature map can be received through a character target positioning network, the feature map is slid in the feature map once according to a sliding window with a preset size, whether the area of the feature map in the sliding window currently comprises a character or not is identified once, and at least one of the gesture, the proportion of naked skin and the expression of the character is identified, so that a character identification result is obtained.
The sliding window, also called a filter or a convolution kernel, may be a two-dimensional matrix for extracting features in an image.
It should be noted that, in different application scenes, the recognition may be performed in a preset manner corresponding to the application scene, for example, in a scene for recognizing pornographic images, the recognition may be performed by recognizing the posture and the proportion of the naked skin of the person in the feature map, so as to determine that the person recognition result is a normal image or a pornographic image, or determine that the person recognition result includes the probability of the normal image and the probability of the abnormal image of the image to be recognized, respectively.
Optionally, the human target location network includes a first convolution kernel, where the convolution kernel is a two-dimensional matrix and is used for extracting features in the image, and a ratio of a row-column ratio of the first convolution kernel is greater than or equal to 1/2 and less than or equal to 2.
For person identification through a sliding window of close length and width, the person targeting network may include a first convolution kernel having a rank ratio greater than or equal to 1/2 and less than or equal to 2.
For example, the first convolution kernel has a row-to-column ratio of 1:1, 1:2, or 2: 1.
In addition, in practical applications, the feature map may be identified in other manners, for example, the feature map may be identified by other types of neural network models or machine learning models.
Optionally, the feature map may be identified through a text target positioning network, a text region may be determined, and the text region may be identified to obtain a text identification result.
The character target positioning network is a convolution neural network for determining the area (including the position and the size of the character) where the character is located from the image. As can be seen from the advantages of the convolutional neural network, in order to improve the efficiency and accuracy of recognition, the character recognition can be performed on the feature map through the character target positioning network.
It should be noted that the text target positioning network may be obtained by determining in advance, for example, a plurality of images may be obtained in advance as a third training sample, the second training sample is labeled according to text included in the images, and the text area determined from the images by the text target positioning network is trained through the labeled third training sample.
The text area is an area including text in the image.
The input feature map can be received through a character target positioning network, the feature map is slid once in the feature map according to a sliding window with a preset size, and whether the area of the feature map in the sliding window currently comprises characters is identified every time the sliding window slides once, so that a character area is obtained.
Optionally, the text object localization network includes a second convolution kernel, where the convolution kernel is a two-dimensional matrix and is used to extract features in the image, and a ratio of a row-column ratio of the second convolution kernel is less than 1/2 or greater than 2.
Since human text is formed by sorting a plurality of characters in a specific direction, for example, arranging the characters in rows or columns, when the rows or columns are full, another row or column is switched, that is, the distribution of the text in the image is in a strip shape, in order to make the sliding window of the text identification region more consistent with the character distribution characteristics, so as to improve the accuracy of the text identification region, the text target positioning network may include a second convolution kernel with a row-column ratio smaller than 1/2 or larger than 2.
For example, the row-column ratio of the second convolution kernel may be 1: 3. 1: 4. 3: 1 or 4: 1, etc.
When the character area is identified, the character information included in the character area can be identified and obtained from the character area, and semantic and/or grammar identification is carried out on the character information, so that a character identification result is obtained.
Alternatively, the text region may be identified by a text recognition neural network.
The character recognition neural network is a convolution neural network which recognizes character information in a character area and performs semantic and/or grammar recognition on the character information. As can be seen from the foregoing advantages of the convolutional neural network, in order to improve the efficiency and accuracy of recognition, the character recognition can be performed on the feature map through the character recognition neural network.
It should be noted that the character recognition neural network may be obtained by determining in advance, for example, a plurality of images including characters may be obtained in advance as a fourth training sample, the fourth training sample is labeled according to whether the characters included in the images are normal, and the character area recognition by the character recognition neural network is trained through the labeled fourth training sample.
In addition, in practical applications, the feature map may be subjected to character recognition in other manners, for example, the feature map may be subjected to character recognition by other types of neural network models or machine learning models.
In addition, in another optional embodiment of the application, the image to be recognized may be subjected to character recognition directly through the character target location network, and/or may be subjected to character recognition directly through the character target location network.
And step 304, determining whether the image to be recognized is an abnormal image or not based on the character recognition result and the character recognition result.
Optionally, the person recognition result and the character recognition result can be input into a probabilistic neural network; and determining the probability that the image to be identified is an abnormal image through a probabilistic neural network.
The probability neural network is a convolution neural network which calculates and determines the recognition result of the image to be recognized according to the person recognition result and the character recognition result. As can be seen from the advantages of the convolutional neural network, in order to improve the efficiency and accuracy of recognition, the character recognition can be performed on the feature map through the character target positioning network.
The probabilistic neural network may be a single-layer fully-connected layer neural network.
It should be noted that the probabilistic neural network may be obtained through advance determination, for example, the person recognition result and the character recognition result of a plurality of images may be obtained in advance as a fifth training sample, the fifth training sample is labeled according to the recognition result of each image, and the fifth training sample after labeling is used to train whether the image determined by the probabilistic neural network is an abnormal image.
For example, the person recognition result includes a normal probability of 30%, a pornographic probability of 40% and a terrorism probability of 40%, the character recognition result includes a normal probability of 50%, a pornographic probability of 40% and a terrorism probability of 10%, the person recognition result and the character recognition result are output to a probabilistic neural network, and the probability that the image to be recognized is a normal image and the probability that the image to be recognized is an abnormal image are obtained by the probabilistic neural network as 40% and 60%.
In addition, in practical application, the probability that the image to be recognized is an abnormal image may also be determined by other manners according to the person recognition result and the character recognition result, for example, the probability that the image to be recognized is an abnormal image may also be determined by other types of neural network models or machine learning models.
In addition, in the above steps, the image to be recognized is recognized through a plurality of convolutional neural networks, so as to obtain the probability that the image to be recognized is the abnormal image, but in another alternative embodiment of the present application, the image to be recognized may also be recognized through more or fewer convolutional neural networks, for example, the image to be recognized may be input into a preset convolutional neural network, the image to be recognized is recognized through the preset convolutional neural network, and the recognition result of the image to be recognized (for example, the probability that the image to be recognized is the abnormal image) is output.
In the embodiment of the application, firstly, whether the image is an abnormal image depends on the information content carried by the image, so that the image to be recognized is obtained, the person recognition and the character recognition are performed on the image to be recognized, so that a person recognition result and a character recognition result are obtained, and then whether the image to be recognized is the abnormal image is determined based on the person recognition result and the character recognition result, that is, the image to be recognized is recognized based on the information content of the person, the character and the like included in the image to be recognized, so that the recognition accuracy is improved.
Secondly, because the object of the perception image is usually human, the image characteristics in the image can accord with the human brain visual system, and the convolution neural network can directly convolute the pixels in the image, so that the image characteristics are extracted from the image, the processing mode is closer to the processing result of the human brain visual system, and the convolution neural network comprises fewer parameters and is simple in training process, so that the image to be recognized is recognized through one or more convolution neural networks, and the recognition accuracy and efficiency are improved.
Fig. 4 is a block diagram illustrating an abnormal image recognition apparatus according to some embodiments of the present application, which implements functions corresponding to the steps performed by the above-described method. The apparatus may be understood as the above-mentioned server, or a processor of the server, or may be understood as a component that is independent from the above-mentioned server or processor and implements the functions of the present application under the control of the server, as shown in the figure, the abnormal image recognition apparatus 400 may include:
an obtaining module 401, configured to obtain an image to be identified;
the recognition module 402 is configured to perform person recognition on the image to be recognized to obtain a person recognition result, and perform character recognition on the image to be recognized to obtain a character recognition result;
a determining module 403, configured to determine whether the image to be recognized is an abnormal image based on the person recognition result and the character recognition result.
Optionally, the identifying module 402 is specifically configured to:
extracting the image characteristics of the image to be recognized to obtain a characteristic diagram corresponding to the image to be recognized;
and performing character recognition on the feature map to obtain a character recognition result, and performing character recognition on the feature map to obtain a character recognition result.
Optionally, the identifying module 402 is specifically configured to:
and extracting the feature map from the image to be identified through a shared convolutional neural network.
Optionally, the identification module is specifically configured to:
and carrying out character recognition on the characteristic diagram through a character target positioning network to obtain a character recognition result.
Optionally, the human target location network includes a first convolution kernel, where the convolution kernel is a two-dimensional matrix and is used to extract features in the image, and a ratio of a row-column ratio of the first convolution kernel is greater than or equal to 1/2 and less than or equal to 2.
Optionally, the identifying module 402 is specifically configured to:
identifying the characteristic diagram through a character target positioning network to determine a character area;
and identifying the character area to obtain the character identification result.
Optionally, the text object localization network includes a second convolution kernel, where the convolution kernel is a two-dimensional matrix and is used to extract features in the image, and a ratio of a row-column ratio of the second convolution kernel is less than 1/2 or greater than 2.
Optionally, the identifying module 402 is specifically configured to:
and identifying the character area through a character identification neural network.
Optionally, the determining module 403 is specifically configured to:
inputting the figure recognition result and the character recognition result into a probabilistic neural network;
and determining the probability that the image to be identified is an abnormal image through the probabilistic neural network.
The above-mentioned apparatus is used for executing the method provided by the foregoing embodiment, and the implementation principle and technical effect are similar, which are not described herein again.
The modules may be connected or in communication with each other via a wired or wireless connection. The wired connection may include a metal cable, an optical cable, a hybrid cable, etc., or any combination thereof. The wireless connection may include a connection via (Local Area Network, LAN), Wide Area Network (WAN), bluetooth, ZigBee, Near Field Communication (NFC), or the like, or any combination thereof. Two or more modules may be combined into a single module, and any one module may be divided into two or more units.
Fig. 5 is a schematic diagram of a functional module of an electronic device provided in the present application. The electronic device may include a computer-readable storage medium 501 storing a computer program and a processor 502, and the processor 502 may call the computer program stored in the computer-readable storage medium 501. When the computer program is read and executed by the processor 502, the abnormal image recognition method provided by the present application may be implemented, including:
acquiring an image to be identified;
carrying out character recognition on the image to be recognized to obtain a character recognition result, and carrying out character recognition on the image to be recognized to obtain a character recognition result;
and determining whether the image to be recognized is an abnormal image or not based on the character recognition result and the character recognition result.
Optionally, performing person recognition on the image to be recognized to obtain a person recognition result, and performing character recognition on the image to be recognized to obtain a character recognition result, including:
extracting the image characteristics of the image to be recognized to obtain a characteristic diagram corresponding to the image to be recognized;
and performing character recognition on the feature map to obtain a character recognition result, and performing character recognition on the feature map to obtain a character recognition result.
Optionally, extracting image features of the image to be recognized to obtain a feature map corresponding to the image to be recognized, including:
and extracting the feature map from the image to be identified through a shared convolutional neural network.
Optionally, performing person identification on the feature map to obtain the person identification result, including:
and carrying out character recognition on the characteristic diagram through a character target positioning network to obtain a character recognition result.
Optionally, the human target location network includes a first convolution kernel, where the convolution kernel is a two-dimensional matrix and is used to extract features in the image, and a ratio of a row-column ratio of the first convolution kernel is greater than or equal to 1/2 and less than or equal to 2.
Optionally, performing text recognition on the feature map to obtain the text recognition result, including:
identifying the characteristic diagram through a character target positioning network to determine a character area;
and identifying the character area to obtain the character identification result.
Optionally, the text object localization network includes a second convolution kernel, where the convolution kernel is a two-dimensional matrix and is used to extract features in the image, and a ratio of a row-column ratio of the second convolution kernel is less than 1/2 or greater than 2.
Optionally, identifying the text area includes:
and identifying the character area through a character identification neural network.
Optionally, determining whether the image to be recognized is an abnormal image based on the person recognition result and the character recognition result includes:
inputting the figure recognition result and the character recognition result into a probabilistic neural network;
and determining the probability that the image to be identified is an abnormal image through the probabilistic neural network.
Optionally, the present application further provides a computer-readable storage medium, on which a computer program is stored, and when the computer program is read and executed by a processor, the above method embodiments can be implemented.
It can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working processes of the system and the apparatus described above may refer to corresponding processes in the method embodiments, and are not described in detail in this application. In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. The above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is merely a logical division, and there may be other divisions in actual implementation, and for example, a plurality of modules or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of devices or modules through some communication interfaces, and may be in an electrical, mechanical or other form.
The modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a non-volatile computer-readable storage medium executable by a processor. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a U disk, a removable hard disk, a ROM, a RAM, a magnetic disk, or an optical disk.
The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (20)

1. An abnormal image recognition method, comprising:
acquiring an image to be identified;
carrying out character recognition on the image to be recognized to obtain a character recognition result, and carrying out character recognition on the image to be recognized to obtain a character recognition result;
and determining whether the image to be recognized is an abnormal image or not based on the character recognition result and the character recognition result.
2. The method of claim 1, wherein the performing character recognition on the image to be recognized to obtain a character recognition result and performing character recognition on the image to be recognized to obtain a character recognition result comprises:
extracting image features of the image to be recognized to obtain a feature map corresponding to the image to be recognized;
and performing character recognition on the feature map to obtain the character recognition result, and performing character recognition on the feature map to obtain the character recognition result.
3. The method according to claim 2, wherein the extracting the image features of the image to be recognized to obtain the feature map corresponding to the image to be recognized comprises:
and extracting the feature map from the image to be identified through a shared convolutional neural network.
4. The method of claim 2, wherein the performing the person recognition on the feature map to obtain the person recognition result comprises:
and carrying out figure identification on the characteristic diagram through a figure target positioning network to obtain the figure identification result.
5. The method of claim 4, wherein the human target location network comprises a first convolution kernel, wherein the convolution kernel is a two-dimensional matrix for extracting features from the image, and wherein a ratio of a row-to-column ratio of the first convolution kernel is greater than or equal to 1/2 and less than or equal to 2.
6. The method of claim 2, wherein the performing the character recognition on the feature map to obtain the character recognition result comprises:
identifying the characteristic diagram through a character target positioning network to determine a character area;
and identifying the character area to obtain the character identification result.
7. The method of claim 6, wherein the text object localization network comprises a second convolution kernel, wherein the convolution kernel is a two-dimensional matrix for extracting features in the image, and a ratio of a row-column ratio of the second convolution kernel is less than 1/2 or greater than 2.
8. The method of claim 6, wherein the identifying the text region comprises:
and identifying the character area through a character identification neural network.
9. The method according to any one of claims 1 to 8, wherein the determining whether the image to be recognized is an abnormal image based on the person recognition result and the character recognition result includes:
inputting the figure recognition result and the character recognition result into a probabilistic neural network;
and determining the probability that the image to be identified is an abnormal image through the probabilistic neural network.
10. An abnormal image recognition apparatus, comprising:
the acquisition module is used for acquiring an image to be identified;
the recognition module is used for carrying out character recognition on the image to be recognized to obtain a character recognition result and carrying out character recognition on the image to be recognized to obtain a character recognition result;
and the determining module is used for determining whether the image to be recognized is an abnormal image or not based on the character recognition result and the character recognition result.
11. The apparatus of claim 10, wherein the identification module is specifically configured to:
extracting image features of the image to be recognized to obtain a feature map corresponding to the image to be recognized;
and performing character recognition on the feature map to obtain the character recognition result, and performing character recognition on the feature map to obtain the character recognition result.
12. The apparatus of claim 11, wherein the identification module is specifically configured to:
and extracting the feature map from the image to be identified through a shared convolutional neural network.
13. The apparatus of claim 11, wherein the identification module is specifically configured to:
and carrying out figure identification on the characteristic diagram through a figure target positioning network to obtain the figure identification result.
14. The apparatus of claim 13, wherein the human target location network comprises a first convolution kernel, wherein the first convolution kernel is a two-dimensional matrix for extracting features from the image, and wherein a ratio of a row-to-column ratio of the first convolution kernel is greater than or equal to 1/2 and less than or equal to 2.
15. The apparatus of claim 11, wherein the identification module is specifically configured to:
identifying the characteristic diagram through a character target positioning network to determine a character area;
and identifying the character area to obtain the character identification result.
16. The apparatus of claim 15, wherein the text object localization network comprises a second convolution kernel, wherein the convolution kernel is a two-dimensional matrix for extracting features in the image, and a ratio of a row-to-column ratio of the second convolution kernel is less than 1/2 or greater than 2.
17. The apparatus of claim 15, wherein the identification module is specifically configured to:
and identifying the character area through a character identification neural network.
18. The apparatus according to any of claims 10-17, wherein the determining module is specifically configured to:
inputting the figure recognition result and the character recognition result into a probabilistic neural network;
and determining the probability that the image to be identified is an abnormal image through the probabilistic neural network.
19. An electronic device, comprising: a processor, a storage medium and a bus, the storage medium storing machine-readable instructions executable by the processor, the processor and the storage medium communicating via the bus when the electronic device is operating, the processor executing the machine-readable instructions to perform the steps of the abnormal image recognition method according to any one of claims 1 to 9.
20. A computer-readable storage medium, characterized in that a computer program is stored on the computer-readable storage medium, which computer program, when being executed by a processor, performs the steps of the abnormal image recognition method according to any one of claims 1 to 9.
CN201910140017.XA 2019-02-26 2019-02-26 Abnormal image recognition method and device, electronic equipment and storage medium Pending CN111611828A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910140017.XA CN111611828A (en) 2019-02-26 2019-02-26 Abnormal image recognition method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910140017.XA CN111611828A (en) 2019-02-26 2019-02-26 Abnormal image recognition method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN111611828A true CN111611828A (en) 2020-09-01

Family

ID=72205136

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910140017.XA Pending CN111611828A (en) 2019-02-26 2019-02-26 Abnormal image recognition method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN111611828A (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106339719A (en) * 2016-08-22 2017-01-18 微梦创科网络科技(中国)有限公司 Image identification method and image identification device
CN108124191A (en) * 2017-12-22 2018-06-05 北京百度网讯科技有限公司 A kind of video reviewing method, device and server
US20180204562A1 (en) * 2015-09-08 2018-07-19 Beijing Baidu Netcom Science And Technology Co., Ltd. Method and device for image recognition
CN108665457A (en) * 2018-05-16 2018-10-16 腾讯科技(深圳)有限公司 Image-recognizing method, device, storage medium and computer equipment

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180204562A1 (en) * 2015-09-08 2018-07-19 Beijing Baidu Netcom Science And Technology Co., Ltd. Method and device for image recognition
CN106339719A (en) * 2016-08-22 2017-01-18 微梦创科网络科技(中国)有限公司 Image identification method and image identification device
CN108124191A (en) * 2017-12-22 2018-06-05 北京百度网讯科技有限公司 A kind of video reviewing method, device and server
CN108665457A (en) * 2018-05-16 2018-10-16 腾讯科技(深圳)有限公司 Image-recognizing method, device, storage medium and computer equipment

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
黄宏伟 等: "《城市地铁盾构隧道病害快速检测与工程实践》", 上海科学技术出版社, pages: 1 *

Similar Documents

Publication Publication Date Title
US10936919B2 (en) Method and apparatus for detecting human face
CN109816441B (en) Policy pushing method, system and related device
CN111126258A (en) Image recognition method and related device
CN111597884A (en) Facial action unit identification method and device, electronic equipment and storage medium
CN108399386A (en) Information extracting method in pie chart and device
CN107871314B (en) Sensitive image identification method and device
CN111178355B (en) Seal identification method, device and storage medium
CN112883902B (en) Video detection method and device, electronic equipment and storage medium
WO2021238548A1 (en) Region recognition method, apparatus and device, and readable storage medium
US20220406090A1 (en) Face parsing method and related devices
WO2019062081A1 (en) Salesman profile formation method, electronic device and computer readable storage medium
CN116311214B (en) License plate recognition method and device
CN111696080A (en) Face fraud detection method, system and storage medium based on static texture
CN114022748B (en) Target identification method, device, equipment and storage medium
CN108764248B (en) Image feature point extraction method and device
CN111488887B (en) Image processing method and device based on artificial intelligence
CN111626313B (en) Feature extraction model training method, image processing method and device
CN116798041A (en) Image recognition method and device and electronic equipment
CN111611828A (en) Abnormal image recognition method and device, electronic equipment and storage medium
CN112651351B (en) Data processing method and device
CN115457581A (en) Table extraction method and device and computer equipment
CN110942056A (en) Clothing key point positioning method and device, electronic equipment and medium
CN112801238A (en) Image classification method and device, electronic equipment and storage medium
Cong et al. Salient man-made object detection based on saliency potential energy for unmanned aerial vehicles remote sensing image
CN116580396B (en) Cell level identification method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination