CN117407562A

CN117407562A - Image recognition method, system and electronic equipment

Info

Publication number: CN117407562A
Application number: CN202311714687.0A
Authority: CN
Inventors: 高在伟
Original assignee: Hangzhou Hikvision Digital Technology Co Ltd
Current assignee: Hangzhou Hikvision Digital Technology Co Ltd
Priority date: 2023-12-13
Filing date: 2023-12-13
Publication date: 2024-01-16
Anticipated expiration: 2043-12-13
Also published as: CN117407562B

Abstract

The embodiment of the application provides an image identification method, an image identification system and electronic equipment. The method comprises the following steps: determining a non-tampered region in the first image that has not been tampered with; displaying a first image marked with a non-tampered region; in response to a region selection operation for the first image, determining a region to be identified selected by the region selection operation in the first image; determining alternative retrieval attributes of the objects existing in the area to be identified in the preset retrieval attributes; displaying an attribute selection interface containing alternative retrieval attributes; responding to the attribute selection operation aiming at the attribute selection interface, and identifying the alternative retrieval attribute selected by the attribute selection operation as a target retrieval attribute; acquiring an attribute value of a target retrieval attribute of a region to be identified, and taking the attribute value as a target attribute value; and searching an object, of which the attribute value of the target retrieval attribute is matched with the target attribute value, in the database, and taking the object as a recognition result. By adopting the embodiment, the accuracy of the image recognition result can be improved.

Description

Image recognition method, system and electronic equipment

Technical Field

The present disclosure relates to the field of intelligent recognition technologies, and in particular, to an image recognition method, system, and electronic device.

Background

Currently, a user may acquire image information related to or similar to an image by recognizing contents in the image, and this manner of acquiring the image information is hereinafter referred to as image recognition. However, AI (Artificial Intelligence ) forgery technology is continuously upgraded, and there are many phenomena in which contents in images are falsified or tampered with by AI, resulting in that the contents in images do not coincide with actual contents. Since the content in the image is tampered, if the image is directly identified, related information of the tampered image content may be obtained, that is, information unrelated to the original image content is obtained, thereby reducing the accuracy of image identification. And the image information obtained by directly identifying the image is wider, so that the user can not acquire the interested image information from the image information, and the accuracy of image identification is reduced.

Disclosure of Invention

An object of the embodiment of the application is to provide an image recognition method, an image recognition system and electronic equipment, so as to improve accuracy of image recognition results. The specific technical scheme is as follows:

the embodiment of the application provides an image identification method, which comprises the following steps:

Determining a non-tampered region in the first image that has not been tampered with;

displaying the first image marked with the non-tampered region;

determining a region to be identified selected by the region selection operation in the first image in response to the region selection operation for the first image;

determining alternative retrieval attributes of the objects existing in the area to be identified in preset retrieval attributes;

displaying an attribute selection interface containing the alternative retrieval attribute;

responding to an attribute selection operation aiming at the attribute selection interface, and identifying an alternative retrieval attribute selected by the attribute selection operation as a target retrieval attribute;

acquiring an attribute value of the target retrieval attribute of the region to be identified as a target attribute value;

searching for an object, of which the attribute value of the target retrieval attribute is matched with the target attribute value, from a database in which each object and the attribute value of the preset retrieval attribute of each object are stored in advance, and taking the object as a recognition result.

In a possible embodiment, the first image is a video frame in a video stream, and the target retrieval attribute is a walking gesture;

the obtaining the attribute value of the target retrieval attribute of the region to be identified as a target attribute value includes:

And determining the attribute value of the walking gesture of the area to be identified as a target attribute value according to the first image and a plurality of video frames adjacent to the first image in the video stream.

In one possible embodiment, the method further comprises:

sequentially displaying the thumbnail information of each recognition result according to the sequence from high to low of the matching degree of the attribute value of the target retrieval attribute of the recognition result and the target attribute value;

responding to the view operation of the thumbnail information, and identifying an identification result corresponding to the thumbnail information selected by the view operation;

and displaying the detailed information of the identification result.

In a possible embodiment, the determining a non-tampered region that is not tampered with in the first image comprises:

acquiring a second image obtained by carrying out image enhancement on the first image;

inputting the first image and the second image into a deep learning double-flow detection model to obtain an output result of the deep learning double-flow detection model, wherein the output result comprises the confidence of a candidate tampered region in the first image;

and determining the candidate tampered region with the confidence level not smaller than a preset confidence level threshold as a tampered region in the first image.

In a possible embodiment, the first image is a single image, and the method further includes:

displaying two first images spliced with each other, wherein one first image is marked with the non-tampered area, and the other first image is not marked with the non-tampered area;

or,

the first image is each video frame in a video clip, the method further comprising:

and displaying two mutually spliced video playing windows, wherein the two video playing windows synchronously play the video clips, the non-tampered area is marked in a video frame played by one video playing window, and the non-tampered area is not marked in a video frame played by the other video playing window.

In one possible embodiment, the presenting the first image labeled with the non-tampered region comprises:

performing image processing on the tampered region in the first image to obtain a third image; wherein the image processing includes: selecting the tampered area, mosaic processing, painting processing and binarization processing;

the third image is shown.

The embodiment of the application also provides an image recognition system, which comprises:

A deep learning image authenticity processing unit, configured to determine a non-tampered area that is not tampered in the first image;

a business processing unit, configured to display the first image marked with the non-tampered area; determining a region to be identified selected by the region selection operation in the first image in response to the region selection operation for the first image; determining alternative retrieval attributes of the objects existing in the area to be identified in preset retrieval attributes; displaying an attribute selection interface containing the alternative retrieval attribute; responding to an attribute selection operation aiming at the attribute selection interface, and identifying an alternative retrieval attribute selected by the attribute selection operation as a target retrieval attribute; acquiring an attribute value of the target retrieval attribute of the region to be identified as a target attribute value; searching for an object, of which the attribute value of the target retrieval attribute is matched with the target attribute value, from a database in which each object and the attribute value of the preset retrieval attribute of each object are stored in advance, and taking the object as a recognition result.

In one possible embodiment, the system further comprises:

a data receiving unit configured to receive the first image;

The image processing unit is used for carrying out image enhancement on the first image to obtain a second image;

the deep learning image authenticity processing unit is specifically used for acquiring the second image; inputting the first image and the second image into a deep learning double-flow detection model to obtain an output result of the deep learning double-flow detection model, wherein the output result comprises the confidence level of a candidate tampered region in the first image, and the output result is sent to an alarm unit;

the alarm unit is used for responding to the output result, determining the candidate tampered area with the confidence coefficient not smaller than a preset confidence coefficient threshold value in the output result as a tampered area in the first image, and alarming the tampered area;

a storage management unit for storing data;

and the configuration management unit is used for configuring the image processing unit, the deep learning image authenticity processing unit, the alarm unit, the service processing unit and the storage management unit.

In a possible embodiment, the service processing unit is further configured to present the first image marked with the non-tampered region in response to an identification operation for the alarm unit.

The embodiment of the application also provides an image recognition device, which comprises:

a non-tampered region determining module configured to determine a non-tampered region that is not tampered in the first image;

a first image display module for displaying the first image marked with the non-tampered region;

the to-be-identified area determining module is used for responding to the area selection operation aiming at the first image and determining the to-be-identified area selected by the area selection operation in the first image;

the candidate retrieval attribute determining module is used for determining candidate retrieval attributes of the objects existing in the area to be identified in preset retrieval attributes;

the attribute selection interface display module is used for displaying an attribute selection interface containing the alternative retrieval attribute;

the target retrieval attribute selection module is used for responding to the attribute selection operation aiming at the attribute selection interface and identifying the alternative retrieval attribute selected by the attribute selection operation as a target retrieval attribute;

the target attribute value acquisition module is used for acquiring the attribute value of the target retrieval attribute of the area to be identified as a target attribute value;

and the identification result searching module is used for searching an object, of which the attribute value is matched with the target attribute value, in a database which is pre-stored with each object and the attribute value of the preset retrieval attribute of each object, and taking the object as an identification result.

the target attribute value obtaining module obtains an attribute value of the target retrieval attribute of the region to be identified as a target attribute value, including:

In one possible embodiment, the apparatus further comprises:

the thumbnail information display module is used for sequentially displaying thumbnail information of each recognition result according to the sequence of the matching degree of the attribute value of the target retrieval attribute of the recognition result and the target attribute value from high to low;

the thumbnail information identification module is used for responding to the view operation of the thumbnail information and identifying an identification result corresponding to the thumbnail information selected by the view operation;

and the detailed information display module is used for displaying the detailed information of the identification result.

In one possible embodiment, the non-tampered region determining module determining a non-tampered region that is not tampered with in the first image includes:

In a possible embodiment, the first image is a single image, and the apparatus further includes:

the first splicing display module is used for displaying two first images spliced with each other, wherein one first image is marked with the non-tampered area, and the other first image is not marked with the non-tampered area;

or,

the first image is each video frame in a video clip, the apparatus further comprising:

the second spliced display module is used for displaying two video playing windows spliced with each other, the two video playing windows synchronously play the video clips, the non-tampered area is marked in a video frame played by one video playing window, and the non-tampered area is not marked in a video frame played by the other video playing window.

In one possible embodiment, the first image displaying module displays the first image labeled with the non-tampered region, including:

the third image is shown.

The embodiment of the application also provides electronic equipment, which comprises:

a memory for storing a computer program;

and the processor is used for realizing any one of the image recognition methods when executing the program stored in the memory.

Embodiments of the present application also provide a computer readable storage medium having a computer program stored therein, which when executed by a processor, implements any of the above-described image recognition methods.

Embodiments of the present application also provide a computer program product comprising instructions which, when run on a computer, cause the computer to perform any of the above-described image recognition methods.

The beneficial effects of the embodiment of the application are that:

according to the image identification method, the system and the electronic equipment, the non-tampered area which is not tampered in the first image can be determined, the first image marked with the non-tampered area is displayed, and the area to be identified selected by the area selection operation is determined in the first image in response to the area selection operation for the first image; determining alternative retrieval attributes of the objects existing in the area to be identified in the preset retrieval attributes; displaying an attribute selection interface containing alternative retrieval attributes; and in response to the attribute selection operation aiming at the attribute selection interface, identifying the candidate retrieval attribute selected by the attribute selection operation as a target retrieval attribute, acquiring an attribute value of the target retrieval attribute of the area to be identified as a target attribute value, and searching an object, of which the attribute value is matched with the target attribute value, in a database of which the attribute values of the preset retrieval attributes of all objects are stored in advance as a target attribute value, as an identification result. When the target retrieval attribute is determined, the candidate retrieval attribute of the object in the area to be identified can be determined in the preset retrieval attributes, and the candidate retrieval attribute can be regarded as the preset retrieval attribute possibly selected by the user, so that the target retrieval attribute of the content interested by the user can be rapidly determined in a plurality of preset retrieval attributes by recommending the preset retrieval attribute for the user to select by the user. Because the database is established based on the attribute value of the preset retrieval attribute of each object, and the target retrieval attribute of the area to be identified is determined from the preset retrieval attribute, the target attribute value is the attribute value of the target retrieval attribute of the area to be identified, and therefore, the relevant information of the content interested by the user can be obtained more accurately by matching the attribute value of the target retrieval attribute of each object in the database with the target attribute value, screening each object in the database, and taking the object with the attribute value of the target retrieval attribute matched with the target attribute value in the database as the identification result. Therefore, the embodiment of the application can determine the characteristics of the content interested by the user by determining the target retrieval attribute, and more accurately determine the related information of the content interested by the user by matching the attribute value of the target retrieval attribute of each object in the database with the target attribute value, thereby improving the accuracy of image recognition.

Of course, not all of the above-described advantages need be achieved simultaneously in practicing any one of the products or methods of the present application.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the following description will briefly introduce the drawings that are required to be used in the embodiments or the description of the prior art, and it is obvious that the drawings in the following description are only some embodiments of the present application, and other embodiments may also be obtained according to these drawings to those skilled in the art.

Fig. 1 is a schematic flow chart of an image recognition method provided in the present application;

FIG. 2a is a schematic illustration of a first image provided herein labeled with a non-tampered region;

FIG. 2b is a schematic view of a first image for determining a region to be identified provided in the present application;

FIG. 2c is a schematic diagram illustrating a property selection interface display process provided in the present application;

FIG. 3 is a schematic flow chart of another image recognition method provided in the present application;

FIG. 4 is a schematic diagram of a tiled display of a tampered first image provided herein;

FIG. 5 is a schematic flow chart of another image recognition method provided in the present application;

FIG. 6 is a schematic structural diagram of an image recognition system provided in the present application;

FIG. 7 is a schematic flow chart of another image recognition method provided in the present application;

FIG. 8 is a schematic structural diagram of an image recognition device provided in the present application;

fig. 9 is a schematic structural diagram of an electronic device provided in the present application.

Detailed Description

The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are only some, but not all, of the embodiments of the present application. Based on the embodiments herein, a person of ordinary skill in the art would be able to obtain all other embodiments based on the disclosure herein, which are within the scope of the disclosure herein.

In order to improve accuracy of image recognition results, the embodiment of the application provides an image recognition method, an image recognition system and electronic equipment. An exemplary description will be first made of an image recognition method provided in an embodiment of the present application.

The image recognition method provided in the embodiment of the present application, as shown in fig. 1, may specifically include the following steps:

s101, determining a non-tampered region that is not tampered with in the first image.

S102, displaying the first image marked with the non-tampered region.

S103, in response to the region selection operation for the first image, determining a region to be identified selected by the region selection operation in the first image.

S104, determining the candidate retrieval attribute of the object existing in the area to be identified in the preset retrieval attribute.

S105, displaying an attribute selection interface containing the alternative retrieval attributes.

S106, in response to the attribute selection operation aiming at the attribute selection interface, identifying the candidate retrieval attribute selected by the attribute selection operation as a target retrieval attribute.

S107, acquiring an attribute value of a target retrieval attribute of the area to be identified as a target attribute value.

S108, searching for an object, of which the attribute value of the target retrieval attribute is matched with the target attribute value, from a database pre-stored with the attribute values of the objects and the preset retrieval attributes of the objects, and taking the object as a recognition result.

By adopting the embodiment, the non-tampered area which is not tampered in the first image can be determined, the first image marked with the non-tampered area is displayed, and the area to be identified selected by the area selection operation is determined in the first image in response to the area selection operation for the first image; determining alternative retrieval attributes of the objects existing in the area to be identified in the preset retrieval attributes; displaying an attribute selection interface containing alternative retrieval attributes; and in response to the attribute selection operation aiming at the attribute selection interface, identifying the candidate retrieval attribute selected by the attribute selection operation as a target retrieval attribute, acquiring an attribute value of the target retrieval attribute of the area to be identified as a target attribute value, and searching an object, of which the attribute value is matched with the target attribute value, in a database of which the attribute values of the preset retrieval attributes of all objects are stored in advance as a target attribute value, as an identification result. When the target retrieval attribute is determined, the candidate retrieval attribute of the object in the area to be identified can be determined in the preset retrieval attributes, and the candidate retrieval attribute can be regarded as the preset retrieval attribute possibly selected by the user, so that the target retrieval attribute of the content interested by the user can be rapidly determined in a plurality of preset retrieval attributes by recommending the preset retrieval attribute for the user to select by the user. Because the database is established based on the attribute value of the preset retrieval attribute of each object, and the target retrieval attribute of the area to be identified is determined from the preset retrieval attribute, the target attribute value is the attribute value of the target retrieval attribute of the area to be identified, and therefore, the relevant information of the content interested by the user can be obtained more accurately by matching the attribute value of the target retrieval attribute of each object in the database with the target attribute value, screening each object in the database, and taking the object with the attribute value of the target retrieval attribute matched with the target attribute value in the database as the identification result. Therefore, the embodiment of the application can determine the characteristics of the content interested by the user by determining the target retrieval attribute, and more accurately determine the related information of the content interested by the user by matching the attribute value of the target retrieval attribute of each object in the database with the target attribute value, thereby improving the accuracy of image recognition.

The foregoing S101 to S108 will be described below, respectively, in which:

in S101, it is assumed that the first image includes the object a, the object B, and the object C, where a part of the features of the object a are tampered to be the features corresponding to the object D, and when the user is interested in the object a, the user needs to identify the object a in the first image, so as to obtain the related information of the object a. Since part of the features of the object a are tampered into the features corresponding to the object D, when the object a in the first image is directly identified, the image identification is performed according to the tampered features of the object a, that is, the image identification is performed according to the features of the object D, so that the obtained identification result contains the relevant information of the object D, and the result of the image identification is not accurate enough.

Illustratively, while watching a video, a user is interested in a certain function of a commodity of brand a shown in the video, thinking about determining the commodity with the function by recognition, but the trademark of the commodity in the video is tampered with as the trademark of brand B, and the commodity of brand B does not have the function shown in the video. If the user directly identifies the commodity image in the video, the commodity information of the brand B is obtained, and the commodity of the brand B does not have the function displayed in the video, so that the user cannot obtain the commodity information having the function displayed in the video, and the user cannot obtain the identification result which the user hopes to obtain.

Therefore, in order to make the image recognition result more accurate, the non-tampered area which is not tampered needs to be determined in the first image and displayed, so that a user can distinguish the tampered area from the non-tampered area in the first image, thereby avoiding the situation that the user obtains irrelevant information due to whether the uncertain image is tampered or not, and further improving the accuracy of the recognition result.

For the manner in which the non-tampered region that has not been tampered with is determined in the first image, in one possible embodiment, a second image may be obtained that is obtained by image enhancement of the first image; inputting the first image and the second image into a deep learning double-current detection model to obtain an output result of the deep learning double-current detection model, wherein the output result comprises the confidence coefficient of the candidate tampered region in the first image; and determining the candidate tampered region with the confidence level not smaller than a preset confidence level threshold as a tampered region in the first image.

In particular, the manner of image enhancement may include image decoding, and/or image scaling, and/or spatial domain high pass filtering, and/or frequency domain high pass filtering. If the acquired first image is encoded and packaged image data, image decoding is needed for the first image. The first image is processed through spatial domain high-pass filtering and/or frequency domain high-pass filtering, so that detail features of the first image can be highlighted, blurred edges in the first image are enhanced, and image enhancement is achieved.

Because the image features in the first image are blurred, the accuracy of the image features extracted according to the first image is low, if the first image is only input into the deep learning double-flow detection model to detect the tampered region, the accuracy of the image features extracted by the deep learning double-flow detection model is low, so that the accuracy of the output result of the deep learning double-flow detection model is low, and the accuracy of the determined non-tampered region in the first image is low.

And the second image is obtained by image enhancement of the first image, then a part of the image characteristics of the second image are changed, which is different from the image characteristics of the first image. Therefore, if only the second image is input into the deep learning double-current detection model to detect the tampered region, the image features extracted by the deep learning double-current detection model may be different from those of the first image, so that the accuracy of the image features extracted by the deep learning double-current detection model is lower, the accuracy of the output result of the deep learning double-current detection model is lower, and the accuracy of the determined non-tampered region in the first image is lower.

Therefore, the second image and the first image need to be input into the deep learning double-flow detection model, so that the image features extracted by the deep learning double-flow detection model are obtained by integrating the image features of the first image and the image features of the second image, the accuracy of the image features extracted by the deep learning double-flow detection model is higher, the accuracy of the output result of the deep learning double-flow detection model is higher, the accuracy of the determined non-tampered region in the first image is higher, and the accuracy of image recognition is improved.

It can be understood that, for the first image, the tampered area and the non-tampered area are complementary, and the tampered area in the first image is determined, so that the other areas except the tampered area in the first image are non-tampered areas which are not tampered in the first image. Thus, determining a tampered region in the first image is equivalent to determining a non-tampered region in the first image that has not been tampered with.

If the confidence coefficient of the candidate tampered area is not smaller than the preset confidence coefficient threshold value, determining the candidate tampered area as a tampered area in the first image, wherein other areas except the tampered area in the first image are non-tampered areas which are not tampered in the first image. The preset confidence threshold may be set according to past experience or actual requirements, for example, the preset confidence threshold may be 0.5, 0.6, 0.65, and so on.

With this embodiment, compared with other RCNN detection models, the deep learning dual-flow detection model (fast-RCNN) has a Faster detection speed and a higher accuracy of detection results, so that the rate and accuracy of determining the tampered region in the first image can be improved by the deep learning dual-flow detection model. And as described above, determining the tampered region in the first image is equivalent to determining the non-tampered region in the first image, so that the speed and accuracy of determining the non-tampered region in the first image can be improved through the deep learning double-flow detection model, and the accuracy of image identification can be further improved.

In another possible embodiment, the non-tampered region in the first image may also be determined by a detection model other than a deep learning dual stream detection model.

In S102, when the first image marked with the non-tampered area is displayed, the non-tampered area in the displayed first image needs to be clearly distinguished from the tampered area, so that the situation that the user obtains irrelevant information due to whether the image is tampered or not is avoided.

In order to enable the user to more clearly distinguish the non-tampered region from the tampered region in the first image, in a possible embodiment, the tampered region in the first image may be subjected to image processing to obtain a third image; the image processing includes: selecting a tampered area, performing mosaic processing, performing color coating processing and performing binarization processing; a third image is shown.

Specifically, on the basis that the tampered area in the first image is determined in S101, image processing may be performed on the tampered area, and the image obtained after the processing is used as a third image, and the third image is displayed. The image processing is performed on the tampered region in the first image, which is equivalent to labeling the tampered region in the first image, and the tampered region in the first image is complementary to the non-tampered region, so the labeling of the tampered region in the first image is equivalent to the labeling of the non-tampered region in the first image, that is, the image processing is equivalent to the labeling of the non-tampered region in the first image. Therefore, the third image obtained by performing image processing on the tampered region in the first image is displayed, which corresponds to the first image labeled with the non-tampered region.

The image processing employed should be one that can significantly change the visual effect of the image, and thus the image processing may include: the tampered region frame in the first image is selected by using a frame with any color or line or shape, mosaic processing, color coating processing, binarization processing and the like, and the specific mode of image processing is not limited in the application.

For example, as shown in fig. 2a, the displayed first image marked with the non-tampered area may be a tampered area in the first image selected by the dashed frame in fig. 2a, and the non-tampered area in the first image is not selected by the dashed frame.

By adopting the embodiment, the third image obtained by carrying out image processing on the tampered area in the first image can be displayed, and the first image marked with the non-tampered area is displayed, so that the non-tampered area in the displayed first image can be obviously distinguished from the tampered area, the situation that a user obtains irrelevant information due to whether the image is tampered or not is avoided, and the accuracy of the identification result is improved.

In S103, after the first image marked with the non-tampered area is displayed, the user may perform an area selection operation with respect to the non-tampered area, where the area to be identified selected by the area selection operation is an area where the object of interest in the first image selected by the user is located. The area to be identified selected by the area selection operation may include only the non-tampered area, only the tampered area, or both the non-tampered area and the tampered area.

It can be understood that the tampered image in the first image may be an image that actually exists, for example, an image obtained by matting other images, so that the user may be interested in the content in the tampered area, and the area to be identified may also include the tampered area.

For example, the first image for determining the area to be identified may be shown in fig. 2b, where the solid frame in fig. 2b is selected as the area to be identified.

In S104, since the area to be identified may include a plurality of identifiable objects, and different objects also have a plurality of identifiable features, if the area to be identified is directly identified, the obtained identification result is relatively wide, and the identification result may not include information of interest to the user. For example, if the user in the area to be identified is a person wearing red clothes, the person will be directly identified if the area to be identified is directly identified, so that the obtained identification result may be various results of a sitting person, a running person, etc., and the result of the person wearing red clothes of the user may not be included, so that the accuracy of image identification is lower.

Based on the above, the user can determine the target retrieval attribute aiming at the region to be identified in the preset retrieval attributes in the mode of S104-S106 according to the content of interest, so that the region to be identified is identified according to the target retrieval attribute, and the identification result of the interest of the user is obtained.

The preset retrieval attributes may include a variety of features of commonly used recognition objects. By way of example, the preset search attributes may include characteristics of a person, such as a mannequin, hair, walking gestures, and so forth; features of the vehicle such as license plate number, vehicle appearance, etc. may also be included; features of the article such as clothing style, cup shape, ornament shape, etc. may also be included. The specific setting of the preset search attribute is not limited in the application.

When determining the candidate search attribute, since there may be a plurality of identifiable objects in the region to be identified, if the categories to which the plurality of objects existing in the region to be identified belong are similar, an intersection of preset search attributes of each object in the region to be identified may be used as the candidate search attribute. For example, assuming that the area to be identified includes a plurality of flowers of different varieties, the categories to which the plurality of objects in the area to be identified belong are plants, the intersection of preset search attributes of each object, that is, the preset search attributes of all flowers and plants, may be used as candidate search attributes, where: petal shape, leaf shape, etc.

If the categories to which the plurality of objects in the region to be identified belong differ greatly, the union of preset search attributes of each object in the region to be identified can be used as the candidate search attribute. For example, if the area to be identified includes a plurality of objects such as a person and a vehicle, it is obvious that the person and the vehicle belong to different categories, and the difference between the category to which the person belongs and the category to which the vehicle belongs is large, and there is almost no intersection between the preset search attribute of the person and the preset search attribute of the vehicle, so that a union of the preset search attributes of each object, that is, the preset search attribute of the person and the preset search attribute of the vehicle, needs to be taken as candidate search attributes, such as: manikins, hair, walking posture, license plate number, vehicle appearance, etc.

In S105, the attribute selection interface that exposes the search attribute containing the candidate may refer to: the displayed attribute selection interface only contains the alternative retrieval attribute and can also be referred to as: in the presented attribute selection interface, the alternative search attribute is placed in front of other preset search attributes that are not alternative search attributes.

For the display mode of the attribute selection interface, in a possible implementation manner, after determining the region to be identified selected by the region selection operation of the user, the alternative search attribute is directly determined according to the mode in S104, and the attribute selection interface including the alternative search attribute is displayed.

In another possible implementation manner, an attribute selection interface containing alternative retrieval attributes selected by the interface display operation can also be displayed in response to the interface display operation on the area to be identified. For example, as shown in fig. 2c, the display process of the attribute selection interface may be that the search type in fig. 2c is a preset search attribute in the text, and the search result is a recognition result in the text. As shown in fig. 2c, the attribute selection interface may be that after determining the area to be identified, the user clicks on the search type in the lower left corner of the first image, and pops up in the form of a drop-down frame on the interface displaying the first image marked with the non-tampered area, so as to display the attribute selection interface. The attribute selection interface can also be that after the area to be identified is determined, the user clicks the retrieval type and pops up another interface so as to enable the user to select the target retrieval attribute. In this example, the user click search type operation may be regarded as an interface presentation operation for the region to be identified. The application does not limit the display form of the attribute selection interface.

As shown in fig. 2c, the alternative search attributes presented by the attribute selection interface may include: manikins (i.e. human body contrast), walking posture, clothing color. When the candidate retrieval attribute is a human model, the human model obtained by modeling the identification area is required to be compared with the human model stored in the database, and the human model which is most similar to the human model obtained according to the identification area in the database is determined to be used as the identification result. This process is equivalent to a human contrast, and thus, an alternative retrieval attribute is that the human model corresponds to the human contrast in fig. 2 c.

In S106, the target search attribute may be regarded as a feature of an object in the area to be identified, and the number of the determined target search attributes may be one or more.

As can be seen from the description in S104, the intersection or union of the preset search attributes of each object in the region to be identified may be used as the candidate search attribute, and if the union of the preset search attributes of each object in the region to be identified is used as the candidate search attribute, the number of candidate search attributes may be increased, so that the user cannot directly determine the candidate search attribute of the content of interest as the target search attribute. Thus, in one possible implementation, the user may determine, in the attribute selection interface, the target search attribute to be selected from among the candidate search attributes by means of text search. In this example, the attribute selection operation of the user for the attribute selection interface can be performed by using a text search mode in the attribute selection interface, and the candidate search attribute obtained by the user search is used as the candidate search attribute selected by the attribute selection operation.

If the number of the candidate search attributes is small, the user can directly determine the candidate search attributes of the content of interest as the target search attributes. Thus, in another possible implementation, the user may determine the target retrieval attribute directly among the alternative retrieval attributes. As shown in fig. 2c, a mode of a user in a selection frame in front of a target retrieval attribute displayed in the attribute selection interface may be taken as an attribute selection operation of the user for the attribute selection interface, and in response to the attribute selection operation, an alternative retrieval attribute selected by the attribute selection operation may be identified and taken as the target retrieval attribute.

In S107, the attribute value may be regarded as a feature value of the target retrieval attribute, that is, the attribute value represents a feature value of the target retrieval attribute of a certain object in the area to be identified. The attribute values may be in various forms such as feature vectors, values, and the like.

For example, if the determined target search attribute is the appearance of the vehicle, determining a characteristic value of the appearance of the vehicle in the area to be identified is required; if the determined target retrieval attribute is the animal shape, determining the characteristic value of the animal shape of the animal in the area to be identified is needed.

In one possible embodiment, if the first image is a video frame in a video stream; if the target retrieval attribute is a walking gesture, determining an attribute value of the walking gesture of the area to be identified according to the first image and a plurality of video frames adjacent to the first image in the video stream.

In this embodiment, when the determined target search attribute is a feature that cannot be reflected by a single image, the attribute value of the target search attribute of the area to be identified may be determined by a plurality of video frames adjacent to the first image in the video stream to which the first image belongs and the first image.

In S108, the database is established based on the attribute values of the preset search attributes of the objects in the database, and the target search attribute for the region to be identified is determined from the preset search attributes, so that when image identification is performed, the objects in the database can be screened by searching for the object whose attribute value of the target search attribute matches the target attribute value.

In one possible embodiment, for each object in the database, the target attribute value and the attribute value of the target retrieval attribute of the object may be input into the deep learning network model, so as to obtain an output result of whether the deep learning network model outputs a match, so that an object, in which the attribute value of the target retrieval attribute matches with the target attribute value, is determined in the database, as the recognition result.

In another possible embodiment, the matching degree between the target attribute value and the attribute value of the target retrieval attribute of each object in the database can be determined as the matching degree of each object in the database; and determining an object with the matching degree higher than a preset matching degree threshold value as a recognition result.

The matching degree can be measured by the similarity, specifically, the matching degree can be measured by the parameters of Euclidean distance, mahalanobis distance and the like which are inversely related to the similarity, and the matching degree can also be measured by the parameters of cosine distance and the like which are positively related to the similarity.

It can be understood that when an object with a matching degree higher than the preset matching degree threshold is determined as the recognition result, the specific numerical value of the matching degree is not compared in essence, but the matching degree between the image data in the region to be recognized and each object in the database is compared, so that the object with a higher matching degree with the image data in the region to be recognized is obtained as the recognition result.

If the matching degree is measured by the euclidean distance inversely related to the similarity, the larger the euclidean distance of the object is, the smaller the similarity is, the lower the matching degree between the object and the image data in the area to be identified is, at this time, the object with the higher matching degree with the image data in the area to be identified needs to be obtained as the identification result by determining the object with the euclidean distance smaller than the preset euclidean distance threshold as the identification result, and therefore, in this case, the object with the euclidean distance smaller than the preset euclidean distance threshold is determined as the identification result, which is equivalent to the object with the matching degree higher than the preset matching degree threshold is determined as the identification result.

The preset matching degree threshold value can be set according to past experience or actual requirements. If the degree of matching is measured by the degree of similarity, the preset degree of matching threshold may be 0.5, 0.6, 0.65, etc. If the matching degree is measured by the euclidean distance, the preset matching degree threshold may be 10, 15, 20, or the like, as described above, in which case, an object whose euclidean distance is smaller than the preset matching degree threshold needs to be determined as the recognition result.

For example, if the determined target retrieval attribute is a human model, the region to be identified may be input to the deep learning network, so as to obtain a human model a of the region to be identified output by the deep learning network; and determining the matching degree between the human body model and each human body model in the database.

In this example, the human body model a is a specific human body model obtained by modeling according to the region to be identified, and the human body model a is an attribute value of the target retrieval attribute of the region to be identified. And each human model in the database is an attribute value of the target retrieval attribute of each object in the database.

The object is screened by using the preset matching degree threshold value to obtain a more accurate identification result, but the preset matching degree threshold value may be set too high or too low when actually set. The fact that the preset matching degree threshold is too high may lead to the fact that the recognition result cannot be determined, namely the recognition failure is caused, so that the recognition success rate is reduced; an excessively low preset matching degree threshold may cause that the obtained recognition result may be mixed with an object with low matching degree and insufficient accuracy, and the accuracy of the recognition result may be reduced.

Based on this, in one possible embodiment, an object whose matching degree is located a preset number of bits before the ranking of the matching degree from high to low may be determined as the recognition result.

Specifically, the objects with the matching degree being in the preset number of bits before the matching degree is ranked from high to low can be determined as the recognition result through comparison between the matching degrees of the objects without ranking. And after the matching degree is sequenced from high to low, determining the object with the matching degree positioned in the front preset number of bits in the sequencing as the identification result. And the matching degree can be ranked from low to high, and then the objects with the preset number of bits after the matching degree is positioned in the ranking can be determined and used as the identification result. The preset number may be set according to the user requirement, for example, the preset number may be 5, 10, 15, and so on.

According to the method and the device for identifying the image, the objects with higher matching degree can be determined to serve as the identification result according to the matching degree of the objects, so that the matching degree of the objects contained in the identification result is higher than that of other objects in the database, the preset number can be set according to the user requirement, the number of the objects contained in the identification result can meet the user requirement, and the accuracy of image identification is improved.

When the determined target retrieval attributes are a plurality of, determining the matching degree between the attribute value of the target retrieval attribute of the area to be identified and the attribute value of the target retrieval attribute of each object in the database as the matching degree of each object in the database according to each target retrieval attribute; determining the sequence of the matching degree of each object in the sequence from high to low of the matching degree according to each target retrieval attribute; and integrating the sequence of each object under each target retrieval attribute, determining the integrated sequence of each object, and taking the objects with the preset number of bits as the identification result.

For example, if the determined target search attributes are the target search attribute 1 and the target search attribute 2, the matching degree between the attribute value of the target search attribute 1 of the object 1 in the database and the attribute value of the target search attribute 1 of the region to be identified is ranked 12, and the matching degree between the attribute value of the target search attribute 2 of the object 1 in the database and the attribute value of the target search attribute 2 of the region to be identified is ranked 10, the comprehensive ranking of the object 1 is (12+10) +.2=11. The matching degree between the attribute value of the target retrieval attribute 1 of the object 2 in the database and the attribute value of the target retrieval attribute 1 of the area to be identified is 3 rd, the matching degree between the attribute value of the target retrieval attribute 2 of the object 2 in the database and the attribute value of the target retrieval attribute 2 of the area to be identified is 7 th, and the comprehensive ordering of the object 1 is (3+7)/(2=5).

In calculating the comprehensive ranking of each object, the method of calculating the comprehensive ranking of each object is not limited in any way according to the averaging method in the above example or the weighted summation method according to the weight of each target retrieval attribute.

For the display of the identification results, the thumbnail information of each identification result can be displayed in sequence according to the sequence from high to low of the matching degree of the attribute value of the target retrieval attribute of the identification result and the target attribute value; responding to the view operation for the thumbnail information, and identifying an identification result corresponding to the thumbnail information selected by the view operation; and displaying the detailed information of the identification result.

In this example, the recognition result is displayed in the form of thumbnail information, and the user can view detailed information such as a picture or a video or a text corresponding to the recognition result by clicking on the displayed thumbnail information, wherein clicking on the displayed thumbnail information can be regarded as a viewing operation for the thumbnail information.

For example, the identification result may be displayed as the search result on the right side in fig. 2c, where NO1, NO2, NO3, NO4, and NO5 refer to the order in which the objects in the identification result are arranged from high to low in the matching degree, that is, the matching degree of the object displayed at NO1 is the highest in the identification result.

After determining the non-tampered area of the first image, in the step S101, the first image marked with the non-tampered area and the first image not marked with the non-tampered area may be spliced and displayed, so as to be convenient for the user to view. Thus, in one possible embodiment, if the first images are single images, two first images that are stitched to each other may be displayed, where one first image is marked with a non-tampered region and the other first image is not marked with a non-tampered region.

If the first image is each video frame in the video clip, two video playing windows spliced with each other can be displayed, the two video playing windows synchronously play the video clip, a non-tampered area is marked in the video frame played by one video playing window, and a non-tampered area is not marked in the video frame played by the other video playing window.

It is to be understood that the foregoing S101-S102 may be regarded as retrieving the non-tampered region in the first image, and S103-S108 may be regarded as retrieving the object in the region to be identified, and thus, the process of retrieving the region to be identified may be regarded as a process of retrieving the first image for the second time, and hereinafter, for convenience of description, the process of retrieving the region to be identified will be referred to as the second time retrieval.

Fig. 3 is another flow chart of the image recognition method provided in the embodiment of the present application, where the tampered area in the first image may be determined by the method shown in fig. 3. As shown in fig. 3, the method includes:

s301, inputting a first image.

S302, performing image preprocessing on the first image.

S303, obtaining an original image (namely a first image) through image preprocessing and obtaining a noise image (namely the second image) through a preprocessing mode of spatial domain high-pass filtering and frequency domain high-pass filtering.

S304, inputting the first image and the second image into a deep learning double flow detection model (Faster-RCNN).

S305, the deep learning dual-stream detection model outputs the tampered image area (i.e., the candidate tampered area) and the confidence of the candidate tampered area.

S306, judging whether the confidence coefficient of the candidate tampered area is larger than a preset confidence coefficient threshold value through an alarm unit, if not, namely, if not, executing S307; if yes, that is, the confidence of the candidate tampered area is greater than the preset confidence threshold, S308 is executed.

S307, the flow ends.

And S308, converting the normalized coordinates of the candidate tampered region output by the model into the actual positions of the candidate tampered region in the first image, and realizing coordinate conversion.

S309, drawing a tampered region according to the actual position of the candidate tampered region in the first image, namely, performing image processing on the tampered region in the first image in the step S102, thereby realizing the labeling of the tampered region in the first image, which is equivalent to the labeling of the non-tampered region in the first image.

And S310, splicing and displaying the first image marked with the tampered area and the original first image.

Illustratively, as shown in fig. 4, the left side in fig. 4 is the original first image, the right side is the first image marked with the tampered region, and the dotted frame is selected as the tampered region in the first image. Thus, the stitched image shown in fig. 4 can be regarded as the original image together with the tamper-evident image.

S311, judging whether the input first image is a video formed by arranging a plurality of associated first images in time sequence, if not, executing step S312; if yes, S313 and S314 are performed.

S312, the flow ends.

S313, controlling the playing display by synchronizing the time stamps.

S314, ending the flow.

Fig. 5 is a schematic flow chart of an image recognition method according to an embodiment of the present application, as shown in fig. 5, on the basis of the embodiment of fig. 3, after drawing a tampered area according to an actual position of a candidate tampered area with a confidence level greater than a confidence level threshold in a first image, and displaying the first image drawing the tampered area in a secondary search interface, that is, after marking a non-tampered area in the first image, and displaying the first image marked with the non-tampered area in the secondary search interface, step S501 is executed. The secondary search interface is a user interface for implementing the secondary search (i.e., S103 to S108 described above).

S501, a secondary search configuration, in which a search area (i.e., the aforementioned area to be identified) and a search type (i.e., the aforementioned target search attribute) are selected by a user.

The user can input a region selection instruction in the secondary retrieval interface, so that the region to be identified indicated by the region selection instruction is determined. After determining the region to be identified, the user can input an attribute selection instruction in the displayed attribute selection interface, so that the target retrieval attribute indicated by the attribute selection instruction is determined.

S502, identifying or modeling the region to be identified selected by the user through the deep learning structuring/model processing module to obtain structured data or model data.

S503, searching is carried out through a data searching model, namely database searching is carried out through structured data or model data, and a searching result is obtained. The database may have stored therein structured data or models of objects.

And S504, displaying the nearest top according to the search result, namely displaying the search result (namely the identification result) which is the most similar to the structured data or the model data obtained by identifying or modeling the area to be identified in the search result.

S505, the user clicks the item information, and the detailed information of the identification result is displayed through the detailed information display module.

When the identification result is displayed, the simplified information can be displayed, the user can check the detailed information of the identification result by clicking the item information, and the detailed information can be displayed through the detailed information display module.

S506, the flow ends.

The above steps are performed on the basis of the embodiment of fig. 3 as the aforementioned steps S103-S108.

For example, if a user needs to identify a person in a first image, but a scene where an area above the person's body is tampered, the specific implementation steps of the image identification method provided in the present application are as follows: inputting a first image; and performing spatial domain high-pass filtering and frequency domain high-pass filtering on the first image to obtain a second image. Inputting the first image and the second image into a deep learning double-flow detection model (fast-RCNN) to obtain the confidence that the deep learning double-flow detection model outputs the region above the body and the region above the body; and judging that the confidence coefficient of the area above the body is larger than a preset confidence coefficient threshold value through an alarm unit, namely, if the area above the body of the person in the first image is tampered, converting the normalized coordinates of the area above the body output by the model into the actual position of the area above the body in the first image, and selecting the area above the body in the first image in a mode of selecting a broken line frame according to the actual position of the area above the body in the first image, wherein the area above the body is the area shown in the broken line frame in fig. 2 a.

The method comprises the steps that a body part of a person in a first image is selected as a region to be identified by a user, a human body model is selected as a target retrieval attribute, the region to be identified is input into a deep learning model processing module for modeling, a human body model of the region to be identified is obtained, the human body model of the region to be identified is compared with human body model objects stored in a database, a human body model object which is the most similar to the human body model of the region to be identified in the database is determined to be used as an identification result, and the identification result is displayed. The presented information may only contain rough information of the identification result, and the user may view detailed information of the identification result by clicking on the item information.

The image recognition method provided by the application can be applied to electronic devices such as DVR (Digital Video Recorder ), NVR (network hard disk recorder) and central storage device.

Corresponding to the foregoing image recognition method, the embodiment of the present application further provides an image recognition system, where the system includes:

and the deep learning image authenticity processing unit is used for determining a non-tampered area which is not tampered in the first image.

Specifically, the manner in which the deep learning image authenticity processing unit determines the untampered area in the first image is the same as that in S101, and reference may be made to the description of S101, which is not repeated here.

The business processing unit is used for displaying the first image marked with the non-tampered area; in response to a region selection operation for the first image, determining a region to be identified selected by the region selection operation in the first image; determining alternative retrieval attributes of the objects existing in the area to be identified in the preset retrieval attributes; displaying an attribute selection interface containing alternative retrieval attributes; responding to the attribute selection operation aiming at the attribute selection interface, and identifying the alternative retrieval attribute selected by the attribute selection operation as a target retrieval attribute; acquiring an attribute value of a target retrieval attribute of a region to be identified, and taking the attribute value as a target attribute value; searching for an object, of which the attribute value of the target retrieval attribute is matched with the target attribute value, from a database pre-stored with each object and the attribute value of the preset retrieval attribute of each object, and taking the object as a recognition result.

Specifically, the service processing unit displays a first image marked with a non-tampered area; responding to an area selection instruction input for a non-tampered area, and determining an area to be identified indicated by the area selection instruction in a first image; the manner of identifying the object in the area to be identified and obtaining the identification result is the same as that in S102-S108, and reference may be made to the description of S102-S108, which is not repeated here.

With the adoption of the embodiment, the non-tampered area which is not tampered in the first image can be determined through the deep learning image authenticity processing unit, the first image marked with the non-tampered area is displayed through the service processing unit, and the area to be identified selected by the area selection operation is determined in the first image in response to the area selection operation for the first image; determining alternative retrieval attributes of the objects existing in the area to be identified in the preset retrieval attributes; displaying an attribute selection interface containing alternative retrieval attributes; and in response to the attribute selection operation aiming at the attribute selection interface, identifying the candidate retrieval attribute selected by the attribute selection operation as a target retrieval attribute, acquiring an attribute value of the target retrieval attribute of the area to be identified as a target attribute value, and searching an object, of which the attribute value is matched with the target attribute value, in a database of which the attribute values of the preset retrieval attributes of all objects are stored in advance as a target attribute value, as an identification result. When the target retrieval attribute is determined, the candidate retrieval attribute of the object in the area to be identified can be determined in the preset retrieval attributes, and the candidate retrieval attribute can be regarded as the preset retrieval attribute possibly selected by the user, so that the target retrieval attribute of the content interested by the user can be rapidly determined in a plurality of preset retrieval attributes by recommending the preset retrieval attribute for the user to select by the user. Because the database is established based on the attribute value of the preset retrieval attribute of each object, and the target retrieval attribute of the area to be identified is determined from the preset retrieval attribute, the target attribute value is the attribute value of the target retrieval attribute of the area to be identified, and therefore, the relevant information of the content interested by the user can be obtained more accurately by matching the attribute value of the target retrieval attribute of each object in the database with the target attribute value, screening each object in the database, and taking the object with the attribute value of the target retrieval attribute matched with the target attribute value in the database as the identification result. Therefore, the embodiment of the application can determine the characteristics of the content interested by the user by determining the target retrieval attribute, and more accurately determine the related information of the content interested by the user by matching the attribute value of the target retrieval attribute of each object in the database with the target attribute value, thereby improving the accuracy of image recognition.

In one possible embodiment, the system further comprises: a data receiving unit for receiving a first image;

the deep learning image authenticity processing unit is specifically used for acquiring a second image; inputting the first image and the second image into a deep learning double-current detection model to obtain an output result of the deep learning double-current detection model, wherein the output result comprises the confidence coefficient of the candidate tampered region in the first image, and sending the output result to an alarm unit;

the alarm unit is used for responding to the output result, determining a candidate tampered area with the confidence coefficient not smaller than a preset confidence coefficient threshold value in the output result as a tampered area in the first image, and alarming the tampered area;

a storage management unit for storing data;

Exemplary, a schematic structural diagram of the image recognition system may be as shown in fig. 6. After receiving the first image, the data receiving unit 601 inputs data to the image processing unit (i.e. the picture/video processing unit 602), that is, inputs the first image to the image processing unit 602, obtains a second image output by the image processing unit 602 after preprocessing the first image, and the image processing unit 602 may send the obtained second image to the storage management unit 606 for storage, or may obtain required image data from the storage management unit 606. The first image and the second image are input to the deep learning image authenticity processing unit 603, so as to obtain a candidate tampered area in the first image output by the deep learning image authenticity processing unit 603 and the authenticity confidence of the candidate tampered area, and the alarm unit 604 judges whether the candidate tampered area is the tampered area in the first image according to the authenticity confidence of the candidate tampered area, alarms the tampered area, and obtains an alarm result, thereby realizing detection of the tampered area of the first image, namely realizing first identification of the first image.

The service processing unit 605 may display the first image marked with the non-tampered area according to the alarm result, implement the alarm of the tampered area in the first image in a form of a screen popup window or voice broadcast, and execute the foregoing S102-S108 to perform image recognition on the first image when receiving the recognition instruction input by the user, that is, when the user wants to perform the second recognition on the first image. The service processing units, i.e. the service processing unit 605 shown in fig. 6, are hereinafter referred to as service processing unit 605.

In addition, the service processing unit 605 may send the obtained search result, that is, the image recognition result, to the storage management unit 606, so that the storage management unit 606 stores the image recognition result, and the alarm result obtained by the alarm unit 604 may also send the obtained alarm result to the storage management unit 606, so that the storage management unit 606 stores the alarm result.

A configuration management unit 607 for configuring the image processing unit 602, the deep learning image authenticity processing unit 603, the alarm unit 604, the service processing unit 605, and the storage management unit 606. For example, the deep learning image authenticity processing unit 603 may be configured to output an authenticity confidence of the candidate tampered area, and determine whether to display an alarm according to the confidence; the alarm unit 604 may be configured to determine whether to perform alarm storage, and the service processing unit 605 may configure the area to be identified and the target retrieval attribute when performing steps S102 to S104. The configuration management unit 607 is the configuration management 607 shown in fig. 6, among others.

Fig. 7 is a schematic flow chart of another image recognition method provided in the present application, as shown in fig. 7, where the configuration management unit configures the data receiving unit to determine different preprocessing manners of the first image received by the data receiving unit. The video processing unit (that is, the image processing unit) judges whether the received first image is an analog video, if the first image is an analog video, video acquisition is performed on the first image, if the first image is a digital video, video decoding is performed on the first image, and if the first image is not a video, picture data of the first image is acquired. And sending the second image obtained after processing by the image processing unit to a storage management unit for storage, and inputting the first image and the second image to a deep learning authenticity processing unit (namely the deep learning image authenticity processing unit) for detecting a tampered area in the first image, so as to obtain a candidate tampered area in the first image output by the deep learning authenticity processing unit and the confidence of the candidate tampered area. The alarm unit judges whether the candidate tampered area is a tampered area in the first image according to the authenticity confidence of the candidate tampered area, and obtains an alarm result, and the alarm result is sent to the storage management unit for storage. The service processing unit (i.e. service processing) can perform alarm display on the alarm result according to the alarm result. And the service processing unit can display the first image marked with the non-tampered area according to the alarm result, and perform secondary identification on the non-tampered area according to the object of the database in the storage management unit to obtain an identification result, namely, realize secondary image retrieval, and perform result display on the identification result.

Corresponding to the foregoing image recognition method, the embodiment of the present application further provides an image recognition device, as shown in fig. 8, where the device includes:

a non-tampered region determining module 801 configured to determine a non-tampered region that has not been tampered in the first image;

a first image display module 802 for displaying a first image labeled with a non-tampered region;

a region to be identified determining module 803, configured to determine, in response to a region selection operation for the first image, a region to be identified selected by the region selection operation in the first image;

an alternative search attribute determining module 804, configured to determine an alternative search attribute that is possessed by an object existing in the area to be identified in the preset search attributes;

the attribute selection interface display module 805 is configured to display an attribute selection interface including an alternative search attribute;

a target search attribute selection module 806, configured to identify, in response to an attribute selection operation for the attribute selection interface, an alternative search attribute selected by the attribute selection operation as a target search attribute;

a target attribute value acquisition module 807 configured to acquire an attribute value of a target retrieval attribute of the area to be identified as a target attribute value;

the recognition result searching module 808 is configured to search, in a database that stores in advance each object and an attribute value of a preset search attribute of each object, for an object whose attribute value of a target search attribute matches the target attribute value, as a recognition result.

In one possible embodiment, the first image is a video frame in a video stream and the target retrieval attribute is a walking gesture;

the target attribute value acquisition module acquires an attribute value of a target retrieval attribute of an area to be identified as a target attribute value, and comprises:

In one possible embodiment, the apparatus further comprises:

the thumbnail information display module is used for sequentially displaying thumbnail information of each recognition result according to the sequence from high to low of the matching degree of the attribute value of the target retrieval attribute of the recognition result and the target attribute value;

In one possible embodiment, the non-tampered region determining module determining a non-tampered region that is not tampered with in the first image comprises:

Inputting the first image and the second image into a deep learning double-current detection model to obtain an output result of the deep learning double-current detection model, wherein the output result comprises the confidence coefficient of the candidate tampered region in the first image;

In one possible embodiment, the first image is a single image, the apparatus further comprising:

the first spliced display module is used for displaying two first images spliced with each other, wherein one first image is marked with a non-tampered area, and the other first image is not marked with the non-tampered area;

or,

the second spliced display module is used for displaying two video playing windows spliced with each other, the two video playing windows synchronously play video clips, a non-tampered area is marked in a video frame played by one video playing window, and a non-tampered area is not marked in a video frame played by the other video playing window.

In one possible embodiment, a first image presentation module presents a first image labeled with a non-tampered region, comprising:

Performing image processing on the tampered region in the first image to obtain a third image; wherein the image processing includes: selecting a tampered area, performing mosaic processing, performing color coating processing and performing binarization processing;

a third image is shown.

The embodiment of the application also provides an electronic device, as shown in fig. 9, including:

a memory 901 for storing a computer program;

the processor 902 is configured to execute the program stored in the memory 901, thereby implementing the following steps:

displaying a first image marked with a non-tampered region;

in response to a region selection operation for the first image, determining a region to be identified selected by the region selection operation in the first image;

determining alternative retrieval attributes of the objects existing in the area to be identified in the preset retrieval attributes;

displaying an attribute selection interface containing alternative retrieval attributes;

responding to the attribute selection operation aiming at the attribute selection interface, and identifying the alternative retrieval attribute selected by the attribute selection operation as a target retrieval attribute;

acquiring an attribute value of a target retrieval attribute of a region to be identified, and taking the attribute value as a target attribute value;

searching for an object, of which the attribute value of the target retrieval attribute is matched with the target attribute value, from a database pre-stored with each object and the attribute value of the preset retrieval attribute of each object, and taking the object as a recognition result.

And the electronic device may further include a communication bus and/or a communication interface, where the processor 902, the communication interface, and the memory 901 perform communication with each other via the communication bus.

The communication bus mentioned above for the electronic devices may be a peripheral component interconnect standard (Peripheral Component Interconnect, PCI) bus or an extended industry standard architecture (Extended Industry Standard Architecture, EISA) bus, etc. The communication bus may be classified as an address bus, a data bus, a control bus, or the like. For ease of illustration, the figures are shown with only one bold line, but not with only one bus or one type of bus.

The communication interface is used for communication between the electronic device and other devices.

The Memory may include random access Memory (Random Access Memory, RAM) or may include Non-Volatile Memory (NVM), such as at least one disk Memory. Optionally, the memory may also be at least one memory device located remotely from the aforementioned processor.

The processor may be a general-purpose processor, including a central processing unit (Central Processing Unit, CPU), a network processor (Network Processor, NP), etc.; but also digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), field programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components.

In yet another embodiment provided herein, there is also provided a computer readable storage medium having stored therein a computer program which when executed by a processor implements the steps of any of the image recognition methods described above.

In yet another embodiment provided herein, there is also provided a computer program product containing instructions that, when run on a computer, cause the computer to perform any of the image recognition methods of the above embodiments.

In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, produces a flow or function in accordance with embodiments of the present application, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. The computer instructions may be stored in or transmitted from one computer-readable storage medium to another, for example, by wired (e.g., coaxial cable, optical fiber, digital Subscriber Line (DSL)), or wireless (e.g., infrared, wireless, microwave, etc.). The computer readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that contains an integration of one or more available media. The usable medium may be a magnetic medium (e.g., a floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., DVD), or a Solid State Disk (SSD), etc.

It is noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

In this specification, each embodiment is described in a related manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. In particular, for system, apparatus, electronic device, computer readable storage medium, and computer program product embodiments, the description is relatively simple as it is substantially similar to method embodiments, as relevant points are found in the partial description of method embodiments.

The foregoing description is only of the preferred embodiments of the present application and is not intended to limit the scope of the present application. Any modifications, equivalent substitutions, improvements, etc. that are within the spirit and principles of the present application are intended to be included within the scope of the present application.

Claims

1. An image recognition method, the method comprising:

displaying the first image marked with the non-tampered region;

2. The method of claim 1, wherein the first image is a video frame in a video stream and the target retrieval attribute is a walking gesture;

3. The method according to claim 1, wherein the method further comprises:

and displaying the detailed information of the identification result.

4. The method of claim 1, wherein the determining a non-tampered region in the first image that is not tampered with comprises:

5. The method of claim 1, wherein the first image is a single image, the method further comprising:

or,

6. The method of claim 1, wherein the presenting the first image labeled with the non-tampered region comprises:

the third image is shown.

7. An image recognition system, the system comprising:

8. The system of claim 7, wherein the system further comprises:

a data receiving unit configured to receive the first image;

a storage management unit for storing data;

9. The system of claim 8, wherein the system further comprises a controller configured to control the controller,

the service processing unit is further configured to display the first image labeled with the non-tampered region in response to an identification operation for the alarm unit.

10. An electronic device, comprising:

a memory for storing a computer program;

a processor for implementing the method of any of claims 1-6 when executing a program stored on a memory.