CN111144241B

CN111144241B - Target identification method and device based on image verification and computer equipment

Info

Publication number: CN111144241B
Application number: CN201911278754.2A
Authority: CN
Inventors: 岑俊毅; 傅东生
Original assignee: Shenzhen Miracle Intelligent Network Co Ltd
Current assignee: Shenzhen Miracle Intelligent Network Co Ltd
Priority date: 2019-12-13
Filing date: 2019-12-13
Publication date: 2023-06-20
Anticipated expiration: 2039-12-13
Also published as: CN111144241A

Abstract

The application relates to an image verification-based target identification method, an image verification-based target identification device and computer equipment. The method comprises the following steps: acquiring video data, and extracting multi-frame sample images from the video data; comparing the sample images to obtain a plurality of sample similarities; when the sample similarity is greater than a sample threshold, determining the sample image as a reference image; checking an image to be identified in the video data according to the reference image; and when the image similarity between the image to be identified and the reference image is smaller than an image threshold value, carrying out target identification on the image to be identified to obtain a target area corresponding to the image to be identified. By adopting the method, repeated target recognition of the images to be recognized with larger image similarity in the video data can be avoided, unnecessary images to be recognized are reduced to be recognized, and resources consumed for recognizing the images are effectively saved.

Description

Target identification method and device based on image verification and computer equipment

Technical Field

The present disclosure relates to the field of computer technologies, and in particular, to a target identification method and apparatus based on image verification, a computer device, and a storage medium.

Background

With the development of computer technology, image recognition technology is widely applied to various application fields such as face recognition, automatic driving, safety monitoring and the like. The real-time image recognition can continuously and rapidly recognize the target object in the image from the acquired images such as pictures, videos and the like. For example, in the field of security monitoring, a plurality of frames of monitoring images may be continuously recognized, and a target object to be monitored may be recognized in each frame of monitoring image.

However, in the case of real-time image recognition such as security monitoring, there are often cases where there are few target objects in an image, for example, in off-peak periods or at night, there are few or no target objects included in a part of the image. It is meaningless to repeatedly recognize images excluding the target object, and continuous recognition of each frame of image causes unnecessary waste of resources such as operations.

Disclosure of Invention

In view of the above, it is necessary to provide a target recognition method, apparatus, computer device, and storage medium based on image verification, which can save resources, in order to solve the above-described problem of wasting unnecessary resources such as operations.

An image verification-based target identification method, the method comprising:

Acquiring video data, and extracting multi-frame sample images from the video data;

comparing the sample images to obtain a plurality of sample similarities;

when the sample similarity is greater than a sample threshold, determining the sample image as a reference image;

checking an image to be identified in the video data according to the reference image;

and when the image similarity between the image to be identified and the reference image is smaller than an image threshold value, carrying out target identification on the image to be identified to obtain a target area corresponding to the image to be identified.

In one embodiment, the comparing the sample images to obtain a plurality of sample similarities includes:

preprocessing the sample image;

acquiring gray values corresponding to a plurality of pixel points in the processed sample image;

determining characteristic information corresponding to the sample image according to the gray value;

and comparing the characteristic information corresponding to the sample images to obtain the sample similarity between the sample images.

acquiring a preset frame image in a plurality of frames of sample images;

Comparing the sample image with the preset frame image to obtain a plurality of sample similarities;

the determining the sample image as a reference image when the sample similarity is greater than a sample threshold comprises:

and when any one of the plurality of sample similarities is larger than the sample threshold, determining the preset frame image as a reference image.

In one embodiment, after the comparing the sample images to obtain a plurality of sample similarities, the method further includes:

when the sample similarity smaller than or equal to the sample threshold exists in the plurality of sample similarities, counting the sample image quantity corresponding to the sample similarity smaller than or equal to the sample threshold;

acquiring a corresponding preset time period according to the sample image quantity;

and carrying out target recognition on the images to be recognized, which belong to the preset time period, in the video data.

In one embodiment, when there is a sample similarity less than a sample threshold among a plurality of the sample similarities, the method further comprises:

the sample image with the sample similarity smaller than or equal to a sample threshold value is recorded as an image to be identified;

and carrying out target recognition on the image to be recognized to obtain a target area corresponding to the image to be recognized.

In one embodiment, the verifying the image to be identified in the video data according to the reference image includes:

extracting an image after the sample image from the video data according to a preset sampling rate to serve as an image to be identified;

comparing the image to be identified with the reference image to obtain image similarity;

and repeating the step of extracting the image after the sample image is extracted from the video data according to a preset sampling rate to serve as an image to be identified when the image similarity is larger than or equal to the image threshold.

In one embodiment, the method further comprises:

counting the number of images to be identified, wherein the similarity of the images is greater than or equal to an image threshold value;

adjusting the preset sampling rate according to the number of the images to be identified;

and extracting an image to be identified from the video data according to the adjusted sampling rate.

An image verification-based object recognition apparatus, the apparatus comprising:

the video acquisition module is used for acquiring video data and extracting multi-frame sample images from the video data;

the reference image determining module is used for comparing the sample images to obtain a plurality of sample similarity; when the sample similarity is greater than a sample threshold, determining the sample image as a reference image;

The image verification module is used for verifying the image to be identified in the video data according to the reference image;

and the image recognition module is used for carrying out target recognition on the image to be recognized when the image similarity between the image to be recognized and the reference image is smaller than an image threshold value, so as to obtain a target area corresponding to the image to be recognized.

A computer device comprising a memory storing a computer program and a processor implementing the steps of the above image verification based object recognition method when the processor executes the computer program.

A computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the above-described image verification-based target recognition method.

According to the target identification method, the device, the computer equipment and the storage medium based on image verification, the multi-frame sample images are extracted from the acquired video data, the multi-frame sample images are compared with each other to obtain a plurality of sample similarities, and the reference image is determined from the multi-frame sample images according to the sample similarities. And checking the image to be identified in the video data according to the reference image, so as to screen the image to be identified. And carrying out target recognition on the images to be recognized, the image similarity of which is smaller than the image threshold value, between the images and the reference image, thereby avoiding repeated target recognition on the images to be recognized, which are larger in image similarity, in the video data, reducing unnecessary recognition on the images to be recognized, and effectively saving resources consumed for recognizing the images.

Drawings

FIG. 1 is an application environment diagram of an image verification-based target recognition method in one embodiment;

FIG. 2 is a flow chart of a target recognition method based on image verification in one embodiment;

FIG. 3 is a flow chart of a step of comparing sample images to obtain a plurality of sample similarities in an embodiment;

FIG. 4 is a block diagram of an object recognition device based on image verification in one embodiment;

fig. 5 is an internal structural diagram of a computer device in one embodiment.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application will be further described in detail with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the present application.

The target identification method based on image verification can be applied to an application environment shown in fig. 1. Wherein at least one monitoring device 102 and a server 104 are included, the monitoring device 102 may communicate with the server 104 over a network. The monitoring device 102 may be disposed in a variety of application environments to collect corresponding video data in a variety of environments. For example, the monitoring device 102 may be used to collect information including, but not limited to, road monitoring video, shop monitoring video, campus monitoring video, and indoor monitoring video. The server 104 may obtain video data collected by the monitoring device 102. In one embodiment, the server 104 may also obtain video data from a video capture device, computer, server, or the like in other application scenarios. The server 104 extracts a plurality of sample images from the video data. The server 104 compares the sample images to obtain a plurality of sample similarities, and when the sample similarities are greater than a sample threshold, determines the sample images as reference images. The server 104 checks the image to be identified in the video data according to the reference image, determines that the check fails when the image similarity between the image to be identified and the reference image is smaller than an image threshold value, and performs target identification on the image to be identified to obtain a target area corresponding to the image to be identified. Wherein the monitoring device 102 may include, but is not limited to, various video capturing devices and image capturing devices, and the server 104 may be implemented as a stand-alone server or a server cluster composed of a plurality of servers.

In one embodiment, as shown in fig. 2, there is provided an image verification-based object recognition method, which is described by taking the application of the method to the server 104 in fig. 1 as an example, and includes the following steps:

step 202, obtaining video data, and extracting multi-frame sample images from the video data.

The server can acquire video data, the video data comprise multi-frame image data, and the server can conduct target identification on the multi-frame image data in the video data to obtain a target area where a target object to be identified in the image data is located. Specifically, the server may acquire complete video data sent by the computer or the server, or may acquire pre-stored video data from the database, or may acquire video data acquired by the acquisition device in real time. When the server acquires video data acquired in real time, the video data can be transmitted in a video stream mode. Wherein video data may be transmitted via a variety of transmission protocols. For example, the transmission protocol of video data may include, but is not limited to, RTSP (Real Time Streaming Protocol ), RTMP (Real Time Messaging Protocol, real time messaging protocol), and the like. Video data refers to a sequence of consecutive images, the video data comprising consecutive frames of image data that are temporally sequential. A frame is the smallest visual unit of video data, one for each frame of video data.

After the video data is acquired, the server may parse the video data to obtain multi-frame image data included in the video data. The server may extract the current multi-frame image data from the video data according to the order of the image data, and record the extracted multi-frame image data as the sample image. The number of sample images may be preset according to actual requirements.

And 204, comparing the sample images to obtain a plurality of sample similarities.

The server can judge whether the image contents in the corresponding time periods of the multi-frame sample images are similar or not according to the extracted multi-frame sample images. Specifically, the server may compare the extracted multi-frame sample images with each other to obtain a sample similarity between the plurality of sample images. The server can compare the multi-frame sample images with each other in one of a plurality of combination modes according to actual demands. For example, the server may sort the multiple frames of sample images according to the time sequence of the sample images to obtain a sample image sequence, and the server may compare two adjacent sample images with each other according to the sample image sequence to obtain a plurality of sample similarities. The server can also determine a frame of sample image from the multi-frame sample images, and compare other multi-frame sample images with the determined frame of sample image respectively to obtain a plurality of sample similarities. Wherein, the one frame of sample image determined by the server may be the first frame of sample image in the multiple frames of sample images.

The sample similarity between sample images may be used to represent the degree of similarity of the image content corresponding to the two sample images. When the sample similarity between the sample images is large, the image content similarity degree of the corresponding sample images is high, the change of the real scene in the time period corresponding to the sample images is small, and the image data with high similarity degree does not need to be identified repeatedly. For example, in the road monitoring video, no vehicles or pedestrians and the like appear on the monitored road for a period of time, and the corresponding image data is highly similar. When the sample similarity between the sample images is smaller, the image content similarity degree of the corresponding sample images is lower, the change of the real scene is larger in the time period corresponding to the sample images, and the changed image data needs to be quickly identified, so that unnecessary resource consumption, such as operation resources of a server and the like, is avoided when the image identification is quickly and accurately carried out, and the resource cost consumed by the image identification is saved.

In step 206, when the sample similarity is greater than the sample threshold, the sample image is determined as the reference image.

The server may compare the sample similarity between the sample images to a sample threshold. The sample threshold may be preset by the user according to the actual requirement, and the sample threshold may be a constant. For example, the sample threshold may be set to 95%. The server may determine the sample image as the reference image when the sample similarity between the sample images is greater than the sample threshold. Specifically, the server may extract multiple sample images, and multiple sample similarities may be obtained after the multiple sample images are compared with each other. The server may compare the plurality of sample similarities to sample thresholds, respectively. When the similarity of any one of the plurality of sample similarities is larger than the sample threshold, the similarity between the plurality of sample images is higher, the environment content corresponding to the plurality of sample images is not changed greatly, and the server can record the sample images as reference images. The reference image can be used for comparing the images to be identified and judging whether the environment content corresponding to the images to be identified has larger change or not so as to screen the images to be identified.

In one embodiment, when any one of the plurality of sample similarities is less than or equal to the sample threshold, the server may perform object recognition on the extracted multi-frame sample image, and re-extract the sample image from the video data after performing object recognition on the extracted sample image, wherein the at least one frame of sample image is significantly different from other sample images.

In one embodiment, the server may obtain the preset frame image from the extracted multi-frame sample image when comparing the sample images. The preset frame image may be a frame of sample image at a preset position in an image sequence formed by multiple frames of sample images according to time sequence. The position of the preset frame image in the image sequence may be preset according to actual requirements. For example, the preset frame image may be the first frame sample image in the image sequence, or may be the last frame sample image in the image sequence. The server can call a similarity model, and compare the sample images except the preset frame images with the preset frame images one by utilizing the similarity model to obtain sample similarity between the plurality of sample images and the preset frame images. The similarity model may be pre-trained and configured in the server. The server may compare the plurality of sample similarities to the sample threshold in turn. The server may also compare the sample similarity with the sample threshold every time a sample similarity is obtained, and when the sample similarity is greater than the sample threshold, continue to compare the sample similarity between the next sample image and the preset frame image until all sample images except the preset frame image are compared with the preset frame image. The server may determine the preset frame image as the reference image when any one of the plurality of sample similarities is greater than the sample threshold.

And step 208, checking the image to be identified in the video data according to the reference image.

After the server determines the sample image as the reference image, the server may verify the image to be identified in the video data according to the reference image, thereby filtering the image to be identified in the video data. Wherein the image to be identified is image data extracted from the video data by the server and located after the sample image. It will be appreciated that the server may extract image data from video data sequentially in chronological order of the image data. The server may record the extracted image data as a sample image or an image to be recognized, respectively, according to the difference in the extraction time. For example, when the server acquires video data and needs to determine a reference image to verify the image data, the server may record the extracted image data as a sample image and determine the reference image from a plurality of sample images. After determining the reference image, the server may record the extracted image data as an image to be identified, and verify the image to be identified according to the reference image.

Specifically, the server may sequentially extract image data after the sample images from the video data as images to be identified, and the server may compare the images to be identified with the reference image one by one, so as to verify the images to be identified. And when the verification of the image to be identified and the reference image is successful, the verification of the image to be identified and the reference image of the next frame is repeated until the verification of the image to be identified and the reference image fails. And the server compares the image to be identified with the reference image, so that the image similarity between the image to be identified and the reference image can be obtained. The server may also call a similarity model, similar to the comparison method of the sample image, and determine the image similarity between the image to be identified and the reference image by using the similarity model.

The server may compare the image similarity between the image to be identified and the reference image with an image threshold. The image threshold may be preset according to actual requirements. In one embodiment, the image threshold may be the same as the sample threshold. When the image similarity is greater than or equal to the image threshold, the server may determine that the image to be identified is successfully compared with the reference image. When the image similarity is smaller than the image threshold, the server may determine that the image to be identified fails to be compared with the reference image.

And 210, performing target recognition on the image to be recognized when the image similarity between the image to be recognized and the reference image is smaller than the image threshold value, and obtaining a target area corresponding to the image to be recognized.

When the image similarity between the image to be identified and the reference image is smaller than the image threshold, the image content corresponding to the image to be identified is obviously changed compared with the reference image, the image to be identified does not belong to the image data similar to the reference image, and the server can determine that the verification of the image to be identified and the reference image fails. And when the image similarity between the image to be identified and the reference image is greater than or equal to an image threshold value, determining that the verification of the image to be identified and the reference image is successful. The server can perform target recognition on the image to be recognized, which fails to verify, so that the image data in the video data is filtered, and a target area corresponding to the target object in the image to be recognized is obtained.

Specifically, the server may invoke the image recognition model to perform object recognition on the screened image to be recognized. The image recognition model may be pre-established and trained, and the image recognition model may include at least one of a plurality of image recognition algorithms. The server can input the image to be identified into the image identification model, and the image identification model is utilized to operate the filtered image to be identified, so that a target area corresponding to the image to be identified output by the image identification model is obtained.

In one embodiment, after the images to be identified with failed verification are screened, the server may clear the reference image, and repeat the step of extracting the multi-frame sample image from the video data, thereby redetermining the reference image, and effectively improving the accuracy of image verification.

In this embodiment, the server compares the multiple-frame sample images by extracting the multiple-frame sample images from the acquired video data, and determines the sample image as the reference image when the sample similarities between the multiple-frame sample images are all greater than the sample threshold. The server may verify the image to be identified in the video data from the reference image. Before image recognition, the image data in the video data is checked, so that the images to be recognized are screened, and the repeated object recognition of the images to be recognized with larger image similarity in the video data is avoided. When the image similarity between the image to be identified and the reference image is smaller than the image threshold, the server performs target identification on the screened image to be identified to obtain a target area corresponding to the image to be identified, so that unnecessary image to be identified is reduced, and resources consumed by identifying the image are effectively saved.

In one embodiment, as shown in fig. 3, the step of comparing the sample images to obtain a plurality of sample similarities includes:

step 302, preprocessing the sample image.

Step 304, gray values corresponding to a plurality of pixel points in the processed sample image are obtained.

And 306, determining the characteristic information corresponding to the sample image according to the gray value.

And 308, comparing the characteristic information corresponding to the sample images to obtain the sample similarity between the sample images.

The server can call a similarity model, and the similarity model is utilized to compare the extracted sample images, so that the sample similarity between the sample images output by the similarity model is obtained. The similarity model can be pre-established and trained, and can be pre-configured in the server for the server to call. The similarity model includes a similarity function, and the similarity model may include at least one of a plurality of similarity algorithms. For example, the server may specifically employ a difference hash algorithm (Different hash algorithm, abbreviated as DHA), a mean hash algorithm (Average hash algorithm, abbreviated as AHA), a perceptual hash algorithm (Perceptual hash algorithm, abbreviated as PHA), and a SIFT (Scale-invariant feature transform, scale invariant feature transform) algorithm. The server can process the two sample images through a similarity function, and sample similarity between the two sample images is calculated.

Specifically, the server may pre-process the two compared sample images, where the pre-process includes at least one of a plurality of processing modes. For example, the preprocessing may include, but is not limited to, scaling processing and graying processing in particular. The server may perform scaling processing on the extracted sample image to scale the sample image into image data of a preset size. The size of the scaled size of the sample image may be preset according to the actual requirement. For example, according to different actual demands, the server can reduce the sample image into image data with the size of 32 pixels or 72 pixels, so as to avoid the difference of the sample image caused by different sizes or different proportions.

The server can perform graying processing on the scaled sample image and convert the sample image into a gray scale image, so that the calculated amount of the sample image is reduced. The server may obtain a gray value corresponding to each pixel in the processed sample image, where the values of the three color channels of each pixel are equal in the sample image after the graying process. The server can determine the characteristic information corresponding to the sample image according to the gray value corresponding to each pixel point. Specifically, the plurality of pixels in the sample image are arranged in a rectangular shape, and the server can traverse the gray value corresponding to each row of pixels in turn. The server can compare the gray values corresponding to the adjacent pixels in each row according to the arrangement sequence of the pixels, and determine whether the gray value of the previous pixel is greater than or equal to the gray value of the next pixel. The server may mark the results by comparing "0" with "1". When the gray value of the previous pixel is greater than or equal to the gray value of the next pixel, the server may record the comparison result as "1". When the gray value of the previous pixel is smaller than the gray value of the next pixel, the server may mark the comparison result as "0". After traversing the pixel points of each row, the server obtains hash values including 0 and 1, and the server can record the hash values of the sample images as characteristic information corresponding to the sample images.

The server can compare the characteristic information corresponding to the two frames of sample images to obtain the sample similarity between the sample images. The server can compare the hash values corresponding to the two frames of sample images one by one to know whether the hash values are the same. For example, the server may calculate the hamming distance between the sample images from the hash values corresponding to the sample images. Hamming distance may be used to represent the number of different bits corresponding to two strings of the same length. The server determines the sample similarity between two frames of sample images according to the hamming distance between the sample images.

In this embodiment, the server may perform preprocessing on the sample image, and determine feature information corresponding to the sample image according to the gray value corresponding to the pixel point in the processed sample image. And the server compares the characteristic information corresponding to the sample images to obtain the sample similarity between the sample images. The server can determine whether the extracted multi-frame sample images are similar according to the sample similarity, if so, the server can skip the identification of similar image data, so that the repeated target identification of the image data with larger image similarity in the video data is avoided, unnecessary image identification is reduced, and resources consumed for identifying the images are effectively saved.

In one embodiment, after the step of comparing the sample images to obtain a plurality of sample similarities, the image verification-based target identification method further includes: when the sample similarity smaller than or equal to the sample threshold exists in the plurality of sample similarities, counting the sample image quantity corresponding to the sample similarity smaller than or equal to the sample threshold; acquiring a corresponding preset time period according to the sample image quantity; and carrying out target recognition on the images to be recognized, which belong to the preset time period, in the video data.

After comparing the sample images to obtain the sample similarity corresponding to the sample images, the server can compare the sample similarity with a sample threshold value to judge whether the sample similarity is larger than the sample threshold value. After comparing all the sample images with each other, the server can compare the obtained similarity of a plurality of samples with a sample threshold value respectively. The server may also compare the sample similarity obtained to a sample threshold for each sample similarity obtained. And when the sample similarity is larger than the sample threshold, comparing the sample image, and further saving the operation resource of the server.

In one embodiment, when the sample similarity is smaller than or equal to the sample threshold, a larger difference exists between the sample images corresponding to the sample similarity, and the sample images have a change of the target object, so that the sample images with the sample similarity smaller than or equal to the sample threshold are required to be subjected to target recognition. The server can record the corresponding sample image as an image to be identified, and target identification is carried out on the image to be identified, so that a target area corresponding to the target object in the image to be identified is obtained. Therefore, missing of image data with dissimilar image contents is avoided, server resources are saved, and accuracy and effectiveness of identifying target objects in images are guaranteed.

When the sample similarity smaller than or equal to the sample threshold exists in the plurality of sample similarities corresponding to the multi-frame sample images, the server can count the sample similarity smaller than or equal to the sample threshold, and the sample image quantity corresponding to the sample similarity is obtained. The sample image quantity may be used to represent the number of sample images whose image content is dissimilar. It will be appreciated that the sample image amount indicates the number of sample images corresponding to sample similarity less than or equal to the sample threshold between sample images, and the sample image amount may be reset when the server re-extracts the sample images.

The server may acquire a corresponding preset period of time according to the sample image amount. The preset time period is a time length preset by a user according to actual demands, a corresponding association relation is preset between the time length and the sample image quantity, and different sample image quantities can correspond to time periods with different lengths. The sample image volume may correspond to a discrete preset time period or may correspond to a continuous preset time period. The preset time period may increase as the sample image amount increases. The server may acquire a preset time period associated with the sample image amount according to a preset association relationship. When there is a sample similarity smaller than or equal to a sample threshold between multiple frames of sample images extracted from video data by a server, it is necessary to identify different image data, indicating that there is a change in a target object in a scene to which the video data corresponds.

The server can record the image data belonging to the preset time period in the video data as the image to be identified according to the acquired preset time period, and directly identify the target of the image to be identified. The server can sequentially record the image data in the corresponding time length as the image to be identified from the last extracted sample image according to the time sequence corresponding to the image data. The server can directly perform target recognition on the image to be recognized without checking the image data until the image to be recognized is recognized, and then sequentially extracting multi-frame sample images for comparison, so that the image checking is not performed when the image content changes greatly, and the resource cost is further saved.

For example, in the road monitoring video, the target object may be a vehicle or a pedestrian or the like. When there are few vehicles and pedestrians at night, the multi-frame image data of the monitoring video may be the same or similar, and the server may not perform unnecessary image recognition on the repeated image data. When vehicles or pedestrians appear in the monitoring area, the similarity between the corresponding image data is low, and the server can conduct target identification on the image data with low similarity, so that the target area where the target object in the image data is located can be accurately identified. When more images with lower similarity exist in the multi-frame sample images, the monitored road can be indicated that the moving vehicles or pedestrians continuously exist, such as the peak period of the monitored road with positive values. At this time, the image needs to be continuously identified, and the server may stop performing unnecessary similarity check on the image data. The less image data with higher similarity in the sample image, the greater the activity capability of the target object in the monitored road, so that the server can stop checking the image data for a longer time, thereby further saving the resources of the server.

In this embodiment, when a sample similarity smaller than or equal to a sample threshold exists in the plurality of sample similarities, the server may count a sample image amount corresponding to the sample similarity smaller than or equal to the sample threshold, obtain a corresponding preset time period according to the sample image amount, and directly perform object recognition on an image to be recognized in the preset time period, so as to avoid performing unnecessary image verification on image data with smaller similarity, thereby further saving resources of the server.

In one embodiment, the step of verifying the image to be identified in the video data from the reference image further comprises: extracting an image after a sample image from video data according to a preset sampling rate to serve as an image to be identified; comparing the image to be identified with a reference image to obtain image similarity; and when the image similarity is greater than or equal to the image threshold, repeating the step of extracting the image after the sample image is extracted from the video data according to the preset sampling rate as the image to be identified.

After determining the reference image, the server may sequentially extract image data after the sample image from the video data according to a preset sampling rate as the image to be recognized. The preset sampling rate is preset by a user according to actual requirements and is used for extracting image data. The sampling rate at which the server extracts the image to be identified may be the same as the sampling rate at which the sample image is extracted. In one embodiment, the sampling rate at which the server extracts the image data may be consistent with the frame rate corresponding to the video data.

The server can sequentially compare the extracted image to be identified with the reference image to obtain the image similarity between the image to be identified and the reference image. The manner in which the server compares the image to be identified with the reference image may be the same as the manner in which the server compares the sample image in the above embodiment, so that the description thereof is omitted herein. The server can compare the image similarity between the image to be identified and the reference image with an image threshold value, and judge whether the image to be identified is similar to the reference image. When the image similarity is greater than or equal to the image threshold, the image verification is determined to be successful, the image to be identified is similar to the reference image, and repeated identification of similar image data is not needed. The server can skip target recognition of the image to be recognized, and sequentially extract the next frame of image to be recognized from the video data and compare the next frame of image to be recognized with the reference image. In one embodiment, when the image similarity is smaller than the image threshold, it is determined that the image verification fails, which means that the image to be identified is dissimilar from the reference image, and a moving target object may appear in the image to be identified, and the server is necessary to perform target identification on different images to be identified. The server can perform target recognition on the image to be recognized, which fails to be checked, and clear the reference image, and re-extract the sample image from the video data.

In this embodiment, after determining the reference image, the server may record the image data after the sample image in the video data as the image to be identified, and perform similarity verification on the image to be identified by comparing the image to be identified with the reference image. When the image similarity is greater than or equal to the image threshold, it means that the image to be identified is similar to the reference image, and repeated identification of similar image data is unnecessary. The server can skip target recognition of the image to be recognized, repeatedly extract the next frame of image to be recognized and compare the reference image, avoid the server to carry out unnecessary repeated recognition on similar image data, and effectively save the resource cost such as operation resource, power consumption and the like of the server. The resources are truly utilized in the necessary identification process, so that the efficiency of image identification is improved.

In one embodiment, the above image verification-based target identification method further includes: counting the number of images to be identified, the similarity of which is greater than or equal to an image threshold; adjusting a preset sampling rate according to the number of images to be identified; and extracting the image to be identified from the video data according to the adjusted sampling rate.

When there is an image similarity of the image to be identified corresponding to the reference image being greater than or equal to the image threshold, the server may count the number of images to be identified whose image similarity is greater than or equal to the image threshold. The number of images to be recognized may be used to represent the number of images to be recognized that are skipped from among the images to be recognized extracted by the server. When the number of images to be recognized is small, it means that the image content changes greatly in a short time. When the number of images to be recognized is large, the image content in a longer period of time is not changed greatly.

The server can adjust the preset sampling rate according to the number of the images to be identified. Specifically, the server may store in advance a correspondence between the number of images to be identified and the sampling rate adjustment amplitude, and the correspondence between the number of images to be identified and the sampling rate adjustment amplitude may be preset by the user according to actual requirements. The adjustment amplitude of sampling rates with different sizes can be set according to different numbers of images to be identified. The sample rate adjustment amplitude may be discrete. For example, the sample rate adjustment amplitude may include decreasing to one-half, one-fifth, and one-tenth of the original sample rate. The server may extract the image to be identified from the video data according to the adjusted sampling rate. It will be appreciated that the number of images to be identified is used to represent the number of images to be identified that are similar to the reference image, and that the server clears the reference image when the image similarity is less than the image threshold. Correspondingly, the server may reset the number of images to be identified. After the reference image is redetermined, the server may re-count the number of images to be identified whose image similarity is greater than or equal to the image threshold.

When the number of the images to be identified is large, the image content in a longer period of time is not changed greatly, and the scene corresponding to the images to be identified is stable. Therefore, the server can reduce the preset sampling rate, reduce the frequency of comparing the extracted image to be identified with the reference image, reduce the comparison operation of the image to be identified and the reference image, and further save server resources.

In this embodiment, the server may count the number of images to be identified whose image similarity is greater than or equal to the image threshold, adjust the preset sampling rate according to the number of images to be identified, and extract the images to be identified from the video data according to the adjusted sampling rate, so as to reduce the number of times of comparing the images to be identified with the reference image when the scene corresponding to the images to be identified is relatively stable, and further save server resources.

It should be understood that, although the steps in the flowcharts of fig. 2-3 are shown in order as indicated by the arrows, these steps are not necessarily performed in order as indicated by the arrows. The steps are not strictly limited to the order of execution unless explicitly recited herein, and the steps may be executed in other orders. Moreover, at least some of the steps in fig. 2-3 may include multiple sub-steps or stages that are not necessarily performed at the same time, but may be performed at different times, nor do the order in which the sub-steps or stages are performed necessarily occur sequentially, but may be performed alternately or alternately with at least a portion of the sub-steps or stages of other steps or steps.

In one embodiment, as shown in fig. 4, there is provided an image verification-based object recognition apparatus, including: a video acquisition module 402, a reference image determination module 404, an image verification module 406, and an image recognition module 408, wherein:

the video acquisition module 402 is configured to acquire video data, and extract a plurality of frame sample images from the video data.

The reference image determining module 404 is configured to compare the sample images to obtain a plurality of sample similarities; when the sample similarity is greater than the sample threshold, the sample image is determined as the reference image.

The image verification module 406 is configured to verify an image to be identified in the video data according to the reference image.

The image recognition module 408 is configured to perform target recognition on the image to be recognized when the image similarity between the image to be recognized and the reference image is smaller than the image threshold value, so as to obtain a target area corresponding to the image to be recognized.

In one embodiment, the reference image determining module 404 is further configured to pre-process the sample image; acquiring gray values corresponding to a plurality of pixel points in the processed sample image; determining characteristic information corresponding to the sample image according to the gray value; and comparing the characteristic information corresponding to the sample images to obtain the sample similarity between the sample images.

In one embodiment, the reference image determining module 404 is further configured to obtain a preset frame image from the multiple frame sample images; comparing the sample image with a preset frame image to obtain a plurality of sample similarities; when any one of the plurality of sample similarities is greater than the sample threshold, a preset frame image is determined as a reference image.

In one embodiment, the image recognition module 408 is further configured to, when there is a sample similarity less than or equal to the sample threshold value among the plurality of sample similarities, count a sample image amount corresponding to the sample similarity less than or equal to the sample threshold value; acquiring a corresponding preset time period according to the sample image quantity; and carrying out target recognition on the images to be recognized, which belong to the preset time period, in the video data.

In one embodiment, the image recognition module 408 is further configured to record, as the image to be recognized, a sample image having a sample similarity less than or equal to a sample threshold; and carrying out target recognition on the image to be recognized to obtain a target area corresponding to the image to be recognized.

In one embodiment, the image verification module 406 is further configured to extract, as the image to be identified, an image after the sample image is extracted from the video data according to a preset sampling rate; comparing the image to be identified with a reference image to obtain image similarity; and when the image similarity is greater than or equal to the image threshold, repeating the step of extracting the image after the sample image is extracted from the video data according to the preset sampling rate as the image to be identified.

In one embodiment, the image verification module 406 is further configured to count the number of images to be identified with an image similarity greater than or equal to the image threshold; adjusting a preset sampling rate according to the number of images to be identified; and extracting the image to be identified from the video data according to the adjusted sampling rate.

For specific limitations on the image verification-based object recognition apparatus, reference may be made to the above limitations on the image verification-based object recognition method, and no further description is given here. The respective modules in the above-described image verification-based object recognition apparatus may be implemented in whole or in part by software, hardware, and combinations thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.

In one embodiment, a computer device is provided, which may be a server, the internal structure of which may be as shown in fig. 5. The computer device includes a processor, a memory, a network interface, and a database connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, computer programs, and a database. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The database of the computer device is used for storing target identification data based on image verification. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program, when executed by a processor, implements a target recognition method based on image verification.

It will be appreciated by those skilled in the art that the structure shown in fig. 5 is merely a block diagram of some of the structures associated with the present application and is not limiting of the computer device to which the present application may be applied, and that a particular computer device may include more or fewer components than shown, or may combine certain components, or have a different arrangement of components.

In one embodiment, a computer device is provided, comprising a memory and a processor, the memory having stored therein a computer program, the processor implementing the steps of the above-described image verification-based object recognition method embodiment when the computer program is executed.

In one embodiment, a computer readable storage medium is provided, on which a computer program is stored which, when executed by a processor, implements the steps of the above-described image verification-based object recognition method embodiment.

Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the various embodiments provided herein may include non-volatile and/or volatile memory. The nonvolatile memory can include Read Only Memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), memory bus direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), among others.

The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.

The above examples merely represent a few embodiments of the present application, which are described in more detail and are not to be construed as limiting the scope of the invention. It should be noted that it would be apparent to those skilled in the art that various modifications and improvements could be made without departing from the spirit of the present application, which would be within the scope of the present application. Accordingly, the scope of protection of the present application is to be determined by the claims appended hereto.

Claims

1. An image verification-based target identification method, comprising the steps of:

comparing the sample images to obtain a plurality of sample similarities;

repeating the step of extracting the image after the sample image is extracted from the video data according to a preset sampling rate as an image to be identified when the image similarity is greater than or equal to an image threshold;

and when the image similarity between the image to be identified and the reference image is smaller than the image threshold, carrying out target identification on the image to be identified to obtain a target area corresponding to the image to be identified.

2. The method of claim 1, wherein comparing the sample images to obtain a plurality of sample similarities comprises:

preprocessing the sample image;

3. The method of claim 1, wherein comparing the sample images to obtain a plurality of sample similarities comprises:

Acquiring a preset frame image in a plurality of frames of sample images;

4. The method of claim 1, wherein after said comparing the sample images to obtain a plurality of sample similarities, the method further comprises:

5. The method of claim 4, wherein when there is a sample similarity of the plurality of sample similarities that is less than a sample threshold, the method further comprises:

6. The method according to claim 1, wherein the method further comprises:

7. An image verification-based object recognition apparatus, the apparatus comprising:

the image verification module is used for verifying the image to be identified in the video data according to the reference image; the image processing device is also used for extracting the image after the sample image from the video data according to a preset sampling rate to serve as an image to be identified; comparing the image to be identified with the reference image to obtain image similarity; repeating the step of extracting the image after the sample image is extracted from the video data according to a preset sampling rate as an image to be identified when the image similarity is greater than or equal to an image threshold;

And the image recognition module is used for carrying out target recognition on the image to be recognized when the image similarity between the image to be recognized and the reference image is smaller than the image threshold value, so as to obtain a target area corresponding to the image to be recognized.

8. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the steps of the method of any of claims 1 to 6 when the computer program is executed.

9. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method of any of claims 1 to 6.