CN111160390B

CN111160390B - Image recognition method and device

Info

Publication number: CN111160390B
Application number: CN201911212596.0A
Authority: CN
Inventors: 盖俸瑞
Original assignee: Unisound Intelligent Technology Co Ltd
Current assignee: Unisound Intelligent Technology Co Ltd
Priority date: 2019-12-02
Filing date: 2019-12-02
Publication date: 2023-06-20
Anticipated expiration: 2039-12-02
Also published as: CN111160390A

Abstract

The invention discloses an image recognition method and device, comprising the steps of recognizing an image by using a recognition engine to obtain image information; after the identification is finished, the image information is stored and the identification engine is closed; acquiring image information acquired by a camera in real time and comparing the acquired image information with stored image information; it is determined whether to re-activate the recognition engine for secondary recognition. After the recognition engine recognizes the image information, the image information is stored, the recognition engine is closed, and whether the recognition engine is restarted or not is determined by comparing the image information acquired by the camera in real time with the stored image information, so that the problem that in the prior art, the background can continuously call the engine to recognize after the engine is successfully recognized is solved, invalid data is returned, the probability of false recognition is increased, and the experience of a user is enhanced.

Description

Image recognition method and device

Technical Field

The present invention relates to the field of data processing technologies, and in particular, to an image recognition method and apparatus

Background

At present, an image scanning technology becomes an indispensable link in life of people, such as printing important files, scanning identity cards and the like, the image scanning technology in the prior art acquires image information in real time by utilizing a camera and transmits the image information to an engine for recognition so as to acquire content related to the acquired image information, the method has the following problems that when the camera acquires the image information and transmits the image information to the engine and the engine is successfully recognized, a background can continuously call the engine to continuously recognize, similar pages are recognized, invalid data are returned, meanwhile, the probability of false recognition is increased, and the experience feeling is very bad for a user.

Disclosure of Invention

In order to solve the problems, the method is based on the fact that the recognition engine is used for recognizing the image to acquire image information, the recognition engine is closed after the recognition is successful by storing the image information, and whether the recognition engine is restarted or not is determined by comparing the image information acquired by the camera with the stored image information to recognize the image information.

An image recognition method comprising the steps of:

identifying the image by utilizing an identification engine to obtain image information;

after the identification is finished, the image information is stored and the identification engine is turned off;

acquiring image information acquired by a camera in real time and comparing the acquired image information with stored image information;

it is determined whether to re-activate the recognition engine for secondary recognition.

Preferably, the image includes a plurality of images; the image is identified by an identification engine to obtain image information, which comprises the following steps:

acquiring a plurality of images;

activating the recognition engine based on a plurality of the images;

and identifying the images one by utilizing the identification engine so as to obtain a plurality of image information.

Preferably, the comparing the acquired image information acquired by the camera in real time with the stored image information includes:

storing the stored plurality of image information into a pre-established image information base;

acquiring image information acquired by the camera in real time and transmitting the image information to the recognition engine;

and comparing a plurality of image information in the image information base with the image information acquired in real time by using a perceptual hash algorithm to obtain a comparison result.

Preferably, the comparing the plurality of image information in the image information base with the image information acquired in real time by using a perceptual hash algorithm to obtain a comparison result includes:

preprocessing the image information acquired in real time;

calculating m generalized frequency components of the preprocessed image information;

calculating the average value of the m generalized frequency components, and then comparing each generalized frequency component in the m generalized frequency components with the average value, wherein generalized frequency components larger than or equal to the average value are marked as 1, and generalized frequency components smaller than the average value are marked as 0; counting a first registration result;

outputting the first registration result;

and counting a plurality of second registration results of the plurality of image information in advance, comparing the plurality of second registration results with the first registration result, and obtaining the comparison result.

Preferably, the determining whether to re-activate the recognition engine for secondary recognition includes:

obtaining the number of target images and image information of which the generalized frequency component identity ratio is greater than or equal to a first preset probability in the comparison result;

comparing whether the generalized frequency component identical rate is greater than or equal to a second preset probability or not according to the acquired result and the image information acquired in real time;

if yes, outputting target image information with the maximum generalized frequency component identical rate and playing an audio file corresponding to the target image information;

otherwise, restarting the recognition engine to recognize the image information acquired by the camera in real time and storing the image information.

An image recognition apparatus, the apparatus comprising:

the identification module is used for identifying the image by utilizing the identification engine and obtaining image information;

the storage module is used for storing the image information after the recognition is finished and closing the recognition engine;

the comparison module is used for obtaining the image information acquired by the camera in real time and comparing the image information with the stored image information;

and the determining module is used for determining whether the recognition engine is re-activated for secondary recognition.

Preferably, the image includes a plurality of images; the identification module comprises:

the first acquisition submodule is used for acquiring a plurality of images;

an activation sub-module for activating the recognition engine based on a plurality of the images;

and the identification sub-module is used for identifying a plurality of images one by utilizing the identification engine so as to obtain a plurality of image information.

Preferably, the comparison module includes:

the storage sub-module is used for storing the stored plurality of image information into a pre-established image information base;

the second acquisition sub-module is used for acquiring the image information acquired by the camera in real time and transmitting the image information to the recognition engine;

and the comparison sub-module is used for comparing a plurality of image information in the image information base with the image information acquired in real time by using a perceptual hash algorithm to obtain a comparison result.

Preferably, the comparison sub-module includes:

the preprocessing unit is used for preprocessing the image information acquired in real time;

a first calculation unit for calculating m generalized frequency components of the image information after preprocessing;

a second calculation unit configured to calculate an average value of the m generalized frequency components, and then compare each generalized frequency component of the m generalized frequency components with the average value, where generalized frequency components greater than or equal to the average value are marked as 1, and generalized frequency components less than the average value are marked as 0; counting a first registration result;

an output unit configured to output the first registration result;

and the comparison unit is used for counting a plurality of second registration results of the plurality of image information in advance, comparing the plurality of second registration results with the first registration result and obtaining the comparison result.

Preferably, the determining module includes:

the third acquisition sub-module is used for acquiring the target image quantity and image information of which the generalized frequency component identical rate is more than or equal to a first preset probability in the comparison result;

the comparison sub-module is used for comparing whether the generalized frequency component identical rate is greater than or equal to a second preset probability according to the acquired result and the image information acquired in real time;

and the control sub-module is used for controlling the recognition engine to output target image information with the maximum generalized frequency component identical rate and play an audio file corresponding to the target image information when the comparison sub-module confirms that the generalized frequency component identical rate is greater than or equal to a second preset probability, otherwise, restarting the recognition engine to recognize the image information acquired by the camera in real time and storing the image information.

Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention may be realized and attained by the structure particularly pointed out in the written description and drawings.

The technical scheme of the invention is further described in detail through the drawings and the embodiments.

Drawings

FIG. 1 is a workflow diagram of an image recognition method provided by the present invention;

FIG. 2 is another workflow diagram of an image recognition method according to the present invention;

FIG. 3 is a block diagram of an image recognition device according to the present invention;

FIG. 4 is a diagram showing another construction of an image recognition apparatus according to the present invention;

Detailed Description

Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary examples are not representative of all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with some aspects of the present disclosure as detailed in the accompanying claims.

At present, an image scanning technology becomes an indispensable link in life of people, such as printing important files, scanning identity cards and the like, the image scanning technology in the prior art acquires image information in real time by utilizing a camera and transmits the image information to an engine for recognition so as to acquire content related to the acquired image information, the method has the following problems that when the camera acquires the image information and transmits the image information to the engine and the engine is successfully recognized, a background can continuously call the engine to continuously recognize, similar pages are recognized, invalid data are returned, meanwhile, the probability of false recognition is increased, and the experience feeling is very bad for a user. In order to solve the above-mentioned problem, the present embodiment discloses an image information recognition method based on the acquisition of image information by using a recognition engine, storing image information after successful recognition, closing the recognition engine, and determining whether to restart the recognition engine by comparing the image information acquired by a camera with the stored image information.

An image recognition method, as shown in fig. 1, includes the following steps:

step S101, identifying an image by utilizing an identification engine to obtain image information;

step S102, after the recognition is finished, the image information is saved, and the recognition engine is closed;

step S103, comparing the image information acquired by the camera in real time with the stored image information;

step S104, determining whether to reactivate the recognition engine for secondary recognition.

The working principle of the technical scheme is as follows: and (3) utilizing the recognition engine to recognize the image to acquire image information, storing the image information after successful recognition, closing the recognition engine, comparing the image information acquired by the acquired camera in real time with the stored image information, and determining whether to re-activate the recognition engine for secondary recognition according to a comparison result.

The beneficial effects of the technical scheme are as follows: after the recognition engine recognizes the image information, the image information is stored and the recognition engine is closed, whether the recognition engine is restarted or not is determined by comparing the image information acquired by the camera in real time with the stored image information, namely, the recognition engine is not required to be restarted if the comparison result is within a preset range, otherwise, the recognition engine is restarted to recognize the image information acquired by the camera in real time, the problem that in the prior art, the background can continuously call the engine to recognize after the engine is successfully recognized is solved, invalid data is returned, the probability of false recognition is increased, and the experience of a user is enhanced.

In one embodiment, the image includes a plurality of; identifying the image with an identification engine to obtain image information, including:

acquiring a plurality of images;

activating an identification engine based on the plurality of images;

and utilizing the recognition engine to recognize the plurality of images one by one so as to obtain a plurality of image information.

The beneficial effects of the technical scheme are as follows: the situation of different image information acquired by the camera in real time can be dealt with by utilizing the recognition engine to recognize a plurality of images, and meanwhile, the probability of successful image information comparison is increased.

In one embodiment, as shown in fig. 2, acquiring image information acquired by the camera in real time and comparing the image information with stored image information includes:

step S201, storing the stored plurality of image information into a pre-established image information base;

step S202, acquiring image information acquired by a camera in real time and transmitting the image information to an identification engine;

and step 203, comparing a plurality of image information in the image information base with image information acquired in real time by using a perceptual hash algorithm to obtain a comparison result.

The beneficial effects of the technical scheme are as follows: the image information base is built so as to store the stored image information, and the comparison result can be more accurately calculated by utilizing the Hash algorithm, so that the problem of false identification is further avoided.

In one embodiment, comparing a plurality of image information in an image information base with image information acquired in real time by using a perceptual hash algorithm to obtain a comparison result, including:

preprocessing image information acquired in real time;

calculating the average value of m generalized frequency components, and then comparing each generalized frequency component in the m generalized frequency components with the average value, wherein the generalized frequency component larger than or equal to the average value is marked as 1, and the generalized frequency component smaller than the average value is marked as 0; counting a first registration result;

outputting the first registration result;

counting a plurality of second registration results of a plurality of image information in advance, comparing the plurality of second registration results with the first registration result, and obtaining a comparison result;

in particular, the pretreatment described above comprises: the size of the image information collected by the camera is reduced, and the color is simplified into a gray image; and (3) a matrix with the calculated generalized frequency component value of 32x32, capturing the demonstrated comparative generalized frequency component of 8x8 at the upper left corner and the average value, wherein m can be 64.

The beneficial effects of the technical scheme are as follows: the method comprises the steps of preprocessing collected image information, calculating the value of a generalized frequency component more accurately, counting the generalized frequency component of the image information collected by a camera in real time, comparing the generalized frequency component with an average value to confirm a first registration result, and comparing the first registration result with a preset second registration result of stored image information to obtain whether the image information collected by the camera is identical to the stored image information according to a comparison result, wherein the scanning recognition rate is more stable compared with the prior art.

In one embodiment, determining whether to reactivate the recognition engine for secondary recognition includes:

obtaining the number of target images and image information of which the generalized frequency component identical rate is greater than or equal to a first preset probability in a comparison result;

otherwise, restarting the recognition engine to recognize the image information acquired by the camera in real time and storing the image information;

in particular, the first preset probability may be 50%, the second preset probability may be 60%, the larger the probability is, the more similar the image information acquired by the camera in real time is, and finally the preset image information with the maximum probability is output.

The beneficial effects of the technical scheme are as follows: the target image information with the largest generalized frequency component is output, so that the identification result is more deviated from the image information acquired by the camera in real time, and the first preset probability box and the second preset probability are set at the same time to better cope with the phenomenon that the same rate of a plurality of generalized frequency components is deviated, unrecognized image information libraries can be enriched by the identification engine for re-identification and storage, and then the same image information can be directly identified next time, so that time is saved, and efficiency is improved.

In one embodiment, the method comprises:

step 1: when the recognition engine returns a state of successful recognition, corresponding image information is saved and the recognition engine is turned off.

Step 2: acquiring image information acquired by a camera in real time, and comparing the acquired image information with stored image information by using a perceptual hash algorithm pHash;

step 3: and when the comparison result of the two pieces of picture information is less than 60%, the recognition engine is started up.

Perceptual halfsh algorithm pHash:

the perceptual hash algorithm can obtain more accurate results and uses DCT (discrete cosine transform) to reduce the frequency.

(a) Reduced size

To simplify the calculation of the DCT, pHash starts with small pictures (recommended pictures are larger than 8x8, 32x 32).

(b) Simplified color

The same as the aHash, the picture needs to be converted into a gray image, so that the calculated amount is further simplified (specific algorithm see the steps of the aHash algorithm).

(c) Computing DCT

The DCT is to aggregate the picture decomposition frequencies and the ladder. Here, a 32x32 picture is taken as an example.

The DCT transform is commonly referred to as discrete cosine transform (Discrete Cosine Transform), which is mainly used for compressing data or images, and can transform a spatial signal into a frequency domain, and has good decorrelation performance. The DCT transformation is lossless, but creates good conditions for subsequent quantization, harvey coding and the like in the fields of image coding and the like, and meanwhile, as the DCT transformation is symmetrical, the original image information can be restored at a receiving end by utilizing the DCT inverse transformation after the quantization coding. The original image is subjected to discrete cosine transform, the DCT coefficient energy after the transform is mainly concentrated in the upper left corner, most of the rest coefficients are close to zero, and the DCT has the characteristic of being suitable for image compression. And (3) carrying out threshold operation on the transformed DCT coefficient, and zeroing the coefficient smaller than a certain value, namely, a quantization process in image compression, and then carrying out inverse DCT operation, so that a compressed image can be obtained.

Principle of discrete cosine transform:

one-dimensional DCT transformation:

where F (i) is the original signal, F (u) is the coefficient after DCT transformation, N is the number of points of the original signal, and c (u) can be regarded as a compensation coefficient, and the DCT transformation matrix can be made orthogonal. The forward transform formula of the two-dimensional discrete cosine transform is:

(d) Shrinking DCT

The result of the DCT is a matrix of 32x32 size, but only the 8x8 matrix in the upper left corner needs to be retained, which represents the lowest frequency in the picture.

(e) Calculating the average value

Like mean hashing, the mean of the DCT is calculated

(f) Further reducing DCT

The comparison is performed according to the 8x8 DCT matrix, and the DCT average value is set to be '1' or more and the DCT average value is set to be '0' or less. Under the condition that the overall structure of the picture is kept unchanged, the hash result value is unchanged.

(g) Constructing a hash value

The hash value is generated by combining 64 bits, and the sequence is random but consistent.

(h) Comparison fingerprint: and calculating fingerprints of the two pictures and calculating the Hamming distance.

The working principle and beneficial effects of the technical scheme are as follows: after the recognition engine recognizes the image information, the image information is stored, the recognition engine is closed, whether the recognition engine is restarted or not is determined by comparing the image information acquired by the camera in real time with the stored image information, namely, if the comparison result is within a preset range, the recognition engine is not required to be restarted, otherwise, the recognition engine is restarted to recognize the image information acquired by the camera in real time, the number of times of calling the recognition engine is reduced by increasing the recognition of page turning, and therefore the return of invalid data is reduced and the probability of false recognition is reduced.

An image recognition apparatus, as shown in fig. 3, comprising:

an identification module 301, configured to identify an image by using an identification engine, and obtain image information;

a saving module 302, configured to save the image information after the recognition is completed and turn off the recognition engine;

the comparison module 303 is configured to obtain image information collected by the camera in real time and compare the image information with stored image information;

a determining module 304 is configured to determine whether to reactivate the recognition engine for secondary recognition.

In one embodiment, the image includes a plurality of; an identification module, comprising:

the first acquisition submodule is used for acquiring a plurality of images;

an activation sub-module for activating an identification engine based on a plurality of the images;

and the identification sub-module is used for identifying the plurality of images one by utilizing the identification engine so as to obtain a plurality of image information.

In one embodiment, as shown in fig. 4, the comparison module includes:

a storage sub-module 401, configured to store a plurality of stored image information into a pre-established image information library;

the second obtaining sub-module 402 is configured to obtain image information collected by the camera in real time and transmit the image information to the recognition engine;

and the comparison sub-module 403 is configured to compare the plurality of image information in the image information base with the image information acquired in real time by using a perceptual hash algorithm, so as to obtain a comparison result.

In one embodiment, the comparison sub-module comprises:

a second calculation unit for calculating an average value of the m generalized frequency components, and then comparing each generalized frequency component of the m generalized frequency components with the average value, wherein the generalized frequency component greater than or equal to the average value is recorded as 1, and the generalized frequency component less than the average value is recorded as 0; counting a first registration result;

an output unit configured to output a first registration result;

and the comparison unit is used for counting a plurality of second registration results of the plurality of image information in advance, comparing the plurality of second registration results with the first registration results and obtaining a comparison result.

In one embodiment, the determining module includes:

the third acquisition sub-module is used for acquiring the target image quantity and image information of which the generalized frequency component identical rate is more than or equal to the first preset probability in the comparison result;

and the control sub-module is used for controlling the recognition engine to output target image information with the maximum generalized frequency component identical rate and play the audio file corresponding to the target image information when the comparison sub-module confirms that the generalized frequency component identical rate is greater than or equal to the second preset probability, otherwise, restarting the recognition engine to recognize the image information acquired by the camera in real time and storing the image information.

It will be appreciated by those skilled in the art that the first and second aspects of the present invention refer to different phases of application.

Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any adaptations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

It is to be understood that the present disclosure is not limited to the precise arrangements and instrumentalities shown in the drawings, and that various modifications and changes may be effected without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims

1. An image recognition method, characterized by comprising the steps of:

determining whether to re-activate the recognition engine for secondary recognition;

the step of obtaining the image information acquired by the camera in real time and comparing the image information with the stored image information comprises the following steps:

comparing a plurality of image information in the image information base with the image information acquired in real time by using a perceptual hash algorithm to obtain a comparison result;

the comparing, using a perceptual hash algorithm, the plurality of image information in the image information library and the image information acquired in real time to obtain a comparison result, including:

preprocessing the image information acquired in real time;

outputting the first registration result;

2. The image recognition method according to claim 1, wherein the image includes a plurality of; the image is identified by an identification engine to obtain image information, which comprises the following steps:

acquiring a plurality of images;

activating the recognition engine based on a plurality of the images;

3. The image recognition method of claim 1, wherein the determining whether to reactivate the recognition engine for secondary recognition comprises:

4. An image recognition apparatus, comprising:

the determining module is used for determining whether the recognition engine is re-activated for secondary recognition;

the contrast module comprises:

the comparison sub-module is used for comparing a plurality of image information in the image information base with the image information acquired in real time by using a perceptual hash algorithm to obtain a comparison result;

the contrast sub-module comprises:

an output unit configured to output the first registration result;

5. The image recognition device of claim 4, wherein the image comprises a plurality of images; the identification module comprises:

the first acquisition submodule is used for acquiring a plurality of images;

6. The image recognition device of claim 4, wherein the determination module comprises: