CN111160390A

CN111160390A - Image identification method and device

Info

Publication number: CN111160390A
Application number: CN201911212596.0A
Authority: CN
Inventors: 盖俸瑞
Original assignee: Unisound Intelligent Technology Co Ltd
Current assignee: Unisound Intelligent Technology Co Ltd
Priority date: 2019-12-02
Filing date: 2019-12-02
Publication date: 2020-05-15
Anticipated expiration: 2039-12-02
Also published as: CN111160390B

Abstract

The invention discloses an image identification method and device, comprising the steps of identifying an image by using an identification engine to obtain image information; after the recognition is finished, saving the image information and closing the recognition engine; acquiring image information acquired by a camera in real time and comparing the image information with stored image information; it is determined whether to reactivate the recognition engine for secondary recognition. The image information is stored and the recognition engine is closed after the recognition engine recognizes the image information, and whether the recognition engine is restarted is determined by comparing the image information collected by the camera in real time with the stored image information, so that the problem that the recognition error probability is increased because the engine is continuously called for recognition after the engine is successfully recognized by a background in the prior art and invalid data is returned is solved, and the experience of a user is enhanced.

Description

Image identification method and device

Technical Field

The present invention relates to the field of data processing technologies, and in particular, to an image recognition method and apparatus

Background

At present, an image scanning technology becomes an essential link in life of people, such as important document printing, identity card scanning and the like, the image scanning technology in the prior art acquires image information in real time by using a camera and transmits the image information to an engine for identification so as to obtain contents related to the acquired image information, and the method has the following problems that when the image information is acquired by the camera and transmitted to the engine and the engine is successfully identified, a background can continue to call the engine to continue identification, identify similar pages and return invalid data, meanwhile, the probability of false identification is increased, and the experience feeling is very poor for users.

Disclosure of Invention

In order to solve the displayed problem, the method acquires image information based on recognizing an image by using the recognition engine, saves the image information after the recognition is successful, closes the recognition engine, and determines whether to restart the recognition engine by comparing the image information collected by the camera with the saved image information to recognize the image information.

An image recognition method comprising the steps of:

identifying the image by using an identification engine to obtain image information;

after recognition is finished, the image information is saved and the recognition engine is closed;

acquiring image information acquired by a camera in real time and comparing the image information with stored image information;

determining whether to reactivate the recognition engine for secondary recognition.

Preferably, the image comprises a plurality; the recognizing the image by using the recognition engine to obtain the image information comprises the following steps:

acquiring a plurality of said images;

activating the recognition engine based on a plurality of the images;

and identifying the plurality of images one by utilizing the identification engine so as to obtain a plurality of image information.

Preferably, the comparing the image information acquired by the camera in real time with the stored image information includes:

storing the stored image information into a pre-established image information base;

acquiring image information acquired by the camera in real time and transmitting the image information to the recognition engine;

and comparing the plurality of image information in the image information base with the image information acquired in real time by using a perceptual Hash algorithm to obtain a comparison result.

Preferably, the comparing the plurality of image information in the image information base and the image information acquired in real time by using a perceptual hashing algorithm to obtain a comparison result includes:

preprocessing the image information acquired in real time;

calculating m generalized frequency components of the image information after preprocessing;

calculating an average value of the m generalized frequency components, and then comparing each generalized frequency component in the m generalized frequency components with the average value, wherein the generalized frequency component which is greater than or equal to the average value is recorded as 1, and the generalized frequency component which is smaller than the average value is recorded as 0; counting a first registration result;

outputting the first registration result;

and counting a plurality of second registration results of the plurality of image information in advance, comparing the plurality of second registration results with the first registration result, and acquiring the comparison result.

Preferably, the determining whether to reactivate the recognition engine for secondary recognition includes:

acquiring the quantity of target images and image information of which the generalized frequency component identity rate is greater than or equal to a first preset probability in the comparison result;

comparing whether the generalized frequency component identity rate is greater than or equal to a second preset probability or not through the obtained result and the image information acquired in real time;

if yes, outputting the target image information with the maximum generalized frequency component identity rate and playing an audio file corresponding to the target image information;

and if not, restarting the recognition engine to recognize and store the image information acquired by the camera in real time.

An image recognition apparatus, the apparatus comprising:

the identification module is used for identifying the image by utilizing an identification engine to obtain image information;

the storage module is used for storing the image information and closing the recognition engine after the recognition is finished;

the comparison module is used for acquiring image information acquired by the camera in real time and comparing the image information with the stored image information;

and the determining module is used for determining whether to reactivate the recognition engine for secondary recognition.

Preferably, the image comprises a plurality; the identification module comprises:

a first acquisition sub-module for acquiring a plurality of the images;

an activation sub-module for activating the recognition engine based on a plurality of the images;

and the identification submodule is used for identifying the images one by utilizing the identification engine so as to obtain a plurality of image information.

Preferably, the comparison module includes:

the storage submodule is used for storing the stored image information into a pre-established image information base;

the second acquisition submodule is used for acquiring image information acquired by the camera in real time and transmitting the image information to the recognition engine;

and the comparison submodule is used for comparing a plurality of image information in the image information base with the image information acquired in real time by using a perceptual Hash algorithm to obtain a comparison result.

Preferably, the comparison sub-module includes:

the preprocessing unit is used for preprocessing the image information acquired in real time;

a first calculation unit for calculating m generalized frequency components of the image information after the preprocessing;

a second calculating unit, configured to calculate an average value of the m generalized frequency components, and then compare each generalized frequency component of the m generalized frequency components with the average value, where a generalized frequency component greater than or equal to the average value is recorded as 1, and a generalized frequency component smaller than the average value is recorded as 0; counting a first registration result;

an output unit configured to output the first registration result;

and the comparison unit is used for counting a plurality of second registration results of the plurality of image information in advance, comparing the plurality of second registration results with the first registration result and acquiring the comparison result.

Preferably, the determining module includes:

the third obtaining submodule is used for obtaining the number of target images and image information of which the generalized frequency component identity rate is greater than or equal to a first preset probability in the comparison result;

the comparison submodule is used for comparing whether the generalized frequency component identity rate is greater than or equal to a second preset probability or not through the acquired result and the image information acquired in real time;

and the control submodule is used for controlling the recognition engine to output the target image information with the maximum generalized frequency component identity rate and play the audio file corresponding to the target image information when the comparison submodule confirms that the generalized frequency component identity rate is greater than or equal to a second preset probability, and otherwise, restarting the recognition engine to recognize and store the image information acquired by the camera in real time.

Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and drawings.

The technical solution of the present invention is further described in detail by the accompanying drawings and embodiments.

Drawings

FIG. 1 is a flowchart illustrating an image recognition method according to the present invention;

FIG. 2 is another flowchart of an image recognition method according to the present invention;

FIG. 3 is a block diagram of an image recognition apparatus according to the present invention;

FIG. 4 is another structural diagram of an image recognition apparatus according to the present invention;

Detailed Description

Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The implementations described in the exemplary embodiments below are not intended to represent all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present disclosure, as detailed in the appended claims.

At present, an image scanning technology becomes an essential link in life of people, such as important document printing, identity card scanning and the like, the image scanning technology in the prior art acquires image information in real time by using a camera and transmits the image information to an engine for identification so as to obtain contents related to the acquired image information, and the method has the following problems that when the image information is acquired by the camera and transmitted to the engine and the engine is successfully identified, a background can continue to call the engine to continue identification, identify similar pages and return invalid data, meanwhile, the probability of false identification is increased, and the experience feeling is very poor for users. In order to solve the above problem, the present embodiment discloses a method for acquiring image information based on image recognition by using a recognition engine, saving the image information after the recognition is successful, turning off the recognition engine, and determining whether to restart the recognition engine by comparing the image information collected by a camera with the saved image information.

An image recognition method, as shown in fig. 1, includes the following steps:

s101, identifying an image by using an identification engine to obtain image information;

s102, storing the image information after the identification is finished and closing an identification engine;

s103, acquiring image information acquired by a camera in real time and comparing the image information with stored image information;

and step S104, determining whether to reactivate the recognition engine for secondary recognition.

The working principle of the technical scheme is as follows: and identifying the image by using an identification engine to acquire image information, storing the image information after successful identification, closing the identification engine, comparing the image information acquired by the acquired camera in real time with the stored image information, and determining whether to reactivate the identification engine for secondary identification according to a comparison result.

The beneficial effects of the above technical scheme are: the image information is stored and the recognition engine is closed after the recognition engine recognizes the image information, whether the recognition engine is restarted or not is determined by comparing the image information collected by the camera in real time with the stored image information, namely the recognition engine does not need to be restarted if the comparison result is within a preset range, otherwise, the recognition engine is restarted to recognize the image information collected by the camera in real time, and the problems that the engine is continuously called for recognition after the engine is successfully recognized by a background in the prior art, invalid data is returned, the probability of false recognition is increased, and the experience of a user is enhanced are solved.

In one embodiment, the image comprises a plurality; recognizing the image by using a recognition engine to obtain image information, wherein the image information comprises:

acquiring a plurality of images;

activating a recognition engine based on the plurality of images;

and identifying the plurality of images one by utilizing an identification engine to obtain a plurality of image information.

The beneficial effects of the above technical scheme are: the situation of different image information collected by the camera in real time can be dealt with by utilizing the recognition engine to recognize the images, and meanwhile, the successful comparison probability of the image information is increased.

In one embodiment, as shown in fig. 2, the acquiring image information collected by the camera in real time and comparing the acquired image information with the stored image information includes:

step S201, storing a plurality of stored image information into a pre-established image information base;

step S202, acquiring image information acquired by a camera in real time and transmitting the image information to an identification engine;

and S203, comparing a plurality of image information in the image information base with the image information acquired in real time by using a perceptual Hash algorithm to obtain a comparison result.

The beneficial effects of the above technical scheme are: an image information base is established so as to store the stored image information, the contrast result can be more accurately calculated by using the Hash algorithm, and the problem of false identification is further avoided.

In one embodiment, comparing a plurality of image information in an image information base with image information acquired in real time by using a perceptual hashing algorithm to obtain a comparison result, includes:

preprocessing image information acquired in real time;

calculating the average value of m generalized frequency components, then comparing each generalized frequency component in the m generalized frequency components with the average value, and recording the generalized frequency component which is greater than or equal to the average value as 1 and the generalized frequency component which is smaller than the average value as 0; counting a first registration result;

outputting the first registration result;

counting a plurality of second registration results of a plurality of image information in advance, comparing the plurality of second registration results with the first registration result, and acquiring a comparison result;

in particular, the pretreatment comprises: reducing the size of image information acquired by a camera and simplifying colors into gray images; the generalized frequency component value calculated is 32x32 matrix, and the generalized frequency component and average value are compared by taking the upper left corner 8x8, and m can be 64.

The beneficial effects of the above technical scheme are: the value of the generalized frequency component can be calculated more accurately by preprocessing the acquired image information, the generalized frequency component of the image information acquired by the camera in real time is counted and compared with the average value to confirm the first registration result, and then the first registration result is compared with the preset second registration result of the saved image information to obtain whether the image information acquired by the camera is the same as the saved image information according to the comparison result, so that the scanning recognition rate is more stable compared with the prior art.

In one embodiment, determining whether to reactivate the recognition engine for secondary recognition includes:

acquiring the quantity of target images and image information of which the generalized frequency component identity rate is greater than or equal to a first preset probability in a comparison result;

otherwise, restarting the recognition engine to recognize and store the image information collected by the camera in real time;

specifically, the first preset probability may be 50%, the second preset probability may be 60%, and the larger the probability is, the more similar the image information acquired by the camera in real time is, and finally the preset image information with the largest probability is output.

The beneficial effects of the above technical scheme are: the maximum target image information of the generalized frequency components can be output, so that the recognition result is more biased to the image information acquired by the camera in real time, the first preset probability box and the second preset probability are set to better cope with the phenomenon that the same rate of the generalized frequency components is higher, the unrecognized image information is re-recognized and stored through the recognition engine, the image information base can be enriched, and then the same image information appearing next time can be directly recognized, so that the time is saved, and the efficiency is improved.

In one embodiment, the method comprises the following steps:

step 1: and when the recognition engine returns to the state of successful recognition, saving the corresponding image information and closing the recognition engine.

Step 2: the image information acquired by the camera in real time is compared with the stored image information by using a perception Hash algorithm pHash;

and 3, step 3: and when the comparison result of the two pieces of picture information is less than 60 percent, the identification engine is started from the beginning.

Perceptual hashing algorithm pHash:

the perceptual hashing algorithm, which uses DCT (discrete cosine transform) to reduce the frequency, can obtain more accurate results.

(a) Reduced size

To simplify the computation of DCT, pHash starts with small pictures (suggested pictures are larger than 8x8, 32x 32).

(b) Simplified color

Like aHash, the picture needs to be converted into a gray image, so as to further simplify the calculation amount (the specific algorithm is shown in the aHash algorithm step).

(c) Computing DCT

DCT is the decomposition of pictures into frequency bins and ladders. Here, a 32x32 picture is taken as an example.

The DCT Transform is called Discrete Cosine Transform (Discrete Cosine Transform), and is mainly used for compressing data or images, and can convert signals in a spatial domain to a frequency domain, and has a good decorrelation performance. The DCT is lossless, but creates good conditions for next quantization, Huffman coding and the like in the fields of image coding and the like, and meanwhile, because the DCT is symmetrical, the DCT can be used for restoring the original image information at a receiving end by using the inverse DCT after quantization coding. The discrete cosine transform is carried out on the original image, the DCT coefficient energy after the transform is mainly concentrated on the upper left corner, most of the rest coefficients are close to zero, and the DCT has the characteristic of being suitable for image compression. And performing threshold operation on the transformed DCT coefficient, zeroing the coefficient less than a certain value, which is a quantization process in image compression, and then performing inverse DCT operation to obtain a compressed image.

Principle of discrete cosine transform:

one-dimensional DCT (discrete cosine transform):

where f (i) is the original signal, f (u) is the DCT-transformed coefficient, N is the number of points of the original signal, and c (u) can be considered as a compensation coefficient, which can make the DCT-transformed matrix an orthogonal matrix. The forward transform formula of the two-dimensional discrete cosine transform is as follows:

(d) reducing DCT

The result of the DCT is a matrix of size 32x32, but only the 8x8 matrix in the upper left corner needs to be retained, which part represents the lowest frequencies in the picture.

(e) Calculating the mean value

Computing the mean of the DCT as with mean hashing

(f) Further reduction of DCT

When the comparison is performed using the DCT matrix of 8 × 8, the DCT mean value greater than or equal to "1" is set, and the DCT mean value smaller than the mean value is set to "0". And under the condition that the overall structure of the picture is kept unchanged, the hash result value is unchanged.

(g) Construct hash value

And combining 64 bits to generate a hash value, wherein the sequence is random and the front and the back are consistent.

(h) Comparing fingerprints: and calculating fingerprints of the two pictures and calculating the Hamming distance.

The working principle and the beneficial effects of the technical scheme are as follows: the image information is stored and the recognition engine is closed after the recognition engine recognizes the image information, whether the recognition engine is restarted or not is determined by comparing the image information collected by the camera in real time with the stored image information, namely the recognition engine does not need to be restarted if the comparison result is within a preset range, otherwise, the recognition engine is restarted to recognize the image information collected by the camera in real time, the number of times of calling the recognition engine is reduced by increasing the identification of page turning, and therefore the return of invalid data is reduced and the probability of false recognition is reduced.

An image recognition apparatus, as shown in fig. 3, includes:

the identification module 301 is configured to identify an image by using an identification engine to obtain image information;

a saving module 302, configured to save the image information after the identification is completed and close the identification engine;

the comparison module 303 is configured to compare image information acquired by the camera in real time with stored image information;

a determining module 304, configured to determine whether to reactivate the recognition engine for secondary recognition.

In one embodiment, the image comprises a plurality; an identification module comprising:

a first acquisition sub-module for acquiring a plurality of images;

an activation sub-module for activating a recognition engine based on a plurality of said images;

and the identification submodule is used for identifying the plurality of images one by utilizing the identification engine so as to obtain a plurality of image information.

In one embodiment, as shown in fig. 4, the comparison module includes:

a storage sub-module 401, configured to store the stored multiple pieces of image information in a pre-established image information base;

the second obtaining submodule 402 is configured to obtain image information collected by the camera in real time and transmit the image information to the recognition engine;

the comparison sub-module 403 is configured to compare the multiple pieces of image information in the image information base with the image information acquired in real time by using a perceptual hashing algorithm to obtain a comparison result.

In one embodiment, a contrast submodule, comprising:

the second calculating unit is used for calculating the average value of the m generalized frequency components, then comparing each generalized frequency component in the m generalized frequency components with the average value, and recording the generalized frequency component which is greater than or equal to the average value as 1 and the generalized frequency component which is smaller than the average value as 0; counting a first registration result;

an output unit configured to output a first registration result;

and the comparison unit is used for counting a plurality of second registration results of the plurality of image information in advance, comparing the plurality of second registration results with the first registration result and acquiring a comparison result.

In one embodiment, the determining module includes:

the third obtaining submodule is used for obtaining the number of target images and image information of which the generalized frequency component identity rate is greater than or equal to the first preset probability in the comparison result;

and the control submodule is used for controlling the recognition engine to output the target image information with the maximum generalized frequency component identity rate and play the audio file corresponding to the target image information when the comparison submodule confirms that the generalized frequency component identity rate is greater than or equal to the second preset probability, and otherwise, restarting the recognition engine to recognize the image information collected by the camera in real time and storing the image information.

It will be understood by those skilled in the art that the first and second terms of the present invention refer to different stages of application.

Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims

1. An image recognition method, comprising the steps of:

2. The image recognition method according to claim 1, wherein the image includes a plurality; the recognizing the image by using the recognition engine to obtain the image information comprises the following steps:

acquiring a plurality of said images;

activating the recognition engine based on a plurality of the images;

3. The image recognition method of claim 1, wherein comparing the image information acquired by the acquisition camera in real time with the stored image information comprises:

4. The image recognition method of claim 3, wherein comparing the image information collected in real time with the image information in the image information library by using a perceptual hashing algorithm to obtain a comparison result comprises:

preprocessing the image information acquired in real time;

outputting the first registration result;

5. The image recognition method of claim 4, wherein the determining whether to reactivate the recognition engine for secondary recognition comprises:

6. An image recognition apparatus, characterized in that the apparatus comprises:

7. The image recognition apparatus according to claim 6, wherein the image includes a plurality; the identification module comprises:

a first acquisition sub-module for acquiring a plurality of the images;

8. The image recognition device of claim 6, wherein the comparison module comprises:

9. The image recognition device of claim 8, wherein the comparison sub-module comprises:

an output unit configured to output the first registration result;

10. The image recognition device of claim 9, wherein the determining module comprises: