CN111401252B

CN111401252B - Book spine matching method and equipment of book checking system based on vision

Info

Publication number: CN111401252B
Application number: CN202010187361.7A
Authority: CN
Inventors: 蔡君; 张立安; 廖丽平; 谭志坚
Original assignee: Guangdong Xingxi Intelligent Technology Co ltd; Guangdong Polytechnic Normal University
Current assignee: Guangdong Xingxi Intelligent Technology Co ltd; Guangdong Polytechnic Normal University
Priority date: 2020-03-17
Filing date: 2020-03-17
Publication date: 2023-07-07
Anticipated expiration: 2040-03-17
Also published as: CN111401252A

Abstract

The invention discloses a spine matching method based on a visual book checking system, wherein an algorithm for improving the spine matching precision in the book checking system comprises a mobile robot and a computer; the mobile robot acquires images in real time through the camera and transmits the images to the computer in real time for image processing; an xinxi-point model; a fast nearest neighbor algorithm FLANN of the high-dimensional data; false detection rejection model XingxiJudge; and (3) carrying out feature extraction on the spine image, and conveying the features to a FLANN feature matcher to be matched with a spine feature library. However, the method has error matching, after the results of a plurality of suspicious targets are matched, the suspicious targets are formed into 'picture pairs' one by one and sent into a false detection rejection model XingxiJudge, whether the detection is wrong or not is judged, if false detection exists, the suspicious targets are rejected until a unique target is left. Finally, the algorithm obviously improves the matching quality and has better real-time performance and feature matching accuracy. The time of the matching process is effectively reduced, and the accuracy of spine matching is improved.

Description

Book spine matching method and equipment of book checking system based on vision

Technical Field

The invention relates to the technical field of automatic book checking, in particular to a spine matching method and equipment of a visual book checking system.

Background

In the book checking work of each large and medium-sized library, the book checking work is the work which needs to be performed every a period of time. At present, paper books are still an integral part of book resources. For millions of books in large and medium libraries, the routine inventory work is very labor and material consuming. Therefore, the book checking system based on computer vision is applied to book checking work of various large and medium libraries, and the checking work efficiency is improved. The system shoots the spine image by the mobile robot through various technologies such as image processing, deep learning and the like, performs characteristic extraction and characteristic matching on the spine image to finish spine matching, and is one of important components of a book checking system.

Spine matching is carried out, and spine feature extraction is firstly required to be carried out on spine images. Currently, the mainstream spine feature extraction algorithm includes SIFT, SURF, ORB. The SIFT extracted features are robust, but not as real-time as SURF, ORB. The ORB feature extraction algorithm runs far better than SIFT and SURF and can be used for real-time feature detection, but ORB is not scale-change robust.

After the spine feature extraction is performed on the image, spine feature matching is required. Because the library inventory system needs to process a large amount of data, the matching algorithm selects the FLANN for matching, the FLANN can select a proper algorithm to process a large amount of data according to the data, and the FLANN is much faster than other nearest neighbor searches. In the spine matching process, the larger the matched data volume is, the more false checks are generated, and the common spine matching technology is completed, but the problems are still not solved.

Disclosure of Invention

The present invention aims to solve at least one of the technical problems existing in the prior art. Therefore, the invention discloses a spine matching method of a book checking system based on vision, which comprises the following steps:

step 1, a mobile robot acquires spine image information in real time through a camera;

step 2, extracting features of the acquired spine image, and extracting feature points and descriptors through a preset algorithm;

step 3, training grids to extract corner points by taking virtual three-dimensional objects as data sets;

step 4, automatically marking characteristic points, wherein real spine picture data are adopted, and the network trained in the step 3 is used for extracting angular points;

step 5, auditing the real spine picture data to obtain a new picture and forming a picture pair with a known position relationship, inputting the picture pair into a network, and extracting feature points and descriptors;

the features extracted through the preset algorithm are sent to a Flann feature matcher for rough matching, and a feature subset is obtained;

and 6, traversing the feature subsets by the features, calculating the distance to obtain a result set, sending elements in the result set into a false detection rejection model XingxiJudge one by one, judging whether false detection exists, if false detection exists, performing second matching to realize rejection false matching operation, optimizing parameters in the model through a false detection rejection network model, and matching the spine.

Further, the preset algorithm is to design two networks, namely a BaseDetector network, for detecting corner points of the basic geometric image; the other is the XingxiPoint model network for extracting feature points and descriptors.

Furthermore, the step 5 further includes that the result of feature matching may obtain a corresponding relationship list of two feature sets, and before the Flann invokes the matching function, training a matcher to achieve the purpose of improving the matching speed, and matching features of the query spine set with a trainer one by one, where feature points of the query spine set may match a plurality of suspicious targets.

Still further, the step 6 further includes: and in the later stage, verifying the correctness of the matching by a false detection rejection model XingxiJudge, and rejecting until the optimal matching is obtained if false detection exists.

Further, the XingxiPoint model is composed of a BACKBONE module, a PPN module, a ROIPOOL module and a KP module; the KP module is responsible for generating feature points and descriptors; the BACKBONE module is respectively connected with the PPN module and the ROIPOOL module, the PPN module is connected with the ROIPOOL module, and the ROIPOOL module is connected with the KP module.

Further, performing noise reduction treatment on the spine data set A; automatically marking characteristic points of the data set A by affine transformation and ORB algorithm to obtain a spine data set G with strong supervision marking; initializing an XingxiPoint model; training the XingxiPoint by using a data set G, and selecting a small batch gradient descent algorithm by an optimizer; using the XingxiPoint to query the spine Q to obtain a series of feature points and descriptors; in a library spine library, matching a plurality of suspicious targets of Q by adopting a BF matching algorithm.

The invention further discloses an electronic device, comprising: a processor; and a memory for storing executable instructions of the processor; wherein the processor is configured to perform the spine matching method of the vision-based book checking system described above via execution of the executable instructions.

The invention further discloses a computer readable storage medium, on which a computer program is stored, which when being executed by a processor, implements the spine matching method of the vision-based book checking system.

In summary, compared with the spine matching method in the existing visual book checking system, the spine matching method has the following advantages:

firstly, more stable spine feature points can be generated, and the matching precision is improved.

For feature point extraction, an ORB feature point extraction algorithm is generally adopted, but the feature points are not stable due to the fact that the actual photographed spine is not ideal, and the subsequent matching precision is directly affected; according to the XingxiPoint model based on deep learning, affine transformation in the actual shooting spine can be fitted under the condition that manual labeling is not needed, the model is insensitive to illumination and angles of actual shooting, and the subsequent spine matching precision is effectively improved.

And secondly, self-adaptive similarity measurement can be carried out on the matching result, and a final target can be accurately taken out.

In spine matching, similarity between a spine and suspicious results is generally queried by cosine distance measurement, and then the similarity is ranked according to the similarity; once the suspicious results are compared with the images, or the calculation accuracy is insufficient, the final target is wrong, which is unacceptable; the false detection rejection model XingxiJudge provided by the invention essentially changes fixed similarity measurement into non-fixed self-adaptive similarity measurement, thereby improving the accuracy of a final target.

Drawings

The invention will be further understood from the following description taken in conjunction with the accompanying drawings. The components in the figures are not necessarily to scale, emphasis instead being placed upon illustrating the principles of the embodiments. In the figures, like reference numerals designate corresponding parts throughout the different views.

FIG. 1 is a schematic diagram of the structure of the XingxiPoint model of the present invention;

FIG. 2 is a schematic illustration of the present invention the XingxiJudge model structure schematic diagram.

Detailed Description

The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the detailed description is presented by way of example only and is not intended to limit the scope of the invention.

Example 1

As shown in fig. 1-2, this embodiment proposes a spine matching method of a visual book checking system, including the steps of:

s1, performing noise reduction treatment on a spine data set A;

preprocessing a spine image, wherein a Gaussian filter is adopted for noise reduction processing in the embodiment; the Gaussian filter is a linear smoothing filter and has good inhibition effect on noise subjected to normal distribution. The gaussian filter is a weighting matrix whose weight expression is as follows:

wherein W is _ij Is the weight, i and j are the pixel index, K is the normalization constant, and σ is the standard deviation of the Gaussian distribution.

S2, automatically marking characteristic points of the data set A by using affine transformation and ORB algorithm to obtain a spine data set G with strong supervision marking;

affine transformation is performed on each image in the spine image dataset, and 2000 transformations are adopted in the embodiment to obtain 2000 transformed images. And respectively extracting characteristic points on the images by using ORB, so that 2000 characteristic point locating images can be obtained, the 2000 characteristic point locating images are subjected to inverse affine transformation and then accumulated together to obtain a final characteristic point locating image, namely the characteristic point marking of the original spine image, without manual marking, and the cost is saved. The data set G can be obtained by repeating the steps.

S3, initializing an XingxiPoint model;

the XingxiPoint model is essentially a variant of the target detection model FasterRCNN model;

comprises a BACKBONE module, a PPN module, a ROIPOOL module and a KP module. Wherein, the BACKBONE module is responsible for generating a characteristic diagram of the spine; any deep convolutional network can be used as a back bone module, and the common VGG network is adopted in this embodiment.

The PPN module is a full convolution network and is responsible for suggesting suspicious feature points to the ROIPOOL module and corresponds to the regional suggestion module RPN in the FasterRCNN model; wherein Anchor in the RPN module is characterized by two points. In the embodiment, the Anchor in the RPN is changed from two-point characterization to single-point characterization and is used as a PPN module. Note that a single point is herein referred to as a suspicious feature point.

The ROIPOOL module is responsible for receiving the suggested feature points of the PPN module, extracting corresponding feature blocks from the BACKBONE module and sending the feature blocks to the KP module. The structure of the ROIPOOL module in this embodiment is consistent with the ROIPOOL in the target detection model FaterRCNN.

The KP module is actually a regressor and is responsible for regressing real feature point coordinates according to the labeling information, and a feature block corresponding to the feature point is given as a descriptor.

The connection mode of each module is as follows: the BACKBONE module is respectively connected with the PPN module and the ROIPOOL module, the PPN module is connected with the ROIPOOL module, and the ROIPOOL module is connected with the KP module.

S4, training the XingxiPoint by using a data set G, and selecting a small batch gradient descent algorithm by an optimizer;

small batch gradient descent is a compromise to batch gradient descent and random gradient descent. The idea is to calculate the average gradient of the batch_size samples in each iteration, and then optimize the parameters. In this embodiment, we take batch_size=10, train 1000 spine images, and then the optimization process pseudo-code is:

where θ is the XingxiPoint model parameter, α is the learning rate, and the term following α is the average gradient of the batch_size samples.

S5, using the XingxiPoint for inquiring the spine Q to obtain a series of feature points and descriptors;

s6, matching a plurality of suspicious targets of the Q in a library spine library by adopting a BF matching algorithm;

and obtaining a corresponding relation list of the two feature sets through spine matching. The first set of features is named a dataset and the second set is a query set. BF matching is a similarity search between high-dimensional vectors through a distance function, and a k-d tree is an algorithm for fast and accurately finding a high-dimensional spatial index structure and approximate query of neighbors of a query point. Given a query point and a query distance threshold, all data from the dataset that is less than the query distance threshold from the query point is found. K data closest to the query point are found from the dataset, setting k=2, which is the nearest neighbor query. After the feature points and descriptors are obtained, each point to be queried in the set is found, and 2-neighbor query is carried out on the feature point set of the target object obtained by matching, namely, the nearest neighbor and the next nearest neighbor are obtained.

Defining nearest neighbors:

given a multidimensional space

Handle->

Becomes one sample point. The nearest neighbor of a given sample set E and sample points d, d is that any sample point d 'E is satisfied such that all neighbors (E, d, d'). The distance measure is as follows

Wherein d is _i Is the i-th component of vector d.

The BF matching algorithm trains out a matcher. The matcher designs an index tree of the feature set, and matches each feature point of the query set with the matcher to obtain a plurality of suspicious targets.

S7, matching a plurality of suspicious targets to form a picture pair one by one, sending the picture pair into a false detection rejection model XingxiJudge, judging whether false detection exists or not, and rejecting if false detection exists. Finally leaving the unique target.

And matching a plurality of suspicious targets to form a picture pair one by one, and sending the picture pair into a false detection rejection model XingxiJudge. The false detection rejection model is used for screening 'picture pairs' to obtain the best matched spine image. And (3) finding out the first two key points of a plurality of suspicious targets closest to the spine image of the data set by extracting key points of the spine, judging as false detection if the ratio of the second closest distance divided by the closest distance is smaller than a certain threshold value, and adopting a rejecting operation.

The false detection rejection model XingxiJudge trains a classifier. For the two classification problems, the false detection is marked as a negative sample as 0, the positive detection is marked as a positive sample as 1, i.e. for class y, there is

y∈{0,1}

Probability h for positive samples _θ (x) So that

0≤h _θ (x)≤1

Where θ is the parameter to be optimized such that in case of sample x for unknown class ₀ Classification time h _θ (x ₀ ) Probability that the sample is a positive sample. The classification criteria are as follows:

due to the high dimension of the feature space and the millions of spine images to be matched, similar distances entail a large number of mismatches, and the corresponding ratio is also high. So by raising this threshold, the number of matching points will be reduced, and the accuracy of spine matching can be effectively improved.

S8, after a unique target is left by the false detection rejection model XingxiJudge, finally outputting the spine to be queried.

Example two

The invention discloses a spine matching method of a book checking system based on vision, which comprises the following steps:

It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article or apparatus that comprises the element.

It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

While the invention has been described above with reference to various embodiments, it should be understood that many changes and modifications can be made without departing from the scope of the invention. It is therefore intended that the foregoing detailed description be regarded as illustrative rather than limiting, and that it be understood that it is the following claims, including all equivalents, that are intended to define the spirit and scope of this invention. The above examples should be understood as illustrative only and not limiting the scope of the invention. Various changes and modifications to the present invention may be made by one skilled in the art after reading the teachings herein, and such equivalent changes and modifications are intended to fall within the scope of the invention as defined in the appended claims.

Claims

1. A spine matching method of a book checking system based on vision is characterized by comprising the following steps:

step 2, extracting features of the acquired spine image, extracting feature points and descriptors through a preset algorithm, wherein the preset algorithm is to design two networks, namely a BaseDetector network, and the BaseDetector network is used for detecting corner points of the basic geometric image; the other is the XingxiPoint model network for extracting feature points and descriptors;

step 4, automatically marking characteristic points, wherein real spine picture data are adopted, and the grids trained in the step 3 are used for extracting angular points;

step 5, auditing the real spine picture data to obtain a new picture and forming a picture pair with a known position relationship, inputting the picture pair into a network, and extracting feature points and descriptors; the features extracted through the preset algorithm are sent to a Flann feature matcher for rough matching, and a feature subset is obtained;

2. The spine matching method of the vision-based book checking system of claim 1, wherein the step 5 further comprises the step that the corresponding relation list of two feature sets is obtained by the feature matching result, before the matching function is called, the Flann trains a matcher to achieve the purpose of improving the matching speed, then inquires that features of the spine set are matched with a trainer one by one, and inquires that feature points of the spine set match a plurality of suspicious targets.

3. The spine matching method of a vision-based book checking system of claim 1, wherein said step 6 further comprises: and in the later stage, verifying the correctness of the matching by a false detection rejection model XingxiJudge, and rejecting until the optimal matching is obtained if false detection exists.

4. The spine matching method of a vision-based book checking system of claim 1, wherein the XingxiPoint model is composed of a back box module, a PPN module, a roiool module and a KP module; the KP module is responsible for generating feature points and descriptors; the BACKBONE module is respectively connected with the PPN module and the ROIPOOL module, the PPN module is connected with the ROIPOOL module, and the ROIPOOL module is connected with the KP module.

5. The spine matching method of a vision-based book checking system as claimed in claim 1, wherein the spine data set a is subjected to noise reduction; automatically marking characteristic points of the data set A by affine transformation and ORB algorithm to obtain a spine data set G with strong supervision marking; initializing an XingxiPoint model; training the XingxiPoint by using a data set G, and selecting a small batch gradient descent algorithm by an optimizer; using the XingxiPoint to query the spine Q to obtain a series of feature points and descriptors; in a library spine library, matching a plurality of suspicious targets of Q by adopting a BF matching algorithm.

6. An electronic device, comprising:

a processor; the method comprises the steps of,

a memory for storing executable instructions of the processor;

wherein the processor is configured to perform the spine matching method of the vision-based book checking system of any one of claims 1-5 via execution of the executable instructions.

7. A computer readable storage medium having stored thereon a computer program, which when executed by a processor, implements the spine matching method of the vision-based book checking system of any one of claims 1-5.