CN110909743B

CN110909743B - Book checking method and book checking system

Info

Publication number: CN110909743B
Application number: CN201911163170.0A
Authority: CN
Inventors: 章志亮; 曹仁杰; 李钢; 顾明良
Original assignee: Beijing Zhongqi Zhiyuan Digital Information Technology Co ltd
Current assignee: Beijing Zhongqi Zhiyuan Digital Information Technology Co ltd
Priority date: 2019-11-25
Filing date: 2019-11-25
Publication date: 2023-08-11
Anticipated expiration: 2039-11-25
Also published as: CN110909743A

Abstract

The invention discloses a book checking method and a book checking system. The method comprises the following steps: obtaining a book image of which a single book only contains a single character; inputting each book image into a first pre-trained convolutional neural network one by one for character recognition; sequentially inputting each character recognition result output by the first pre-trained convolutional neural network into a book name library, and confirming whether each book obtains a book name or not; if the book name is obtained, the book name of the book is saved to an inventory table; if the book name is not obtained, inputting the image of the book into a second pre-trained convolutional neural network for image recognition; inputting an image recognition result output by the second pre-trained convolutional neural network into a book name library, and confirming whether a book is provided with a book name; after the book name is obtained, the book name is stored in an inventory table; otherwise, entering a data area to be marked; and counting the book stock according to the names and pricing of the books stored in the inventory.

Description

Book checking method and book checking system

Technical Field

The invention relates to a book checking method based on image recognition, and also relates to a book checking system adopting the method, belonging to the technical field of book checking.

Background

For a long time, the checking work of book entity stores and warehouses is quite complicated, and the problems of long time consumption, high investment cost of manpower and material resources, influence on business in checking and the like exist. The existing book checking scheme mainly comprises the following two types:

the first book checking scheme is as follows: as shown in fig. 1, a one-dimensional bar code on the book cover is scanned one by one using a laser bar code scanning gun or an RF (radio frequency) device, then manually processed one by one, and scanned data is input to a computer system. At present, the book checking scheme is a mode used by most users. The book checking scheme has long time consumption, high labor intensity and higher checking cost, and all or part of book entity stores are stopped for book checking.

The second book checking scheme is as follows: each book is pasted with an electronic tag in advance, then put on a shelf, and when checking, the RFID reader is used for collecting book checking data, and then the book checking data is input into the computer system. The book checking scheme has the advantages of higher processing speed and data acquisition after code pasting compared with the first book checking scheme, but has over high economic cost, and is rarely used by users in the book industry. In addition, due to the problem of the identification rate of the electronic tag, the accuracy of inventory cannot be guaranteed, and meanwhile, the workload of removing the electronic tag during return is increased.

Disclosure of Invention

Aiming at the defects of the prior art, the primary technical problem to be solved by the invention is to provide a book checking method based on image recognition.

The invention aims to provide a book checking system adopting the method.

In order to achieve the above purpose, the present invention adopts the following technical scheme:

according to a first aspect of an embodiment of the present invention, there is provided a book checking method, including the steps of:

step S1: obtaining a book image of which a single book only contains a single character;

step S2: respectively inputting book images of each single character into a first pre-trained convolutional neural network one by one for character recognition;

step S3: sequentially inputting each character recognition result output by the first pre-trained convolutional neural network into a book name library, and confirming whether each book can obtain a book name;

step S4: if a book gets a book name, the book name of the book is saved to an inventory table;

step S5: if a book does not obtain a book name, inputting an image of the book into a second pre-trained convolutional neural network for image recognition;

step S6: inputting the image recognition result output by the second pre-trained convolutional neural network into a book name library, and confirming whether a book which does not obtain a book name obtains the book name or not;

step S7: if the book in the step S6 is obtained, the book name and pricing are saved to an inventory table; otherwise, entering a data area to be marked;

and S8, counting the book stock according to the names and pricing of the books stored in the inventory list.

Preferably, the spine image of a bookshelf is obtained for preprocessing before the book image of which the single book only contains single characters is obtained.

Wherein, the collected book spine image is preprocessed preferably, comprising the following sub-steps:

step 10: image segmentation is carried out on the book spine image to obtain a single book image;

step 11: and carrying out font segmentation on the single book image to obtain a book image with single characters.

Preferably, the process of obtaining the single book image comprises the following substeps:

step S100: obtaining an effective spine image from the spine image of the book;

step 101: and cutting the effective spine image to obtain a single book image.

step S1010: converting the effective spine image into a gray image;

step S1011: filtering the gray level image;

step S1012: extracting an edge point set of the single book image from the gray level image;

step S1013: and linearizing the edge point set of the single book image to obtain an effective edge line of the single book image so as to cut out the single book image.

Wherein preferably the first pre-trained convolutional neural network is trained by the sub-steps of:

step S20: establishing the first convolutional neural network;

step S21: and training the first convolutional neural network to obtain first convolutional neural network model data.

Wherein preferably the second pre-trained convolutional neural network is trained by the sub-steps of:

step S60: establishing the second convolutional neural network;

step S61: and training the second convolutional neural network to obtain second convolutional neural network model data.

Preferably, the first convolutional neural network and the second convolutional neural network each sequentially comprise an input layer, N convolutional layers and pooling layers which are alternately arranged, a full-connection layer and an output layer, wherein N is a positive integer;

and obtaining corresponding convolutional neural network model data by setting the filtering parameters of each convolutional layer and the pooling length of each pooling layer.

According to a second aspect of the embodiment of the present invention, there is provided a book checking system including a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the following steps when executing the program:

According to a third aspect of embodiments of the present invention, there is provided a machine readable medium having stored thereon a computer program which when executed by a processor performs the steps of:

The book checking method provided by the invention comprises the steps of preprocessing a collected book placement image on a certain bookshelf to form an image of a single book, and then carrying out character segmentation on the image of the single book to form a single character; then, recognizing characters by using a corresponding convolutional neural network, and recognizing images of the single book; and using a related technology of a database to ensure that the book name information of the character recognition is kept at an accurate application level. By the method, application pain points in bookstore inventory and shelf management can be effectively solved, inventory efficiency is greatly improved, and inventory cost is reduced.

Drawings

FIG. 1 is a schematic diagram of a device for checking a book in the prior art;

FIG. 2 is a schematic diagram of a spine image of a book collected in the book checking method provided by the invention;

FIG. 3 is a schematic diagram of an effective spine image cut from an acquired spine image in the book checking method provided by the invention;

FIG. 4 is a schematic diagram of a single book image obtained by segmenting an effective spine image in the book checking method provided by the invention;

FIG. 5 is a schematic flow chart of the book checking method provided by the invention;

fig. 6 is a schematic diagram of a book checking system according to the present invention.

Detailed Description

The technical contents of the present invention will be described in further detail with reference to the accompanying drawings and specific examples.

The book checking method provided by the invention is a method for realizing quick book checking based on an artificial intelligent visual recognition technology, and is used for realizing quick checking of books placed on a bookshelf and a goods space in a physical bookstore sales space and a book warehouse. As shown in fig. 2, the book checking method includes the following sub-steps:

step S1: a book image is obtained that contains only a single character.

Before a book image of a book only containing a single character is obtained, a book spine image of a bookshelf is obtained for preprocessing.

As shown in fig. 3, a book spine displayed on a bookshelf is photographed by a mobile phone, a camera, a high-definition camera or other special image acquisition equipment to acquire a book spine image of a book, and the acquired book spine image of the book is preprocessed to obtain a book image of a single book containing only a single character; the method comprises the following sub-steps of:

step S10: and carrying out image segmentation on the acquired spine image of the book to obtain a single book image.

Because the acquired spine image of the book contains all books displayed on a bookshelf, if the number of each book is counted, a single book image is required to be acquired from the acquired spine image of the book; the method for obtaining the single book image comprises the following substeps:

step S100: an effective spine image is obtained from the acquired spine image of the book.

Because of the randomness of data collection, not all pictures of the collected spine image of the book are valid data, so that the valid images contained in the collected spine image of the book need to be segmented.

The process of obtaining the effective image from the acquired spine image of the book comprises the following steps: copying the acquired spine images of the books to obtain two spine images of the same books; converting a spine image of one book into a gray image by utilizing a color space conversion function, wherein in the gray image, the gray value of the book part is continuously changed due to the fact that the background has the same gray value, namely the gray change rate of the background is relatively stable, and the gray value of the book part is relatively high due to the fact that the book part is provided with characters or color small pictures; then, transversely scanning the book spine gray level image of the book, extracting vertex coordinates of an effective image from the book spine gray level image of the book according to the limit between the gray level change rate of the background and the gray level change rate of the book part, and obtaining coordinates (upper left, lower left, upper right and lower right) of four vertices of the effective image; as shown in fig. 4, the effective spine image is cut out from the spine map of another book according to the vertex coordinates of the effective image.

Step 101: and cutting the effective spine image to obtain a single book image.

And obtaining a single book image according to the effective spine image obtained in the step S100, wherein the method comprises the following substeps.

Step S1010: the effective spine image is converted to a grayscale image.

The effective spine image is converted to a gray scale image using a color space conversion function.

Step S1011: and filtering the gray level image.

The gray scale image converted from the effective spine image is enhanced to a point in the y-axis direction of each image area according to the following two-dimensional gaussian function.

Wherein e represents an exponential function with a natural constant e as a base, σ represents a standard deviation, x represents a value on the x-axis of the entire effective spine image, and y=represents a value on the y-axis of the entire effective spine image; and setting different values of all pixels of each divided image area in the x and Y directions by using one standard deviation in the x and Y directions of the standard deviation of the two-dimensional Gaussian function according to the gray value enhancement degree of the gray image of the effective spine image actually required, so that the pixels of each divided image area in the x or Y directions are correspondingly increased or reduced, and the effect of enhancing the Y direction of the effective spine image is realized.

Step S1012: and extracting an edge point set of the single book image from the gray level image subjected to filtering.

And extracting an edge point set of the single book image by setting the ranges of the high value and the low value of the edge extraction function so as to highlight the local edge in the single book image.

Step S1013: and linearizing the edge point set of the single book image to obtain an effective edge line of the single book image so as to cut the single book image.

The boundary formed by the set of edge points of the extracted single book image may be widened or intermittent at some point due to the presence of noise and blurring. Therefore, a linear algorithm is needed to remove some edge points or fill up edge break points from the edge point set of the single book image, and the edge points are connected into a complete line to obtain an effective edge line of the single book image. Specifically, the linearization process of the edge point set of the single book image comprises the following steps: establishing a rectangular coordinate system, bringing the coordinates of each edge point into the rectangular coordinate system one by one from the first edge point to obtain a slope, eliminating edge points outside the line where the slope exists, filling in edge break points, and obtaining an effective edge line of the single book image. As shown in fig. 5, the single book image is cut out according to the effective edge line of the single book image.

Step 11: and carrying out font segmentation on the single book image to obtain the book image of which the single book only contains single characters.

Because of the different design styles of each book and the different thicknesses of the books, the single book image obtained in the step S10 needs to be preprocessed and then is subjected to font segmentation, so that the book image with single characters is obtained.

Specifically, if the book in the single book image is a thin book, in order to avoid influencing the recognition of fonts due to the thinner thickness of the spine of the thin book, all pixels in the single book image need to be transversely enlarged so as to enlarge the whole single book image. If the fonts in the single book image are double-row fonts, the fonts can be separated from each other or can be tightly formed into a column, the coordinates of each font in the single book image are extracted, an intermediate file is generated, the intermediate file comprises single-column fonts and double-column fonts, and the double-column fonts are cut into the single-column fonts according to the principle that the pixel values between the left-row font and the right-row font are unchanged. And finding out the start and stop positions of each font by adopting a horizontal projection method for single-column fonts in the obtained single book image so as to be convenient for cutting out the book image of which the single book only contains single characters according to the start and stop positions of each font.

Step S2: and respectively inputting the book images of each single character into a first pre-trained convolutional neural network one by one to perform character recognition.

By respectively inputting the book images of each single character into the first pre-trained convolutional neural network, the identification of the words with the highest probability of occurrence in the word stock data set from the book images of each single character can be realized, namely, the words with the highest probability of occurrence in the word stock data set of the words related to the book name are identified. Since the name, author and publisher of each book are generally printed on the spine of the book, the names of books referred to in the present invention mainly refer to the name, author and publisher of each book. The first pre-trained convolutional neural network can be used for identifying the characters with highest probability of appearing in the database data set of the names, authors and publishers printed on the spine of each book.

The word stock data set is formed by the following steps: generating a label file related to coding according to the content of the existing book name file (the existing characters in the existing book name library), and obtaining a word library file; in the tag file, each character corresponds to a code, corresponding characters are found out from a character library according to the codes in the tag file, the characters are converted into character images, a character library image file is obtained, and the images corresponding to each character in the character library image file are processed by rotation, noise adding and the like to generate a character library data set. For example, taking a word as an example, the word has multiple fonts, each font corresponds to a word image, that is, each word has multiple word images, and each word image of the word is processed by rotation, noise, and the like to generate a data set of the word. Thus, the generated font data set is a processed set of text images of the characters already in the existing book name library.

The first pre-trained convolutional neural network is trained by the sub-steps of:

step S20: a first convolutional neural network is established.

In the step, the first convolutional neural network is built and sequentially comprises an input layer, N convolutional layers and a pooling layer which are alternately arranged, a full-connection layer and an output layer, wherein N is a positive integer. Wherein the number of convolutional layers and pooling layers is set according to the required neural network model data. For example, as shown in the figure, as a specific embodiment of the present invention, the first convolutional neural network provided in this embodiment includes 2 convolutional layers and pooling layers alternately arranged, that is, the first convolutional neural network includes an input layer, a convolutional layer, a pooling layer, a fully-connected layer, and an output layer in this order.

Step S21: training the established first convolutional neural network to obtain first convolutional neural network model data.

And respectively inputting the training data set and the test data set into the established first convolutional neural network, and obtaining the required first convolutional neural network model data after multiple training. In the process of training the first convolutional neural network, the optimal first convolutional neural network model data is obtained by setting the filtering parameters of each convolutional layer and the pooling length of each pooling layer. The training data set and the test data are obtained from a word stock data set, and specifically, the proportion of the training data set and the test data is selected according to the size of the obtained word stock image file. For example, 80% of the word stock data set is used as the training data set and 20% of the word stock data set is used as the test data set.

Step S3: and sequentially inputting each character recognition result output by the first pre-trained convolutional neural network into a book name library, and confirming whether each book obtains a book name.

Each character recognition result output by the first pre-trained convolutional neural network means that the characters with the highest probability of appearing in the database data set of the names, authors and publishers printed on the spine of each book are recognized through the first pre-trained convolutional neural network. And (3) bringing the first character of each identified book into a book name library to find out all the book names containing the characters, starting from the second character of each identified book, and bringing the next character into the book name containing the last character found out by the last character, thereby further reducing the found book name range and finally obtaining the book name of each book. The obtained book name of each book comprises the name of the book, the author and the publishing company. And according to the obtained book names of each book, pricing of the book can be found out from a book name library so as to facilitate subsequent inventory of the books.

Step S4: if a book is obtained, the book name of the book is saved to the inventory table.

If the book name of a book can be obtained according to the method of the step S3, the book name of the book and the pricing found in the book name library according to the book name are respectively stored in the inventory table.

if the book name of a book is not obtained according to the method of the step S3, the image of the book is input into a second pre-trained convolutional neural network for image recognition.

Step S6: and inputting the image recognition result output by the second pre-trained convolutional neural network into a book name library, and confirming whether a book which does not obtain the book name obtains the book name or not.

And (3) respectively inputting the single book images obtained in the step (S10) into a second pre-trained convolutional neural network, and confirming whether the single book images can be found from the image data set, so that the book name and pricing information of each book are confirmed from the image data set.

The image data set is formed by the following steps: and (3) marking the single book image obtained in the step (S10) respectively, wherein the marked content is the book number and the book name of each book. All the marked book images form an image data set.

The second pre-trained convolutional neural network is trained by the sub-steps of:

step S60: a second convolutional neural network is established.

In the step, the established second convolutional neural network sequentially comprises an input layer, N convolutional layers and a pooling layer which are alternately arranged, a full-connection layer and an output layer, wherein N is a positive integer. Wherein the number of convolutional layers and pooling layers is set according to the required neural network model data.

Step S61: training the established second convolutional neural network to obtain second convolutional neural network model data.

And respectively inputting the training image data set and the test image data set into the established second convolutional neural network, and obtaining the required second convolutional neural network model data after multiple training. In the process of training the second convolutional neural network, the optimal second convolutional neural network model data is obtained by setting the filtering parameters of each convolutional layer and the pooling length of each pooling layer. The training image data set and the test image data are obtained from the image data set, and specifically, the proportion of the training image data set and the test image data is selected according to the size of the obtained word stock image file. For example, 70% of the image dataset is used as the training image dataset and 30% of the image dataset is used as the test image dataset.

Step S7: if the book in the step S6 can obtain the book name, the book name and pricing are saved to an inventory table; otherwise, entering the data area to be marked.

If the book name of a book can be obtained according to the method of the step S6, the book name of the book and the pricing found in the book name library according to the book name are respectively stored in the inventory table.

If the book still cannot obtain the book name of the book after the image comparison is carried out by adopting the second convolutional neural network, the book can enter the data area to be marked, the images of the book in the data area to be marked are respectively marked in a manual mode and then are merged into the image data set, and therefore the book name of the book can be obtained according to the latest image data set when the same book image is encountered next time.

According to the names of the books stored in the inventory list and the pricing of each book, the number of each book can be counted, and according to the pricing of each book, the inventory of the books can be obtained. Wherein each book refers to the same name, author and publisher of the book.

As shown in fig. 6, the embodiment of the present invention further provides a book checking system, including: the memory 61 and the processor 62, the memory 61 stores a computer control program, and when the computer control program is executed by the processor 62, the steps of the book checking method provided by the present invention are realized (steps S1 to 8 described above). In addition, the embodiment of the invention also provides a machine-readable medium storing a computer control program which, when executed by a processor, implements the steps of the book checking method provided by the invention (steps S1 to S8 described above).

Those of ordinary skill in the art will appreciate that all or some of the steps, systems, functional modules/units in the apparatus, and methods disclosed above may be implemented as software, firmware, hardware, and suitable combinations thereof. In a hardware implementation, the division between the functional modules/units mentioned in the above description does not necessarily correspond to the division of physical components; for example, one physical component may have multiple functions, or one function or step may be performed cooperatively by several physical components. Some or all of the components may be implemented as software executed by a processor, such as a digital signal processor or microprocessor, or as hardware, or as an integrated circuit, such as an application specific integrated circuit. Such software may be distributed on machine-readable media (e.g., computer-readable media), which may include computer storage media (or non-transitory media) and communication media (or transitory media). The term computer storage media includes both volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data, as known to those skilled in the art. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital Versatile Disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by a computer. Furthermore, as is well known to those of ordinary skill in the art, communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media.

The book checking method and the book checking system provided by the invention are described in detail. Any obvious modifications to the present invention, without departing from the spirit of the present invention, would be apparent to those skilled in the art from the scope of the present patent claims.

Claims

1. The book checking method is characterized by comprising the following steps:

before a book image of a single book only containing single characters is obtained, a book spine image of a bookshelf is obtained for preprocessing;

the step S1 specifically includes:

step 11: carrying out font segmentation on the single book image according to the start-stop positions of each font to obtain a book image of which the single book only contains single characters;

2. The book checking method of claim 1, wherein in the step 11, if the book in the image of the single book is a thin book, all pixels in the image of the single book are enlarged laterally in order to avoid influencing the recognition of the fonts due to the thin thickness of the spine of the thin book.

3. The book checking method according to claim 1, wherein in the step S11, if the fonts in the single book image are double-row fonts, coordinates of each font in the single book image are extracted to generate an intermediate file, the intermediate file includes single-row fonts and double-row fonts, the double-row fonts are cut into single-row fonts according to the principle that pixel values between the left-row and right-row fonts are unchanged, the single-row fonts in the obtained single book image are horizontally projected, and start and stop positions of each font are found out, so that the single book image only including single characters is cut out according to the start and stop positions of each font.

4. The book checking method of claim 1, wherein the process of obtaining the single book image comprises the sub-steps of:

step S100: obtaining an effective spine image from the spine image of the book;

step 101: and cutting the effective spine image to obtain a single book image.

5. The book checking method of claim 4, wherein the process of obtaining the single book image comprises the sub-steps of:

step S1010: converting the effective spine image into a gray image;

step S1011: filtering the gray level image;

6. The book inventory method of claim 1, characterized in that the first pre-trained convolutional neural network is trained by the sub-steps of:

step S20: establishing the first convolutional neural network;

7. The book inventory method of claim 1, characterized in that the second pre-trained convolutional neural network is trained by the sub-steps of:

step S60: establishing the second convolutional neural network;

8. The book checking method of claim 1, wherein:

the first convolutional neural network and the second convolutional neural network sequentially comprise an input layer, N convolutional layers and pooling layers which are alternately arranged, a full-connection layer and an output layer, wherein N is a positive integer;

9. A book checking system comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, characterized in that: the processor, when executing the program, performs the steps of:

10. A machine readable medium having stored thereon a computer program, characterized in that the program when executed by a processor performs the steps of: