CN111524148A - Book page identification method and device, electronic equipment and storage medium - Google Patents
Book page identification method and device, electronic equipment and storage medium Download PDFInfo
- Publication number
- CN111524148A CN111524148A CN202010321083.XA CN202010321083A CN111524148A CN 111524148 A CN111524148 A CN 111524148A CN 202010321083 A CN202010321083 A CN 202010321083A CN 111524148 A CN111524148 A CN 111524148A
- Authority
- CN
- China
- Prior art keywords
- detected
- picture
- book page
- book
- pages
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 43
- 238000003860 storage Methods 0.000 title claims abstract description 20
- 238000012545 processing Methods 0.000 claims abstract description 40
- 230000011218 segmentation Effects 0.000 claims abstract description 36
- 238000013507 mapping Methods 0.000 claims abstract description 13
- 238000001514 detection method Methods 0.000 claims description 31
- 238000005520 cutting process Methods 0.000 claims description 7
- 238000009432 framing Methods 0.000 claims description 6
- 230000008569 process Effects 0.000 claims description 2
- 238000004891 communication Methods 0.000 description 20
- 238000004422 calculation algorithm Methods 0.000 description 10
- 238000004590 computer program Methods 0.000 description 10
- 238000010586 diagram Methods 0.000 description 6
- 241000282326 Felis catus Species 0.000 description 4
- 230000008878 coupling Effects 0.000 description 3
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- 238000005034 decoration Methods 0.000 description 2
- 238000013136 deep learning model Methods 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000010267 cellular communication Effects 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000005315 distribution function Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 238000012549 training Methods 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/11—Region-based segmentation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/12—Edge-based segmentation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10004—Still image; Photographic image
Landscapes
- Engineering & Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Image Analysis (AREA)
Abstract
The invention belongs to the technical field of computers, and discloses a method and a device for identifying book pages, electronic equipment and a storage medium, wherein the method comprises the following steps: identifying at least one book page in the picture to be detected by carrying out example segmentation processing on the picture to be detected; performing coordinate regression processing on each book page; and mapping the regression coordinate value of each book page back to the picture to be detected to obtain the position of each book page in the picture to be detected. By the scheme, at least one book page can be recognized and detected accurately, efficiently and precisely.
Description
Technical Field
The present invention relates to the field of computer technologies, and in particular, to a method and an apparatus for identifying pages of a book, an electronic device, and a storage medium.
Background
Book page detection products in the current market generally obtain outlines through a traditional image algorithm to obtain book edges, and then calculate angular points of book pages to obtain book page areas, but the problem of poor anti-interference performance exists. Or the detection of the book edge is realized by regressing the corner points of the book page through a regression algorithm, if four book corners of the book page are shielded or some book corners exceed the area, the coordinates of the corner points of the book page can regress to the wrong position, and the problem of inaccurate detection exists. In addition, the book page detection product in the prior art only aims at the detection of one book page, and cannot realize the simultaneous detection of a plurality of book pages.
Disclosure of Invention
The invention aims to provide a method and a device for identifying book pages, electronic equipment and a storage medium, which can accurately, efficiently and precisely detect at least one book page.
The technical scheme provided by the invention is as follows:
in one aspect, a method for identifying pages of a book is provided, which includes the steps of:
and identifying at least one book page in the picture to be detected by carrying out example segmentation processing on the picture to be detected.
And performing coordinate regression processing on each book page.
And mapping the regression coordinate value of each book page back to the picture to be detected to obtain the position of each book page in the picture to be detected.
According to the scheme, based on the image instance segmentation idea, each book page is firstly segmented, one or more book pages can be detected and identified, the book page detection accuracy is improved, and meanwhile, the book pages are detected, so that the book page detection efficiency is improved.
Further preferably, the identifying at least one book page in the picture to be detected by performing instance segmentation processing on the picture to be detected includes:
and framing different examples of the picture to be detected by a target detection method, and identifying whether the picture to be detected has at least one book page.
And when at least one book page exists, acquiring the coordinates of the rectangular area of each book page and the corresponding mask map.
Further preferably, the performing coordinate regression processing on each book page includes:
cutting the mask graph into at least one mask subgraph according to the rectangular area coordinates; and performing corner coordinate regression processing on each mask sub-image corresponding to the book page to obtain a corner abscissa and a corner ordinate of the book page corresponding to the mask sub-image.
Further preferably, the step of mapping the regression coordinate value of each book page back to the picture to be detected to obtain the position of each book page in the picture to be detected includes:
and mapping the angular point horizontal coordinates and the angular point vertical coordinates of the book pages corresponding to the mask sub-images back to the picture to be detected according to the rectangular region coordinates to obtain a coordinate array of each book page in the picture to be detected.
And obtaining the position of each book page in the picture to be detected according to the coordinate array of each book page in the picture to be detected.
According to the scheme, the image global regression is changed into the local subgraph regression through the image instance segmentation algorithm, and meanwhile, the regression is carried out according to the mask image mode, so that most of non-book area interference is removed, and the accuracy of recognizing four corners of a book is improved. In the scheme, the situation that only one book page and a plurality of book pages exist can be accurately detected.
Further preferably, after the different instances of the picture to be detected are framed out by the target detection method and whether at least one book page exists in the picture to be detected is identified, the method further includes the following steps: and when the book page does not exist, quitting the identification of the picture to be detected.
Further preferably, before the identifying at least one book page in the picture to be detected by performing instance segmentation processing on the picture to be detected, the method includes the steps of:
and inputting the picture to be detected, and emptying the coordinate array identified in the picture to be detected.
In the scheme, the emptying is that the coordinate array is emptied, that is, no coordinate is recorded, and a memory for caching the coordinate array is opened up.
In another aspect, an apparatus for recognizing pages of a book is provided, including:
the identification module is used for identifying at least one book page in the picture to be detected by carrying out example segmentation processing on the picture to be detected.
A regression processing module: and the system is used for performing coordinate regression processing on each book page.
An acquisition module: and the regression coordinate value of each book page is mapped back to the picture to be detected, so that the position of each book page in the picture to be detected is obtained.
Further preferably, the identification module is further configured to:
framing different examples of the picture to be detected by a target detection method, and identifying whether the picture to be detected has at least one book page; and when at least one book page exists, acquiring the coordinates of the rectangular area of each book page and the corresponding mask map.
In the scheme, based on the image instance segmentation idea, firstly, each book page is segmented, one or more book pages can be supported to be detected and identified, the book page detection accuracy is improved, and meanwhile, the book pages are detected, so that the book page detection efficiency is improved.
In another aspect, an electronic device is also provided, which includes:
a processor; and a memory storing computer-executable instructions that, when executed, cause the processor to perform a method of identification from pages of the book.
In another aspect, a storage medium is provided, and at least one instruction is stored in the storage medium, and the instruction is loaded and executed by a processor to implement the operations performed by the method for identifying pages of a book.
The invention has the technical effects that: based on the image example segmentation idea, each book page is segmented firstly, and the book page identification can be supported. Meanwhile, the image global regression is changed into the local subgraph regression through the image instance segmentation algorithm, and the regression is carried out according to the mask image mode, so that most of interference of non-book page areas is removed, and the accuracy of book page identification is improved.
Drawings
The invention is described in further detail below with reference to the following figures and detailed description:
FIG. 1 is a schematic flow chart diagram illustrating an embodiment of a method for identifying pages of a book in accordance with the present invention;
FIG. 2 is a schematic flow chart diagram illustrating a method for identifying pages of a book in accordance with another embodiment of the present invention;
FIG. 3 is a schematic flow chart diagram illustrating a method for identifying pages of a book according to yet another embodiment of the present invention;
fig. 4 is a schematic structural diagram of a device for recognizing pages of a book according to the present invention;
FIG. 5 is a schematic diagram of an electronic device according to the present invention;
fig. 6 is a schematic structural diagram of a storage medium of the present invention.
Detailed Description
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
For the sake of simplicity, the drawings only schematically show the parts relevant to the present invention, and they do not represent the actual structure as a product. In addition, in order to make the drawings concise and understandable, components having the same structure or function in some of the drawings are only schematically illustrated or only labeled. In this document, "one" means not only "only one" but also a case of "more than one".
It should be further understood that the term "and/or" as used in this specification and the appended claims refers to and includes any and all possible combinations of one or more of the associated listed items.
In this context, it is to be understood that, unless otherwise explicitly stated or limited, the terms "mounted," "connected," and "connected" are to be construed broadly, e.g., as meaning either a fixed connection, a removable connection, or an integral connection; can be mechanically or electrically connected; they may be connected directly or indirectly through intervening media, or they may be interconnected between two elements. The specific meanings of the above terms in the present invention can be understood in specific cases to those skilled in the art.
In addition, in the description of the present application, the terms "first", "second", and the like are used only for distinguishing the description, and are not intended to indicate or imply relative importance.
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the following description will be made with reference to the accompanying drawings. It is obvious that the drawings in the following description are only some examples of the invention, and that for a person skilled in the art, other drawings and embodiments can be derived from them without inventive effort.
Referring to fig. 1, the present invention provides an embodiment of a method for identifying pages of a book, including the steps of:
s110, performing example segmentation processing on the picture to be detected, and identifying at least one book page in the picture to be detected.
Specifically, the image to be detected is subjected to instance segmentation, where the image to be detected may include readable paper materials, such as books and test papers. Different examples are automatically framed from a picture to be detected by a machine through a target detection method, book page examples are identified, pixel-by-pixel marking is carried out in different example areas through a semantic segmentation method, each example has corresponding rectangular area coordinates, and each different book page example has different rectangular area coordinates.
The image instance segmentation and regression algorithm in the scheme is generally realized through deep learning, the book page features need to be sufficiently analyzed, the targeted optimization can be achieved, the accuracy is improved, and in addition, a large amount of training data needs to be made.
And S120, performing coordinate regression processing on each book page.
Specifically, the regression processing of the corner coordinates of the corresponding book page is performed on each book page.
Illustratively, acquiring the coordinates of a rectangular area of each book page and a corresponding mask map; cutting the mask graph into at least one mask subgraph according to the rectangular area coordinates; and performing corner coordinate regression processing on each mask sub-image corresponding to the book page to obtain a corner abscissa and a corner ordinate of the book page corresponding to the mask sub-image.
The corner points of the corresponding book pages correspond to 4 coordinate points, and are used for constructing the space structure of the book pages and forming a mechanism for checking and calculating mutually.
S130, mapping the regression coordinate value of each book page back to the picture to be detected to obtain the position of each book page in the picture to be detected.
Specifically, the book pages and the specific position of each book page in the picture to be detected are identified through the scheme, firstly, information of a mask subgraph corresponding to each book page, namely regression coordinate values, namely horizontal coordinates and vertical coordinates, is identified, the regression coordinate values are mapped back to the picture to be detected, the position of each book page in the whole picture to be detected is obtained, and the method belongs to an iteration process in the middle of model learning.
In the scheme, based on the image instance segmentation idea, firstly, each book page is segmented, one or more book pages can be detected and identified, the book page detection accuracy is improved, and meanwhile, the book pages are detected, so that the book page detection efficiency is improved.
The method and the device support the detection and identification of one or more book pages, can give the angular point coordinate of each book page, and can predict the covered or exceeded book angular coordinate.
Referring to fig. 2, the present invention provides another embodiment of a method for identifying pages of a book, comprising the steps of:
s210, inputting a picture to be detected, and emptying the coordinate array identified in the picture to be detected.
Specifically, the setting empty is that the coordinate array is set empty, that is, no coordinate is recorded, and a memory for caching the coordinate array is opened up.
S220, framing different examples of the picture to be detected through a target detection method, and identifying whether the picture to be detected has at least one book page.
Specifically, there are three results of image instance segmentation, namely, the classification result, i.e., the classified specific objects, the coordinate position of each classified specific object, and the corresponding mask map. The specific objects of classification are instances, so that the instance segmentation not only needs to perform pixel-level classification, but also needs to distinguish different instances on the basis of specific classes.
For example, the image has a plurality of persons A, B and C, the semantic segmentation results are all persons, and the example segmentation results are different objects: a, B and C. On the basis of image instance segmentation, whether a book page instance exists in a picture to be detected or not is firstly identified.
S230, when at least one book page exists, acquiring coordinates of a rectangular area of each book page and a corresponding mask map.
Specifically, after the example division is performed on the picture to be detected, there may be no book page in the picture to be detected, a single book page, or multiple book pages.
In addition, when the to-be-detected picture is divided by the example, the rectangular area coordinate and the mask map corresponding to each example are acquired, so that the rectangular area coordinate and the mask map of each book page can be acquired under the condition that the book page exists.
For example, when there are multiple cats in the image, the semantic segmentation will predict all pixels of the two cats as a whole as the category "cat". In contrast, the example segmentation needs to distinguish which pixels belong to a first cat and which pixels belong to a second cat. I.e., the book page and the rectangular area coordinates of the book page are identified by the example segmentation method.
S240, cutting the mask graph into at least one mask subgraph according to the rectangular area coordinates; and performing corner coordinate regression processing on each mask sub-image corresponding to the book page to obtain a corner abscissa and a corner ordinate of the book page corresponding to the mask sub-image.
And S250, mapping the angular point horizontal coordinates and the angular point vertical coordinates of the book pages corresponding to the mask sub-image back to the picture to be detected according to the rectangular region coordinates to obtain a coordinate array of each book page in the picture to be detected.
S260, obtaining the position of each book page in the picture to be detected according to the coordinate array of each book page in the picture to be detected.
According to the scheme, the image global regression is changed into the local subgraph regression through the image instance segmentation algorithm, and meanwhile, the regression is carried out according to the mask image mode, so that most of non-book area interference is removed, and the accuracy of book page identification is improved.
The image instance segmentation refers to a method combining object detection and semantic segmentation, wherein the method is used for identifying different objects and allocating a separate pixel-level classification mask to each object. Target detection is used for identifying each target class and the position of each target needs to be predicted through a bounding box; semantic segmentation is used to identify the class to which each pixel belongs, and does not distinguish pixels of the same class as compared to target detection. The global regression is to perform regression processing on the overall feature map of the input picture. The local subgraph regression refers to performing regression processing on a local feature graph in an input picture. The deep learning model refers to modeling based on a deep neural network, and the general form of the representation of the model is a probability distribution function or a decision function.
Specifically, when one book page or a plurality of book pages exist, performing coordinate regression processing on each book page; and mapping the regression coordinate value of each book page back to the picture to be detected to obtain the position of each book page in the picture to be detected.
Further preferably, the identifying at least one book page in the picture to be detected by performing instance segmentation processing on the picture to be detected includes: and when the book page does not exist, quitting the identification of the picture to be detected.
Illustratively, the specific steps are as follows: inputting a picture containing n book pages, wherein n is more than or equal to 0; detecting n book page rectangular region coordinates and corresponding mask images through an image instance segmentation algorithm; if n is 0, identifying as no book page, and quitting the identification; if n is not 0, respectively cutting n mask subgraphs on the n mask subgraphs according to the rectangular area coordinates; respectively carrying out 4 coordinate point regression processing on the n book page mask subgraphs through a regression algorithm to obtain 8 coordinate values of the book pages in the mask subgraphs; and mapping the regression coordinate values of the corresponding n mask subgraphs back to the whole graph by combining the n rectangular region coordinates to obtain the coordinates of n book pages in the whole graph, namely the detected book page region. Wherein, 4 coordinate points refer to the above four angular points of the book page, and 8 coordinate values refer to the abscissa and the ordinate of the four angular points of the book page.
On the other hand, as shown in fig. 3, the present invention further provides another embodiment of a method for identifying pages of a book, including:
and inputting a detection picture, and setting the identified coordinate array as null.
And carrying out book page example segmentation processing on the detection picture.
And judging whether a book page example exists or not.
When a book page example exists, the coordinate regression processing of four corner points of the book page is performed for each book page example.
The coordinates of the book page instance regression are mapped back to the inspection picture.
And returning the coordinate array of each identified book page in the detected picture.
Specifically, the person identifies the book by dividing each book page, then looks at the positions of four corner points of the corresponding book page, if the book page is shielded or exceeds an image area, the possible positions of the corner points need to be predicted through other points and the edges of the book.
Illustratively, four corner points of a book page can also be directly regressed through a deep learning model; and finding out the edge of the book by a traditional image contour algorithm, and calculating four corner points of the page of the book.
On the other hand, as shown in fig. 4, there is also provided an apparatus for recognizing pages of a book, comprising:
the identifying module 410 is configured to identify at least one book page in the picture to be detected by performing instance segmentation processing on the picture to be detected.
The identification module 410 is further configured to frame different instances of the picture to be detected by a target detection method, and identify whether the picture to be detected has at least one book page; and when at least one book page exists, acquiring the coordinates of the rectangular area of each book page and the corresponding mask map.
The regression processing module 420: and the system is used for performing coordinate regression processing on each book page.
The regression processing module 420 further comprises:
and the cutting sub-module is used for cutting the mask graph into at least one mask sub-graph according to the rectangular area coordinates.
And the regression submodule is used for performing corner coordinate regression processing on each mask subgraph corresponding to the book page to obtain the corner abscissa and the corner ordinate of the book page corresponding to the mask subgraph.
The obtaining module 430: and the regression coordinate value of each book page is mapped back to the picture to be detected, so that the position of each book page in the picture to be detected is obtained.
The obtaining module 430 further comprises:
and the mapping sub-module is used for mapping the angular point horizontal coordinates and the angular point vertical coordinates of the book pages corresponding to the mask sub-images back to the picture to be detected according to the rectangular region coordinates to obtain a coordinate array of each book page in the picture to be detected.
A location acquisition submodule: and the book page position acquiring unit is used for acquiring the position of each book page in the picture to be detected according to the coordinate array of each book page in the picture to be detected.
The device for identifying the book pages in the scheme is utilized, based on the image instance segmentation idea, the book pages are firstly segmented, the book page detection and identification can be supported, the accuracy of the book page detection is improved, meanwhile, a plurality of book pages are detected, and the efficiency of the book page detection is improved.
On the other hand, as shown in fig. 5, the present invention provides an electronic device 100, which includes a processor 110, a memory 120, wherein the memory 120 is used for storing a computer program 121; the processor 110 is configured to execute the computer program 121 stored in the memory 120 to implement the method in the corresponding method embodiment.
The electronic device 100 may be a desktop computer, a notebook computer, a palm computer, a tablet computer, a mobile phone, a human-computer interaction screen, or the like. The electronic device 100 may include, but is not limited to, a processor 110, a memory 120. Those skilled in the art will appreciate that fig. 5 is merely an example of the electronic device 100, does not constitute a limitation of the electronic device 100, and may include more or fewer components than illustrated, or some components in combination, or different components, for example: electronic device 100 may also include input/output interfaces, display devices, network access devices, communication buses, communication interfaces, and the like. A communication interface and a communication bus, and may further include an input/output interface, wherein the processor 110, the memory 120, the input/output interface and the communication interface complete communication with each other through the communication bus. The memory 120 stores a computer program 121, and the processor 110 is configured to execute the computer program 121 stored in the memory 120 to implement the method in the corresponding method embodiment.
The Processor 110 may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, discrete gate or transistor logic device, discrete hardware component, etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The memory 120 may be an internal storage unit of the electronic device 100, for example: a hard disk or a memory of the electronic device. The memory may also be an external storage device of the electronic device, for example: the electronic device is provided with a plug-in hard disk, an intelligent memory Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card) and the like. Further, the memory 120 may also include both an internal storage unit and an external storage device of the electronic device 100. The memory 120 is used for storing the computer program 121 and other programs and data required by the electronic device 100. The memory may also be used to temporarily store data that has been output or is to be output.
A communication bus is a circuit that connects the described elements and enables transmission between the elements. Illustratively, the processor 110 receives commands from other elements through the communication bus, decrypts the received commands, and performs calculations or data processing according to the decrypted commands. Memory 120 may include program modules, illustratively, a kernel (kernel), middleware (middleware), an Application Programming Interface (API), and applications. The program modules may be comprised of software, firmware or hardware, or at least two of the same. The input/output interface forwards commands or data input by a user via the input/output interface (e.g., sensor, keypad, touch screen). The communication interface connects the electronic device 100 with other network devices, user equipment, networks. For example, the communication interface may be connected to the network by wire or wirelessly to connect to other external network devices or user devices. The wireless communication may include at least one of: wireless fidelity (WiFi), Bluetooth (BT), Near Field Communication (NFC), Global Positioning Satellite (GPS) and cellular communications, among others. The wired communication may include at least one of: universal Serial Bus (USB), high-definition multimedia interface (HDMI), asynchronous transfer standard interface (RS-232), and the like. The network may be a telecommunications network and a communications network. The communication network may be a computer network, the internet of things, a telephone network. The electronic device 100 may be connected to the network through a communication interface, and a protocol by which the electronic device 100 communicates with other network devices may be supported by at least one of an application, an Application Programming Interface (API), middleware, a kernel, and a communication interface.
In another aspect, as shown in fig. 6, the present invention provides a storage medium, where at least one instruction is stored, and the instruction is loaded and executed by a processor to implement the operations performed by the corresponding embodiments of the method. The storage medium may be, for example, a read-only memory (ROM), a Random Access Memory (RAM), a compact disc read-only memory (CD-ROM), a magnetic tape, a floppy disk, an optical data storage device, and the like.
They may be implemented in program code that is executable by a computing device such that it is executed by the computing device, or separately, or as individual integrated circuit modules, or as a plurality or steps of individual integrated circuit modules. Thus, the present invention is not limited to any specific combination of hardware and software.
In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and reference may be made to the related descriptions of other embodiments for parts that are not described or recited in detail in a certain embodiment.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus/electronic device and method may be implemented in other ways. The above-described embodiments of the apparatus/electronic device are merely exemplary, and the division of the modules or units is merely an example of a logical division, and there may be other divisions when the actual implementation is performed, and for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated modules/units may be stored in a storage medium if they are implemented in the form of software functional units and sold or used as separate products. Based on such understanding, all or part of the flow in the method according to the embodiments of the present invention may also be implemented by sending instructions to relevant hardware by the computer program 121, where the computer program 121 may be stored in a storage medium, and when the computer program 121 is executed by a processor, the steps of the above-described embodiments of the method may be implemented. The computer program 121 may be in a source code form, an object code form, an executable file or some intermediate form, etc. The storage medium may include: any entity or device capable of carrying the computer program 121, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-only Memory (ROM), Random Access Memory (RAM), electrical carrier signal, telecommunication signal, and software distribution medium, etc. It should be noted that the storage medium may contain contents that are appropriately increased or decreased according to the requirements of legislation and patent practice in the jurisdiction, and for example, in some jurisdictions, the computer-readable storage medium does not include electrical carrier signals and telecommunication signals according to legislation and patent practice.
It should be noted that the above embodiments can be freely combined as necessary. The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.
Claims (10)
1. A method for identifying pages of a book, comprising the steps of:
identifying at least one book page in the picture to be detected by carrying out example segmentation processing on the picture to be detected;
performing coordinate regression processing on each book page;
and mapping the regression coordinate value of each book page back to the picture to be detected to obtain the position of each book page in the picture to be detected.
2. The book page identification method according to claim 1, wherein said identifying at least one book page in the picture to be detected by performing instance segmentation processing on the picture to be detected comprises the steps of:
framing different examples of the picture to be detected by a target detection method, and identifying whether the picture to be detected has at least one book page;
and when at least one book page exists, acquiring the coordinates of the rectangular area of each book page and the corresponding mask map.
3. The method of claim 2, wherein said performing a coordinate regression process on each of said book pages comprises the steps of:
cutting the mask graph into at least one mask subgraph according to the rectangular area coordinates;
and performing corner coordinate regression processing on each mask sub-image corresponding to the book page to obtain a corner abscissa and a corner ordinate of the book page corresponding to the mask sub-image.
4. The method for recognizing book pages according to claim 3, wherein said step of mapping the regression coordinate value of each book page back to the picture to be detected to obtain the position of each book page in the picture to be detected comprises the steps of:
mapping the angular point horizontal coordinates and the angular point vertical coordinates of the book pages corresponding to the mask sub-images back to the picture to be detected according to the rectangular region coordinates to obtain a coordinate array of each book page in the picture to be detected;
and obtaining the position of each book page in the picture to be detected according to the coordinate array of each book page in the picture to be detected.
5. The method for identifying book pages according to claim 2, wherein after framing different instances of the picture to be detected by the object detection method and identifying whether at least one book page exists in the picture to be detected, the method further comprises the following steps: and when the book page does not exist, quitting the identification of the picture to be detected.
6. The book page identification method according to claim 4, wherein before identifying at least one book page in the picture to be detected by performing instance segmentation processing on the picture to be detected, the method comprises the steps of:
and inputting the picture to be detected, and emptying the coordinate array identified in the picture to be detected.
7. An apparatus for identifying pages of a book, comprising:
an identification module: the book page detection method comprises the steps of carrying out example segmentation processing on a picture to be detected, and identifying at least one book page in the picture to be detected;
a regression processing module: the book page processing device is used for performing coordinate regression processing on each book page;
an acquisition module: and the regression coordinate value of each book page is mapped back to the picture to be detected, so that the position of each book page in the picture to be detected is obtained.
8. The apparatus for recognizing pages of a book according to claim 7, wherein said recognition module is further configured to: framing different examples of the picture to be detected by a target detection method, and identifying whether the picture to be detected has at least one book page; and when at least one book page exists, acquiring the coordinates of the rectangular area of each book page and the corresponding mask map.
9. An electronic device, characterized in that the electronic device comprises:
a processor; and a memory storing computer executable instructions that, when executed, cause the processor to perform a method of identification of pages of a book according to any one of claims 1 to 6.
10. A storage medium having stored therein at least one instruction, which is loaded and executed by a processor to perform operations performed by a method of identifying pages of a book according to any one of claims 1 to 6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010321083.XA CN111524148A (en) | 2020-04-22 | 2020-04-22 | Book page identification method and device, electronic equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010321083.XA CN111524148A (en) | 2020-04-22 | 2020-04-22 | Book page identification method and device, electronic equipment and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111524148A true CN111524148A (en) | 2020-08-11 |
Family
ID=71904450
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010321083.XA Pending CN111524148A (en) | 2020-04-22 | 2020-04-22 | Book page identification method and device, electronic equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111524148A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112765394A (en) * | 2021-01-07 | 2021-05-07 | 上海喜日电子科技有限公司 | Data processing method and device, electronic equipment and storage medium |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107403183A (en) * | 2017-07-21 | 2017-11-28 | 桂林电子科技大学 | The intelligent scissor method that conformity goal is detected and image segmentation is integrated |
CN108876795A (en) * | 2018-06-07 | 2018-11-23 | 四川斐讯信息技术有限公司 | A kind of dividing method and system of objects in images |
CN109948510A (en) * | 2019-03-14 | 2019-06-28 | 北京易道博识科技有限公司 | A kind of file and picture example dividing method and device |
-
2020
- 2020-04-22 CN CN202010321083.XA patent/CN111524148A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107403183A (en) * | 2017-07-21 | 2017-11-28 | 桂林电子科技大学 | The intelligent scissor method that conformity goal is detected and image segmentation is integrated |
CN108876795A (en) * | 2018-06-07 | 2018-11-23 | 四川斐讯信息技术有限公司 | A kind of dividing method and system of objects in images |
CN109948510A (en) * | 2019-03-14 | 2019-06-28 | 北京易道博识科技有限公司 | A kind of file and picture example dividing method and device |
Non-Patent Citations (2)
Title |
---|
KAIMING HE等: "Mask R-CNN", 《 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION》 * |
YI LI等: "Fully Convolutional Instance-aware Semantic Segmentation", 《 2017 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION》 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112765394A (en) * | 2021-01-07 | 2021-05-07 | 上海喜日电子科技有限公司 | Data processing method and device, electronic equipment and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111209827B (en) | Method and system for OCR (optical character recognition) bill problem based on feature detection | |
CN105046254A (en) | Character recognition method and apparatus | |
CN109347898B (en) | Scene information sending method, scene information display method, server and mobile terminal | |
WO2019105457A1 (en) | Image processing method, computer device and computer readable storage medium | |
CN108764051B (en) | Image processing method and device and mobile terminal | |
CN113627428A (en) | Document image correction method and device, storage medium and intelligent terminal device | |
CN109033935B (en) | Head-up line detection method and device | |
CN108304562B (en) | Question searching method and device and intelligent terminal | |
WO2018184255A1 (en) | Image correction method and device | |
US11847812B2 (en) | Image generation method and apparatus, device, and storage medium | |
US20240212336A1 (en) | Security check ct object recognition method and apparatus | |
CN111428570A (en) | Detection method and device for non-living human face, computer equipment and storage medium | |
CN109214995A (en) | The determination method, apparatus and server of picture quality | |
WO2022103519A1 (en) | Semantic segmentation for stroke classification in inking application | |
CN111524148A (en) | Book page identification method and device, electronic equipment and storage medium | |
EP4181013A1 (en) | Method and apparatus for determining labeling information | |
CN105096355A (en) | Image processing method and system | |
CN110610178A (en) | Image recognition method, device, terminal and computer readable storage medium | |
CN108776959B (en) | Image processing method and device and terminal equipment | |
CN108270973B (en) | Photographing processing method, mobile terminal and computer readable storage medium | |
CN116486153A (en) | Image classification method, device, equipment and storage medium | |
CN115223173A (en) | Object identification method and device, electronic equipment and storage medium | |
CN108021648B (en) | Question searching method and device and intelligent terminal | |
CN114663418A (en) | Image processing method and device, storage medium and electronic equipment | |
CN114373078A (en) | Target detection method and device, terminal equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20200811 |
|
RJ01 | Rejection of invention patent application after publication |