WO2018053756A1 - 一种图像检测方法及终端 - Google Patents

一种图像检测方法及终端 Download PDF

Info

Publication number
WO2018053756A1
WO2018053756A1 PCT/CN2016/099730 CN2016099730W WO2018053756A1 WO 2018053756 A1 WO2018053756 A1 WO 2018053756A1 CN 2016099730 W CN2016099730 W CN 2016099730W WO 2018053756 A1 WO2018053756 A1 WO 2018053756A1
Authority
WO
WIPO (PCT)
Prior art keywords
quadrilateral
target
candidate
terminal
image
Prior art date
Application number
PCT/CN2016/099730
Other languages
English (en)
French (fr)
Inventor
秦超
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to CN201680080721.5A priority Critical patent/CN108604374B/zh
Priority to PCT/CN2016/099730 priority patent/WO2018053756A1/zh
Publication of WO2018053756A1 publication Critical patent/WO2018053756A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis

Definitions

  • the invention relates to the field of image detection technologies, and in particular, to an image detection method and a terminal.
  • Documents also refer to rectangular images of documents like documents, invoices, books, business cards, documents, handouts, photos, advertisements, billboards, televisions, movies, screens, and so on.
  • the angle between the opposite sides of the quadrilateral is less than 30°; or the distance between the opposite sides of the quadrilateral occupies a certain proportion in the length or width of the image, such as one-fifth; or the angle between adjacent sides is Close to vertical (90°), allowing a deviation of 30°; or, the quadrilateral should be large enough, such as a certain percentage of the circumference to be larger than the width and height of the picture.
  • the embodiment of the invention provides an image detection method and a terminal, so as to at least solve the problem that the rectangular false positive rate is high in the existing document correction process.
  • the embodiment of the present invention provides the following technical solutions:
  • the image detecting method provided by the embodiment of the present invention because the embodiment of the present invention considers interference from interference lines inside or outside the rectangle in the process of document correction, and detects a plurality of candidate quadrilaterals from the original image, and multiple candidates are
  • the quadrilateral in the quadrilateral that matches the target quadrilateral projected by the target rectangle in the image is determined as the actual quadrilateral projected by the target rectangle in the image, so that the effect of accurately eliminating the internal and external interference of the rectangle can be achieved.
  • This is an important issue in document correction, because if the quadrilateral detection error is made in this step, the processing of the subsequent steps is based on the erroneous quadrilateral, thereby unrecoverable misleading of the correction result.
  • the rectangular pose parameter of the target rectangle includes: an angle ⁇ between the object plane where the target rectangle is located and an image plane where the image is located, and a side of the target rectangle and the object plane and the image plane The angle ⁇ of the intersection line.
  • the shape of the quadrilateral image formed by the rectangle that is, the quadrilateral
  • the rectangular pose parameter of the target rectangle includes: a larger angle ⁇ among the angles of the two sets of opposite sides of the first candidate quadrilateral.
  • the implementation is relatively simple.
  • the angle library further includes a ratio ⁇ of projections of unit lengths of adjacent two sides of each of the plurality of rectangular gestures; the method further includes: the terminal according to the target rectangle a rectangular pose parameter, matching the angle library, determining a ratio ⁇ 1 of a projection of a unit length of adjacent sides of the target rectangle; and a quadrilateral having the highest reliability matching the target quadrilateral among the plurality of candidate quadrilaterals at the terminal After determining the actual quadrilateral projected by the target rectangle in the image, the method further includes: determining, by the terminal, the true ratio of the adjacent sides of the actual quadrilateral according to the ratio ⁇ 1 and the projection ratio of the adjacent sides of the actual quadrilateral; The true ratio of the lengths of the adjacent sides of the target quadrilateral acquires and outputs the target rectangle.
  • the image detection method provided by the embodiment of the present invention can not only find the actual quadrilateral corresponding to the target rectangle, but also obtain the true ratio of the width and height, and the terminal can obtain the projection transformation matrix according to the ratio of the real rectangle, thereby recovering the quadrilateral. For a real rectangle.
  • the method before the terminal acquires the image of the shooting target rectangle and the depth information in the image, the method further includes: the terminal acquiring and storing the angle library.
  • the terminal acquires the angle library, and the terminal receives the angle library sent by the angle library acquisition device; or the terminal detects the target quadrilateral corresponding to each of the plurality of rectangular gestures. And calculating a ratio of the inner angle value sequence of the target quadrilateral corresponding to each rectangular posture and the projection of the unit length of the adjacent sides of the target quadrilateral corresponding to each rectangular posture, and obtaining the angle library.
  • the terminal Before the terminal matches the pre-stored angle library according to the rectangular posture parameter of the target rectangle, the terminal further includes: the terminal determining a distance d between the target rectangle and the camera; the terminal matches the pre-stored database according to the distance d, and determines The angle library corresponding to the d is the pre-stored angle library.
  • the target rectangle may have a slight change in the same posture, for example, when the target rectangle is far away from the camera, it is greater than 90 in the image plane.
  • the angle of the degree will become larger, and the angle smaller than 90 degrees will become smaller, equal to the angle of 90 degrees.
  • the terminal determines that the sequence of interior angle values of the target quadrilateral projected by the target rectangle in the image is ⁇ 90°, 90°, 90°, 90° ⁇ .
  • the terminal determines, according to the statistic value, the reliability of the matching of the first candidate quadrilateral with the target quadrilateral, including: determining, by the terminal, a difference between the preset value and the statistic value as the first The reliability of the candidate quadrilateral matching the target quadrilateral; or the terminal queries the pre-stored correspondence according to the statistic, and determines the reliability of the matching of the first candidate quadrilateral with the target quadrilateral, the correspondence includes multiple The credibility of the value.
  • the rectangular pose parameter of the target rectangle includes: an angle ⁇ between the object plane where the target rectangle is located and an image plane where the image is located, and a side of the target rectangle and the object plane and the image plane The angle ⁇ of the intersection line.
  • the rectangular pose parameter of the target rectangle includes: a larger angle ⁇ among the angles of the two sets of opposite sides of the first candidate quadrilateral.
  • the terminal further includes a display module;
  • the angle library further includes a ratio ⁇ of a projection of a unit length of adjacent two sides corresponding to each of the plurality of rectangular postures;
  • the processing module And is further configured to match the angle library according to the rectangular posture parameter of the target rectangle, and determine a ratio ⁇ 1 of the projection of the unit length of the adjacent two sides corresponding to the target rectangle;
  • the processing module is further configured to be in the plurality of candidate quadrilaterals.
  • the quadrilateral with the highest reliability matching the target quadrilateral is determined as the actual quadrilateral projected by the target rectangle in the image, and the adjacent sides of the actual quadrilateral are determined according to the ratio ⁇ 1 and the projection ratio of the adjacent sides of the actual quadrilateral.
  • Real ratio the processing module is also used to lengthen adjacent sides according to the target quadrilateral The real ratio, the target rectangle is obtained; the display module is used to display the target rectangle.
  • the terminal further includes a storage module, and the processing module is further configured to acquire the angle library before acquiring the image of the shooting target rectangle and the depth information in the image; the storage module is configured to: Store the angle library.
  • the terminal further includes: a communication module; the processing module is configured to: receive, by the communication module, an angle library to obtain an angle library sent by the device; or, the processing module is specifically configured to: detect the multiple a target quadrilateral corresponding to each rectangular posture in a rectangular posture, and calculating a sequence of inner angle values of the target quadrilateral corresponding to each rectangular posture, and a ratio of projections of unit lengths of adjacent sides of the target quadrilateral corresponding to each rectangular posture , get the angle library.
  • the processing module is further configured to determine, if the sides of the first candidate quadrilateral are on the same plane, before matching the pre-stored angle library according to the rectangular pose parameter of the target rectangle A distance d between the target rectangle and the camera; the processing module is further configured to match the pre-stored database according to the distance d, and determine that the angle library corresponding to the d is the pre-stored angle library.
  • the processing module determines that the sequence of interior angle values for the target quadrilateral is ⁇ 90°, 90°, 90°, 90° ⁇ .
  • the processing module is specifically configured to: determine a difference between the preset value and the statistic value as a reliability of matching the first candidate quadrilateral with the target quadrilateral; or, according to the statistic value, Querying a pre-stored correspondence, determining a credibility of the first candidate quadrilateral to match the target quadrilateral, the correspondence relationship comprising a plurality of values corresponding to the credibility.
  • the terminal provided by the embodiment of the present invention can be used to perform the foregoing image detection method. Therefore, the technical effects that can be obtained can be referred to the foregoing method embodiments, and details are not described herein again.
  • the rectangular pose parameter of the target rectangle includes: an angle ⁇ between the object plane where the target rectangle is located and an image plane where the image is located, and a side of the target rectangle and the object plane and the image plane The angle ⁇ of the intersection line.
  • the rectangular pose parameter of the target rectangle includes: a larger angle ⁇ among the angles of the two sets of opposite sides of the first candidate quadrilateral.
  • the terminal further includes a display;
  • the angle library further includes a ratio ⁇ of a projection of a unit length of adjacent two sides corresponding to each of the plurality of rectangular postures;
  • the processor further For matching the angle library according to the rectangular posture parameter of the target rectangle, determining a ratio ⁇ 1 of the projection of the unit length of the adjacent two sides corresponding to the target rectangle;
  • the processor And for determining, after determining the quadrilateral with the highest degree of reliability matching the target quadrilateral among the plurality of candidate quadrilaterals as the actual quadrilateral projected by the target rectangle in the image, according to the ratio ⁇ 1 and the adjacent sides of the actual quadrilateral
  • the projection ratio determines a true ratio of the adjacent sides of the actual quadrilateral;
  • the processor is further configured to obtain the target rectangle according to a true ratio of the lengths of the adjacent sides of the target quadrilateral;
  • the display is configured to display the target rectangle.
  • the terminal further includes a memory; the processor is further configured to acquire the angle library before acquiring the image of the shooting target rectangle and the depth information in the image; the memory is configured to store the Angle library.
  • the terminal further includes: a communication interface; the processor is configured to: receive, by the communication interface, an angle library to obtain an angle library sent by the device; or the processor is specifically configured to: detect the multiple a target quadrilateral corresponding to each rectangular posture in a rectangular posture, and calculating a sequence of inner angle values of the target quadrilateral corresponding to each rectangular posture, and a ratio of projections of unit lengths of adjacent sides of the target quadrilateral corresponding to each rectangular posture , get the angle library.
  • the processor is further configured to determine, if the sides of the first candidate quadrilateral are on the same plane, before matching the pre-stored angle library according to the rectangular pose parameter of the target rectangle A distance d between the target rectangle and the camera; the processor is further configured to match the pre-stored database according to the distance d, and determine that the angle library corresponding to the d is the pre-stored angle library.
  • the processor determines that the sequence of interior angle values for the target quadrilateral is ⁇ 90°, 90°, 90°, 90° ⁇ .
  • the processor is specifically configured to: determine a difference between the preset value and the statistic value as a reliability of matching the first candidate quadrilateral with the target quadrilateral; or, according to the statistic value, Querying a pre-stored correspondence, determining a credibility of the first candidate quadrilateral to match the target quadrilateral, the correspondence relationship comprising a plurality of values corresponding to the credibility.
  • the terminal provided by the embodiment of the present invention can be used to perform the foregoing image detection method. Therefore, the technical effects that can be obtained can be referred to the foregoing method embodiments, and details are not described herein again.
  • an embodiment of the present invention provides a computer storage medium for storing computer software instructions for use in the terminal, including a program designed to perform the above aspects.
  • an embodiment of the present invention provides an image detection method, the method comprising: acquiring, by an image, an image of a target rectangle; the terminal detecting an edge of the image to obtain a plurality of candidate quadrilaterals; and the terminal is among the plurality of candidate quadrilaterals
  • the image detecting method provided by the embodiment of the present invention because the embodiment of the present invention considers interference from interference lines inside or outside the rectangle in the process of document correction, and detects a plurality of candidate quadrilaterals from the original image, and multiple candidates are
  • the quadrilateral in the quadrilateral that matches the target quadrilateral projected by the target rectangle in the image is determined as the actual quadrilateral projected by the target rectangle in the image, so that the effect of accurately eliminating the internal and external interference of the rectangle can be achieved.
  • this embodiment does not require preset data, so the implementation is the simplest with respect to the image detecting method described above.
  • the internal angles are respectively obtained by comparing the difference values; the terminal determines the reliability of matching the first candidate quadrilateral with the target quadrilateral according to the statistical value.
  • the terminal determines, according to the statistic value, the reliability of the matching of the first candidate quadrilateral with the target quadrilateral, including: determining, by the terminal, a difference between the preset value and the statistic value as the first The reliability of the candidate quadrilateral matching the target quadrilateral; or the terminal queries the pre-stored correspondence according to the statistic, and determines the reliability of the matching of the first candidate quadrilateral with the target quadrilateral, the correspondence includes multiple The credibility of the value.
  • the processing module is specifically configured to: determine a difference between the preset value and the statistic value as a reliability of matching the first candidate quadrilateral with the target quadrilateral; or, according to the statistic value, Querying a pre-stored correspondence, determining a credibility of the first candidate quadrilateral to match the target quadrilateral, the correspondence relationship comprising a plurality of values corresponding to the credibility.
  • the terminal provided by the embodiment of the present invention can be used to perform the foregoing image detection method. Therefore, the technical effects that can be obtained can be referred to the foregoing method embodiments, and details are not described herein again.
  • the processor is specifically configured to: determine a difference between the preset value and the statistic value as a reliability of matching the first candidate quadrilateral with the target quadrilateral; or, according to the statistic value, Querying a pre-stored correspondence, determining a credibility of the first candidate quadrilateral to match the target quadrilateral, the correspondence relationship comprising a plurality of values corresponding to the credibility.
  • the terminal provided by the embodiment of the present invention can be used to perform the foregoing image detection method. Therefore, the technical effects that can be obtained can be referred to the foregoing method embodiments, and details are not described herein again.
  • an embodiment of the present invention provides a computer storage medium for storing computer software instructions for use in the terminal, including a program designed to perform the above aspects.
  • the image detecting method and the terminal provided by the embodiment of the present invention, because the embodiment of the present invention considers interference from interference lines inside or outside the rectangle in the process of document correction, and detects a plurality of candidate quadrilaterals from the original image.
  • the quadrilateral of the plurality of candidate quadrilaterals matching the target quadrilateral projected by the target rectangle in the image is determined as the actual quadrilateral projected by the target rectangle in the image, so that the effect of accurately eliminating the internal and external interference of the rectangle can be achieved.
  • This is an important issue in document correction, because if the quadrilateral detection error is made in this step, the processing of the subsequent steps is based on the wrong quadrilateral, so that the correction result is not possible. Misleading recovery.
  • 1 is a schematic diagram of a conventional image detection result
  • FIG. 2 is an abstract schematic diagram of rectangular imaging according to an embodiment of the present invention.
  • FIG. 3 is a schematic diagram of a case where the angle ⁇ is not 0 when rectangular imaging is provided according to an embodiment of the present invention
  • FIG. 4 is a schematic structural diagram of hardware of a terminal according to an embodiment of the present disclosure.
  • FIG. 5 is a schematic flowchart of an image detection method according to an embodiment of the present disclosure.
  • FIG. 6 is a schematic diagram of depth information of an image according to an embodiment of the present invention.
  • FIG. 7 is a schematic diagram showing the angle ⁇ of an embodiment of the present invention.
  • FIG. 8 is a schematic diagram showing the angle ⁇ of an embodiment of the present invention.
  • FIG. 9 is a simplified schematic diagram of an angle ⁇ and an angle ⁇ according to an embodiment of the present invention.
  • FIG. 10 is a schematic flowchart of an acquisition process of an angle library according to an embodiment of the present disclosure.
  • FIG. 11 is a schematic diagram of angular rotation according to an embodiment of the present invention.
  • FIG. 12 is a schematic diagram of a candidate quadrilateral 1 according to an embodiment of the present invention.
  • FIG. 13 is a schematic diagram of a candidate quadrilateral 2 according to an embodiment of the present invention.
  • Figure 14 is a conventional document correction result
  • FIG. 15 is a schematic flowchart of another image detecting method according to an embodiment of the present invention.
  • 16 is a schematic flowchart of still another image detecting method according to an embodiment of the present invention.
  • FIG. 17 is a schematic structural diagram of another terminal according to an embodiment of the present invention.
  • FIG. 2 is an abstract schematic diagram of a rectangular image according to an embodiment of the present invention.
  • the resulting quadrilateral image is located on the image plane c perpendicular to the line of sight.
  • the angle between the image plane c and the object plane w is an ⁇ vector
  • the image of the rectangle on the image plane is a quadrangle
  • the quadrilateral image formed by the rectangle has a one-to-one correspondence in the imaging posture, and it is impossible to form a quadrilateral of other shapes.
  • This posture can be uniquely determined by the angle ⁇ between the image plane c and the object plane w and the angle ⁇ between the side of the rectangle on the object plane w and the line AB of the object plane w and the image plane c. Assuming that one side of the rectangle on the object plane w is P3P4, it can be seen from Fig. 2 that P3P4 is parallel to AB, so the angle ⁇ is 0 and is not indicated. Figure 3 illustrates a case where the angle ⁇ is not zero.
  • a component can be, but is not limited to being, a process running on a processor, a processor, an object, an executable, a thread in execution, a program, and/or a computer.
  • an application running on a computing device and the computing device can be a component.
  • One or more components can reside within a process and/or thread of execution, and a component can be located in a computer and/or distributed between two or more computers. Moreover, these components can execute from various computer readable media having various data structures thereon.
  • These components may be passed, for example, by having one or more data packets (eg, data from one component that interacts with the local system, another component of the distributed system, and/or signaled through, such as the Internet)
  • the network interacts with other systems to communicate in a local and/or remote process.
  • the application will present various aspects, embodiments, or features in a system that can include multiple devices, components, modules, and the like. It is to be understood and appreciated that the various systems may include additional devices, components, modules, etc. and/or may not include all of the devices, components, modules, etc. discussed in connection with the figures. In addition, a combination of these schemes can also be used.
  • the word "exemplary” is used to mean an example, an illustration, or a description. Any embodiment or design described as “example” in this application should not be construed as preferred or advantageous over other embodiments or designs. Rather, the term use examples is intended to present concepts in a concrete manner.
  • the scenario described in the embodiment of the present invention is to more clearly illustrate the technology of the embodiment of the present invention.
  • the technical solution does not constitute a limitation on the technical solution provided by the embodiment of the present invention. It is known to those skilled in the art that the technical solution provided by the embodiment of the present invention is applicable to similar technical problems as the new scenario occurs.
  • FIG. 4 is a schematic structural diagram of a hardware structure of a terminal according to an embodiment of the present invention.
  • the terminal 400 includes a processor 401, a camera 402, a display 403, a communication interface 404, a memory 405, and a bus 406.
  • the processor 401, the camera 402, the display 403, the communication interface 404, and the memory 405 are connected to one another via a bus 406.
  • the processor 401 is a control center of the terminal 400, and connects various parts of the entire terminal 400 via the bus 406, executes the terminal by running or executing software programs and/or modules stored in the memory 405, and calling data stored in the memory 405.
  • the various functions of the 400 and the processing of the data enable overall monitoring of the terminal 400.
  • the processor 401 may include one or more processing units; preferably, the processor 401 may integrate an application processor and a modem processor, where the application processor mainly processes an operating system, a user interface, an application, and the like.
  • the modem processor primarily handles wireless communications. It can be understood that the above modem processor may not be integrated into the processor 401.
  • the camera 402 is used to capture an object and obtain image data of the object.
  • the camera 402 may be a depth camera or a dual camera, which is not specifically limited in this embodiment of the present invention.
  • the camera 402 is further configured to acquire depth data of the image.
  • the display 403 is for displaying an image in which the object is photographed and processed.
  • the communication interface 404 is used to support communication between the terminal and other external devices.
  • the memory 405 can be used to store software programs and modules, and the processor 401 executes various functional applications and data processing of the terminal 400 by running software programs and modules stored in the memory 405.
  • the memory 405 mainly includes a storage program area and a storage data area, wherein the storage program area can store an operating system, an application required for at least one function (such as a photographing function, a document correction function), and the like; the storage data area can be stored according to the terminal 400. Use the created data (such as the angle library of the preset rectangular pose) and so on.
  • memory 405 can include high speed random access memory, and can also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device.
  • the bus 406 may be a peripheral component interconnect standard (English: interconnected component: PCI) bus or an extended industry standard architecture (English: extended industry standard architecture, abbreviation: EISA) bus.
  • the bus can be divided into an address bus, a data bus, a control bus, and the like. For ease of representation, only one thick line is shown in Figure 4, but it does not mean that there is only one bus or one type of bus.
  • the terminal 400 may also include a radio frequency (RF) circuit, an audio circuit, and/or a plurality of sensors, which are not specifically limited in the embodiment of the present invention.
  • RF radio frequency
  • an image detection method according to an embodiment of the present invention includes steps S501-S504:
  • the terminal acquires an image of a shooting target rectangle and depth information in the image.
  • the depth information is used to characterize the distance of the target rectangle from the camera.
  • Figure 6 shows a schematic diagram of the depth information of the image, with a portion of the depth value indicated, the number indicating the distance of the target rectangle from the camera.
  • the sampling points in the schematic diagram of the actual depth information are more dense, and the embodiment of the present invention is only a schematic description, which is not specifically limited.
  • the terminal detects an edge of the image to obtain a plurality of candidate quadrilaterals.
  • the plurality of candidate quadrilaterals are composed of a plurality of edge lines after the terminal detects the edges of the image to obtain a plurality of edge lines.
  • the terminal processes each candidate quadrilateral of the plurality of candidate quadrilaterals according to the operations in the following T1-T3 for the first candidate quadrilateral:
  • T1 The terminal determines, according to the depth information, whether each side of the first candidate quadrilateral is on the same plane of the three-dimensional space.
  • the sequence of interior angle values of the target quadrilateral ⁇ i(i 1, 2, 3, 4) ⁇ .
  • the target quadrilateral here specifically refers to the theoretical projection of the target rectangle in the image.
  • the terminal determines a quadrilateral with the highest reliability matching the target quadrilateral among the plurality of candidate quadrilaterals as an actual quadrilateral projected by the target rectangle in the image.
  • the image detecting method provided by the embodiment of the present invention because the embodiment of the present invention considers interference from interference lines inside or outside the rectangle in the process of document correction, and detects a plurality of candidate quadrilaterals from the original image, and multiple candidates are
  • the quadrilateral in the quadrilateral that matches the target quadrilateral projected by the target rectangle in the image is determined as the actual quadrilateral projected by the target rectangle in the image, so that the effect of accurately eliminating the internal and external interference of the rectangle can be achieved.
  • This is an important issue in document correction, because if the quadrilateral detection error is made in this step, the processing of the subsequent steps is based on the erroneous quadrilateral, thereby unrecoverable misleading of the correction result.
  • the specific implementation of the terminal determining whether the sides of the first candidate quadrilateral are in the same plane of the three-dimensional space according to the depth information is provided in the embodiment of the present invention, including:
  • the terminal determines the positions of the sides and corners of the first candidate quadrilateral to distinguish the inside and the outside of the first candidate quadrilateral.
  • the terminal starts from the small block area of one side (such as the lower side) or the corner (such as the lower left corner) of the first candidate quadrilateral, and calculates the depth value in the small block area in different directions (such as up, right, The amount of change to the top right (in Figure 6 is upwards per 10 unit depth / grid).
  • the terminal expands from the small block area in different directions to the surroundings, and simultaneously calculates whether the depth change amount of the extended area in the direction is consistent with the initial change amount until the extended area covers all the inner areas of the first candidate quadrilateral, If no inconsistent regions appear, the sides of the first candidate quadrilateral are on the same plane. If there is an inconsistent region, the sides of the first candidate quadrilateral are not on the same plane. Where the depth change should be considered along with the depth As the absolute value of the degree increases, the amount of change also increases.
  • the terminal determines the angle ⁇ 1 of the two adjacent sides of the quadrilateral, and further determines the remaining angle ⁇ 2, the value of the angle ⁇ 3 in the clockwise or counterclockwise direction from the angle ⁇ 1, and the last angle ⁇ 4 can be the inner angle of the quadrilateral. And 360 ° minus the value of the first three internal angles.
  • the angle ⁇ 1 may take the angle between the leftmost and the lowermost, or the closest or the farthest angle in the depth map, and the like, which is not specifically limited in the embodiment of the present invention.
  • the angle library includes a sequence of interior angle values of the target quadrilateral corresponding to each of the plurality of rectangular gestures.
  • the sequence of internal angle values in the embodiment of the present invention may be arranged in a clockwise or counterclockwise manner from one direction (for example, starting from the lower left), or may be arranged in a manner from large to small or from small to large. Or, according to the depth information in the image, starting from a corner closest to the camera, and the like, the embodiment of the present invention does not specifically limit this.
  • the rectangular posture parameter of the target rectangle in the embodiment of the present invention may specifically include: an angle ⁇ between the object plane where the target rectangle is located and the image plane where the image is located, and a side of the target rectangle The angle ⁇ between the object plane and the image plane intersection line.
  • the terminal can determine the rectangular pose parameter of the target rectangle by:
  • the angle ⁇ can be decomposed into a size component ⁇ v and a direction component ⁇ d. Since the picture obtained by the camera is rectangular, it is natural to use the image plane c as a reference to The adjacent side of the picture is the axis, which can form a Cartesian coordinate system xOy. The normal vector of the image plane c is v1, and the normal vector of the object plane w is v2.
  • the other component is the direction of the angle, (v2-v1) gives the vector d, the direction of d represents the direction of the angle; the vector d is projected onto the xOy plane to obtain the vector d', which is at an angle to an axis such as Oy. Is the direction component ⁇ d.
  • e be the intersection of the object plane and the image plane, then d' and e are perpendicular to each other; therefore, the direction component ⁇ d of the angle ⁇ of the two planes can also be expressed according to the inclination angle of the intersection line e.
  • the factor that affects the projected shape of the target rectangle in the image plane is the angle between the side of the target rectangle and the intersection of the object plane and the image plane. ⁇ . That is, when the relative inclination of the object image plane and the position of the target rectangle relative to the object image plane are determined, translating or rotating the image plane does not affect the value of the inner angle of the rectangular projection.
  • the angle ⁇ is projected to the image plane to become ⁇ ', and the two angles are monotonically related, but not equal, that is, ⁇ '.
  • ⁇ ' ⁇ - ⁇ d, that is, ⁇ ' is equal to the angle between the edge of the target rectangle and the edge of the image and the angle between the intersection of the object image plane and the edge of the image.
  • the shape of the quadrilateral image formed by the rectangle that is, the quadrilateral
  • the rectangular posture parameter of the target rectangle in the embodiment of the present invention may specifically include: a larger angle ⁇ among the angles of the two opposite sides of the first candidate quadrilateral. That is, the embodiment of the present invention can simplify the angle ⁇ and the angle ⁇ in the above embodiment as the larger one of the angles between the two sets of opposite sides of the quadrilateral, and thus the implementation is relatively simple.
  • the angle ⁇ and the angle ⁇ in the above embodiment can be simplified as ⁇ 1 or ⁇ 2 in FIG.
  • This angle is positively related to the angle ⁇ and less affected by ⁇ , so it can be used as a simple approximation.
  • the angle library in the embodiment of the present invention is obtained and stored in advance by the terminal, where the terminal can obtain the angle library in the following two ways:
  • the terminal respectively detects a target quadrilateral corresponding to each rectangular posture of the plurality of rectangular postures, and calculates a sequence of internal angle values of the target quadrilateral corresponding to each rectangular posture to obtain the angle library.
  • the terminal receives the angle library to obtain the angle library sent by the device.
  • the angle library obtaining device can obtain the angle library according to the manner in which the terminal obtains the angle library, and details are not described herein.
  • the terminal acquisition angle library is taken as an example for description.
  • the terminal can obtain the angle as follows: Library:
  • step S1005 is performed;
  • step S1003 is performed.
  • step S1002 is performed.
  • the internal angle value sequence of the target quadrilateral corresponding to each rectangular posture may be stored in the memory shown in FIG. 4, and the storage format may be a database or a common file, which is not specifically limited in the embodiment of the present invention.
  • the storage of the sequence of interior angle values of the target quadrilateral corresponding to the rectangular gesture can be shown in Table 1.
  • the rectangular angle corresponds to the sequence of internal angle values of the target quadrilateral.
  • the values of the four interior angles can be obtained by subtracting the values of the first three internal angles in Table 1 from the internal angle of the quadrilateral and 360°.
  • the sequence of the inner angle values of the target quadrilateral corresponding to each rectangular posture stored in the memory may also include the values of all the four inner corners, which is not specifically limited in the embodiment of the present invention.
  • step S1001 and step S1002 in this example starts from 0, and the angle increment in step S1004 and step S1005 ends at 90 degrees.
  • Initialization as long as all possible angles ⁇ and angles ⁇ can be enumerated, all scenes tilted by the camera can be covered.
  • the angle ⁇ and the angle ⁇ are all in the range of 0° to 90°, and the mirror symmetry can cover all the poses in the shooting scene.
  • the angle since the angle is continuous, it is impossible to obtain all the ⁇ and ⁇ cases.
  • the angle can be discretized and approximated by a small interval (that is, the step size described above). The smaller the interval, the higher the accuracy. For example, the interval between 0.1° and 10° can be taken. Usually, the interval is 1°, and the precision is higher.
  • the method for obtaining the inner angle value in the example can be obtained by using an actual rectangular flat plate, clearly distinguishing from the background, stepping the rotating rectangle or moving the position of the camera, as shown in FIG. 11;
  • the virtual rectangle of the 3D software is obtained by the angle stepping, which is specifically limited by the embodiment of the present invention.
  • step S503 of the embodiment of the present invention Specifically, in T3 in step S503 of the embodiment of the present invention:
  • the reliability of the quadrilateral matching the target quadrilateral may specifically include T31 and T32:
  • the four inner angles of a candidate quadrilateral and a target quadrilateral are respectively subjected to difference to obtain a statistical value.
  • the terminal determines, according to the statistical value, a reliability that the first candidate quadrilateral matches the target quadrilateral.
  • the statistical method may be the mean value of the absolute values of the four differences, or the variance, or the standard deviation, or the sum of the squares of the four differences, or may re-open the sum of the squares
  • the embodiment of the present invention does not specifically limit this.
  • the terminal determines, according to the statistic value, the credibility of the matching of the first candidate quadrilateral with the target quadrilateral, which may include:
  • the terminal determines the difference between the preset value and the statistical value as the reliability of the first candidate quadrilateral matching the target quadrilateral.
  • the reliability of the first candidate quadrilateral matching the target quadrilateral can be determined.
  • the terminal determines, according to the statistic value, the credibility of the matching between the first candidate quadrilateral and the target quadrilateral, and specifically includes:
  • the terminal queries the pre-stored correspondence according to the statistic value, and determines the credibility of the first candidate quadrilateral to match the target quadrilateral, where the correspondence includes the credibility corresponding to the plurality of values.
  • Table 2 is merely exemplary data showing the reliability of a set of multiple values.
  • the numerical correspondence between the statistical value and the credibility may be other. No specific limitation.
  • the terminal may further include: the terminal determines the distance d between the target rectangle and the camera; According to the distance d, the pre-stored database is matched, and the angle library corresponding to d is determined to be a pre-stored angle library.
  • the target rectangle may have a slight change in the same posture, for example, when the target rectangle is far away from the camera, it is greater than 90 in the image plane.
  • the angle of the degree will become larger, and the angle smaller than 90 degrees will become smaller, equal to the angle of 90 degrees.
  • the image detection method provided by the embodiment of the present invention in the process of document correction, considers the interference from the interference line inside or outside the rectangle, from the original
  • the initial image detects a plurality of candidate quadrilaterals, and the quadrilateral with the highest degree of reliability matching the target quadrilateral projected by the target rectangle in the image is determined as the actual quadrilateral projected by the target rectangle in the image, so that accurate elimination can be achieved.
  • the candidate quadrilateral 1 (the quadrilateral formed by the line corresponding to the thick line) and the candidate quadrilateral 2 (the quadrilateral formed by the line corresponding to the thick line) detected by the terminal are as shown in FIGS. 12 and 13, respectively, and the candidate quadrilateral 1 and the candidate are respectively shown.
  • the angle library may further include a ratio ⁇ of projections of unit lengths of adjacent two sides corresponding to each of the plurality of rectangular postures.
  • the image detecting method provided by the embodiment of the present invention may further include:
  • the terminal matches the angle library according to the rectangular posture parameter of the target rectangle, and determines the ratio ⁇ 1 of the projection of the unit length of the adjacent two sides corresponding to the target rectangle.
  • the method may further include:
  • the terminal determines the reality according to the ratio ⁇ 1 and the projection ratio of the adjacent sides of the actual quadrilateral The true ratio of the adjacent sides of the quadrilateral. Furthermore, the terminal acquires and outputs the target rectangle according to the true ratio of the lengths of the adjacent sides of the target quadrilateral.
  • the lengths of the lower left and the lower right side of the actual quadrilateral projected in the image are ⁇ 0.9, 1.5 ⁇ , respectively, according to the stored The ratio
  • the image detection method provided by the embodiment of the present invention can not only find the actual quadrilateral corresponding to the target rectangle, but also obtain the true ratio of the width and height, and the terminal can obtain the projection transformation matrix according to the ratio of the real rectangle, thereby recovering the quadrilateral. For a real rectangle.
  • the terminal determines that the sequence of internal angle values of the target quadrilateral projected by the target rectangle in the image is ⁇ 90°, 90°, 90°, 90° ⁇ . Furthermore, the terminal can respectively determine the reliability of the detected candidate quadrilateral matching the target quadrilateral, and determine the quadrilateral with the highest reliability of the matching as the target moment. The actual quadrilateral projected in the image.
  • this embodiment does not require preset data, the implementation is the simplest, but also the most coarse of the judgment, and the aspect ratio of the rectangle may not be accurately estimated.
  • an embodiment of the present invention further provides an image detection method. As shown in FIG. 15, the method includes steps S1501-S1504:
  • the terminal acquires an image of a shooting target rectangle.
  • the terminal detects an edge of the image to obtain a plurality of candidate quadrilaterals.
  • the plurality of candidate quadrilaterals are composed of a plurality of edge lines after the terminal detects the edges of the image to obtain a plurality of edge lines.
  • the terminal processes each candidate quadrilateral of the plurality of candidate quadrilaterals according to the following operations in the K1-K2 for the first candidate quadrilateral:
  • the terminal determines a quadrilateral with the highest reliability matching the target quadrilateral among the plurality of candidate quadrilaterals as an actual quadrilateral projected by the target rectangle in the image.
  • step S1503 of the embodiment of the present invention in K2 in step S1503 of the embodiment of the present invention:
  • the credibility of the quadrilateral matching the target rectangle may specifically include:
  • the first candidate quadrilateral and the four inner angles of the target quadrilateral are respectively subjected to difference to obtain a statistical value.
  • K22 The terminal determines, according to the statistical value, the reliability of matching the first candidate quadrilateral with the target quadrilateral.
  • the statistical method may be the mean value of the absolute values of the four differences, or the variance, or the standard deviation, or the sum of the squares of the four differences, or may re-open the sum of the squares
  • the embodiment of the present invention does not specifically limit this.
  • step K22 For the specific implementation of the step K22, reference may be made to the embodiment shown in FIG. 5, and details are not described herein again.
  • the image detecting method provided by the embodiment of the present invention because the embodiment of the present invention considers interference from interference lines inside or outside the rectangle in the process of document correction, and detects a plurality of candidate quadrilaterals from the original image, and multiple candidates are
  • the quadrilateral in the quadrilateral that matches the target quadrilateral projected by the target rectangle in the image is determined as the actual quadrilateral projected by the target rectangle in the image, so that the effect of accurately eliminating the internal and external interference of the rectangle can be achieved.
  • this embodiment does not require preset data, so the implementation is the simplest with respect to the image detecting method described above.
  • the terminal includes corresponding hardware structures and/or software modules for performing various functions.
  • the present invention can be implemented in a combination of hardware or hardware and computer software in combination with the elements and algorithm steps of the various examples described in the embodiments disclosed herein. Whether a function is implemented in hardware or computer software to drive hardware depends on the specific application and design constraints of the solution. A person skilled in the art can use different methods for implementing the described functions for each particular application, but such implementation should not be considered to be beyond the scope of the present invention.
  • the embodiment of the present invention may divide the function module into the terminal according to the foregoing method example.
  • each function module may be divided according to each function, or two or more functions may be integrated into one processing module.
  • the above integrated modules can be implemented in the form of hardware or in the form of software functional modules. It should be noted that the division of the module in the embodiment of the present invention is schematic, and is only a logical function division, and the actual implementation may have another division manner.
  • FIG. 17 shows a possible structural diagram of the terminal involved in the above embodiment.
  • the terminal 1700 includes: a camera module 1701 and a processing module Block 1702.
  • the camera module 1701 is configured to support the terminal 1700 to perform step S501 in FIG. 5, and the processing module 1402 is configured to support the terminal 1700 to perform steps S502-S504 in FIG. 5; or the camera module 1701 is configured to support the terminal 1700 to perform FIG. 15 and FIG.
  • the processing module 1402 is configured to support the terminal 1700 to perform steps S1502-S1504 in FIGS. 15 and 16.
  • the terminal 1700 may further include a display module 1703 and a communication module 1704.
  • the display module 1703 is configured to support the terminal 1700 to display a target rectangle
  • the communication module 1704 is configured to support communication between the terminal and other external devices, for example, the above-mentioned perspective library acquisition device.
  • the terminal 1700 may further include a storage module 1705 for storing program codes and data of the base station, which is not specifically limited in this embodiment of the present invention.
  • the camera module 1701 can be the camera 402 in FIG. 4 .
  • the processing module 1702 may be a processor or a controller, for example, the processor 401 in FIG. 4, or a general-purpose processor, a digital signal processor (DSP), and an application-specific integrated circuit. : application-specific integrated circuit (abbreviation: ASIC), field programmable gate array (English: field programmable gate array, abbreviated: FPGA) or other programmable logic device, transistor logic device, hardware component or any combination thereof. It is possible to implement or carry out the various illustrative logical blocks, modules and circuits described in connection with the present disclosure.
  • the processor can also be a combination of computing functions, for example, including one or more microprocessor combinations, a combination of a DSP and a microprocessor, and the like.
  • Display module 1703 can be display 403 in FIG.
  • the communication module 1704 may be the communication interface 404 in FIG. 4, or may be a transceiver or the like.
  • the storage module 1705 can be the memory 405 of FIG.
  • the processing module 1702 is a processor
  • the display module 1703 is a display
  • the communication module 1704 is a communication interface
  • the storage module 1705 is a memory
  • the terminal involved in the embodiment of the present invention may be the terminal shown in FIG.
  • the terminal shown in FIG. For details, refer to the related description in the part of FIG. 4, and details are not described herein again.
  • the embodiment of the present invention since the embodiment of the present invention considers interference from interference lines inside or outside the rectangle in the process of document correction, a plurality of candidate quadrilaterals are detected from the original image, and the plurality of candidate quadrilaterals are Four sides of the target projected in the image with the target rectangle
  • the quadrilateral with the highest reliability of the shape matching is determined as the actual quadrilateral projected by the target rectangle in the image, so that the effect of accurately eliminating the internal and external interference of the rectangle can be achieved.
  • This is an important issue in document correction, because if the quadrilateral detection error is made in this step, the processing of the subsequent steps is based on the erroneous quadrilateral, thereby unrecoverable misleading of the correction result.
  • the steps of a method or algorithm described in connection with the present disclosure may be implemented in a hardware, or may be implemented by a processor executing software instructions.
  • the software instructions may be composed of corresponding software modules, which may be stored in random access memory (English: random access memory, abbreviation: RAM), flash memory, read only memory (English: read only memory, abbreviation: ROM), Erase programmable read-only memory (English: erasable programmable ROM, abbreviation: EPROM), electrically erasable programmable read-only memory (English: electrical EPROM, abbreviation: EEPROM), registers, hard disk, mobile hard disk, CD-ROM (CD) - ROM) or any other form of storage medium known in the art.
  • RAM random access memory
  • ROM read only memory
  • EPROM Erase programmable read-only memory
  • EPROM electrically erasable programmable read-only memory
  • registers hard disk, mobile hard disk, CD-ROM (CD) - ROM) or any other form
  • An exemplary storage medium is coupled to the processor to enable the processor to read information from, and write information to, the storage medium.
  • the storage medium can also be an integral part of the processor.
  • the processor and the storage medium can be located in an ASIC. Additionally, the ASIC can be located in a core network interface device.
  • the processor and the storage medium may also exist as discrete components in the core network interface device.
  • the functions described herein can be implemented in hardware, software, firmware, or any combination thereof.
  • the functions may be stored in a computer readable medium or transmitted as one or more instructions or code on a computer readable medium.
  • Computer readable media includes both computer storage media and communication media including any medium that facilitates transfer of a computer program from one location to another.
  • a storage medium may be any available media that can be accessed by a general purpose or special purpose computer.

Abstract

本发明实施例提供一种图像检测方法及终端,以至少解决现有的文档校正过程中矩形误判率高的问题。方法包括:终端获取拍摄目标矩形的图像以及图像中的深度信息;终端检测图像的边缘,获得多个候选四边形;根据深度信息,确定第一候选四边形的各边是否在三维空间的同一平面上;若在,确定第一候选四边形的内角值序列{θi(i=1,2,3,4)、以及目标矩形在图像中投影的目标四边形的内角值序列{ψi(i=1,2,3,4)};并根据内角值序列{θi(i=1,2,3,4)}和内角值序列{ψi(i=1,2,3,4)},确定第一候选四边形与目标四边形匹配的可信度;将多个候选四边形中与目标四边形匹配的可信度最高的四边形确定为目标矩形在图像中投影的实际四边形。本发明适用于图像检测技术领域。

Description

一种图像检测方法及终端 技术领域
本发明图像检测技术领域,尤其涉及一种图像检测方法及终端。
背景技术
近年来,手机的拍照功能在快速的发展,涌现出了很多新的拍照技术,如全景拍照、全焦模式、文档校正等。其中,这里的文档校正,其实是一种图像检测、变换、处理的方法,输入是图像,输出也是图像。而文档也泛指类似文档的矩形的含有信息的图像,如文件、发票、书本、名片、证件、讲义、照片、广告、展板、电视、电影、屏幕等。
现有技术中,在进行文档校正时,通常采用较为粗略的规则来判断矩形。比如,规定四边形的相对的边的夹角小于30°;或者,四边形相对边的距离,在图像的长或宽中占据一定的比例,比如五分之一;或者,相邻边的夹角要接近垂直(90°),允许30°的偏差;或者,四边形要足够大,比如周长要大于图片宽高的一定比例。
然而,由于上述矩形判断的规则都是粗略的规则,因此容易受到来自矩形内部或外部的干扰线的干扰,比如图1所示的桌子边缘或书本边缘的干扰,从而会影响矩形状的判断,导致误判率高。
发明内容
本发明实施例提供一种图像检测方法及终端,以至少解决现有的文档校正过程中矩形误判率高的问题。
为达到上述目的,本发明实施例提供如下技术方案:
一方面,本发明实施例提供一种图像检测方法,该方法包括:终端获 取拍摄目标矩形的图像以及该图像中的深度信息,其中,该深度信息用于表征该目标矩形距离摄像头的远近;该终端检测该图像的边缘,获得多个候选四边形;该终端对该多个候选四边形中的每一候选四边形,均按照下面针对第一候选四边形的操作进行处理:该终端根据该深度信息,确定该第一候选四边形的各边是否在三维空间的同一平面上;若该第一候选四边形的各边在三维空间的同一平面上,该终端确定该第一候选四边形的内角值序列{θi(i=1,2,3,4)、以及该目标矩形在该图像中投影的目标四边形的内角值序列{ψi(i=1,2,3,4)};该终端根据该第一候选四边形的内角值序列{θi(i=1,2,3,4)}和该目标四边形的内角值序列{ψi(i=1,2,3,4)},确定该第一候选四边形与该目标四边形匹配的可信度;在该终端对该多个候选四边形中的每一候选四边形均按照上面针对该第一候选四边形的操作处理后,该终端将该多个候选四边形中与该目标四边形匹配的可信度最高的四边形确定为该目标矩形在该图像中投影的实际四边形。
基于本发明实施例提供的图像检测方法,因为本发明实施例在文档校正的过程中,考虑到了来自矩形内部或外部的干扰线的干扰,从原始图像检测出多个候选四边形,将多个候选四边形中与目标矩形在图像中投影的目标四边形匹配的可信度最高的四边形确定为目标矩形在图像中投影的实际四边形,因此可以达到准确消除矩形内外部干扰的效果。这是文档校正中的一个重要的问题,因为如果这一步四边形检测错误,后续步骤的处理就基于这个错误的四边形进行,从而对校正结果产生不可恢复的误导。
在一种可能的设计中,该终端确定该目标矩形在该图像中投影的目标四边形的内角值序列{ψi(i=1,2,3,4)},包括:该终端确定该目标矩形的矩形姿态参数;该终端根据该目标矩形的矩形姿态参数,匹配预先存储的角度库,确定该目标矩形在该图像中投影的目标四边形的内角值序列{ψi(i=1,2,3,4)},其中,该角度库中包含多种矩形姿态中的每种矩形姿态对应的目标四边形的内角值序列。
在一种可能的设计中,该目标矩形的矩形姿态参数包括:该目标矩形所在的物平面与该图像所在的像平面的夹角α、以及该目标矩形的一边与该物平面和该像平面相交线的夹角β。
由于当确定了物平面和像平面的夹角为α,并且确定了矩形的一边与物平面和像平面的相交线的夹角为β,则矩形所成的四边形的像的形状,即四边形各内角的大小θi(i=1,2,3,4),就可以唯一的确定,因此通过将候选四边形与该矩形姿态下匹配角度库获得的目标四边形进行匹配确定出的实际四边形更为准确。
在一种可能的设计中,该目标矩形的矩形姿态参数包括:该第一候选四边形中两组对边的夹角中较大的夹角γ。
由于该方法将上述实施例中的夹角α和夹角β简化为该第一候选四边形中两组对边的夹角中较大的夹角γ,因此实现相对简单。
在一种可能的设计中,该角度库中还包含该多种矩形姿态中的每种矩形姿态对应的相邻两边的单位长度的投影的比值λ;该方法还包括:该终端根据该目标矩形的矩形姿态参数,匹配该角度库,确定该目标矩形对应的相邻两边的单位长度的投影的比值λ1;在该终端将该多个候选四边形中与该目标四边形匹配的可信度最高的四边形确定为该目标矩形在该图像中投影的实际四边形之后,还包括:该终端根据该比值λ1、以及该实际四边形相邻两边的投影比值,确定该实际四边形相邻两边的真实比值;该终端根据该目标四边形相邻两边长度的真实比值,获取并输出该目标矩形。
通过本发明实施例提供的图像检测方法,不仅可以找到目标矩形对应的实际四边形,并且可以得到宽高的真实比例,进而终端可以根据真实的矩形的比例,求出投影变换矩阵,从而把四边形恢复为真实的矩形。
在一种可能的设计中,在该终端获取拍摄目标矩形的图像以及该图像中的深度信息之前,还包括:该终端获取并存储该角度库。
在一种可能的设计中,该终端获取该角度库,包括:该终端接收角度库获取设备发送的角度库;或者,该终端分别检测该多种矩形姿态中的每种矩形姿态对应的目标四边形,并计算该每种矩形姿态对应的目标四边形的内角值序列、以及该每种矩形姿态对应的目标四边形相邻两边的单位长度的投影的比值,获得该角度库。
在一种可能的设计中,若该第一候选四边形的各边在同一平面上,在 该终端根据该目标矩形的矩形姿态参数,匹配预先存储的角度库之前,还包括:该终端确定该目标矩形与该摄像头的距离d;该终端根据该距离d,匹配预先存储的数据库,确定与该d对应的角度库为该预先存储的角度库。
即,考虑到当目标矩形与摄像头的距离d取不同的值时,目标矩形在同样的姿态下,顶角值可能会出现轻微的变化,比如当目标矩形远离摄像头时,在像平面中大于90度的角会变得更大,小于90度的角会变得更小,等于90度的角不变。通过匹配根据目标矩形与摄像头的距离d确定出的角度库来确定目标四边形的内角值序列,可以使得确定出的目标四边形的内角值序列更为准确,进而图像检测结果也更为准确。
在一种可能的设计中,该终端确定该目标矩形在该图像中投影的目标四边形的内角值序列为{90°,90°,90°,90°}。
由于该实施例不需要预置数据,因此实现最简单。
在一种可能的设计中,该终端根据该第一候选四边形的内角值序列{θi(i=1,2,3,4)}和该目标四边形的内角值序列{ψi(i=1,2,3,4)},确定该第一候选四边形与该目标四边形匹配的可信度,包括:该终端根据该第一候选四边形的内角值序列{θi(i=1,2,3,4)}和该目标四边形的内角值序列{ψi(i=1,2,3,4)},将该第一候选四边形和该目标四边形的四个内角分别求差后求统计值;该终端根据该统计值,确定该第一候选四边形与该目标四边形匹配的可信度。
一种可能的设计中,该终端根据该统计值,确定该第一候选四边形与该目标四边形匹配的可信度,包括:该终端将预设值与该统计值的差值确定为该第一候选四边形与该目标四边形匹配的可信度;或者,该终端根据该统计值,查询预先存储的对应关系,确定该第一候选四边形与该目标四边形匹配的可信度,该对应关系包括多个值对应的可信度。
另一方面,本发明实施例提供一种终端,其特征在于,该终端包括:处理模块和摄像模块;该摄像模块,用于获取拍摄目标矩形的图像以及该图像中的深度信息,其中,该深度信息用于表征该目标矩形距离摄像头的远近;该处理模块,用于检测该图像的边缘,获得多个候选四边形;该处 理模块,还用于对该多个候选四边形中的每一候选四边形,均按照下面针对第一候选四边形的操作进行处理:根据该深度信息,确定该第一候选四边形的各边是否在三维空间的同一平面上;若该第一候选四边形的各边在三维空间的同一平面上,确定该第一候选四边形的内角值序列{θi(i=1,2,3,4)、以及该目标矩形在该图像中投影的目标四边形的内角值序列{ψi(i=1,2,3,4)};根据该第一候选四边形的内角值序列{θi(i=1,2,3,4)}和该目标四边形的内角值序列{ψi(i=1,2,3,4)},确定该第一候选四边形与该目标四边形匹配的可信度;该处理模块,还用于在对该多个候选四边形中的每一候选四边形均按照上面针对该第一候选四边形的操作处理后,将该多个候选四边形中与该目标四边形匹配的可信度最高的四边形确定为该目标矩形在该图像中投影的实际四边形。
在一种可能的设计中,该处理模块具体用于:确定该目标矩形的矩形姿态参数;根据该目标矩形的矩形姿态参数,匹配预先存储的角度库,确定该目标矩形在该图像中投影的目标四边形的内角值序列{ψi(i=1,2,3,4)},其中,该角度库中包含多种矩形姿态中的每种矩形姿态对应的目标四边形的内角值序列。
在一种可能的设计中,该目标矩形的矩形姿态参数包括:该目标矩形所在的物平面与该图像所在的像平面的夹角α、以及该目标矩形的一边与该物平面和该像平面相交线的夹角β。
在一种可能的设计中,该目标矩形的矩形姿态参数包括:该第一候选四边形中两组对边的夹角中较大的夹角γ。
在一种可能的设计中,该终端还包括显示模块;该角度库中还包含该多种矩形姿态中的每种矩形姿态对应的相邻两边的单位长度的投影的比值λ;该处理模块,还用于根据该目标矩形的矩形姿态参数,匹配该角度库,确定该目标矩形对应的相邻两边的单位长度的投影的比值λ1;该处理模块,还用于在将该多个候选四边形中与该目标四边形匹配的可信度最高的四边形确定为该目标矩形在该图像中投影的实际四边形之后,根据该比值λ1、以及该实际四边形相邻两边的投影比值,确定该实际四边形相邻两边的真实比值;该处理模块,还用于根据该目标四边形相邻两边长度 的真实比值,获取该目标矩形;该显示模块,用于显示该目标矩形。
在一种可能的设计中,该终端还包括存储模块;该处理模块,还用于在该获取拍摄目标矩形的图像以及该图像中的深度信息之前,获取该角度库;该存储模块,用于存储该角度库。
在一种可能的设计中,该终端还包括:通信模块;该处理模块具体用于:通过该通信模块接收角度库获取设备发送的角度库;或者,该处理模块具体用于:分别检测该多种矩形姿态中的每种矩形姿态对应的目标四边形,并计算该每种矩形姿态对应的目标四边形的内角值序列、以及该每种矩形姿态对应的目标四边形相邻两边的单位长度的投影的比值,获得该角度库。
在一种可能的设计中,该处理模块,还用于若该第一候选四边形的各边在同一平面上,在该根据该目标矩形的矩形姿态参数,匹配预先存储的角度库之前,确定该目标矩形与该摄像头的距离d;该处理模块,还用于根据该距离d,匹配预先存储的数据库,确定与该d对应的角度库为该预先存储的角度库。
在一种可能的设计中,该处理模块确定该目标四边形的内角值序列为{90°,90°,90°,90°}。
在一种可能的设计中,该处理模块具体用于:根据该第一候选四边形的内角值序列{θi(i=1,2,3,4)}和该目标四边形的内角值序列{ψi(i=1,2,3,4)},将该第一候选四边形和该目标四边形的四个内角分别求差后求统计值;根据该统计值,确定该第一候选四边形与该目标四边形匹配的可信度。
在一种可能的设计中,该处理模块具体用于:将预设值与该统计值的差值确定为该第一候选四边形与该目标四边形匹配的可信度;或者,根据该统计值,查询预先存储的对应关系,确定该第一候选四边形与该目标四边形匹配的可信度,该对应关系包括多个值对应的可信度。
由于本发明实施例提供的终端可用于执行上述的图像检测方法,因此其所能获得的技术效果可参考上述方法实施例,此处不再赘述。
又一方面,本发明实施例提供一种终端,该终端包括:处理器和摄像头;该摄像头,用于获取拍摄目标矩形的图像以及该图像中的深度信息,其中,该深度信息用于表征该目标矩形距离摄像头的远近;该处理器,用于检测该图像的边缘,获得多个候选四边形;该处理器,还用于对该多个候选四边形中的每一候选四边形,均按照下面针对第一候选四边形的操作进行处理:根据该深度信息,确定该第一候选四边形的各边是否在三维空间的同一平面上;若该第一候选四边形的各边在三维空间的同一平面上,确定该第一候选四边形的内角值序列{θi(i=1,2,3,4)、以及该目标矩形在该图像中投影的目标四边形的内角值序列{ψi(i=1,2,3,4)};根据该第一候选四边形的内角值序列{θi(i=1,2,3,4)}和该目标四边形的内角值序列{ψi(i=1,2,3,4)},确定该第一候选四边形与该目标四边形匹配的可信度;该处理器,还用于在对该多个候选四边形中的每一候选四边形均按照上面针对该第一候选四边形的操作处理后,将该多个候选四边形中与该目标四边形匹配的可信度最高的四边形确定为该目标矩形在该图像中投影的实际四边形。
在一种可能的设计中,该处理器具体用于:确定该目标矩形的矩形姿态参数;根据该目标矩形的矩形姿态参数,匹配预先存储的角度库,确定该目标矩形在该图像中投影的目标四边形的内角值序列{ψi(i=1,2,3,4)},其中,该角度库中包含多种矩形姿态中的每种矩形姿态对应的目标四边形的内角值序列。
在一种可能的设计中,该目标矩形的矩形姿态参数包括:该目标矩形所在的物平面与该图像所在的像平面的夹角α、以及该目标矩形的一边与该物平面和该像平面相交线的夹角β。
在一种可能的设计中,该目标矩形的矩形姿态参数包括:该第一候选四边形中两组对边的夹角中较大的夹角γ。
在一种可能的设计中,该终端还包括显示器;该角度库中还包含该多种矩形姿态中的每种矩形姿态对应的相邻两边的单位长度的投影的比值λ;该处理器,还用于根据该目标矩形的矩形姿态参数,匹配该角度库,确定该目标矩形对应的相邻两边的单位长度的投影的比值λ1;该处理器, 还用于在将该多个候选四边形中与该目标四边形匹配的可信度最高的四边形确定为该目标矩形在该图像中投影的实际四边形之后,根据该比值λ1、以及该实际四边形相邻两边的投影比值,确定该实际四边形相邻两边的真实比值;该处理器,还用于根据该目标四边形相邻两边长度的真实比值,获取该目标矩形;该显示器,用于显示该目标矩形。
在一种可能的设计中,该终端还包括存储器;该处理器,还用于在该获取拍摄目标矩形的图像以及该图像中的深度信息之前,获取该角度库;该存储器,用于存储该角度库。
在一种可能的设计中,该终端还包括:通信接口;该处理器具体用于:通过该通信接口接收角度库获取设备发送的角度库;或者,该处理器具体用于:分别检测该多种矩形姿态中的每种矩形姿态对应的目标四边形,并计算该每种矩形姿态对应的目标四边形的内角值序列、以及该每种矩形姿态对应的目标四边形相邻两边的单位长度的投影的比值,获得该角度库。
在一种可能的设计中,该处理器,还用于若该第一候选四边形的各边在同一平面上,在该根据该目标矩形的矩形姿态参数,匹配预先存储的角度库之前,确定该目标矩形与该摄像头的距离d;该处理器,还用于根据该距离d,匹配预先存储的数据库,确定与该d对应的角度库为该预先存储的角度库。
在一种可能的设计中,该处理器确定该目标四边形的内角值序列为{90°,90°,90°,90°}。
在一种可能的设计中,该处理器具体用于:根据该第一候选四边形的内角值序列{θi(i=1,2,3,4)}和该目标四边形的内角值序列{ψi(i=1,2,3,4)},将该第一候选四边形和该目标四边形的四个内角分别求差后求统计值;根据该统计值,确定该第一候选四边形与该目标四边形匹配的可信度。
在一种可能的设计中,该处理器具体用于:将预设值与该统计值的差值确定为该第一候选四边形与该目标四边形匹配的可信度;或者,根据该统计值,查询预先存储的对应关系,确定该第一候选四边形与该目标四边形匹配的可信度,该对应关系包括多个值对应的可信度。
由于本发明实施例提供的终端可用于执行上述的图像检测方法,因此其所能获得的技术效果可参考上述方法实施例,此处不再赘述。
又一方面,本发明实施例提供了一种计算机存储介质,用于储存为上述终端所用的计算机软件指令,其包含用于执行上述方面所设计的程序。
又一方面,本发明实施例提供一种图像检测方法,该方法包括:终端获取拍摄目标矩形的图像;该终端检测该图像的边缘,获得多个候选四边形;该终端对该多个候选四边形中的每一候选四边形,均按照下面针对第一候选四边形的操作进行处理:该终端确定该第一候选四边形的内角值序列{θi(i=1,2,3,4)};该终端根据该第一候选四边形的内角值序列{θi(i=1,2,3,4)}与该目标矩形在该图像中投影的目标四边形的内角值序列{90°,90°,90°,90°},确定该第一候选四边形与该目标四边形匹配的可信度;在该终端对该多个候选四边形中的每一候选四边形均按照上面针对该第一候选四边形的操作处理后,该终端将该多个候选四边形中与该目标四边形匹配的可信度最高的四边形确定为该目标矩形在该图像中投影的实际四边形。
基于本发明实施例提供的图像检测方法,因为本发明实施例在文档校正的过程中,考虑到了来自矩形内部或外部的干扰线的干扰,从原始图像检测出多个候选四边形,将多个候选四边形中与目标矩形在图像中投影的目标四边形匹配的可信度最高的四边形确定为目标矩形在图像中投影的实际四边形,因此可以达到准确消除矩形内外部干扰的效果。并且该实施例不需要预置数据,因此相对于上述图像检测方法,实现最简单。
一种可能的设计中,该终端根据该第一候选四边形的内角值序列{θi(i=1,2,3,4)}和该目标矩形在该图像中投影的目标四边形的内角值序列{90°,90°,90°,90°},确定该第一候选四边形与该目标四边形匹配的可信度,包括:该终端根据该第一候选四边形的内角值序列{θi(i=1,2,3,4)}和该目标矩形在该图像中投影的目标四边形的内角值序列{90°,90°,90°,90°},将该第一候选四边形和该目标四边形的四个内角分别求差后求统计值;该终端根据该统计值,确定该第一候选四边形与该目标四边形匹配的可信度。
一种可能的设计中,该终端根据该统计值,确定该第一候选四边形与该目标四边形匹配的可信度,包括:该终端将预设值与该统计值的差值确定为该第一候选四边形与该目标四边形匹配的可信度;或者,该终端根据该统计值,查询预先存储的对应关系,确定该第一候选四边形与该目标四边形匹配的可信度,该对应关系包括多个值对应的可信度。
又一方面,本发明实施例提供一种终端,该终端包括:摄像模块和处理模块;该摄像模块,用于获取拍摄目标矩形的图像;该处理模块,用于检测该图像的边缘,获得多个候选四边形;该处理模块,还用于对该多个候选四边形中的每一候选四边形,均按照下面针对第一候选四边形的操作进行处理:确定该第一候选四边形的内角值序列{θi(i=1,2,3,4)};根据该第一候选四边形的内角值序列{θi(i=1,2,3,4)}与该目标矩形在该图像中投影的目标四边形的内角值序列{90°,90°,90°,90°},确定该第一候选四边形与该目标四边形匹配的可信度;该处理模块,还用于在对该多个候选四边形中的每一候选四边形均按照上面针对该第一候选四边形的操作处理后,将该多个候选四边形中与该目标四边形匹配的可信度最高的四边形确定为该目标矩形在该图像中投影的实际四边形。
在一种可能的设计中,该处理模块具体用于:根据该第一候选四边形的内角值序列{θi(i=1,2,3,4)}和该目标矩形在该图像中投影的目标四边形的内角值序列{90°,90°,90°,90°},将该第一候选四边形和该目标四边形的四个内角分别求差后求统计值;根据该统计值,确定该第一候选四边形与该目标四边形匹配的可信度。
在一种可能的设计中,该处理模块具体用于:将预设值与该统计值的差值确定为该第一候选四边形与该目标四边形匹配的可信度;或者,根据该统计值,查询预先存储的对应关系,确定该第一候选四边形与该目标四边形匹配的可信度,该对应关系包括多个值对应的可信度。
由于本发明实施例提供的终端可用于执行上述的图像检测方法,因此其所能获得的技术效果可参考上述方法实施例,此处不再赘述。
又一方面,本发明实施例提供一种终端,该终端包括:摄像头和处理器;该摄像头,用于获取拍摄目标矩形的图像;该处理器,用于检测该图 像的边缘,获得多个候选四边形;该处理器,还用于对该多个候选四边形中的每一候选四边形,均按照下面针对第一候选四边形的操作进行处理:确定该第一候选四边形的内角值序列{θi(i=1,2,3,4)};根据该第一候选四边形的内角值序列{θi(i=1,2,3,4)}与该目标矩形在该图像中投影的目标四边形的内角值序列{90°,90°,90°,90°},确定该第一候选四边形与该目标四边形匹配的可信度;该处理器,还用于在对该多个候选四边形中的每一候选四边形均按照上面针对该第一候选四边形的操作处理后,将该多个候选四边形中与该目标四边形匹配的可信度最高的四边形确定为该目标矩形在该图像中投影的实际四边形。
在一种可能的设计中,该处理器具体用于:根据该第一候选四边形的内角值序列{θi(i=1,2,3,4)}和该目标矩形在该图像中投影的目标四边形的内角值序列{90°,90°,90°,90°},将该第一候选四边形和该目标四边形的四个内角分别求差后求统计值;根据该统计值,确定该第一候选四边形与该目标四边形匹配的可信度。
在一种可能的设计中,该处理器具体用于:将预设值与该统计值的差值确定为该第一候选四边形与该目标四边形匹配的可信度;或者,根据该统计值,查询预先存储的对应关系,确定该第一候选四边形与该目标四边形匹配的可信度,该对应关系包括多个值对应的可信度。
由于本发明实施例提供的终端可用于执行上述的图像检测方法,因此其所能获得的技术效果可参考上述方法实施例,此处不再赘述。
又一方面,本发明实施例提供了一种计算机存储介质,用于储存为上述终端所用的计算机软件指令,其包含用于执行上述方面所设计的程序。
综上,基于本发明实施例提供的图像检测方法和终端,因为本发明实施例在文档校正的过程中,考虑到了来自矩形内部或外部的干扰线的干扰,从原始图像检测出多个候选四边形,将多个候选四边形中与目标矩形在图像中投影的目标四边形匹配的可信度最高的四边形确定为目标矩形在图像中投影的实际四边形,因此可以达到准确消除矩形内外部干扰的效果。这是文档校正中的一个重要的问题,因为如果这一步四边形检测错误,后续步骤的处理就基于这个错误的四边形进行,从而对校正结果产生不可 恢复的误导。
附图说明
图1为现有的图像检测结果示意图;
图2为本发明实施例提供的矩形成像的抽象示意;
图3为本发明实施例提供的矩形成像时夹角β不为0的情况示意图;
图4为本发明实施例提供的一种终端的硬件结构示意图;
图5为本发明实施例提供的一种图像检测方法流程示意图;
图6为本发明实施例提供的图像的深度信息的示意图;
图7为本发明实施例提供的夹角α的表示示意图;
图8为本发明实施例提供的夹角β的表示示意图;
图9为本发明实施例提供的一种夹角α和夹角β的简化示意图;
图10为本发明实施例提供的一种角度库的获取流程示意图;
图11为本发明实施例提供的角度旋转示意图;
图12为本发明实施例提供的候选四边形1的示意图;
图13为本发明实施例提供的候选四边形2的示意图;
图14为现有的一种文档校正结果;
图15为本发明实施例提供的另一种图像检测方法流程示意图;
图16为本发明实施例提供的又一种图像检测方法流程示意图;
图17为本发明实施例提供的另一种终端的结构示意图。
具体实施方式
为了下述各实施例的描述清楚简洁,首先给出本发明实施例提出的一个思路:
图2为本发明实施例提供的一个矩形成像的抽象示意。在世界坐标中 物平面w上一个宽为W,高为H的矩形的四个顶点是Pi(i=1,2,3,4)。当用户或相机以一定角度观察该矩形时,所成的四边形的像位于与视线垂直的像平面c上。像平面c与物平面w的夹角为α矢量,矩形在像平面上的像为四边形,顶点为pi(i=1,2,3,4)。矩形所成的四边形像,在该成像姿态下,是一一对应的关系,不可能成像为其他形状的四边形。该姿态可由像平面c与物平面w的夹角α以及物平面w上矩形的一边与物平面w和像平面c相交线AB的夹角β唯一的确定。假设物平面w上矩形的一边为P3P4,则由图2可以看出,P3P4与AB平行,因此夹角β为0,没有标示出来。图3示意了一种夹角β不为0的情况。
也就是说,当确定了物平面和像平面的夹角为α,并且确定了矩形的一边与物平面和像平面的相交线的夹角为β,则矩形所成的四边形的像的形状,即四边形各内角的大小θi(i=1,2,3,4),就可以唯一的确定。在进行图像检测时,只有检测到的四边形的内角接近或符合θi(i=1,2,3,4)时,才可认为该四边形是一个由矩形映射过来的四边形。其他符合α与β角,但是不接近或符合θi(i=1,2,3,4)内角值的四边形,都不是由矩形映射而来的四边形。
前面说明了一种相机姿态下的矩形判断原理,可以很容易的扩展到更一般的姿态情况下。只要改变α和β角,就可以表示任意的矩形姿态。改变α,即改变物平面和像平面的夹角;改变β,即改变矩形在该平面内的倾斜角度。任意α和任意β的组合,即可以覆盖从任意角度拍摄矩形的场景。
理论上,可以预先获得在所有α和所有β情况下,一个矩形在图像中所成四边形的各顶角值θi(i=1,2,3,4)。在拍照时,通过技术手段获得物平面和像平面的夹角α,以及矩形的一边与物平面和像平面的相交线的夹角β,进而算出像平面上的四边形的顶角值θi(i=1,2,3,4),与预先获得的该α和β下的顶角值比较,若θi(i=1,2,3,4)与预置的值相符,则四边形是矩形的投影,否则不是矩形的投影。
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述。
需要说明的是,为了便于清楚描述本发明实施例的技术方案,在本发明的实施例中,采用了“第一”、“第二”等字样对功能和作用基本相同的相同项或相似项进行区分,本领域技术人员可以理解“第一”、“第二”等字样并不对数量和执行次序进行限定。
需要说明的是,本文中的“/”表示或的意思,例如,A/B可以表示A或B;本文中的“和/或”仅仅是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。“多个”是指两个或多于两个。
如本申请所使用的,术语“组件”、“模块”、“系统”等等旨在指代计算机相关实体,该计算机相关实体可以是硬件、固件、硬件和软件的结合、软件或者运行中的软件。例如,组件可以是,但不限于是:在处理器上运行的处理、处理器、对象、可执行文件、执行中的线程、程序和/或计算机。作为示例,在计算设备上运行的应用和该计算设备都可以是组件。一个或多个组件可以存在于执行中的过程和/或线程中,并且组件可以位于一个计算机中以及/或者分布在两个或更多个计算机之间。此外,这些组件能够从在其上具有各种数据结构的各种计算机可读介质中执行。这些组件可以通过诸如根据具有一个或多个数据分组(例如,来自一个组件的数据,该组件与本地系统、分布式系统中的另一个组件进行交互和/或以信号的方式通过诸如互联网之类的网络与其它系统进行交互)的信号,以本地和/或远程过程的方式进行通信。
本申请将围绕可包括多个设备、组件、模块等的系统来呈现各个方面、实施例或特征。应当理解和明白的是,各个系统可以包括另外的设备、组件、模块等,并且/或者可以并不包括结合附图讨论的所有设备、组件、模块等。此外,还可以使用这些方案的组合。
另外,在本发明实施例中,“示例的”一词用于表示作例子、例证或说明。本申请中被描述为“示例”的任何实施例或设计方案不应被解释为比其它实施例或设计方案更优选或更具优势。确切而言,使用示例的一词旨在以具体方式呈现概念。
本发明实施例描述的场景是为了更加清楚的说明本发明实施例的技 术方案,并不构成对于本发明实施例提供的技术方案的限定,本领域普通技术人员可知,随着新场景的出现,本发明实施例提供的技术方案对于类似的技术问题,同样适用。
如图4所示,为本发明实施例提供的一种终端的硬件结构示意图。该终端400包括处理器401、摄像头402、显示器403、通信接口404、存储器405和总线406。其中,处理器401、摄像头402、显示器403、通信接口404和存储器405通过总线406相互连接。
处理器401是终端400的控制中心,通过总线406连接整个终端400的各个部分,通过运行或执行存储在存储器405内的软件程序和/或模块,以及调用存储在存储器405内的数据,执行终端400的各种功能和处理数据,从而对终端400进行整体监控。可选的,处理器401可包括一个或多个处理单元;优选的,处理器401可集成应用处理器和调制解调处理器,其中,应用处理器主要处理操作系统、用户界面和应用程序等,调制解调处理器主要处理无线通信。可以理解的是,上述调制解调处理器也可以不集成到处理器401中。
摄像头402用于对目标物进行拍摄,获得目标物的图像数据。其中,该摄像头402可以是深度摄像头或双摄像头,本发明实施例对此不作具体限定。可选的,若摄像头402未深度摄像头,则摄像头402还用于获取图像的深度数据。
显示器403用于显示对目标物进行拍摄并且处理后的图像。
通信接口404用于支持终端与其它外部设备的通信。
存储器405可用于存储软件程序以及模块,处理器401通过运行存储在存储器405中的软件程序以及模块,从而执行终端400的各种功能应用以及数据处理。存储器405主要包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需的应用程序(比如拍照功能,文档校正功能)等;存储数据区可存储根据终端400的使用所创建的数据(比如预置的矩形姿态的角度库)等。此外,存储器405可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件、闪存器件、或其他易失性固态存储器件。
总线406可以是外设部件互连标准(英文:peripheral component interconnect,缩写:PCI)总线或扩展工业标准结构(英文:extended industry standard architecture,缩写:EISA)总线等。该总线可以分为地址总线、数据总线、控制总线等。为便于表示,图4中仅用一条粗线表示,但并不表示仅有一根总线或一种类型的总线。
尽管未示出,终端400还可能包括射频(英文:radio freqency,缩写:RF)电路、音频电路、和/或多种传感器,本发明实施例对此不作具体限定。
下面将基于图4所示的终端,对本发明实施例提供的图像检测方法进行介绍。如图5所示,为本发明实施例提供的一种图像检测方法,包括步骤S501-S504:
S501、终端获取拍摄目标矩形的图像以及该图像中的深度信息。
其中,该深度信息用于表征该目标矩形距离摄像头的远近。
图6给出了图像的深度信息的示意图,图中标出了部分的深度值,数字表示目标矩形距离摄像头的远近。当然,实际的深度信息的示意图中的采样点更密集,本发明实施例仅是示意性说明,对此不做具体限定。
S502、终端检测该图像的边缘,获得多个候选四边形。
其中,该多个候选四边形是由终端检测该图像的边缘获得众多边缘线后,由众多边缘线组成的。
S503、终端对该多个候选四边形中的每一候选四边形,均按照下面T1-T3中针对第一候选四边形的操作进行处理:
T1:终端根据深度信息,确定该第一候选四边形的各边是否在三维空间的同一平面上。
T2:若第一候选四边形的各边在三维空间的同一平面上,终端确定第一候选四边形的内角值序列{θi(i=1,2,3,4)、以及目标矩形在图像中投影的目标四边形的内角值序列{ψi(i=1,2,3,4)}。
其中,这里的目标四边形具体是指理论上目标矩形在图像中投影的四 边形,比如图2中顶点为pi(i=1,2,3,4)的四边形。
T3:终端根据第一候选四边形的内角值序列{θi(i=1,2,3,4)}和目标四边形的内角值序列{ψi(i=1,2,3,4)},确定第一候选四边形与目标四边形匹配的可信度。
S504、终端将多个候选四边形中与目标四边形匹配的可信度最高的四边形确定为目标矩形在图像中投影的实际四边形。
基于本发明实施例提供的图像检测方法,因为本发明实施例在文档校正的过程中,考虑到了来自矩形内部或外部的干扰线的干扰,从原始图像检测出多个候选四边形,将多个候选四边形中与目标矩形在图像中投影的目标四边形匹配的可信度最高的四边形确定为目标矩形在图像中投影的实际四边形,因此可以达到准确消除矩形内外部干扰的效果。这是文档校正中的一个重要的问题,因为如果这一步四边形检测错误,后续步骤的处理就基于这个错误的四边形进行,从而对校正结果产生不可恢复的误导。
具体的,本发明实施例步骤S503中的T1中:
结合图6所示的图像的深度信息的示意图,给出本发明实施例中终端根据深度信息,确定该第一候选四边形的各边是否在三维空间的同一平面上的具体实现,包括:
首先,终端确定该第一候选四边形的边和角的位置,区分出该第一候选四边形的内部和外部。
其次,终端对该第一候选四边形的内部,从某一边(比如下边)或角(比如左下角)的小块区域开始,计算小块区域内深度值在不同方向上(比如向上、向右、向右上等)的变化量(图6中为向上方每10单位深度/格)。
然后,终端从该小块区域沿不同方向向周围扩展,并同时计算扩展区域在该方向上的深度变化量是否与初始变化量保持一致,直到扩展区域覆盖该第一候选四边形的全部内部区域,若没有出现不一致的区域,则该第一候选四边形的各边是在同一个平面上,若出现不一致的区域,则该第一候选四边形的各边不是在同一个平面上。其中,该深度变化应考虑随着深 度绝对值的增加,其变化量也随着增大。
具体的,本发明实施例步骤S503中的T2中:
终端确定第一候选四边形的内角值序列{θi(i=1,2,3,4),具体可以包括:
终端确定四边形两邻边的夹角θ1的大小,进而从夹角θ1开始,按顺时针或逆时针方向,依次确定其余的夹角θ2,夹角θ3的值,最后的夹角θ4可由四边形内角和360°减去前三个内角的值而得到。
其中,夹角θ1可以取最左边和最下边的夹角,或者在深度图中最近或最远的夹角,等等,本发明实施例对此不作具体限定。
具体的,本发明实施例步骤S503中的T2中:
终端确定目标矩形在图像中投影的目标四边形的内角值序列{ψi(i=1,2,3,4)},具体可以包括:
终端确定目标矩形的矩形姿态参数,进而根据该目标矩形的矩形姿态参数,匹配预先存储的角度库,确定该目标矩形在图像中投影的目标四边形的内角值序列{ψi(i=1,2,3,4)}。其中,角度库中包含多种矩形姿态中的每种矩形姿态对应的目标四边形的内角值序列。
优选的,本发明实施例中的内角值序列可以按照从一个方向开始(比如从左下开始),顺时针或逆时针方向的方式排列,也可以按照从大到小或者从小到大的方式排列,或者还可以根据图像中的深度信息,从距离摄像头最近的一个角开始,等等,本发明实施例对此不作具体限定。
可选的,一种可能的实现方式中,本发明实施例中目标矩形的矩形姿态参数具体可以包括:目标矩形所在的物平面与图像所在的像平面的夹角α、以及目标矩形的一边与物平面和像平面相交线的夹角β。
示例性的,终端可以通过如下方式确定目标矩形的矩形姿态参数:
一、确定夹角α
如图7所示,夹角α可以分解为大小分量αv和方向分量αd表示。由于相机获得的图片是矩形的,因此,很自然的,以像平面c为参考,以 图片的邻边为轴,可以形成直角坐标系xOy。其中,像平面c的法线矢量为v1,物平面w的法线矢量为v2。
把v2平移到与v1的起点相同,则两者的夹角的大小为大小分量αv=|α|。
另外一个分量是夹角的方向,(v2-v1)得到矢量d,d的方向表示了夹角的方向;把矢量d投影到xOy平面上得到矢量d’,与某一轴比如Oy的夹角为方向分量αd。设e是物平面和像平面的交线,则d’与e是互相垂直的;因此也可以根据交线e的倾斜角度,来表示两平面夹角α的方向分量αd。
二、确定夹角β
如具体实施方式前述部分所述,当夹角α的大小|α|一定时,影响目标矩形在像平面中的投影形状的因素,是目标矩形的一边与物平面和像平面相交线的夹角β。即当物像平面的相对倾角,以及目标矩形相对物像平面的位置确定时,平移或旋转像平面,并不会影响矩形投影的内角的值。
如图8所示,夹角β投影到像平面变成β’,两个角度单调相关,但不相等,即β≠β’。
由于成像图像直观得到,因此β’比较容易计算。β’=λ-αd,即β’等于目标矩形的边与图像边的夹角和物像平面交线与图像边的夹角的差。角度间的关系由公式
cosβ’cosα=cosβ,可得:β=acos(cosβ’cosα)。
当终端确定目标矩形的矩形姿态参数之后,即可根据上述方法分别确定出第一候选四边形的内角值序列{θi(i=1,2,3,4)和目标四边形的内角值序列{ψi(i=1,2,3,4)}。
由于当确定了物平面和像平面的夹角为α,并且确定了矩形的一边与物平面和像平面的相交线的夹角为β,则矩形所成的四边形的像的形状,即四边形各内角的大小θi(i=1,2,3,4),就可以唯一的确定,因此通过将候选四边形与该矩形姿态下匹配角度库获得的目标四边形进行匹配确定出的实际四边形更为准确。
可选的,一种可能的实现方式中,本发明实施例中目标矩形的矩形姿态参数具体可以包括:第一候选四边形中两组对边的夹角中较大的夹角γ。即,本发明实施例可以简化上述实施例中的夹角α和夹角β为四边形中两组对边的夹角中较大的一个角,因此实现相对简单。
示例性的,如图9所示,可以简化上述实施例中的夹角α和夹角β为图9中的θ1或θ2。该角度与α角度呈正相关的关系,而受β的影响较小,因此可作为一个简单近似。
具体的,本发明实施例中的角度库为终端提前获取并存储的,其中,终端可以通过如下两种方式获取角度库:
其一,终端分别检测多种矩形姿态中的每种矩形姿态对应的目标四边形,并计算该每种矩形姿态对应的目标四边形的内角值序列,获得该角度库。
其二,终端接收角度库获取设备发送的角度库。其中,角度库获取设备可以按照其一中终端获取角度库的方式获取该角度库,此处不再赘述。
示例性的,这里以终端获取角度库为例进行说明。
假设矩形姿态参数具体包括:目标矩形所在的物平面与图像所在的像平面的夹角α、以及目标矩形的一边与物平面和像平面相交线的夹角β,则终端可以通过如下方式获取角度库:
S1001、初始化目标矩形所在的物平面与图像所在的像平面的夹角α=0。
S1002、初始化目标矩形的一边与物平面和像平面相交线的夹角β=0。
S1003、检测该姿态下,目标矩形在图像中投影的目标四边形,并计算和存储该目标四边形的内角值序列。
S1004、将夹角β递增1步长,并判断是否达到90度。
若达到90度,执行步骤S1005;
若未达到90度,执行步骤S1003。
S1005、将夹角α递增1步长,并判断是否达到90度。
若达到90度,结束;
若未达到90度,执行步骤S1002。
其中,每种矩形姿态对应的目标四边形的内角值序列可存储在图4所示的存储器中,存储格式为可以为数据库或普通文件,本发明实施例对此不作具体限定。示例性的,矩形姿态对应的目标四边形的内角值序列的存储示意可以表一所示。
表一
编号 α β 内角1 内角2 内角3
1 0 0 90° 90° 90°
i 0 10° 90° 90° 90°
x 30° 0 86° 86° 94°
其中,表一中的角度可以用另外一种形式进行表征,比如90°=π/2,30°=π/6,等等,本发明实施例对此不作具体限定。
其中,表一中仅是示例性的给出了矩形姿态对应的目标四边形的内角值序列中其中三个内角的值,本领域技术人员可以理解,矩形姿态对应的目标四边形的内角值序列中第四个内角的值可由四边形内角和360°减去表一中前三个内角的值而得到。当然,存储器中存储的每种矩形姿态对应的目标四边形的内角值序列也可以包含全部四个内角的值,本发明实施例对此不作具体限定。
需要说明的是,该示例中步骤S1001和步骤S1002中的角度初始化都是从0开始的,步骤S1004和步骤S1005中的角度递增都是以90度结束的,当然,也可以以其它值开始进行初始化,只要能枚举所有可能的夹角α和夹角β,就可以覆盖所有相机倾斜的场景。另外,由于角度的中 心对称性,夹角α和夹角β的取值范围都是0°到90°,通过镜像对称特性,即可覆盖所有的拍摄场景下的姿态。
其中,由于角度是连续的,获取所有的α和β情况是不可能的。但可以把角度离散化,取很小的间隔(也就是上述的步长)来近似,间隔越小则精确度越高。比如可以取0.1°到10°之间的间隔,通常取1°为间隔,精度就比较高了,需要预先获得的数据是90x90=8100组。落在两个整数角度中间的数据,可以根据相邻角度的数据插值得出。
需要说明的是,该示例中内角值的获取方法,可以通过用实际的矩形平板,与背景明显区分,步进的旋转矩形或者移动摄像头的位置的方式获取,如图11所示;也可以用3D软件虚拟的矩形,通过角度步进的方式获取,本发明实施例对此具体限定。
具体的,本发明实施例步骤S503中的T3中:
终端根据第一候选四边形的内角值序列{θi(i=1,2,3,4)}和目标四边形的内角值序列{ψi(i=1,2,3,4)},确定第一候选四边形与目标四边形匹配的可信度,具体可以包括T31和T32:
T31、终端根据第一候选四边形的内角值序列{θi(i=1,2,3,4)}和目标四边形的内角值序列{ψi(i=1,2,3,4)},将第一候选四边形和目标四边形的四个内角分别求差后求统计值。
T32、终端根据统计值,确定第一候选四边形与目标四边形匹配的可信度。
其中,在步骤T31中,统计的方法可以是,四个差值的绝对值的均值,或者方差,或者标准差,或者四个差值的平方的和,或者可以把上述平方的和再开根号,等等,本发明实施例对此不作具体限定。
其中,在步骤T32中,终端根据统计值,确定第一候选四边形与目标四边形匹配的可信度,具体可以包括:
终端将预设值与统计值的差值确定为第一候选四边形与目标四边形匹配的可信度。
即,考虑到统计值越小,匹配的可信度越高,因此将预设值与统计值 求差值,可以确定第一候选四边形与目标四边形匹配的可信度。
或者,终端根据统计值,确定第一候选四边形与目标四边形匹配的可信度,具体可以包括:
终端根据统计值,查询预先存储的对应关系,确定第一候选四边形与目标四边形匹配的可信度,该对应关系包括多个值对应的可信度。
示例性的,该对应关系可以如表二所示:
表二
统计值(X) 可信度
0≤X<0.1 10
0.1≤X<0.2 9
0.9≤X<1 1
需要说明的是,表二仅是示例性的给出了一组多个值对应的可信度的数据,当然,统计值和可信度的数值对应关系可能为其它,本发明实施例对此不作具体限定。
进一步的,若第一候选四边形的各边在同一平面上,在终端根据目标矩形的矩形姿态参数,匹配预先存储的角度库之前,还可以包括:终端确定该目标矩形与摄像头的距离d;终端根据距离d,匹配预先存储的数据库,确定与d对应的角度库为预先存储的角度库。
即,考虑到当目标矩形与摄像头的距离d取不同的值时,目标矩形在同样的姿态下,顶角值可能会出现轻微的变化,比如当目标矩形远离摄像头时,在像平面中大于90度的角会变得更大,小于90度的角会变得更小,等于90度的角不变。通过匹配根据目标矩形与摄像头的距离d确定出的角度库来确定目标四边形的内角值序列,可以使得确定出的目标四边形的内角值序列更为准确,进而图像检测结果也更为准确。
综上,基于本发明实施例提供的图像检测方法,因为本发明实施例在文档校正的过程中,考虑到了来自矩形内部或外部的干扰线的干扰,从原 始图像检测出多个候选四边形,将多个候选四边形中与目标矩形在图像中投影的目标四边形匹配的可信度最高的四边形确定为目标矩形在图像中投影的实际四边形,因此可以达到准确消除矩形内外部干扰的效果。这是文档校正中的一个重要的问题,因为如果这一步四边形检测错误,后续步骤的处理就基于这个错误的四边形进行,从而对校正结果产生不可恢复的误导。
示例性的,假设终端检测出的候选四边形1(粗线条对应的线构成的四边形)和候选四边形2(粗线条对应的线构成的四边形)分别如图12和图13所示,并且候选四边形1和候选四边形2的各边均在三维空间的同一平面上,则终端可以根据上述方法确定目标矩形在图像中投影的目标四边形的内角值序列{ψi(i=1,2,3,4)},并且在分别确定候选四边形1的内角值序列{θi(i=1,2,3,4)和候选四边形2的内角值序列{θi(i=1,2,3,4)之后,分别确定候选四边形1和候选四边形2与目标四边形匹配的可信度。假设候选四边形1与目标四边形匹配的可信度较高,则终端将候选四边形1确定为目标矩形在图像中投影的实际四边形。
进一步的,考虑到现有的宽高比估计技术,在大角度倾斜拍摄的情况下,焦距变化剧烈,焦距的估计会产生很大偏差,从而给矩形原始的宽高比估计带来很大误差,进而导致宽高比估计会非常不准确。如图14所示,为现有的一种文档校正结果,可以看出,在图14中,目标矩形对应的实际四边形检测正确,但是宽高比却估计错误。为解决该问题,上述角度库中还可以包含多种矩形姿态中的每种矩形姿态对应的相邻两边的单位长度的投影的比值λ。相应的,本发明实施例提供的图像检测方法还可以包括:
终端根据目标矩形的矩形姿态参数,匹配该角度库,确定目标矩形对应的相邻两边的单位长度的投影的比值λ1。
在终端将多个候选四边形中与目标四边形匹配的可信度最高的四边形确定为目标矩形在图像中投影的实际四边形(步骤S504)之后,还可以包括:
终端根据比值λ1、以及实际四边形相邻两边的投影比值,确定该实 际四边形相邻两边的真实比值。进而,终端根据该目标四边形相邻两边长度的真实比值,获取并输出该目标矩形。
其中,在表一的基础上,多种矩形姿态中的每种矩形姿态对应的相邻两边的单位长度的投影的比值λ的存储示意可以表三所示。
表三
编号 α β 内角1 内角2 内角3 投影比例
1 0 0 90° 90° 90° 1
i 0 10° 90° 90° 90° 1
x 30° 0 86° 86° 94° 0.936
示例性的,假设存储的目标矩形对应的相邻两边的单位长度的投影的比值λ=0.9,图像中投影的实际四边形的左下与右下边的长度分别为{0.9,1.5},则根据存储的比值,目标矩形的真实的长度比值为0.9/0.9/1.5=1/1.5。
通过本发明实施例提供的图像检测方法,不仅可以找到目标矩形对应的实际四边形,并且可以得到宽高的真实比例,进而终端可以根据真实的矩形的比例,求出投影变换矩阵,从而把四边形恢复为真实的矩形。
可选的,一种可能的实现方式中,考虑到当物平面与像平面平行时,两平面没有相交线,并且,一个矩形无论在物平面内的倾斜角度如何,其在像平面内的投影还是矩形,也就是内角的序列为{90°,90°,90°,90°},因此,终端在进行图形检测时,可以简化为只使用四边形的四个内角与矩形内角差的统计值来判断合理性。即,步骤S503中的T2中:终端确定目标矩形在图像中投影的目标四边形的内角值序列为{90°,90°,90°,90°}。进而,终端可以分别确定检测出的候选四边形与该目标四边形匹配的可信度,并将匹配的可信度最高的四边形确定为目标矩 形在图像中投影的实际四边形。
由于该实施例不需要预置数据,因此实现最简单,不过也是最粗略的判断,并且对矩形的宽高比例可能也无法准确估计。
可选的,基于图4所示的终端,本发明实施例还提供一种图像检测方法,如图15所示,方法包括步骤S1501-S1504:
S1501、终端获取拍摄目标矩形的图像。
S1502、终端检测该图像的边缘,获得多个候选四边形。
其中,该多个候选四边形是由终端检测该图像的边缘获得众多边缘线后,由众多边缘线组成的。
S1503、终端对该多个候选四边形中的每一候选四边形,均按照下面K1-K2中针对第一候选四边形的操作进行处理:
K1:终端确定第一候选四边形的内角值序列{θi(i=1,2,3,4)}。
K2:终端根据该第一候选四边形的内角值序列{θi(i=1,2,3,4)}与目标四边形的内角值序列{90°,90°,90°,90°},确定第一候选四边形与目标矩形匹配的可信度。
S1504、终端将多个候选四边形中与目标四边形匹配的可信度最高的四边形确定为目标矩形在图像中投影的实际四边形。
具体的,本发明实施例步骤S1503中的K1中:
终端确定第一候选四边形的内角值序列{θi(i=1,2,3,4)}的方式可参考图5所示的实施例,本发明实施例在此不再赘述。
具体的,如图16所示,本发明实施例步骤S1503中的K2中:
终端根据该第一候选四边形的内角值序列{θi(i=1,2,3,4)}与目标四边形的内角值序列{90°,90°,90°,90°},确定第一候选四边形与目标矩形匹配的可信度,具体可以包括:
K21:终端根据该第一候选四边形的内角值序列{θi(i=1,2,3,4)}和目标四边形的内角值序列{90°,90°,90°,90°},将该第一候选四边形和该目标四边形的四个内角分别求差后求统计值。
K22:终端根据该统计值,确定该第一候选四边形与该目标四边形匹配的可信度。
其中,在步骤K21中,统计的方法可以是,四个差值的绝对值的均值,或者方差,或者标准差,或者四个差值的平方的和,或者可以把上述平方的和再开根号,等等,本发明实施例对此不作具体限定。
其中,步骤K22的具体实现可参考图5所示的实施例,本发明实施例在此不再赘述。
基于本发明实施例提供的图像检测方法,因为本发明实施例在文档校正的过程中,考虑到了来自矩形内部或外部的干扰线的干扰,从原始图像检测出多个候选四边形,将多个候选四边形中与目标矩形在图像中投影的目标四边形匹配的可信度最高的四边形确定为目标矩形在图像中投影的实际四边形,因此可以达到准确消除矩形内外部干扰的效果。并且该实施例不需要预置数据,因此相对于上述图像检测方法,实现最简单。
上述主要从终端侧对本发明实施例提供的方案进行了介绍。可以理解的是,终端为了实现上述功能,其包含了执行各个功能相应的硬件结构和/或软件模块。本领域技术人员应该很容易意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,本发明能够以硬件或硬件和计算机软件的结合形式来实现。某个功能究竟以硬件还是计算机软件驱动硬件的方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本发明的范围。
本发明实施例可以根据上述方法示例对终端进行功能模块的划分,例如,可以对应各个功能划分各个功能模块,也可以将两个或两个以上的功能集成在一个处理模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。需要说明的是,本发明实施例中对模块的划分是示意性的,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。
在采用集成的单元的情况下,图17示出了上述实施例中所涉及的终端的一种可能的结构示意图。终端1700包括:摄像模块1701和处理模 块1702。摄像模块1701用于支持终端1700执行图5中的步骤S501,处理模块1402用于支持终端1700执行图5中的步骤S502-S504;或者,摄像模块1701用于支持终端1700执行图15和图16中的步骤S1501,处理模块1402用于支持终端1700执行图15和图16中的步骤S1502-S1504。可选的,终端1700还可以包括显示模块1703和通信模块1704。其中,显示模块1703用于支持终端1700显示目标矩形,通信模块1704用于支持终端与其他外部设备的通信,例如与上述角度库获取设备的通信。当然,终端1700还可以包括存储模块1705,用于存储基站的程序代码和数据,本发明实施例对此不作具体限定。
其中,摄像模块1701可以是图4中的摄像头402。
处理模块1702可以是处理器或控制器,例如可以是图4中的处理器401,也可以是通用处理器,数字信号处理器(英文:digital signal processor,缩写:DSP),专用集成电路(英文:application-specific integrated circuit,缩写:ASIC),现场可编程门阵列(英文:field programmable gate array,缩写:FPGA)或者其他可编程逻辑器件、晶体管逻辑器件、硬件部件或者其任意组合。其可以实现或执行结合本发明公开内容所描述的各种示例性的逻辑方框,模块和电路。该处理器也可以是实现计算功能的组合,例如包含一个或多个微处理器组合,DSP和微处理器的组合等等。
显示模块1703可以是图4中的显示器403。
通信模块1704可以是图4中的通信接口404,也可以是收发器等。
存储模块1705可以是图4中的存储器405。
当摄像模块1701为摄像头,处理模块1702为处理器,显示模块1703为显示器,通信模块1704为通信接口、存储模块1705为存储器时,本发明实施例所涉及的终端可以为图4所示的终端,具体可参见图4部分的相关描述,此处不再赘述。
基于本发明实施例提供的终端,因为本发明实施例在文档校正的过程中,考虑到了来自矩形内部或外部的干扰线的干扰,从原始图像检测出多个候选四边形,将多个候选四边形中与目标矩形在图像中投影的目标四边 形匹配的可信度最高的四边形确定为目标矩形在图像中投影的实际四边形,因此可以达到准确消除矩形内外部干扰的效果。这是文档校正中的一个重要的问题,因为如果这一步四边形检测错误,后续步骤的处理就基于这个错误的四边形进行,从而对校正结果产生不可恢复的误导。
结合本发明公开内容所描述的方法或者算法的步骤可以硬件的方式来实现,也可以是由处理器执行软件指令的方式来实现。软件指令可以由相应的软件模块组成,软件模块可以被存放于随机存取存储器(英文:random access memory,缩写:RAM)、闪存、只读存储器(英文:read only memory,缩写:ROM)、可擦除可编程只读存储器(英文:erasable programmable ROM,缩写:EPROM)、电可擦可编程只读存储器(英文:electrically EPROM,缩写:EEPROM)、寄存器、硬盘、移动硬盘、只读光盘(CD-ROM)或者本领域熟知的任何其它形式的存储介质中。一种示例性的存储介质耦合至处理器,从而使处理器能够从该存储介质读取信息,且可向该存储介质写入信息。当然,存储介质也可以是处理器的组成部分。处理器和存储介质可以位于ASIC中。另外,该ASIC可以位于核心网接口设备中。当然,处理器和存储介质也可以作为分立组件存在于核心网接口设备中。
本领域技术人员应该可以意识到,在上述一个或多个示例中,本发明所描述的功能可以用硬件、软件、固件或它们的任意组合来实现。当使用软件实现时,可以将这些功能存储在计算机可读介质中或者作为计算机可读介质上的一个或多个指令或代码进行传输。计算机可读介质包括计算机存储介质和通信介质,其中通信介质包括便于从一个地方向另一个地方传送计算机程序的任何介质。存储介质可以是通用或专用计算机能够存取的任何可用介质。
以上所述的具体实施方式,对本发明的目的、技术方案和有益效果进行了进一步详细说明,所应理解的是,以上所述仅为本发明的具体实施方式而已,并不用于限定本发明的保护范围,凡在本发明的技术方案的基础之上,所做的任何修改、等同替换、改进等,均应包括在本发明的保护范围之内。

Claims (28)

  1. 一种图像检测方法,其特征在于,所述方法包括:
    终端获取拍摄目标矩形的图像以及所述图像中的深度信息,其中,所述深度信息用于表征所述目标矩形距离摄像头的远近;
    所述终端检测所述图像的边缘,获得多个候选四边形;
    所述终端对所述多个候选四边形中的每一候选四边形,均按照下面针对第一候选四边形的操作进行处理:
    所述终端根据所述深度信息,确定所述第一候选四边形的各边是否在三维空间的同一平面上;
    若所述第一候选四边形的各边在三维空间的同一平面上,所述终端确定所述第一候选四边形的内角值序列{θi(i=1,2,3,4)、以及所述目标矩形在所述图像中投影的目标四边形的内角值序列{ψi(i=1,2,3,4)};
    所述终端根据所述第一候选四边形的内角值序列{θi(i=1,2,3,4)}和所述目标四边形的内角值序列{ψi(i=1,2,3,4)},确定所述第一候选四边形与所述目标四边形匹配的可信度;
    在所述终端对所述多个候选四边形中的每一候选四边形均按照上面针对所述第一候选四边形的操作处理后,所述终端将所述多个候选四边形中与所述目标四边形匹配的可信度最高的四边形确定为所述目标矩形在所述图像中投影的实际四边形。
  2. 根据权利要求1所述的方法,其特征在于,所述终端确定所述目标矩形在所述图像中投影的目标四边形的内角值序列{ψi(i=1,2,3,4)},包括:
    所述终端确定所述目标矩形的矩形姿态参数;
    所述终端根据所述目标矩形的矩形姿态参数,匹配预先存储的角度库,确定所述目标矩形在所述图像中投影的目标四边形的内角值序列{ψi(i=1,2,3,4)},其中,所述角度库中包含多种矩形姿态中的 每种矩形姿态对应的目标四边形的内角值序列。
  3. 根据权利要求2所述的方法,其特征在于,所述目标矩形的矩形姿态参数包括:所述目标矩形所在的物平面与所述图像所在的像平面的夹角α、以及所述目标矩形的一边与所述物平面和所述像平面相交线的夹角β。
  4. 根据权利要求2所述的方法,其特征在于,所述目标矩形的矩形姿态参数包括:所述第一候选四边形中两组对边的夹角中较大的夹角γ。
  5. 根据权利要求2-4任一项所述的方法,其特征在于,所述角度库中还包含所述多种矩形姿态中的每种矩形姿态对应的相邻两边的单位长度的投影的比值λ;所述方法还包括:
    所述终端根据所述目标矩形的矩形姿态参数,匹配所述角度库,确定所述目标矩形对应的相邻两边的单位长度的投影的比值λ1;
    在所述终端将所述多个候选四边形中与所述目标四边形匹配的可信度最高的四边形确定为所述目标矩形在所述图像中投影的实际四边形之后,还包括:
    所述终端根据所述比值λ1、以及所述实际四边形相邻两边的投影比值,确定所述实际四边形相邻两边的真实比值;
    所述终端根据所述目标四边形相邻两边长度的真实比值,获取并输出所述目标矩形。
  6. 根据权利要求2-5任一项所述的方法,其特征在于,在所述终端获取拍摄目标矩形的图像以及所述图像中的深度信息之前,还包括:
    所述终端获取并存储所述角度库。
  7. 根据权利要求6所述的方法,其特征在于,所述终端获取所述角度库,包括:
    所述终端接收角度库获取设备发送的角度库;或者,
    所述终端分别检测所述多种矩形姿态中的每种矩形姿态对应的目标四边形,并计算所述每种矩形姿态对应的目标四边形的内角值序列、以及所述每种矩形姿态对应的目标四边形相邻两边的单位长度的投影的比值,获得所述角度库。
  8. 根据权利要求2-7任一项所述的方法,其特征在于,若所述第一候选四边形的各边在同一平面上,在所述终端根据所述目标矩形的矩形姿态参数,匹配预先存储的角度库之前,还包括:
    所述终端确定所述目标矩形与所述摄像头的距离d;
    所述终端根据所述距离d,匹配预先存储的数据库,确定与所述d对应的角度库为所述预先存储的角度库。
  9. 根据权利要求1所述的方法,其特征在于,所述终端确定所述目标矩形在所述图像中投影的目标四边形的内角值序列为{90°,90°,90°,90°}。
  10. 根据权利要求1-9任一项所述的方法,其特征在于,所述终端根据所述第一候选四边形的内角值序列{θi(i=1,2,3,4)}和所述目标四边形的内角值序列{ψi(i=1,2,3,4)},确定所述第一候选四边形与所述目标四边形匹配的可信度,包括:
    所述终端根据所述第一候选四边形的内角值序列{θi(i=1,2,3,4)}和所述目标四边形的内角值序列{ψi(i=1,2,3,4)},将所述第一候选四边形和所述目标四边形的四个内角分别求差后求统计值;
    所述终端根据所述统计值,确定所述第一候选四边形与所述目标四边形匹配的可信度。
  11. 根据权利要求10所述的方法,其特征在于,所述终端根据所述统计值,确定所述第一候选四边形与所述目标四边形匹配的可信度,包括:
    所述终端将预设值与所述统计值的差值确定为所述第一候选四 边形与所述目标四边形匹配的可信度;
    或者,所述终端根据所述统计值,查询预先存储的对应关系,确定所述第一候选四边形与所述目标四边形匹配的可信度,所述对应关系包括多个值对应的可信度。
  12. 一种图像检测方法,其特征在于,所述方法包括:
    终端获取拍摄目标矩形的图像;
    所述终端检测所述图像的边缘,获得多个候选四边形;
    所述终端对所述多个候选四边形中的每一候选四边形,均按照下面针对第一候选四边形的操作进行处理:
    所述终端确定所述第一候选四边形的内角值序列{θi(i=1,2,3,4)};
    所述终端根据所述第一候选四边形的内角值序列{θi(i=1,2,3,4)}与所述目标矩形在所述图像中投影的目标四边形的内角值序列{90°,90°,90°,90°},确定所述第一候选四边形与所述目标四边形匹配的可信度;
    在所述终端对所述多个候选四边形中的每一候选四边形均按照上面针对所述第一候选四边形的操作处理后,所述终端将所述多个候选四边形中与所述目标四边形匹配的可信度最高的四边形确定为所述目标矩形在所述图像中投影的实际四边形。
  13. 根据权利要求12所述的方法,其特征在于,所述终端根据所述第一候选四边形的内角值序列{θi(i=1,2,3,4)}和所述目标矩形在所述图像中投影的目标四边形的内角值序列{90°,90°,90°,90°},确定所述第一候选四边形与所述目标四边形匹配的可信度,包括:
    所述终端根据所述第一候选四边形的内角值序列{θi(i=1,2,3,4)}和所述目标矩形在所述图像中投影的目标四边形的内角值序列{90°,90°,90°,90°},将所述第一候选四边形和所述目标四边形的四个内角分别求差后求统计值;
    所述终端根据所述统计值,确定所述第一候选四边形与所述目标四边形匹配的可信度。
  14. 根据权利要求13所述的方法,其特征在于,所述终端根据所述统计值,确定所述第一候选四边形与所述目标四边形匹配的可信度,包括:
    所述终端将预设值与所述统计值的差值确定为所述第一候选四边形与所述目标四边形匹配的可信度;
    或者,所述终端根据所述统计值,查询预先存储的对应关系,确定所述第一候选四边形与所述目标四边形匹配的可信度,所述对应关系包括多个值对应的可信度。
  15. 一种终端,其特征在于,所述终端包括:处理器和摄像头;
    所述摄像头,用于获取拍摄目标矩形的图像以及所述图像中的深度信息,其中,所述深度信息用于表征所述目标矩形距离摄像头的远近;
    所述处理器,用于检测所述图像的边缘,获得多个候选四边形;
    所述处理器,还用于对所述多个候选四边形中的每一候选四边形,均按照下面针对第一候选四边形的操作进行处理:
    根据所述深度信息,确定所述第一候选四边形的各边是否在三维空间的同一平面上;
    若所述第一候选四边形的各边在三维空间的同一平面上,确定所述第一候选四边形的内角值序列{θi(i=1,2,3,4)、以及所述目标矩形在所述图像中投影的目标四边形的内角值序列{ψi(i=1,2,3,4)};
    根据所述第一候选四边形的内角值序列{θi(i=1,2,3,4)}和所述目标四边形的内角值序列{ψi(i=1,2,3,4)},确定所述第一候选四边形与所述目标四边形匹配的可信度;
    所述处理器,还用于在对所述多个候选四边形中的每一候选四 边形均按照上面针对所述第一候选四边形的操作处理后,将所述多个候选四边形中与所述目标四边形匹配的可信度最高的四边形确定为所述目标矩形在所述图像中投影的实际四边形。
  16. 根据权利要求15所述的终端,其特征在于,所述处理器具体用于:
    确定所述目标矩形的矩形姿态参数;
    根据所述目标矩形的矩形姿态参数,匹配预先存储的角度库,确定所述目标矩形在所述图像中投影的目标四边形的内角值序列{ψi(i=1,2,3,4)},其中,所述角度库中包含多种矩形姿态中的每种矩形姿态对应的目标四边形的内角值序列。
  17. 根据权利要求16所述的终端,其特征在于,所述目标矩形的矩形姿态参数包括:所述目标矩形所在的物平面与所述图像所在的像平面的夹角α、以及所述目标矩形的一边与所述物平面和所述像平面相交线的夹角β。
  18. 根据权利要求17所述的终端,其特征在于,所述目标矩形的矩形姿态参数包括:所述第一候选四边形中两组对边的夹角中较大的夹角γ。
  19. 根据权利要求16-18任一项所述的终端,其特征在于,所述终端还包括显示器;
    所述角度库中还包含所述多种矩形姿态中的每种矩形姿态对应的相邻两边的单位长度的投影的比值λ;
    所述处理器,还用于根据所述目标矩形的矩形姿态参数,匹配所述角度库,确定所述目标矩形对应的相邻两边的单位长度的投影的比值λ1;
    所述处理器,还用于在将所述多个候选四边形中与所述目标四边形匹配的可信度最高的四边形确定为所述目标矩形在所述图像中投影的实际四边形之后,根据所述比值λ1、以及所述实际四边形相 邻两边的投影比值,确定所述实际四边形相邻两边的真实比值;
    所述处理器,还用于根据所述目标四边形相邻两边长度的真实比值,获取所述目标矩形;
    所述显示器,用于显示所述目标矩形。
  20. 根据权利要求16-19任一项所述的终端,其特征在于,所述终端还包括存储器;
    所述处理器,还用于在所述获取拍摄目标矩形的图像以及所述图像中的深度信息之前,获取所述角度库;
    所述存储器,用于存储所述角度库。
  21. 根据权利要求20所述的终端,其特征在于,所述终端还包括:通信接口;
    所述处理器具体用于:
    通过所述通信接口接收角度库获取设备发送的角度库;或者,
    所述处理器具体用于:
    分别检测所述多种矩形姿态中的每种矩形姿态对应的目标四边形,并计算所述每种矩形姿态对应的目标四边形的内角值序列、以及所述每种矩形姿态对应的目标四边形相邻两边的单位长度的投影的比值,获得所述角度库。
  22. 根据权利要求16-21任一项所述的终端,其特征在于,
    所述处理器,还用于若所述第一候选四边形的各边在同一平面上,在所述根据所述目标矩形的矩形姿态参数,匹配预先存储的角度库之前,确定所述目标矩形与所述摄像头的距离d;
    所述处理器,还用于根据所述距离d,匹配预先存储的数据库,确定与所述d对应的角度库为所述预先存储的角度库。
  23. 根据权利要求15所述的终端,其特征在于,所述处理器确定所述目标四边形的内角值序列为{90°,90°,90°,90°}。
  24. 根据权利要求15-23任一项所述的终端,其特征在于,所述 处理器具体用于:
    根据所述第一候选四边形的内角值序列{θi(i=1,2,3,4)}和所述目标四边形的内角值序列{ψi(i=1,2,3,4)},将所述第一候选四边形和所述目标四边形的四个内角分别求差后求统计值;
    根据所述统计值,确定所述第一候选四边形与所述目标四边形匹配的可信度。
  25. 根据权利要求24所述的终端,其特征在于,所述处理器具体用于:
    将预设值与所述统计值的差值确定为所述第一候选四边形与所述目标四边形匹配的可信度;
    或者,根据所述统计值,查询预先存储的对应关系,确定所述第一候选四边形与所述目标四边形匹配的可信度,所述对应关系包括多个值对应的可信度。
  26. 一种终端,其特征在于,所述终端包括:摄像头和处理器;
    所述摄像头,用于获取拍摄目标矩形的图像;
    所述处理器,用于检测所述图像的边缘,获得多个候选四边形;
    所述处理器,还用于对所述多个候选四边形中的每一候选四边形,均按照下面针对第一候选四边形的操作进行处理:
    确定所述第一候选四边形的内角值序列{θi(i=1,2,3,4)};
    根据所述第一候选四边形的内角值序列{θi(i=1,2,3,4)}与所述目标矩形在所述图像中投影的目标四边形的内角值序列{90°,90°,90°,90°},确定所述第一候选四边形与所述目标四边形匹配的可信度;
    所述处理器,还用于在对所述多个候选四边形中的每一候选四边形均按照上面针对所述第一候选四边形的操作处理后,将所述多个候选四边形中与所述目标四边形匹配的可信度最高的四边形确定为所述目标矩形在所述图像中投影的实际四边形。
  27. 根据权利要求26所述的终端,其特征在于,所述处理器具体用于:
    根据所述第一候选四边形的内角值序列{θi(i=1,2,3,4)}和所述目标矩形在所述图像中投影的目标四边形的内角值序列{90°,90°,90°,90°},将所述第一候选四边形和所述目标四边形的四个内角分别求差后求统计值;
    根据所述统计值,确定所述第一候选四边形与所述目标四边形匹配的可信度。
  28. 根据权利要求27所述的终端,其特征在于,所述处理器具体用于:
    将预设值与所述统计值的差值确定为所述第一候选四边形与所述目标四边形匹配的可信度;
    或者,根据所述统计值,查询预先存储的对应关系,确定所述第一候选四边形与所述目标四边形匹配的可信度,所述对应关系包括多个值对应的可信度。
PCT/CN2016/099730 2016-09-22 2016-09-22 一种图像检测方法及终端 WO2018053756A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201680080721.5A CN108604374B (zh) 2016-09-22 2016-09-22 一种图像检测方法及终端
PCT/CN2016/099730 WO2018053756A1 (zh) 2016-09-22 2016-09-22 一种图像检测方法及终端

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2016/099730 WO2018053756A1 (zh) 2016-09-22 2016-09-22 一种图像检测方法及终端

Publications (1)

Publication Number Publication Date
WO2018053756A1 true WO2018053756A1 (zh) 2018-03-29

Family

ID=61689765

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/099730 WO2018053756A1 (zh) 2016-09-22 2016-09-22 一种图像检测方法及终端

Country Status (2)

Country Link
CN (1) CN108604374B (zh)
WO (1) WO2018053756A1 (zh)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111445566A (zh) * 2020-03-27 2020-07-24 腾讯科技(深圳)有限公司 一种信息处理方法、装置及计算机可读存储介质
CN111858996A (zh) * 2020-06-10 2020-10-30 北京百度网讯科技有限公司 室内定位方法、装置、电子设备及存储介质
CN113743396A (zh) * 2021-08-31 2021-12-03 支付宝(杭州)信息技术有限公司 在证件识别过程中识别注入攻击的方法及装置
WO2022267027A1 (zh) * 2021-06-25 2022-12-29 闻泰科技(深圳)有限公司 图像矫正方法、装置、电子设备和存储介质

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110677586B (zh) 2019-10-09 2021-06-25 Oppo广东移动通信有限公司 图像显示方法、图像显示装置及移动终端
CN115760620B (zh) * 2022-11-18 2023-10-20 荣耀终端有限公司 一种文档矫正方法、装置及电子设备

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2001351102A (ja) * 2001-04-20 2001-12-21 Mitsubishi Electric Corp 画像処理装置
CN101989349A (zh) * 2009-08-03 2011-03-23 夏普株式会社 图像输出装置及方法、便携终端装置、拍摄图像处理系统
JP2012065247A (ja) * 2010-09-17 2012-03-29 Toshiba Corp 情報処理装置、方法およびプログラム
CN105096299A (zh) * 2014-05-08 2015-11-25 北京大学 多边形检测方法和多边形检测装置
CN105354866A (zh) * 2015-10-21 2016-02-24 郑州航空工业管理学院 一种多边形轮廓相似度检测方法
CN105931239A (zh) * 2016-04-20 2016-09-07 北京小米移动软件有限公司 图像处理的方法及装置

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2001351102A (ja) * 2001-04-20 2001-12-21 Mitsubishi Electric Corp 画像処理装置
CN101989349A (zh) * 2009-08-03 2011-03-23 夏普株式会社 图像输出装置及方法、便携终端装置、拍摄图像处理系统
JP2012065247A (ja) * 2010-09-17 2012-03-29 Toshiba Corp 情報処理装置、方法およびプログラム
CN105096299A (zh) * 2014-05-08 2015-11-25 北京大学 多边形检测方法和多边形检测装置
CN105354866A (zh) * 2015-10-21 2016-02-24 郑州航空工业管理学院 一种多边形轮廓相似度检测方法
CN105931239A (zh) * 2016-04-20 2016-09-07 北京小米移动软件有限公司 图像处理的方法及装置

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111445566A (zh) * 2020-03-27 2020-07-24 腾讯科技(深圳)有限公司 一种信息处理方法、装置及计算机可读存储介质
CN111445566B (zh) * 2020-03-27 2022-05-06 腾讯科技(深圳)有限公司 一种信息处理方法、装置及计算机可读存储介质
CN111858996A (zh) * 2020-06-10 2020-10-30 北京百度网讯科技有限公司 室内定位方法、装置、电子设备及存储介质
CN111858996B (zh) * 2020-06-10 2023-06-23 北京百度网讯科技有限公司 室内定位方法、装置、电子设备及存储介质
US11721037B2 (en) 2020-06-10 2023-08-08 Beijing Baidu Netcom Science And Technology Co., Ltd. Indoor positioning method and apparatus, electronic device and storage medium
WO2022267027A1 (zh) * 2021-06-25 2022-12-29 闻泰科技(深圳)有限公司 图像矫正方法、装置、电子设备和存储介质
CN113743396A (zh) * 2021-08-31 2021-12-03 支付宝(杭州)信息技术有限公司 在证件识别过程中识别注入攻击的方法及装置
CN113743396B (zh) * 2021-08-31 2023-11-10 支付宝(杭州)信息技术有限公司 在证件识别过程中识别注入攻击的方法及装置

Also Published As

Publication number Publication date
CN108604374A (zh) 2018-09-28
CN108604374B (zh) 2020-03-10

Similar Documents

Publication Publication Date Title
WO2018053756A1 (zh) 一种图像检测方法及终端
CN110348454B (zh) 匹配局部图像特征描述符
US9020187B2 (en) Planar mapping and tracking for mobile devices
JP6338021B2 (ja) 画像処理装置、画像処理方法および画像処理プログラム
KR20140136016A (ko) 장면 구조-기반 자가-포즈 추정
CN109840884A (zh) 一种图像拼接方法、装置及电子设备
US8531519B1 (en) Automatic multi-device localization and collaboration using cameras
WO2022052582A1 (zh) 一种图像配准方法、装置及电子设备和存储介质
KR20180105875A (ko) 단일 영상을 이용한 카메라 캘리브레이션 방법 및 이를 위한 장치
CN110648363A (zh) 相机姿态确定方法、装置、存储介质及电子设备
JP2020067978A (ja) 床面検出プログラム、床面検出方法及び端末装置
US20230041382A1 (en) Electronic device and method for tracking object thereof
CN110163914B (zh) 基于视觉的定位
CN112991441A (zh) 相机定位方法、装置、电子设备及存储介质
JP5973767B2 (ja) 対応点探索装置、そのプログラム及びカメラパラメータ推定装置
CN112150550B (zh) 一种融合定位方法及装置
CN115272470A (zh) 相机定位方法、装置、计算机设备和存储介质
CN111260729B (zh) 用于车载环视系统中鱼眼镜头标定的方法及装置
JP6599097B2 (ja) 位置方位検出装置及び位置方位検出プログラム
CN113920525A (zh) 文本矫正方法、装置、设备及存储介质
JP5464671B2 (ja) 画像処理装置、画像処理方法および画像処理プログラム
JP5586414B2 (ja) 配筋情報取得装置及び配筋情報取得方法
CN110660134B (zh) 三维地图构建方法、三维地图构建装置及终端设备
CN114387405B (zh) 基于机器视觉的微小特征跨数量级快速定位方法和装置
CN111951211B (zh) 一种目标检测方法、装置及计算机可读存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16916495

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16916495

Country of ref document: EP

Kind code of ref document: A1