WO2018141109A1 - 图像处理的方法和设备 - Google Patents

图像处理的方法和设备 Download PDF

Info

Publication number
WO2018141109A1
WO2018141109A1 PCT/CN2017/072949 CN2017072949W WO2018141109A1 WO 2018141109 A1 WO2018141109 A1 WO 2018141109A1 CN 2017072949 W CN2017072949 W CN 2017072949W WO 2018141109 A1 WO2018141109 A1 WO 2018141109A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
quadrilateral
information
occlusion
area
Prior art date
Application number
PCT/CN2017/072949
Other languages
English (en)
French (fr)
Inventor
王雅辉
陈心
张运超
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to US16/483,950 priority Critical patent/US11074679B2/en
Priority to CN201780005041.1A priority patent/CN108513664B/zh
Priority to PCT/CN2017/072949 priority patent/WO2018141109A1/zh
Publication of WO2018141109A1 publication Critical patent/WO2018141109A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • G06T5/94
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformation in the plane of the image
    • G06T3/60Rotation of a whole image or part thereof
    • G06T3/608Skewing or deskewing, e.g. by two-pass or three-pass rotation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration by the use of more than one image, e.g. averaging, subtraction
    • G06T5/77
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/13Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery
    • G06T7/543Depth or shape recovery from line drawings
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/768Arrangements for image or video recognition or understanding using pattern recognition or machine learning using context analysis, e.g. recognition aided by known co-occurring patterns
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N1/00Scanning, transmission or reproduction of documents or the like, e.g. facsimile transmission; Details thereof
    • H04N1/387Composing, repositioning or otherwise geometrically modifying originals
    • H04N1/3876Recombination of partial images to recreate the original image
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/64Computer-aided capture of images, e.g. transfer from script file into camera, check of taken image quality, advice or proposal for image composition or decision on when to take image
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/80Camera processing pipelines; Components thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30176Document

Definitions

  • the present application relates to the field of communications and, more particularly, to a method and apparatus for image processing.
  • the embodiment of the present application provides a method and device for image processing, which can determine whether there is an obstruction in the captured image by using the depth data, and clear the obstruction and restore the occluded information when it is determined that the obstruction exists.
  • an embodiment of the present application provides a method for image processing, including: capturing a first image, where the first image includes a quadrangle, the quadrilateral region has a plurality of sampling points; and establishing a reference plane, the reference plane For the plane in which the quadrilateral is located, the plurality of sampling points respectively have corresponding projection points on the reference plane; and calculating a difference between the first depth data and the second depth data, wherein the first depth data is among the plurality of sampling points The distance between each sampling point and the plane of the camera, the second depth data is the distance between the projection point corresponding to each sampling point and the plane of the camera; according to the difference, whether the area of the quadrilateral is determined There is an obstruction.
  • the reference plane is established according to the captured image, and whether the obscuration object exists on the captured image may be determined according to the difference between the depth data of the captured image and the depth data of the reference plane. A determination is made as to whether or not an obstruction exists within the range of the captured image.
  • the method before the establishing the reference plane, further includes: performing edge detection on the first image, determining at least four edge line segments; and acquiring each of the at least four edge line segments Depth data of an edge segment; according to the depth data of each edge segment, four coplanar edge segments are selected to form the quad.
  • edge detection is performed on the quadrilateral on the reference plane to obtain a more accurate quadrilateral region.
  • determining, according to the difference, whether there is an occlusion in the area of the quadrilateral including: if the difference is greater than the first threshold, the sampling point occupies the multiple sampling points If the percentage of the total is greater than the second threshold, determining that there is an occlusion in the area of the quadrilateral; if the difference between the sampling point whose difference is greater than the first threshold is less than or equal to the second threshold, or if The difference of each sample point is less than or equal to the first threshold, and it is determined that there is no obstruction in the area of the quadrilateral.
  • the establishing a reference plane includes: a plane in which the camera that captures the first image is located as an xy plane, and a distance from the camera to the quadrilateral is a z-axis, the camera The point on the plane is the origin, and the spatial coordinate system is established.
  • the three variables, A, B, C, and D are constants.
  • the method further includes: when an obstruction exists in the quadrilateral region, acquiring, in the quadrilateral region, the first sample that is greater than the first threshold. a point; determining a first obstruction area according to the position of the first sampling point; restoring the occluded information in the first obstruction area.
  • the occlusion region is determined according to the depth data of the captured image and the depth data of the reference plane, and the occluded information is restored.
  • the recovering the occluded information in the first occlusion region includes: capturing a second image, where the second image includes the first occlusion region Part or all of the occluded information; the occluded information in the first occlusion region is restored based on the partial or total occlusion information.
  • the first feature point in the first image is extracted, and the first feature descriptor is calculated; the second feature point in the second image is extracted, and the second feature is calculated a descriptor; calculating a transformation matrix between the first image and the second image according to the first feature descriptor and the second feature descriptor; calculating a second occlusion region in the second image according to the transformation matrix .
  • the second occlusion region includes some or all of the occlusion information in the first occlusion region.
  • the gradation value of the pixel in the second image and the transformation matrix are utilized. Interpolating the first occlusion region to recover occluded information in the first occlusion region.
  • the third occlusion region is calculated, the third occlusion region is another occlusion region of the second occlusion region except the first intersection region; and the third occlusion region is calculated according to the transformation matrix Corresponding a fourth occlusion region in the first image; interpolating the fourth occlusion region by using a gradation value of the pixel in the second image and the transformation matrix to restore the fourth occlusion region Blocked information.
  • the recovering the occluded information in the first occlusion region further includes: capturing a third image, where the third image includes the first occlusion region Part of the occluded information; recovering the occlusion information according to the partial occlusion information in the first occlusion region included in the second image and the partially occluded information in the first occlusion region included in the third image The occluded information in an obstruction area.
  • the method further includes: generating first prompt information, where the first prompt information is used to prompt a location of the obstruction.
  • the method further includes: generating second prompt information, where the second prompt information is used to prompt the user to move the camera.
  • the method further includes: performing document correction on the quadrilateral after restoring the occluded information in the first occlusion region.
  • the method further includes: performing document correction on the quadrilateral when there is no obstruction in the quadrilateral region.
  • the first image is an image that includes document information.
  • the embodiment of the present application provides an apparatus for performing image processing, which is used to perform the method in any of the foregoing possible implementation manners of the first aspect or the first aspect.
  • the apparatus comprises a modular unit for performing the method of any of the above-described first or first possible implementations of the first aspect.
  • the embodiment of the present application provides an apparatus for performing image processing, which is used in the method of any one of the foregoing first aspect or the first aspect
  • the image processing device includes a processor and a memory. And a camera, wherein the memory is for storing instructions, the camera is for taking an image, and the processor is for executing instructions stored by the memory.
  • the camera is configured to capture a first image, the first image includes a quadrangle, and the quadrilateral region has a plurality of sampling points;
  • the processor is configured to establish a reference plane, where the reference plane is a plane in which the quadrilateral is located, and the plurality of sampling points respectively have corresponding projection points on the reference plane;
  • the processor is configured to calculate a difference between the first depth data and the second depth data, where the first depth data is a distance between each of the plurality of sampling points and a plane where the camera is located, where the second depth data is The distance between the projection point corresponding to each sampling point and the plane of the camera;
  • the processor is configured to determine, according to the difference, whether an obstruction exists in the area of the quadrilateral;
  • the processor is further configured to recover the occluded information in the quadrilateral region when the occlusion is determined in the area of the quadrilateral;
  • the processor is also used for document correction of the recovered occlusion information.
  • an embodiment of the present application provides a terminal device, where the terminal device includes an image processing device, a display panel, a read only memory, a random access memory, a register, and at least one button in the second aspect.
  • an embodiment of the present application provides a computer program product comprising instructions, which when executed on a computer, causes the computer to perform the method in the possible implementation manners of the foregoing aspects.
  • the embodiment of the present application provides a computer readable storage medium, where the computer readable storage medium stores instructions, when executed on a computer, causing the computer to perform the method in the foregoing possible implementation manners of the various aspects. .
  • FIG. 1 is a schematic structural diagram of a mobile phone applicable to an embodiment of the present application.
  • FIG. 2 is a schematic flowchart of a method of image processing according to an embodiment of the present application.
  • FIG. 3 is a schematic flowchart of a method of image processing according to another embodiment of the present application.
  • FIG. 4 is a schematic flow chart of the occlusion information in the region where the quadrilateral is restored.
  • Figure 5 is a schematic illustration of restoring occlusion information in a quadrilateral region.
  • Fig. 6 is a schematic view showing another occlusion information in the restored quadrilateral region.
  • FIG. 7 is a schematic block diagram of an apparatus for image processing according to an embodiment of the present application.
  • FIG. 8 is a schematic block diagram of an apparatus for image processing according to another embodiment of the present application.
  • FIG. 9 is a schematic block diagram of an apparatus for image processing provided by an embodiment of the present application.
  • FIG. 10 is a schematic structural diagram of a terminal device according to an embodiment of the present application.
  • the image processing device may be a camera, or may be a terminal having a camera function, for example, a user equipment (User Equipment, referred to as "UE"), an access terminal, a subscriber unit, Mobile device, user terminal, aerial camera, drone device, monitoring device or user device.
  • UE User Equipment
  • the access terminal may be a cellular phone, a cordless phone, a Session Initiation Protocol ("SSIP”) phone, a Wireless Local Loop (WLL) station, and a personal digital processing (Personal Digital) Assistant, referred to as "PDA”), handheld device with wireless communication function, computing device or other processing device connected to the wireless modem, in-vehicle device, wearable device, the fifth generation (5th-Generation, referred to as "5G” for short) a terminal device in a network or a terminal device in a future public network mobile network (Public Land Mobile Network, "PLMN”) network.
  • PLMN Public Land Mobile Network
  • FIG. 1 is a schematic structural diagram of a mobile phone 100 applicable to the embodiment of the present application.
  • the mobile phone 100 includes a radio frequency (Radio Frequency, abbreviated as "RF") circuit 110, a memory 120, an input unit 130, a display unit 140, an audio circuit 150, a processor 160, and a power source 170.
  • RF Radio Frequency
  • the structure of the handset 100 shown in FIG. 1 does not constitute a limitation to the handset, and may include more or less components than those illustrated, or some components may be combined, or different component arrangements.
  • the components of the terminal device 100 will be specifically described below with reference to FIG. 1 :
  • the RF circuit 110 can be used for transmitting and receiving information or during a call, and receiving and transmitting the signal. Specifically, after receiving the downlink information of the base station, the processing is processed by the processor 160. In addition, the uplink data is sent to the base station.
  • RF circuits include, but are not limited to, an antenna, at least one amplifier, a transceiver, a coupler, a Low Noise Amplifier (LNA), a duplexer, and the like.
  • LNA Low Noise Amplifier
  • RF circuitry 110 can also communicate with the network and other devices via wireless communication.
  • the wireless communication can use any communication standard or protocol, including but not limited to Global System of Mobile communication ("GSM”), General Packet Radio Service ("GPRS"). , Wideband Code Division Multiple Access (Code Division Multiple Access), WCDMA (Wideband Code Division Multiple Access (CDMA)), Long Term Evolution (LTE), e-mail, Short Messaging Service (“SMS”).
  • GSM Global System of Mobile communication
  • the memory 120 can be used to store software programs and modules, and the processor 160 is stored in the memory 120 by running. Software programs and modules to perform various functional applications and data processing of the mobile phone 100.
  • the memory 160 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application required for at least one function (such as a sound playing function, an image playing function, a photographing function, a document correction function, and an occlusion detection). And the clear function, etc.); the storage data area can store data (such as audio data, image data, phone book, etc.) created according to the use of the mobile phone 100.
  • memory 120 can include high speed random access memory, and can also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device.
  • the input unit 130 can be configured to receive input numeric or character information and to generate key signal inputs related to user settings and function control of the handset 100.
  • the input unit 130 may include a button 131, a touch screen 132, a camera 133, and other input devices 134.
  • the button 131 can sense the pressing operation performed by the user and drive the corresponding connecting device according to a preset program.
  • the button includes a power switch button, a volume control button, a home button, and a shortcut button.
  • the touch screen 132 can collect touch operations on or near the user, such as the operation of the user using a suitable object or accessory on the touch screen 132 or near the touch screen 132 using a finger, a stylus, etc., and according to The preset program drives the corresponding connection device.
  • the touch screen 132 may include two parts of a touch detection device and a touch controller. Wherein, the touch detection device detects the touch orientation of the user, and detects a signal brought by the touch operation, and transmits the signal to the touch controller; the touch controller receives the touch information from the touch detection device, converts the touch information into contact coordinates, and sends the touch information.
  • the processor 160 is provided and can receive commands from the processor 160 and execute them.
  • the touch screen 132 can be implemented in various types such as resistive, capacitive, infrared, and surface acoustic waves.
  • the camera 133 may acquire depth data of the target image, for example, a dual camera.
  • the input unit 130 may also include other input devices 134.
  • other input devices 134 may include, but are not limited to, one or more of a physical keyboard, a trackball, a mouse, a joystick, and the like.
  • the display unit 140 can be used to display information input by the user or information provided to the user and various menus of the mobile phone 100.
  • the display unit 140 may include a display panel 141.
  • the display may be configured in the form of a Liquid Crystal Display (“LCD”) or an Organic Light-Emitting Diode (OLED).
  • Panel 141 wherein, when the button 131 detects the pressing operation, it can be transmitted to the processor 160 to determine the type of the button event, and then the processor 160 provides a corresponding visual output on the display panel 141 according to the type of the button event. Further, the touch screen 132 may cover the display panel 141.
  • the touch screen 132 When the touch screen 132 detects a touch operation on or near it, the touch screen 132 transmits to the processor 160 to determine the type of the touch event, and then the processor 160 displays the panel according to the type of the touch event. A corresponding visual output is provided on 141.
  • the touch screen 132 and the display panel 141 are shown as two separate components in FIG. 1 to implement the input and input functions of the mobile phone 100, in some embodiments, the touch screen 132 can be integrated with the display panel 141 to implement the mobile phone. 100 input and output functions.
  • the audio circuit 150, the speaker 151, and the microphone 152 can provide an audio interface between the user and the handset 100.
  • the audio circuit 150 can transmit the converted electrical data of the received audio data to the speaker 151 for conversion to the sound signal output by the speaker 151; on the other hand, the microphone 152 converts the collected sound signal into an electrical signal by the audio circuit 150. After receiving, it is converted into audio data, and then the audio data is output to the RF circuit 110 for transmission to, for example, another terminal device, or the audio data is output to the memory 120 for further processing.
  • the processor 160 is the control center of the handset 100, which connects various portions of the entire terminal device using various interfaces and lines, by running or executing software programs and/or modules stored in the memory 120, and recalling data stored in the memory 120.
  • the various functions and processing data of the mobile phone 100 are executed to perform overall monitoring of the mobile phone 100.
  • the processor 160 may include one or more processing units; preferably, the processor 160 may integrate an application processor and a modem processor, where the application processor mainly processes an operating system, a user interface, an application, and the like.
  • the modem processor primarily handles wireless communications. It will be appreciated that the above described modem processor may also not be integrated into the processor 160.
  • the mobile phone 100 also includes a power source 170 (such as a battery) that supplies power to various components.
  • a power source 170 such as a battery
  • the power source can be logically coupled to the processor 160 through a power management system to manage functions such as charging, discharging, and power management through the power management system.
  • the mobile phone 100 may further include a Wireless Fidelity (“WiFi”) module, a Bluetooth module, etc., and may also be equipped with a light sensor, a gyroscope, a barometer, a hygrometer, a thermometer, an infrared sensor, and the like. , will not repeat them here.
  • WiFi Wireless Fidelity
  • FIG. 2 is a schematic flowchart of a method 200 of image processing according to an embodiment of the present application.
  • the method 200 may be performed by some terminals having a shooting function, for example, may be performed by the mobile phone 100 shown in FIG. 1 , and may be performed by other terminals, which is not specifically limited in the embodiment of the present application.
  • a first image is captured, the first image including a quadrilateral having a plurality of sampling points in the region.
  • the quadrilateral may be a square, a rectangle, or some irregular quadrilateral, and the four sides of the quadrilateral may be a straight line or an approximately straight curve.
  • the information of the quadrilateral may include document information, such as a PPT, a poster, a sign, a license plate, and the like.
  • the quadrilateral can be determined by edge detection.
  • performing edge detection on the first image determining at least four edge line segments, acquiring depth data of each edge line segment of the at least four edge line segments, and selecting four coplanar edge line segments according to the depth data of each edge segment segment.
  • the quadrilateral is formed.
  • a reference plane is established, where the reference plane is a plane in which the quadrilateral is located, and the plurality of sampling points respectively have corresponding projection points on the reference plane.
  • the reference plane can be established according to the quadrilateral.
  • the plane equation of the reference plane may be determined in multiple iterations.
  • the three variables, A, B, C, and D are constants, and the plurality of sampling points in the area of the quadrilateral are in a spatial coordinate system, which is the plane where the camera capturing the first image is located as the xy plane.
  • the distance from the camera to the quadrilateral is the z-axis, and a point on the plane where the camera is located is established for the origin.
  • the spatial coordinates of the quadrilateral are obtained, and the spatial coordinates of the quadrilateral are selected at least twice, and three of the spatial coordinates of the quadrilateral are selected at least twice, and the quadrilateral selected by the at least two times is used.
  • Three points in the space coordinates, the coefficients A, B, C, and D of the plane equation are solved, and the coefficients A, B, C, and D of the plane equation solved each time at least twice are compared, if at least twice
  • the coefficients of the coefficients A, B, C, and D of the solved plane equation are smaller than the third threshold, and the plane equation determined according to the coefficients A, B, C, and D whose error is smaller than the third threshold is the plane equation of the reference plane.
  • the plurality of sampling points of the quadrilateral respectively have corresponding projection points on the reference plane.
  • the depth data of the plurality of sampling points of the quadrilateral may be partially the same as the depth data of the projection points corresponding to the plurality of sampling points, or may be all the same or all different.
  • performing edge detection on the reference plane can determine the more precise quadrilateral.
  • a difference between the first depth data and the second depth data is calculated, where the first depth data is a distance between each of the plurality of sampling points and a plane where the camera is located, and the second depth data is the The distance between the projection point corresponding to each sampling point and the plane of the camera.
  • calculating a difference between the first depth data and the second depth data may be calculating a difference between a distance between the sampling point 1 and a plane where the camera is located, and a distance between a projection point of the sampling point 1 and a plane where the camera is located.
  • the difference values corresponding to different sampling points may be the same or different.
  • the quadrilateral is document corrected.
  • whether there is an obstruction in a certain area of the captured image is determined by a difference between the depth data of the captured image and the depth data of the reference plane.
  • the method 200 further includes the following content.
  • the first depth data in the quadrilateral region may change due to the influence of the obstruction, and the first depth data and the absence of the obstruction are present at the position where the obstruction exists. Location Compared with the first depth data, it is significantly smaller.
  • the difference between the first depth data and the second depth data is greater than the first threshold.
  • the first sampling point whose difference is greater than the first threshold may be all sampling points in the quadrilateral region, or may be a partial sampling point in the quadrilateral region.
  • a first obstruction region is determined based on the location of the first sample point.
  • the first sampling points whose difference is greater than the first threshold are obtained, and the first obstruction area is determined according to the positions of the sampling points.
  • all of the first sampling points whose difference is greater than the first threshold are sequentially connected by line segments, and the area enclosed by the line segments is determined as the first covering area.
  • the occluded information in the first occlusion region is restored based on the partial or total occlusion information.
  • the first prompt information is generated according to the first occlusion region, where the first prompt information is used to prompt the position of the occlusion object, and the user captures the second image according to the first prompt information.
  • the second prompt information is generated according to the first occlusion region, where the second prompt information is used to prompt the user to move the camera, and the user captures the second image according to the second prompt information.
  • part or all of the occluded information in the first occlusion region may be recovered according to the second image.
  • the feature descriptor of the first image may be calculated according to the feature point of the first image; and the feature descriptor of the second image is calculated according to the feature point of the second image; according to the feature of the first image Descriptor and feature description of the second image calculate a transformation matrix of the first image and the second image; finally, interpolating the first occlusion region according to the gradation value of the second image and the transformation matrix, Further, the occluded information of the first occlusion region is restored.
  • the feature points may be points that are more easily distinguished from surrounding pixel points in areas where the gray scale changes drastically, and thus are easy to detect, such as the corner of a rectangular border in the image.
  • These points can usually be described by a Feature Descriptor calculated from points in a surrounding area.
  • the feature descriptor can be Scale-invariant Feature Transform (SIFT), Accelerated Robust Feature. (Speeded Up Robust Features, SURF) and Histogram of Oriented Gradient (HOG), etc.
  • SIFT Scale-invariant Feature Transform
  • SURF Accelerated Robust Features
  • HOG Histogram of Oriented Gradient
  • calculating a transformation matrix between two images based on corresponding feature points may first calculate an initial value using a linear algorithm such as Direct Linear Transformation (DLT), and then use a Gauss-Newton algorithm.
  • DLT Direct Linear Transformation
  • Nonlinear algorithms such as Gradient Descent and Levenberg-Marquardt (LM) are further optimized.
  • the information of the occluded area in the second image may be restored by using the first image.
  • the quadrilateral is subjected to document correction.
  • the user takes the third image according to the first prompt information.
  • the second prompt information is generated according to the first occlusion region, the second prompt information is used to prompt the user to move the camera, and the user captures the third image according to the second prompt information.
  • the feature descriptor of the first image may be calculated according to the feature point of the first image; and the feature descriptor of the third image is calculated according to the feature point of the third image;
  • the feature descriptor and the feature descriptor of the third image calculate a transformation matrix of the first image and the third image; and finally interpolate the first occlusion region according to the gray value of the third image and the transformation matrix And recovering the occluded information of the first obstruction area.
  • the fourth image, the fifth image, and the like are captured, and the image is restored by using multiple images.
  • FIG. 4 is a schematic diagram of the occlusion information in the area where the quadrilateral is restored.
  • the specific operation flow for restoring the occluded information in the first occlusion region in the quadrilateral region is as follows:
  • a feature point in the occluded first image I 0 is extracted, and a feature descriptor of the first image I 0 is calculated.
  • a feature point in the second image I 1 is extracted, and a feature descriptor of the second image I 1 is calculated.
  • a first descriptor of the image I 0 calculating a transformation matrix between the images I 1 and the second sub-feature and the second image I 1 is described in accordance with a first feature of the image I 0 is H.
  • the first image I 0 is calculated in the covering area based on the transformation matrix M 0 H covering the area of the second image I M 01 1 a.
  • the occlusion region of the first image I 0 is M 0
  • the occlusion region of the second image I 1 is M 1
  • the occlusion region M 0 is calculated according to the transformation matrix H in the second image I 1 .
  • the occlusion region M 01 wherein M 01 and M 1 have no intersection, the occlusion region M 0 of the first image I 0 can be restored according to the second image I 1 .
  • DETAILED Recovery follows: M 0 covering region interpolation operation according to the second gray value image I 1 and I 0 of the first image and the second image I of the transformation matrix H 1.
  • the position M s0 of M s in the first image I 0 is calculated from the transformation matrix H.
  • the occlusion region of the first image I 0 is M 0
  • the occlusion region of the second image I 1 is M 1
  • the occlusion region M 0 is calculated according to the transformation matrix H in the second image I 1
  • the covering area M 01 where, M 01 and M 1 intersection exists M c
  • the position M s0 in the image I 0 at this time, the occlusion region M s0 of the first image I 0 can be restored according to the second image I 1 .
  • DETAILED Recovery follows: M s0 covering region interpolation operation according to the second gray value image I 1 and I 0 of the first image and the second image I of the transformation matrix H 1.
  • the new occlusion region M 0 is newly a non-empty set.
  • the occlusion region is determined by the depth data of the captured image and the depth data of the reference plane, and the occluded information is restored.
  • FIG. 7 is a schematic block diagram of an apparatus 400 for image processing in accordance with an embodiment of the present application. As shown in Figure 7, the device includes:
  • the photographing module 401 is configured to capture a first image, where the first image includes a quadrangle, and the quadrilateral region has a plurality of sampling points;
  • the creating module 402 is configured to establish a reference plane, where the reference plane is a plane where the quadrilateral is located, and the plurality of sampling points respectively have corresponding projection points on the reference plane;
  • the calculating module 403 is configured to calculate a difference between the first depth data and the second depth data, where the first depth data is a distance between each sampling point of the plurality of sampling points and a plane where the camera is located, and the second depth data is The distance between the projection point corresponding to each sampling point and the plane of the camera;
  • the determining module 404 is configured to determine, according to the difference, whether an obstruction exists in the area of the quadrilateral.
  • the device 400 before the creating module 402 establishes the reference plane, the device 400 further includes:
  • a determining module 405, configured to perform edge detection on the first image, and determine at least four edge line segments;
  • An obtaining module 406, configured to acquire depth data of each edge line segment of the at least four edge line segments
  • the selecting module 407 is configured to select four coplanar edge line segments to form the quadrilateral according to the depth data of each edge segment.
  • the modules included in FIG. 8 do not necessarily exist.
  • the determining module 404 is specifically configured to: according to the difference:
  • the sampling point is less than or equal to the second of the total number of the plurality of sampling points.
  • the threshold determines that there is no obstruction in the area of the quadrilateral.
  • the terminal 400 further includes:
  • the obtaining module 406 is further configured to acquire, in the area of the quadrilateral, the first sampling point whose difference is greater than the first threshold;
  • the determining module 405 is further configured to determine a first obstruction area according to the location of the first sampling point;
  • the recovery module 408 is configured to recover the occluded information in the first occlusion region.
  • the modules included in FIG. 8 do not necessarily exist.
  • the recovery module 408 is specifically configured to:
  • the photographing module 401 is further configured to capture a second image, where the second image includes some or all of the occluded information in the first obstruction region;
  • the recovery module 408 restores the occluded information in the first occlusion region.
  • the recovery module 408 is specifically configured to:
  • the photographing module 401 is further configured to capture a third image, where the third image includes partially occluded information in the first occlusion region;
  • the recovery module 408 restores the first occlusion according to the partial occlusion information in the first occlusion region included in the second image and the partially occluded information in the first occlusion region included in the third image.
  • the occluded information in the object area is the first occlusion according to the partial occlusion information in the first occlusion region included in the second image and the partially occluded information in the first occlusion region included in the third image. The occluded information in the object area.
  • the terminal 400 further includes:
  • the generating module 409 is configured to generate first prompt information, where the first prompt information is used to prompt the location of the obstruction.
  • the modules included in FIG. 8 do not necessarily exist.
  • the generating module 409 is further configured to:
  • a second prompt information is generated, and the second prompt information is used to prompt the user to move the camera.
  • the terminal 400 further includes: a document correction module 410, configured to: after the recovery module 408 restores the occluded information in the first occlusion region, This quad is document corrected.
  • a document correction module 410 configured to: after the recovery module 408 restores the occluded information in the first occlusion region, This quad is document corrected.
  • the document correction module 410 is further configured to: when the determining module 404 determines that an occlusion exists in the area of the quadrilateral, perform document correction on the quadrilateral.
  • the first image is an image including document information.
  • FIG. 9 is a schematic block diagram of an apparatus 500 for image processing provided by an embodiment of the present application.
  • the apparatus 500 includes:
  • a memory 510 configured to store program code
  • the processor 530 is configured to execute program code in the memory 510.
  • the program code includes execution code of each operation of the method 200 of the embodiment of the present application.
  • the processor 530 may be a central processing unit ("CPU"), and the processor 530 may also be other general-purpose processors, digital signal processors (DSPs). , Application Specific Integrated Circuits (ASICs), off-the-shelf programmable gate arrays (FPGAs) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, and the like.
  • the general purpose processor may be a microprocessor or the processor or any conventional processor or the like.
  • the memory 510 can include read only memory and random access memory and provides instructions and data to the processor 530. A portion of the memory 510 can also include a non-volatile random access memory. For example, the memory 510 can also store information of the device type.
  • the camera 520 is configured to capture a first image, where the first image includes a quadrangle, and the quadrilateral region has a plurality of sampling points;
  • the processor 530 is configured to establish a reference plane, where the reference plane is a plane where the quadrilateral is located, and the plurality of sampling points respectively have corresponding projection points on the reference plane;
  • the processor 530 is configured to calculate a difference between the first depth data and the second depth data, where the first depth data is a distance between each of the plurality of sampling points and a plane where the camera is located.
  • the second depth data is a distance between the projection point corresponding to each sampling point and a plane where the camera is located;
  • the processor 530 is configured to determine, according to the difference, whether an obstruction exists in the area of the quadrilateral.
  • the processor 530 is further configured to:
  • each edge line segment four coplanar edge line segments are selected to form the quadrilateral.
  • processor 530 is further configured to:
  • the percentage of the sample points whose difference is greater than the first threshold is less than or equal to the second threshold, or if the difference of each sample point is less than or equal to the first threshold, It is determined that there is no obstruction in the area of the quadrilateral.
  • processor 530 is further configured to:
  • the camera 520 is further configured to capture a second image, where the second image includes some or all of the occluded information in the first occlusion region;
  • the processor 530 is further configured to recover the occluded information in the first occlusion region according to the part or all of the occlusion information.
  • the camera 520 is further configured to capture a third image, where the third image includes partially occluded information in the first occlusion region;
  • the processor 530 is further configured to: according to partial occlusion information in the first occlusion region included in the second image, and a portion in the first occlusion region included in the third image The occlusion information is restored to recover the occluded information in the first occlusion region.
  • processor 530 is further configured to:
  • the first prompt information is generated, and the first prompt information is used to prompt the position of the obstruction.
  • processor 530 is further configured to:
  • the second prompt information is generated, and the second prompt information is used to prompt the user to move the camera.
  • the processor 530 is further configured to perform document correction on the quadrilateral after restoring the occluded information in the first occlusion region.
  • the processor 530 is further configured to perform document correction on the quadrilateral when there is no obstruction in the area of the quadrilateral.
  • the first image is an image including document information.
  • the above steps may be completed by an integrated logic circuit of hardware in the processor 530 or an instruction in the form of software.
  • the steps of the method 200 disclosed in the embodiments of the present application may be directly implemented as hardware processor execution completion, or performed by a combination of hardware and software modules in the processor.
  • the software module can be located in a conventional storage medium such as random access memory, flash memory, read only memory, programmable read only memory or electrically erasable programmable memory, registers, and the like.
  • the storage medium is located in the memory, and the processor 530 reads the information in the memory and completes the steps of the method 200 described above in conjunction with its hardware. To avoid repetition, it will not be described in detail here.
  • FIG. 10 is a schematic structural diagram of a terminal device 600 according to an embodiment of the present application.
  • the terminal device 600 includes an image processing device 400, a display panel 610, a read only memory (ROM) 620, and a random access memory (RAM) 630. , register 640 and button 651 to button 655.
  • ROM read only memory
  • RAM random access memory
  • the image processing device 400 can sense a button event of the button 651 to the button 655, such as a shooting event, an image processing event, or a document correction event, and can perform a corresponding image processing function.
  • the ROM 620 can be used to store code that the image processing device 400 needs to execute, the RAM 630 for the image processing device 400 executing code stored on the ROM 620 to implement a corresponding function; the register 640 for storing the terminal device The type of device 400 that initiates the image processing of 600.
  • the display panel 610 is configured to display the running state of the terminal device 600 and the image processing device 400 to perform image processing, and provide a user with a friendly interaction interface, and can guide the user to perform related operations.
  • the display panel 610 can be covered with a touch screen, and the touch screen can be used to sense a user's touch event, and the device 400 that controls the image processing performs corresponding operations to give the user a certain selection space.
  • the structure of the terminal device 600 shown in FIG. 10 does not constitute a limitation on the terminal device, and the terminal device 600 may further include more or less components than the illustration, or combine some Parts, or different parts.
  • the embodiment of the present application provides a computer readable storage medium for storing instructions.
  • the computer can be used to execute the method 200 of the image processing of the embodiment of the present application.
  • the readable medium may be a ROM or a RAM, which is not limited in this embodiment of the present application.
  • the disclosed systems, devices, and methods may be implemented in other manners.
  • the device embodiments described above are merely illustrative.
  • the division of the unit is only a logical function division.
  • there may be another division manner for example, multiple units or components may be combined or may be Integrate into another system, or some features can be ignored or not executed.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or unit, and may be in an electrical, mechanical or other form.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
  • This functionality if implemented as a software functional unit and sold or used as a standalone product, can be stored on a computer readable storage medium.
  • the technical solution of the present application which is essential or contributes to the prior art, or a part of the technical solution, may be embodied in the form of a software product, which is stored in a storage medium, including The instructions are used to cause a computer device (which may be a personal computer, server, or network device, etc.) to perform all or part of the steps of the methods described in various embodiments of the present application.
  • the foregoing storage medium includes various media that can store program codes, such as a USB flash drive, a mobile hard disk, a read only memory, a random access memory, a magnetic disk, or an optical disk.

Abstract

本申请提供了一种图像处理的方法和设备,通过利用双摄像头获取的深度数据,确定拍摄的图像中是否存在遮挡物,以及存在遮挡物时恢复被遮挡的信息,提升用户体验。该方法包括:拍摄第一图像,第一图像中包括四边形,该四边形的区域内具有多个采样点;建立基准平面,该基准平面为该四边形所在的平面,该多个采样点在该基准平面上分别具有对应的投影点;计算第一深度数据与第二深度数据的差值,该第一深度数据为该多个采样点中每个采样点与摄像头所在平面之间的距离,该第二深度数据为该每个采样点对应的该投影点与摄像头所在平面之间的距离;根据该差值,判断该四边形的区域内是否存在遮挡物。

Description

图像处理的方法和设备 技术领域
本申请涉及通信领域,并且更具体地,涉及一种图像处理的方法和设备。
背景技术
随着技术的进步,拍照成为人们日常生活中必不可少的信息获取方式,同时,用户对拍照效果的要求也越来越高,对图像的处理也成为了一种常见的手段,比如,拍摄会议正在播放的PPT、海报时,常常会因为自己的位置、角度不佳,不是正对着幕布,使得普通模式下拍照拍出的PPT歪歪斜斜,远端也可能看不清。针对这样的情况,现阶段主要通过文档校正对图像进行处理,将文档修正为规整的矩形。然而,实际使用时发现,对于会议场景播放PPT,经常有主讲人的身体或手臂挡住PPT的情况,或者前排观众头部进入拍照范围,导致在进行文档校正时,遮挡仍然存在,使得校正后的文档并不完整,也不美观。
为了在拍摄PPT、海报等文档信息时,克服遮挡物对拍照的影响,提高图像处理能力是亟须解决的问题。
发明内容
本申请实施例提供了一种图像处理的方法和设备,能够利用深度数据判断所拍摄的图像中是否存在遮挡物,并在确定存在遮挡物时,清除遮挡物并恢复被遮挡信息。
第一方面,本申请实施例提供了一种图像处理的方法,包括:拍摄第一图像,该第一图像中包括四边形,该四边形的区域内具有多个采样点;建立基准平面,该基准平面为该四边形所在的平面,该多个采样点在该基准平面上分别具有对应的投影点;计算第一深度数据与第二深度数据的差值,该第一深度数据为该多个采样点中每个采样点与摄像头所在平面之间的距离,该第二深度数据为该每个采样点对应的该投影点与摄像头所在平面之间的距离;根据该差值,判断该四边形的区域内是否存在遮挡物。
因此,在本申请实施例中,在拍摄图像时,根据所拍摄图像建立基准平面,并可以根据所拍摄图像的深度数据与基准平面的深度数据的差值,确定所拍摄图像上是否存在遮挡物,实现了对所拍摄图像的范围内是否存在遮挡物的判断。
可选地,在第一方面的一种实现方式中,在建立该基准平面之前,该方法还包括:对该第一图像进行边缘检测,确定至少四条边缘线段;获取该至少四条边缘线段中每一条边缘线段的深度数据;根据该每一条边缘线段的深度数据,选取四条共面的边缘线段构成该四边形。
可选地,在第一方面的一种实现方式中,在建立该基准平面后,根据该第一深度数据,在该基准平面上,对该四边形进行边缘检测,获取更加准确的该四边形区域。
可选地,在第一方面的一种实现方式中,根据该差值,判断该四边形的区域内是否存在遮挡物,包括:如果该差值大于第一阈值的采样点占该多个采样点总数的百分比大于第二阈值,则确定在该四边形的区域内存在遮挡物;如果该差值大于第一阈值的采样点占该多个采样点总数的百分比小于或者等于第二阈值,或如果该每个采样点的该差值小于或等于第一阈值,则确定在该四边形的区域内不存在遮挡物。
可选地,在第一方面的一种实现方式中,该建立基准平面,包括:以拍摄该第一图像的摄像头所在的平面为xy平面,该摄像头到该四边形的距离为z轴,该摄像头所在的平面上的一点为原点,建立空间坐标系;根据该空间坐标系,利用空间平面方程Ax+By+Cz+D=0建立该基准平面,其中,x、y、z为空间坐标系中的三个变量,A、B、C和D为常数。
可选地,在第一方面的一种实现方式中,该根据该空间坐标系,利用空间平面方程Ax+By+Cz+D=0建立该基准平面,包括:获取该四边形的空间坐标;至少两次选取该四边形的空间坐标,该至少两次中每次选取该四边形的空间坐标中的三点;利用该至少两次中每次选取的该四边形的空间坐标中的三点,求解平面方程的系数A、B、C和D;比较该至少两次中每次求解的平面方程的系数A、B、C和D,如果该至少两次中每次求解的平面方程的系数A、B、C和D的误差小于第三阈值,则根据误差小于第三阈值的系数A、B、C和D确定的平面方程为该基准平面的平面方程。
可选地,在第一方面的一种实现方式中,该方法还包括:当该四边形的区域内存在遮挡物时,在该四边形的区域内,获取该差值大于第一阈值的第一采样点;根据该第一采样点的位置,确定第一遮挡物区域;恢复该第一遮挡物区域内的被遮挡信息。
因此,在本申请实施例中,在确定所拍摄的图像的某个区域内存在遮挡物时,根据所拍摄图像的深度数据与基准平面的深度数据,确定遮挡物区域,并恢复被遮挡的信息。
可选地,在第一方面的一种实现方式中,该恢复该第一遮挡物区域内的被遮挡信息,包括:拍摄第二图像,该第二图像中包括该第一遮挡物区域内的部分或全部被遮挡信息;根据该部分或全部被遮挡信息,恢复该第一遮挡物区域内的被遮挡信息。
可选地,在第一方面的一种实现方式中,提取第一图像中的第一特征点,并计算第一特征描述子;提取第二图像中的第二特征点,并计算第二特征描述子;根据该第一特征描述子和该第二特征描述子计算该第一图像与该第二图像之间的变换矩阵;根据该变换矩阵,计算该第二图像中的第二遮挡物区域。
可选地,在第一方面的一种实现方式中,该第二遮挡物区域包括该第一遮挡物区域内的部分或全部被遮挡信息。
可选地,在第一方面的一种实现方式中,该第二图像中包括该第一遮挡物区域内的全部被遮挡信息时,利用该第二图像中像素的灰度值和该变换矩阵对该第一遮挡区域进行插值运算,以恢复该第一遮挡物区域内的被遮挡信息。
可选地,在第一方面的一种实现方式中,该第二图像中包括该第一遮挡物区域内的部分被遮挡信息时,确定该第一遮挡物区域与该第二遮挡物区域存在第一交集区域,计算第三遮挡物区域,该第三遮挡物区域是该第二遮挡物区域中除该第一交集区域外的其它遮挡区域;根据该变换矩阵,计算该第三遮挡物区域对应的该第一图像中的第四遮挡物区域;利用该第二图像中像素的灰度值和该变换矩阵对该第四遮挡物区域进行插值运算,以恢复该第四遮挡物区域内的被遮挡信息。
可选地,在第一方面的一种实现方式中,确定该第一遮挡物区域内被遮挡信息是否已经全部恢复;当该第一遮挡物区域内的被遮挡信息全部恢复时,退出程序;当该第一遮挡物区域内的被遮挡信息部分恢复时,且没有新图像时,生成用于指示清除该第五遮挡物区域失败的指示信息;当该第一遮挡物区域内的被遮挡信息部分恢复时,且存在新图像时,拍摄第三图像。
可选地,在第一方面的一种实现方式中,该恢复该第一遮挡物区域内的被遮挡信息,还包括:拍摄第三图像,该第三图像中包括该第一遮挡物区域内的部分被遮挡信息;根据该第二图像中包括的该第一遮挡物区域内的部分遮挡信息、以及该第三图像中包括的该第一遮挡物区域内的部分被遮挡信息,恢复该第一遮挡物区域内的被遮挡信息。
可选地,在第一方面的一种实现方式中,该方法还包括:生成第一提示信息,该第一提示信息用于提示遮挡物的位置。
可选地,在第一方面的一种实现方式中,该方法还包括:生成第二提示信息,该第二提示信息用于提示用户移动摄像头。
可选地,在第一方面的一种实现方式中,该方法还包括:在恢复该第一遮挡物区域内的被遮挡信息后,对该四边形进行文档校正。
可选地,在第一方面的一种实现方式中,该方法还包括:该四边形的区域内不存在遮挡物时,对该四边形进行文档校正。
可选地,在第一方面的一种实现方式中,该第一图像为包括文档信息的图像。
第二方面,本申请实施例提供了一种图像处理的设备,用于执行上述第一方面或第一方面的任一种可能的实现方式中的方法。具体地,该设备包括用于执行上述第一方面或第一方面的任一种可能的实现方式中的方法的模块单元。
第三方面,本申请实施例提供了一种图像处理的设备,用于执行上述第一方面或第一方面的任一种可能的实现方式中的方法,该图像处理的设备包括处理器、存储器和摄像头,其中,该存储器用于存储指令,该摄像头用于拍摄图像,该处理器用于执行该存储器存储的指令。
其中,该摄像头用于拍摄第一图像,第一图像中包括四边形,该四边形的区域内具有多个采样点;
该处理器用于建立基准平面,该基准平面为该四边形所在的平面,该多个采样点在该基准平面上分别具有对应的投影点;
该处理器用于计算第一深度数据与第二深度数据的差值,该第一深度数据为该多个采样点中每个采样点与摄像头所在平面之间的距离,该第二深度数据为该每个采样点对应的该投影点与摄像头所在平面之间的距离;
该处理器用于根据该差值,判断该四边形的区域内是否存在遮挡物;
该处理器还用于在确定该四边形的区域内存在遮挡物时,恢复该四边形区域内的被遮挡信息;
该处理器还用于对恢复的遮挡信息进行文档校正。
第四方面,本申请实施例提供了一种终端设备,该终端设备包括第二方面中的图像处理的设备、显示面板、只读存储器、随机存取存储器、寄存器以及至少一个按键。
第五方面,本申请实施例提供了一种包含指令的计算机程序产品,当其在计算机上运行时,使得计算机执行上述各方面可能的实现方式中的方法。
第六方面,本申请实施例提供了一种计算机可读存储介质,该计算机可读存储介质中存储有指令,当其在计算机上运行时,使得计算机执行上述各方面可能的实现方式中的方法。
附图说明
图1是可应用于本申请实施例的手机的示意性结构图。
图2是根据本申请实施例的图像处理的方法的示意性流程图。
图3是根据本申请另一实施例的图像处理的方法的示意性流程图。
图4是恢复四边形的区域内的遮挡物信息的示意性流程图。
图5是一种恢复四边形区域内的遮挡信息的示意图。
图6是另一种恢复四边形区域内的遮挡信息的示意图。
图7是根据本申请实施例的图像处理的设备的示意性框图。
图8是根据本申请另一实施例的图像处理的设备的示意性框图。
图9示出了本申请实施例提供的图像处理的设备的示意性框图。
图10是根据本申请实施例的终端设备的示意性结构图。
具体实施方式
下面将结合附图,对本申请实施例中的技术方案进行描述。
应理解,在本申请实施例中,图像处理的设备可以是照相机,也可以是一些具有拍照功能的终端,例如,用户设备(User Equipment,简称为“UE”)、接入终端、用户单元、移动设备、用户终端、航拍设备、无人机设备、监控设备或用户装置。接入终端可以是蜂窝电话、无绳电话、会话启动协议(Session Initiation Protocol,简称为“SIP”)电话、无线本地环路(Wireless Local Loop,简称为“WLL”)站、个人数字处理(Personal Digital Assistant,简称为“PDA”)、具有无线通信功能的手持设备、计算设备或连接到无线调制解调器的其它处理设备、车载设备、可穿戴设备,未来第五代(5th-Generation,简称为“5G”)网络中的终端设备或者未来演进的公共陆地移动网络(Public Land Mobile Network,简称为“PLMN”)网络中的终端设备等。
可选地,以该图像处理设备为手机为例,图1示出了可应用于本申请实施例的一种手机100的示意性结构图。
如图1所示,该手机100包括射频(Radio Frequency,简称为“RF”)电路110、存储器120、输入单元130、显示单元140、音频电路150、处理器160、以及电源170等部件。本领域技术人员可以理解,图1中示出的手机100的结构并不构成对手机的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。
下面结合图1对终端设备100的各个构成部件进行具体的介绍:
RF电路110可用于收发信息或通话过程中,信号的接收和发送,特别地,将基站的下行信息接收后,给处理器160处理;另外,将上行的数据发送给基站。通常,RF电路包括但不限于天线、至少一个放大器、收发信机、耦合器、低噪声放大器(Low Noise Amplifier,简称为“LNA”)、双工器等。此外,RF电路110还可以通过无线通信与网络和其他设备通信。该无线通信可以使用任一通信标准或协议,包括但不限于全球移动通讯系统(Global System of Mobile communication,简称为“GSM”)、通用分组无线服务(General Packet Radio Service,简称为“GPRS”)、宽带码分多址(Code Division Multiple Access,码分多址)、WCDMA(Wideband Code Division Multiple Access,简称为“CDMA”)、长期演进(Long Term Evolution,简称为“LTE”)、电子邮件、短消息服务(Short Messaging Service,简称为“SMS”)等。
存储器120可用于存储软件程序以及模块,处理器160通过运行存储在存储器120 的软件程序以及模块,从而执行手机100的各种功能应用以及数据处理。存储器160可主要包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需的应用程序(比如声音播放功能、图像播放功能、拍照功能、文档校正功能、遮挡检测与清除功能等)等;存储数据区可存储根据手机100的使用所创建的数据(比如音频数据、图像数据、电话本等)等。此外,存储器120可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件、闪存器件、或其他易失性固态存储器件。
输入单元130可用于接收输入的数字或字符信息,以及产生与手机100的用户设置以及功能控制有关的键信号输入。具体地,输入单元130可包括按键131、触摸屏132、摄像头133以及其它输入设备134。按键131可以感应用户对其进行的按压操作,并根据预先设定的程式驱动相应的连接装置。可选地,该按键包括电源开关按键、音量控制按键、home键和快捷键等。触摸屏132,也称为触控面板,可收集用户在其上或附近的触摸操作,比如用户使用手指、触笔等任何适合的物体或附件在触摸屏132上或在触摸屏132附近的操作,并根据预先设定的程式驱动相应的连接装置。可选的,触摸屏132可包括触摸检测装置和触摸控制器两个部分。其中,触摸检测装置检测用户的触摸方位,并检测触摸操作带来的信号,将信号传送给触摸控制器;触摸控制器从触摸检测装置上接收触摸信息,并将它转换成触点坐标,再送给处理器160,并能接收处理器160发来的命令并加以执行。此外,可以采用电阻式、电容式、红外线以及表面声波等多种类型实现触摸屏132。可选地,摄像头133可以采集目标图像的深度数据,例如,双摄像头。除了按键131、触摸屏132和摄像头133,输入单元130还可以包括其它输入设备134。具体地,其它输入设备134可以包括但不限于物理键盘、轨迹球、鼠标、操作杆等中的一种或多种。
显示单元140可用于显示由用户输入的信息或提供给用户的信息以及手机100的各种菜单。显示单元140可包括显示面板141,可选的,可以采用液晶显示器(Liquid Crystal Display,简称为“LCD”)、有机发光二极管(Organic Light-Emitting Diode,简称为“OLED”)等形式来配置显示面板141。其中,在按键131检测到按压操作时,可以传送给处理器160以确定按键事件的类型,随后处理器160根据按键事件的类型在显示面板141上提供相应的视觉输出。进一步的,触摸屏132可覆盖显示面板141,当触摸屏132检测到在其上或附近的触摸操作后,传送给处理器160以确定触摸事件的类型,随后处理器160根据触摸事件的类型在显示面板141上提供相应的视觉输出。虽然在图1中示出触摸屏132与显示面板141是作为两个独立的部件来实现手机100的输入和输入功能,但是在某些实施例中,可以将触摸屏132与显示面板141集成而实现手机100的输入和输出功能。
音频电路150、扬声器151、麦克风152可提供用户与手机100之间的音频接口。音频电路150可将接收到的音频数据转换后的电信号,传输到扬声器151,由扬声器151转换为声音信号输出;另一方面,麦克风152将收集的声音信号转换为电信号,由音频电路150接收后转换为音频数据,再将音频数据输出至RF电路110以发送给比如另一终端设备,或者将音频数据输出至存储器120以便进一步处理。
处理器160是手机100的控制中心,利用各种接口和线路连接整个终端设备的各个部分,通过运行或执行存储在存储器120内的软件程序和/或模块,以及调用存储在存储器120内的数据,执行手机100的各种功能和处理数据,从而对手机100进行整体监控。 可选的,处理器160可包括一个或多个处理单元;优选的,处理器160可集成应用处理器和调制解调处理器,其中,应用处理器主要处理操作系统、用户界面和应用程序等,调制解调处理器主要处理无线通信。可以理解的是,上述调制解调处理器也可以不集成到处理器160中。
手机100还包括给各个部件供电的电源170(比如电池),优选的,电源可以通过电源管理系统与处理器160逻辑相连,从而通过电源管理系统实现管理充电、放电、以及功耗管理等功能。
尽管未示出,手机100还可以包括无线保真(Wireless Fidelity,简称为“WiFi”)模块、蓝牙模块等,还可配置有光传感器、陀螺仪、气压计、湿度计、温度计、红外线传感器等,在此不再赘述。
图2是根据本申请实施例的一种图像处理的方法200的示意性流程图。该方法200可以由具有拍摄功能的一些终端执行,例如,可以由图1所示的手机100执行,当然,也可以由其他终端执行,本申请实施例并不对此进行特别限定。
在201中,拍摄第一图像,该第一图像中包括四边形,该四边形的区域内具有多个采样点。
可选地,四边形可以是正方形,也可以是长方形,还可以是一些不规则的四边形,四边形的四条边可以是直线,也可以是近似直线的曲线。
可选地,该四边形范围内可以包括文档信息,例如,PPT,海报,指示牌,车牌等。
可选地,在拍摄该第一图像之前先判断该第一图像中是否存在该四边形,确定存在该四边形时,拍摄该第一图像。
可选地,可以通过边缘检测确定该四边形。
具体地,对该第一图像进行边缘检测,确定至少四条边缘线段,获取该至少四条边缘线段中每一条边缘线段的深度数据,根据该每一条边缘线段的深度数据,选取四条共面的边缘线段构成该四边形。
在202中,建立基准平面,该基准平面为该四边形所在的平面,该多个采样点在该基准平面上分别具有对应的投影点。
可选地,可以根据该四边形,建立该基准平面。
可选地,可以采用多次迭代的方式确定该基准平面的平面方程。
具体地,选取该四边形的区域内的多个采样点中的三点,利用空间平面方程Ax+By+Cz+D=0建立该基准平面,其中,x、y、z为空间坐标系中的三个变量,A、B、C和D为常数,该四边形的区域内的多个采样点处于一个空间坐标系中,该空间坐标系是以捕获该第一图像的摄像头所在的平面为xy平面,该摄像头到该四边形的距离为z轴,该摄像头所在的平面上的一点为原点建立的。
例如,获取该四边形的空间坐标,至少两次选取该四边形的空间坐标,该至少两次中每次选取该四边形的空间坐标中的三点,利用该至少两次中每次选取的该四边形的空间坐标中的三点,求解平面方程的系数A、B、C和D,比较该至少两次中每次求解的平面方程的系数A、B、C和D,如果该至少两次中每次求解的平面方程的系数A、B、C和D的误差小于第三阈值,则根据误差小于第三阈值的系数A、B、C和D确定的平面方程为该基准平面的平面方程。
例如,第一次选取该四边形区域内的三点a、b、c,它们的坐标值具体为a(x1,y1,z1), b(x2,y2,z2),c(x3,y3,z3),根据该a、b、c代入空间平面方程Ax+By+Cz+D=0中,计算得到A1、B1、C1和D1,第二次选取该四边形区域内的三点d、e、f,它们的坐标值具体为d(x4,y4,z4),e(x5,y5,z5),f(x6,y6,z6),将该a、b、c代入空间平面方程Ax+By+Cz+D=0中,计算得到A2、B2、C2和D2,比较两次求解的平面方程的系数A1、B1、C1和D1与A2、B2、C2和D2,如果两次求解的平面方程的系数A、B、C和D的误差小于第三阈值,则根据误差小于第三阈值的系数A、B、C和D确定的平面方程为该基准平面的平面方程,如果两次求解的平面方程的系数A、B、C和D的误差大于或者等于第三阈值,则再次选取该四边形区域内的三点j、h、i,它们的坐标值具体为j(x4,y4,z4),h(x5,y5,z5),i(x6,y6,z6),将该j、h、i代入空间平面方程Ax+By+Cz+D=0中,计算得到A3、B3、C3和D3,比较三次求解的平面方程的系数A1、B1、C1和D1与A3、B3、C3和D3,以及A2、B2、C2和D2与A3、B3、C3和D3,如果其中两次求解的平面方程的系数A、B、C和D的误差小于第三阈值,则根据误差小于第三阈值的系数A、B、C和D确定的平面方程为该基准平面的平面方程,如果还是无法满足建立该基准平面的平面方程的要求,就继续取点求系数A、B、C和D,直至满足系数A、B、C和D的误差小于第三阈值的要求。
可选地,该四边形的多个采样点在该基准平面上分别具有对应的投影点。
可选地,该四边形的多个采样点的深度数据与该多个采样点对应的投影点的深度数据可以部分相同,也可以全部相同,也可以全部不同。
可选地,在该基准平面上进行边缘检测可以确定更加精确的该四边形。
在203中,计算第一深度数据与第二深度数据的差值,该第一深度数据为该多个采样点中每个采样点与摄像头所在平面之间的距离,该第二深度数据为该每个采样点对应的该投影点与摄像头所在平面之间的距离。
例如,计算该第一深度数据与该第二深度数据的差值,可以是计算采样点1与摄像头所在平面之间的距离与采样点1的投影点与摄像头所在平面之间的距离的差值。
可选地,计算该多个采样点中每个采样点对应的差值。
可选地,不同的采样点对应的差值可以相同,也可以不同。
在204中,根据该差值,判断该四边形的区域内是否存在遮挡物。
可选地,可以通过如下方式确定该四边形的区域内是否存在遮挡物:
如果该差值大于第一阈值的采样点占该多个采样点总数的百分比大于第二阈值,则确定在该四边形的区域内存在遮挡物;
如果该差值大于第一阈值的采样点占该多个采样点总数的百分比小于或者等于第二阈值,或如果该每个采样点的该差值小于或等于第一阈值,则确定在该四边形的区域内不存在遮挡物。
可选地,当该四边形的区域内不存在遮挡物时,对该四边形进行文档校正。
因此,在本申请实施例中,通过所拍摄图像的深度数据与基准平面的深度数据的差值,确定所拍摄图像的某一区域内是否存在遮挡物。
可选地,可以作为一个实施例,如图3所示,该方法200还包括以下内容。
在205中,当该四边形的区域内存在遮挡物时,在该四边形的区域内,获取该差值大于第一阈值的第一采样点。
可选地,当该四边形的区域内存在遮挡物时,该四边形区域内的第一深度数据会因遮挡物的影响发生变化,在存在遮挡物的位置该第一深度数据与不存在遮挡物的位置的 该第一深度数据相比,明显变小。
可选地,在存在遮挡物时,第一深度数据与第二深度数据的差值大于第一阈值。
可选地,该差值大于第一阈值的第一采样点可以是该四边形区域内的所有采样点,也可以使该四边形区域内的部分采样点。
在206中,根据该第一采样点的位置,确定第一遮挡物区域。
可选地,获取所有该差值大于第一阈值的该第一采样点,根据这些采样点的位置,确定该第一遮挡物区域。
例如,将所有该差值大于第一阈值的这些第一采样点依次用线段连接起来,线段圈起来的区域确定为该第一遮挡物区域。
在207中,恢复该第一遮挡物区域内的被遮挡信息。
可选地,拍摄第二图像,该第二图像中包括该第一遮挡物区域内的部分或全部被遮挡信息;
根据该部分或全部被遮挡信息,恢复该第一遮挡物区域内的被遮挡信息。
可选地,根据该第一遮挡物区域,生成第一提示信息,该第一提示信息用于提示遮挡物的位置,用户根据该第一提示信息拍摄该第二图像。
可选地,根据该第一遮挡物区域,生成第二提示信息,该第二提示信息用于提示用户移动摄像头,用户根据该第二提示信息拍摄该第二图像。
可选地,根据该第二图像可以恢复该第一遮挡物区域内的部分或全部被遮挡信息。
可选地,可以根据该第一图像的特征点,计算该第一图像的特征描述子;根据该第二图像的特征点,计算该第二图像的特征描述子;根据该第一图像的特征描述子和该第二图像的特征描述子计算该第一图像和该第二图像的变换矩阵;最后根据该第二图像的灰度值和该变换矩阵对该第一遮挡物区域进行插值运算,进而恢该第一遮挡物区域的被遮挡信息。
应理解,特征点可以是位于灰度剧烈变化的区域的较易于与周围象素点区分开,因此易于检测的点,如图像中矩形边框的角点(Corner)。这些点通常可以用其周围一块区域中的点计算出的特征描述子(Feature Descriptor)来描述,同时,特征描述子可以是尺度不变特征转换(Scale-invariant Feature Transform,SIFT)、加速稳健特征(Speeded Up Robust Features,SURF)和方向梯度直方图特征(Histogram of Oriented Gradient,HOG)等,特征描述子通常为一个向量。通过检测不同图像中的特征点并计算各特征点的描述子之间的相似性(如欧式距离等),即可确定两个特征点是否匹配,以实现特征点在不同帧图像间的跟踪。
还应理解,基于对应特征点计算两幅图像之间的变换矩阵可以使用直接线性变换(Direct Linear Transformation,DLT)等线性算法首先计算一个初始值,再使用高斯-牛顿算法(Gauss-Newton algorithm)、梯度下降算法(Gradient Descent)、最小二乘算法(Levenberg-Marquardt,LM)等非线性算法进一步优化。
可选地,可以利用该第一图像恢复该第二图像中的被遮挡区域的信息。
可选地,在恢复该第一遮挡物区域内的被遮挡信息后,对该四边形进行文档校正。
可选地,当根据该第二图像只恢复该第一遮挡物区域内的部分被遮挡信息时,拍摄第三图像,该第三图像中包括该第一遮挡物区域内还未被恢复的部分或全部被遮挡信息。
可选地,根据该第一遮挡物区域,生成第一提示信息,该第一提示信息用于提示还 未被恢复的遮挡物的位置,用户根据该第一提示信息拍摄该第三图像。
可选地,根据该第一遮挡物区域,生成第二提示信息,该第二提示信息用于提示用户移动摄像头,用户根据该第二提示信息拍摄该第三图像。
可选地,根据该第二图像中包括的该第一遮挡物区域内的部分遮挡信息、以及该第三图像中包括的该第一遮挡物区域内的部分被遮挡信息,恢复该第一遮挡物区域内的被遮挡信息。
可选地,同样可以根据该第一图像的特征点,计算该第一图像的特征描述子;根据该第三图像的特征点,计算该第三图像的特征描述子;根据该第一图像的特征描述子和该第三图像的特征描述子计算该第一图像和该第三图像的变换矩阵;最后根据该第三图像的灰度值和该变换矩阵对该第一遮挡物区域进行插值运算,进而恢该第一遮挡物区域的被遮挡信息。
可选地,当该第二图像和该第三图像结合也无法完全恢复该第一图像中的被遮挡区域时,拍摄第四图像、第五图像……,利用多张图像结合来恢复该第一图像中的被遮挡区域。
可选地,当拍摄多张图像也无法恢复该第一图像中的被遮挡区域时,则该第一图像中的某些被遮挡区域无法恢复,返回这部分遮挡区域恢复失败的消息。
例如,如图4是恢复四边形的区域内的遮挡物信息的示意图。如图4所示,确定四边形区域内存在遮挡物时,恢复该四边形的区域内的第一遮挡物区域内的被遮挡信息的具体操作流程如下:
在301中,提取有遮挡的第一图像I0中的特征点,并计算该第一图像I0的特征描述子。
在302中,提取第二图像I1中的特征点,并计算该第二图像I1的特征描述子。
在303中,根据该第一图像I0的特征描述子和该第二图像I1的特征描述子计算该第一图像I0与该第二图像I1之间的变换矩阵H。
在304中,根据该变换矩阵H计算第一图像I0中遮挡物区域M0在第二图像I1的遮挡物区域M01
在305中,判断遮挡物区域M01与第二图像I1的遮挡物区域M1是否存在交集区域Mc
在306中,当遮挡物区域M01与第二图像I1的遮挡物区域M1不存在交集区域Mc时,利用第二图像I1中像素的灰度值和变换矩阵H对第一图像I0的遮挡物区域M0进行插值运算。
例如,如图5所示,第一图像I0的遮挡物区域是M0,第二图像I1的遮挡物区域是M1,根据变换矩阵H计算遮挡物区域M0在第二图像I1的遮挡物区域M01,其中,M01和M1没有交集,则根据该第二图像I1就可以恢复该第一图像I0的遮挡物区域M0。具体恢复方式如下:根据该第二图像I1的灰度值和该第一图像I0与该第二图像I1的变换矩阵H对该遮挡物区域M0进行插值运算。
在307中,遮挡物区域M01与第二图像I1的遮挡物区域M1存在交集区域Mc
在308中,计算M01中的非交集区域Ms,Ms=M01-Mc
在309中,根据变换矩阵H计算Ms在在第一图像I0中的位置Ms0
在310中,利用该第二图像I1的灰度值和该第一图像I0与该第二图像I1的变换矩阵 H对该遮挡物区域Ms0进行插值运算。
在311中,更新该第一图像I0的新遮挡物区域M0新,M0新=M0-Ms0
例如,如图6所示,第一图像I0的遮挡物区域是M0,第二图像I1的遮挡物区域是M1,根据变换矩阵H计算遮挡物区域M0在第二图像I1的遮挡物区域M01,其中,M01和M1存在交集Mc,计算M01中的非交集区域Ms,Ms=M01-Mc,根据变换矩阵H计算Ms在在第一图像I0中的位置Ms0,此时,根据该第二图像I1可以恢复该第一图像I0的遮挡物区域Ms0。具体恢复方式如下:根据该第二图像I1的灰度值和该第一图像I0与该第二图像I1的变换矩阵H对该遮挡物区域Ms0进行插值运算。
在312中,判断新遮挡物区域M0新是否为空集。
在313中,新遮挡物区域M0新为空集时,则该第一图像I0的遮挡物区域M0的遮挡信息已经恢复,结束该恢复遮挡信息的流程。
在314中,新遮挡物区域M0新为非空集。
在315中,判断是否还有新图像。
在316中,如果还有新图像,取得下一张新图像I2,即拍摄第三图像。
在317中,如果没有新图像,返回去除遮挡失败消息。
在318中,在拍摄第三图像后,重新开始执行301操作。
在319中,在返回去除遮挡失败消息后,结束该恢复遮挡信息的流程。
因此,在本申请实施例中,在确定所拍摄的图像的某个区域内存在遮挡物时,通过所拍摄图像的深度数据与基准平面的深度数据,确定遮挡物区域,并恢复被遮挡的信息。
图7是根据本申请实施例的图像处理的设备400的示意性框图。如图7所示,该设备包括:
拍摄模块401,用于拍摄第一图像,该第一图像中包括四边形,该四边形的区域内具有多个采样点;
创建模块402,用于建立基准平面,该基准平面为该四边形所在的平面,该多个采样点在该基准平面上分别具有对应的投影点;
计算模块403,用于计算第一深度数据与第二深度数据的差值,该第一深度数据为该多个采样点中每个采样点与摄像头所在平面之间的距离,该第二深度数据为该每个采样点对应的该投影点与摄像头所在平面之间的距离;
判断模块404,用于根据该差值,判断该四边形的区域内是否存在遮挡物。
可选地,在本申请实施例中,如图8所示,在该创建模块402建立该基准平面之前,该设备400还包括:
确定模块405,用于对该第一图像进行边缘检测,确定至少四条边缘线段;
获取模块406,用于获取该至少四条边缘线段中每一条边缘线段的深度数据;
选取模块407,用于根据该每一条边缘线段的深度数据,选取四条共面的边缘线段构成该四边形。
可选地,在本申请实施例中,图8中所包括的模块不一定都存在。
可选地,该根据该差值,该判断模块404具体用于:
如果该差值大于第一阈值的采样点占该多个采样点总数的百分比大于第二阈值,则确定在该四边形的区域内存在遮挡物;
如果该差值大于第一阈值的采样点占该多个采样点总数的百分比小于或者等于第二 阈值,或如果该每个采样点的该差值小于或等于第一阈值,则确定在该四边形的区域内不存在遮挡物。
可选地,在本申请实施例中,如图8所示,该终端400还包括:
当该判断模块404确定该四边形的区域内存在遮挡物时,该获取模块406还用于在该四边形的区域内,获取该差值大于第一阈值的第一采样点;
该确定模块405还用于根据该第一采样点的位置,确定第一遮挡物区域;
恢复模块408,用于恢复该第一遮挡物区域内的被遮挡信息。
可选地,在本申请实施例中,图8中所包括的模块不一定都存在。
可选地,该恢复模块408具体用于:
该拍摄模块401还用于拍摄第二图像,该第二图像中包括该第一遮挡物区域内的部分或全部被遮挡信息;
根据该部分或全部被遮挡信息,该恢复模块408恢复该第一遮挡物区域内的被遮挡信息。
可选地,该恢复模块408具体用于:
该拍摄模块401还用于拍摄第三图像,该第三图像中包括该第一遮挡物区域内的部分被遮挡信息;
根据该第二图像中包括的该第一遮挡物区域内的部分遮挡信息、以及该第三图像中包括的该第一遮挡物区域内的部分被遮挡信息,该恢复模块408恢复该第一遮挡物区域内的被遮挡信息。
可选地,在本申请实施例中,如图8所示,该终端400还包括:
生成模块409,用于生成第一提示信息,该第一提示信息用于提示遮挡物的位置。
可选地,在本申请实施例中,图8中的所包括的模块不一定都存在。
可选地,该生成模块409还用于:
生成第二提示信息,该第二提示信息用于提示用户移动摄像头。
可选地,在本申请实施例中,如图8所示,该终端400还包括:文档校正模块410,用于在该恢复模块408恢复该第一遮挡物区域内的被遮挡信息后,对该四边形进行文档校正。
可选地,该文档校正模块410还用于:当该判断模块404确定该四边形的区域内存在遮挡物时,对该四边形进行文档校正。
可选地,该第一图像为包括文档信息的图像。
应理解,根据本申请实施例的图像处理的设备400中的各个模块的上述和其它操作和/或功能分别为了实现本申请实施例的方法200的相应流程,为了简洁,在此不再赘述。
图9示出了本申请实施例提供的图像处理的设备500的示意性框图,该设备500包括:
存储器510,用于存储程序代码;
摄像头520,用于拍摄图像;
处理器530,用于执行存储器510中的程序代码。
可选地,该程序代码中包括本申请实施例的方法200的各个操作的执行代码。
应理解,在本申请实施例中,该处理器530可以是中央处理单元(Central Processing Unit,简称为“CPU”),该处理器530还可以是其他通用处理器、数字信号处理器(DSP)、 专用集成电路(ASIC)、现成可编程门阵列(FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。
该存储器510可以包括只读存储器和随机存取存储器,并向处理器530提供指令和数据。存储器510的一部分还可以包括非易失性随机存取存储器。例如,存储器510还可以存储设备类型的信息。
可选地,所述摄像头520,用于拍摄第一图像,第一图像中包括四边形,所述四边形的区域内具有多个采样点;
所述处理器530,用于建立基准平面,所述基准平面为所述四边形所在的平面,所述多个采样点在所述基准平面上分别具有对应的投影点;
所述处理器530,用于计算第一深度数据与第二深度数据的差值,所述第一深度数据为所述多个采样点中每个采样点与摄像头所在平面之间的距离,所述第二深度数据为所述每个采样点对应的所述投影点与摄像头所在平面之间的距离;
所述处理器530,用于根据所述差值,判断所述四边形的区域内是否存在遮挡物。
可选地,在建立所述基准平面之前,所述处理器530还用于:
对所述第一图像进行边缘检测,确定至少四条边缘线段;
获取所述至少四条边缘线段中每一条边缘线段的深度数据;
根据所述每一条边缘线段的深度数据,选取四条共面的边缘线段构成所述四边形。
可选地,所述处理器530还用于:
如果所述差值大于第一阈值的采样点占所述多个采样点总数的百分比大于第二阈值,则确定在所述四边形的区域内存在遮挡物;
如果所述差值大于第一阈值的采样点占所述多个采样点总数的百分比小于或者等于第二阈值,或如果所述每个采样点的所述差值小于或等于第一阈值,则确定在所述四边形的区域内不存在遮挡物。
可选地,所述处理器530还用于:
当确定所述四边形的区域内存在遮挡物时,在所述四边形的区域内,获取所述差值大于第一阈值的第一采样点;
根据所述第一采样点的位置,确定第一遮挡物区域;
恢复所述第一遮挡物区域内的被遮挡信息。
可选地,所述摄像头520还用于拍摄第二图像,所述第二图像中包括所述第一遮挡物区域内的部分或全部被遮挡信息;
所述处理器530还用于根据所述部分或全部被遮挡信息,恢复所述第一遮挡物区域内的被遮挡信息。
可选地,所述拍摄头520还用于拍摄第三图像,所述第三图像中包括所述第一遮挡物区域内的部分被遮挡信息;
所述处理器530还用于根据所述第二图像中包括的所述第一遮挡物区域内的部分遮挡信息、以及所述第三图像中包括的所述第一遮挡物区域内的部分被遮挡信息,恢复所述第一遮挡物区域内的被遮挡信息。
可选地,所述处理器530还用于:
生成第一提示信息,所述第一提示信息用于提示遮挡物的位置。
可选地,所述处理器530还用于:
生成第二提示信息,所述第二提示信息用于提示用户移动摄像头。
可选地,所述处理器530还用于:在恢复所述第一遮挡物区域内的被遮挡信息后,对所述四边形进行文档校正。
可选地,所述处理器530还用于:当所述四边形的区域内不存在遮挡物时,对所述四边形进行文档校正。
可选地,所述第一图像为包括文档信息的图像。
在实现过程中,上述各步骤可以通过处理器530中的硬件的集成逻辑电路或者软件形式的指令完成。结合本申请实施例所公开的方法200的步骤可以直接体现为硬件处理器执行完成,或者用处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器,处理器530读取存储器中的信息,结合其硬件完成上述方法200的步骤。为避免重复,这里不再详细描述。
图10是根据本申请实施例的终端设备600的示意性结构图。如图10所示,该终端设备600包括本申请实施例的图像处理的设备400、显示面板610、只读存储器(Read Only Memory,ROM)620、随机存取存储器(Random Access Memory,RAM)630、寄存器640以及按键651至按键655。
其中,该图像处理的设备400可以感知按键651至按键655的按键事件,例如拍摄事件、图像处理事件或文档校正事件,并可以进行相应的图像处理功能。
该ROM620可以用于存储该图像处理的设备400需要执行的代码,该RAM630用于该图像处理的设备400执行存储在ROM620上的代码,从而实现对应的功能;该寄存器640用于存储该终端设备600的启动该图像处理的设备400的类型。
该显示面板610用于显示该终端设备600的运行状态以及该图像处理的设备400进行图像处理的情况,为用户提供友好的交互界面,能够引导用户进行相关操作。
可选地,该显示面板610上可以覆盖有触摸屏,该触摸屏可以用于感知用户的触摸事件,并控制该图像处理的设备400进行相应的操作,给予用户一定的选择空间。
应理解,在本申请实施例中,图10所示的终端设备600结构并不构成对终端设备的限定,该终端设备600还可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。
本申请实施例提供了一种计算机可读存储介质,用于存储指令,当该指令在计算机上运行时,该计算机可以用于执行上述本申请实施例的图像处理的方法200。该可读介质可以是ROM或RAM,本申请实施例对此不做限制。
应理解,本文中术语“和/或”以及“A或B中的至少一种”,仅仅是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。另外,本文中字符“/”,一般表示前后关联对象是一种“或”的关系。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认 为超出本申请的范围。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,该单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
该作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。
该功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器、随机存取存储器、磁碟或者光盘等各种可以存储程序代码的介质。
以上该,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应该以权利要求的保护范围为准。

Claims (23)

  1. 一种图像处理的方法,其特征在于,包括:
    拍摄第一图像,所述第一图像中包括四边形,所述四边形的区域内具有多个采样点;
    建立基准平面,所述基准平面为所述四边形所在的平面,所述多个采样点在所述基准平面上分别具有对应的投影点;
    计算第一深度数据与第二深度数据的差值,所述第一深度数据为所述多个采样点中每个采样点与摄像头所在平面之间的距离,所述第二深度数据为所述每个采样点对应的所述投影点与摄像头所在平面之间的距离;
    根据所述差值,判断所述四边形的区域内是否存在遮挡物。
  2. 根据权利要求1所述的方法,其特征在于,
    在建立所述基准平面之前,所述方法还包括:
    对所述第一图像进行边缘检测,确定至少四条边缘线段;
    获取所述至少四条边缘线段中每一条边缘线段的深度数据;
    根据所述每一条边缘线段的深度数据,选取四条共面的边缘线段构成所述四边形。
  3. 根据权利要求1或2所述的方法,其特征在于,所述根据所述差值,判断所述四边形的区域内是否存在遮挡物,包括:
    如果所述差值大于第一阈值的采样点占所述多个采样点总数的百分比大于第二阈值,则确定在所述四边形的区域内存在遮挡物;
    如果所述差值大于第一阈值的采样点占所述多个采样点总数的百分比小于或者等于第二阈值,或如果所述每个采样点的所述差值小于或等于第一阈值,则确定在所述四边形的区域内不存在遮挡物。
  4. 根据权利要求1至3中任一所述的方法,其特征在于,所述方法还包括:
    当所述四边形的区域内存在遮挡物时,在所述四边形的区域内,获取所述差值大于第一阈值的第一采样点;
    根据所述第一采样点的位置,确定第一遮挡物区域;
    恢复所述第一遮挡物区域内的被遮挡信息。
  5. 根据权利要求4所述的方法,其特征在于,所述恢复所述第一遮挡物区域内的被遮挡信息,包括:
    拍摄第二图像,所述第二图像中包括所述第一遮挡物区域内的部分或全部被遮挡信息;
    根据所述部分或全部被遮挡信息,恢复所述第一遮挡物区域内的被遮挡信息。
  6. 根据权利要求5所述的方法,其特征在于,所述恢复所述第一遮挡物区域内的被遮挡信息,还包括:
    拍摄第三图像,所述第三图像中包括所述第一遮挡物区域内的部分被遮挡信息;
    根据所述第二图像中包括的所述第一遮挡物区域内的部分遮挡信息、以及所述第三图像中包括的所述第一遮挡物区域内的部分被遮挡信息,恢复所述第一遮挡物区域内的被遮挡信息。
  7. 根据权利要求1至6中任一所述的方法,其特征在于,所述方法还包括:
    生成第一提示信息,所述第一提示信息用于提示遮挡物的位置。
  8. 根据权利要求中1至7中任一所述的方法,其特征在于,所述方法还包括:
    生成第二提示信息,所述第二提示信息用于提示用户移动摄像头。
  9. 根据权利要求4至8中任一所述的方法,其特征在于,所述方法还包括:在恢复所述第一遮挡物区域内的被遮挡信息后,对所述四边形进行文档校正。
  10. 根据权利要求1至3中任一所述的方法,其特征在于,所述方法还包括:所述四边形的区域内不存在遮挡物时,对所述四边形进行文档校正。
  11. 根据权利要求1至10中任一所述的方法,其特征在于,所述第一图像为包括文档信息的图像。
  12. 一种图像处理的设备,其特征在于,包括:摄像头、处理器和存储器,
    所述摄像头,用于拍摄第一图像,第一图像中包括四边形,所述四边形的区域内具有多个采样点;
    所述处理器,用于建立基准平面,所述基准平面为所述四边形所在的平面,所述多个采样点在所述基准平面上分别具有对应的投影点;
    所述处理器,用于计算第一深度数据与第二深度数据的差值,所述第一深度数据为所述多个采样点中每个采样点与摄像头所在平面之间的距离,所述第二深度数据为所述每个采样点对应的所述投影点与摄像头所在平面之间的距离;
    所述处理器,用于根据所述差值,判断所述四边形的区域内是否存在遮挡物。
  13. 根据权利要求12所述的设备,其特征在于,在建立所述基准平面之前,所述处理器还用于:
    对所述第一图像进行边缘检测,确定至少四条边缘线段;
    获取所述至少四条边缘线段中每一条边缘线段的深度数据;
    根据所述每一条边缘线段的深度数据,选取四条共面的边缘线段构成所述四边形。
  14. 根据权利要求12或13所述的设备,其特征在于,所述处理器还用于:
    如果所述差值大于第一阈值的采样点占所述多个采样点总数的百分比大于第二阈值,则确定在所述四边形的区域内存在遮挡物;
    如果所述差值大于第一阈值的采样点占所述多个采样点总数的百分比小于或者等于第二阈值,或如果所述每个采样点的所述差值小于或等于第一阈值,则确定在所述四边形的区域内不存在遮挡物。
  15. 根据权利要求12至14中任一所述的设备,其特征在于,所述处理器还用于:
    当确定所述四边形的区域内存在遮挡物时,在所述四边形的区域内,获取所述差值大于第一阈值的第一采样点;
    根据所述第一采样点的位置,确定第一遮挡物区域;
    恢复所述第一遮挡物区域内的被遮挡信息。
  16. 根据权利要求15所述的设备,其特征在于,
    所述摄像头还用于拍摄第二图像,所述第二图像中包括所述第一遮挡物区域内的部分或全部被遮挡信息;
    所述处理器还用于根据所述部分或全部被遮挡信息,恢复所述第一遮挡物区域内的被遮挡信息。
  17. 根据权利要求16所述的设备,其特征在于,
    所述拍摄头还用于拍摄第三图像,所述第三图像中包括所述第一遮挡物区域内的部分被遮挡信息;
    所述处理器还用于根据所述第二图像中包括的所述第一遮挡物区域内的部分遮挡信息、以及所述第三图像中包括的所述第一遮挡物区域内的部分被遮挡信息,恢复所述第一遮挡物区域内的被遮挡信息。
  18. 根据权利要求12至17中任一所述的设备,其特征在于,所述处理器还用于:
    生成第一提示信息,所述第一提示信息用于提示遮挡物的位置。
  19. 根据权利要求中12至18中任一所述的设备,其特征在于,所述处理器还用于:
    生成第二提示信息,所述第二提示信息用于提示用户移动摄像头。
  20. 根据权利要求15至19中任一所述的设备,其特征在于,所述处理器还用于:在恢复所述第一遮挡物区域内的被遮挡信息后,对所述四边形进行文档校正。
  21. 根据权利要求12至14中任一所述的设备,其特征在于,所述处理器还用于:当所述四边形的区域内不存在遮挡物时,对所述四边形进行文档校正。
  22. 根据权利要求12至21中任一所述的设备,其特征在于,所述第一图像为包括文档信息的图像。
  23. 一种计算机可读存储介质,包括指令,当其在计算机上运行时,使得计算机执行权利要求1至11中任一所述的方法。
PCT/CN2017/072949 2017-02-06 2017-02-06 图像处理的方法和设备 WO2018141109A1 (zh)

Priority Applications (3)

Application Number Priority Date Filing Date Title
US16/483,950 US11074679B2 (en) 2017-02-06 2017-02-06 Image correction and display method and device
CN201780005041.1A CN108513664B (zh) 2017-02-06 2017-02-06 图像处理的方法和设备
PCT/CN2017/072949 WO2018141109A1 (zh) 2017-02-06 2017-02-06 图像处理的方法和设备

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2017/072949 WO2018141109A1 (zh) 2017-02-06 2017-02-06 图像处理的方法和设备

Publications (1)

Publication Number Publication Date
WO2018141109A1 true WO2018141109A1 (zh) 2018-08-09

Family

ID=63039363

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/072949 WO2018141109A1 (zh) 2017-02-06 2017-02-06 图像处理的方法和设备

Country Status (3)

Country Link
US (1) US11074679B2 (zh)
CN (1) CN108513664B (zh)
WO (1) WO2018141109A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021026642A1 (en) * 2019-08-13 2021-02-18 Avigilon Corporation Method and system for enhancing use of two-dimensional video analytics by using depth data

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110266913A (zh) * 2019-06-17 2019-09-20 苏州佳世达光电有限公司 影像捕获设备及影像补偿方法
CN111624617A (zh) * 2020-05-28 2020-09-04 联想(北京)有限公司 一种数据处理方法及电子设备
CN114885086A (zh) * 2021-01-21 2022-08-09 华为技术有限公司 一种图像处理方法、头戴式设备和计算机可读存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2007194873A (ja) * 2006-01-19 2007-08-02 Kyocera Mita Corp 画像形成システムおよび画像形成装置
CN101267493A (zh) * 2007-03-16 2008-09-17 富士通株式会社 透视变形文档图像的校正装置和校正方法
JP2010276642A (ja) * 2009-05-26 2010-12-09 Oki Data Corp 複合装置
CN103366165A (zh) * 2012-03-30 2013-10-23 富士通株式会社 图像处理装置、图像处理方法以及设备

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN100576934C (zh) 2008-07-03 2009-12-30 浙江大学 基于深度和遮挡信息的虚拟视点合成方法
KR20110102321A (ko) 2008-12-08 2011-09-16 알까뗄 루슨트 이미지 움직임 검출 방법 및 장치
CN102510506B (zh) 2011-09-30 2014-04-16 北京航空航天大学 一种基于双目图像和距离信息的虚实遮挡处理方法
CN103310218B (zh) 2013-05-21 2016-08-10 常州大学 一种重叠遮挡果实精确识别方法
US9122921B2 (en) * 2013-06-12 2015-09-01 Kodak Alaris Inc. Method for detecting a document boundary
JP6120989B2 (ja) * 2013-12-09 2017-04-26 株式会社Pfu オーバーヘッド型画像読取装置、画像処理方法、および、プログラム
US20150187101A1 (en) * 2013-12-30 2015-07-02 Trax Technology Solutions Pte Ltd. Device and method with orientation indication
CN104657993B (zh) 2015-02-12 2018-04-17 北京格灵深瞳信息技术有限公司 一种镜头遮挡检测方法及装置
US9684965B1 (en) * 2015-12-01 2017-06-20 Sony Corporation Obstacle removal using point cloud and depth map data
CN106327445A (zh) 2016-08-24 2017-01-11 王忠民 一种图像处理方法、装置及摄影器材、使用方法

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2007194873A (ja) * 2006-01-19 2007-08-02 Kyocera Mita Corp 画像形成システムおよび画像形成装置
CN101267493A (zh) * 2007-03-16 2008-09-17 富士通株式会社 透视变形文档图像的校正装置和校正方法
JP2010276642A (ja) * 2009-05-26 2010-12-09 Oki Data Corp 複合装置
CN103366165A (zh) * 2012-03-30 2013-10-23 富士通株式会社 图像处理装置、图像处理方法以及设备

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021026642A1 (en) * 2019-08-13 2021-02-18 Avigilon Corporation Method and system for enhancing use of two-dimensional video analytics by using depth data
US11303877B2 (en) 2019-08-13 2022-04-12 Avigilon Corporation Method and system for enhancing use of two-dimensional video analytics by using depth data

Also Published As

Publication number Publication date
CN108513664A (zh) 2018-09-07
US20200051226A1 (en) 2020-02-13
CN108513664B (zh) 2019-11-29
US11074679B2 (en) 2021-07-27

Similar Documents

Publication Publication Date Title
TWI683259B (zh) 一種相機姿態資訊確定的方法及相關裝置
US11846877B2 (en) Method and terminal for acquiring panoramic image
CN106558025B (zh) 一种图片的处理方法和装置
EP3370204B1 (en) Method for detecting skin region and device for detecting skin region
WO2021036536A1 (zh) 视频拍摄方法及电子设备
US20210097715A1 (en) Image generation method and device, electronic device and storage medium
CN108989672B (zh) 一种拍摄方法及移动终端
US20210158560A1 (en) Method and device for obtaining localization information and storage medium
CN108038825B (zh) 一种图像处理方法及移动终端
WO2018141109A1 (zh) 图像处理的方法和设备
CN108307106B (zh) 一种图像处理方法、装置及移动终端
WO2016019926A1 (zh) 照片拍摄方法、装置及移动终端
CN108776822B (zh) 目标区域检测方法、装置、终端及存储介质
WO2018219275A1 (zh) 对焦方法、装置、计算机可读存储介质和移动终端
WO2015196715A1 (zh) 图像重定位方法、装置及终端
WO2022017140A1 (zh) 目标检测方法及装置、电子设备和存储介质
WO2020187065A1 (zh) 一种视频的评价方法、终端、服务器及相关产品
WO2023273499A1 (zh) 深度检测方法及装置、电子设备和存储介质
WO2023273498A1 (zh) 深度检测方法及装置、电子设备和存储介质
WO2017020671A1 (zh) 视频交互方法、装置及视频源设备
US10270963B2 (en) Angle switching method and apparatus for image captured in electronic terminal
CN105513098B (zh) 一种图像处理的方法和装置
US9665925B2 (en) Method and terminal device for retargeting images
CN111556248A (zh) 拍摄方法、装置、存储介质及移动终端
CN110996003B (zh) 一种拍照定位方法、装置及移动终端

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17895152

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17895152

Country of ref document: EP

Kind code of ref document: A1