WO2018072483A1 - 图像分割方法、图像分割系统和存储介质及包括其的设备 - Google Patents

图像分割方法、图像分割系统和存储介质及包括其的设备 Download PDF

Info

Publication number
WO2018072483A1
WO2018072483A1 PCT/CN2017/091986 CN2017091986W WO2018072483A1 WO 2018072483 A1 WO2018072483 A1 WO 2018072483A1 CN 2017091986 W CN2017091986 W CN 2017091986W WO 2018072483 A1 WO2018072483 A1 WO 2018072483A1
Authority
WO
WIPO (PCT)
Prior art keywords
connected domain
target object
image
depth
pixel points
Prior art date
Application number
PCT/CN2017/091986
Other languages
English (en)
French (fr)
Inventor
赵骥伯
唐小军
Original Assignee
京东方科技集团股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 京东方科技集团股份有限公司 filed Critical 京东方科技集团股份有限公司
Priority to EP17835447.8A priority Critical patent/EP3537375B1/en
Priority to US15/750,410 priority patent/US10650523B2/en
Publication of WO2018072483A1 publication Critical patent/WO2018072483A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/155Segmentation; Edge detection involving morphological operators
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/194Segmentation; Edge detection involving foreground-background segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/60Analysis of geometric attributes
    • G06T7/62Analysis of geometric attributes of area, perimeter, diameter or volume
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/255Detecting or recognising potential candidate objects based on visual cues, e.g. shapes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/107Static hand or arm
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/107Static hand or arm
    • G06V40/113Recognition of static hand signs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • G06V40/28Recognition of hand or arm movements, e.g. recognition of deaf sign language
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20112Image segmentation details
    • G06T2207/20132Image cropping
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds

Definitions

  • Embodiments of the present invention relate to an image segmentation method, an image segmentation system, and a storage medium, and an apparatus including the same.
  • gesture recognition can be applied to a smart device such as an AR glasses, and an image including a human hand is captured by a camera in the smart device, and the image is subjected to hand segmentation processing to acquire an image having only a hand, and then the hand is only The image is analyzed to know the movements of the human hand (ie gestures) or to extract fingertip information.
  • hand segmentation processing directly affects the accuracy of subsequent feature extraction and recognition.
  • Embodiments of the present invention provide an image segmentation method, an image segmentation system, and a storage medium, and an apparatus including the same, and an embodiment of the present invention can obtain an image of a high-quality target object.
  • At least one embodiment of the present invention provides an image segmentation method including: obtaining a connected domain in which a target object is located from a depth image; determining a primary direction or a secondary direction of the connected domain by a principal component analysis method; A relationship between a form of the target object and the main direction or the secondary direction, and an image of the target object is acquired from the connected domain.
  • the obtaining the connected domain in which the target object is located from the depth image includes: detecting all connected domains in the depth image and the same parameter of each connected domain; and using the connected domain having the set parameter as the target object The connected domain.
  • the same parameter is a minimum depth value and the set parameter is a minimum minimum depth value.
  • the image segmentation method further includes: step S11, adding a set pixel point in the depth image as an initial point and adding it to a setting queue; and step S12, determining that the initial point is empty An adjacent pixel adjacent to each other; step S13, calculating an absolute value of a depth difference between the adjacent pixel and the initial point, wherein an absolute value of the depth difference is less than or equal to a set depth In case of difference, the adjacent pixel points are added to the connected domain in which the initial point is located; in step S14, the adjacent pixel points are taken as the next initial point and added to the set queue; The above steps S12 to S14 are repeated to determine the connected domain in which the initial point is located.
  • the set depth difference is 10 mm to 15 mm.
  • the acquiring an image of the target object includes determining a trend of the number of pixel points at the plurality of locations in the connected domain along the main direction, wherein the pixel at each location is along the connected domain
  • the secondary directions are sequentially arranged, the secondary direction being perpendicular to the main direction; comparing the change trend with a change trend of the shape of the target object in the main direction; and in the connected domain according to the comparison result
  • a division position for acquiring an image of the target object is determined.
  • the acquiring an image of the target object includes: determining a true width of the connected domain in a secondary direction at each of the plurality of locations, wherein the secondary direction is perpendicular to the primary direction, the plurality of Positions are sequentially arranged along the main direction, pixel points at each position are sequentially arranged in the secondary direction; and the true width is compared with a reference width to determine in the connected domain for acquiring the target The split position of the image of the object.
  • the target object is a human hand
  • the reference width is 40 mm to 100 mm.
  • determining the connected domain in each of the pixels according to the number of pixel points at each position, the average depth value of the pixel point at each position, and the focal length ratio of the camera acquiring the depth image The true width at the location.
  • the acquiring an image of the target object further includes: determining a true distance between each of the plurality of locations to a vertex of the connected domain; and comparing the true distance to a reference length to determine Describe the location.
  • the target object is a human hand
  • the reference length is 40 mm to 200 mm.
  • calculating each of the locations to the location according to a difference between an average depth between each of the two positions between the adjacent positions between the vertices of the connected domain and a true distance along the main direction The true distance of the vertices of the connected domain.
  • the acquiring an image of the target object further includes: acquiring a plurality of reference positions in the plurality of positions; calculating a difference value of coordinates between two pixel points farthest apart at each reference position; The magnitude relationship between the difference value and the number of pixel points at each of the reference positions, The segmentation position is determined.
  • the plurality of reference positions include a first reference position and a second reference position, and a difference between coordinates of two pixel points farthest apart from the first reference position is ⁇ X1 greater than 0, the first The number of pixel points at a reference position is N1, and in the case where ( ⁇ X1-N1)/N1 is less than or equal to a set value and the set value is 10% to 15%, the first reference position is taken as The split position.
  • the difference between the coordinates between the two pixel points farthest apart at the second reference position is ⁇ X2 greater than 0, and the number of pixel points at the second reference position is N2, at ( ⁇ X2- When N2)/N2 is greater than the set value, the distance from the divided position to the second reference position is greater than a set distance, and the set distance is 24 mm to 26 mm.
  • At least one embodiment of the present invention also provides an image segmentation system comprising: a first image segmentation device configured to process a depth image to obtain a connected domain in which the target object is located from the depth image; a device coupled to the first image segmentation device and configured to determine a primary direction or a secondary direction of the connected domain acquired by the first image segmentation device by a principal component analysis method; and a second image segmentation device, It is coupled to the analysis device and configured to acquire an image of the target object from the connected domain based on a relationship between a morphology of the target object and the primary or secondary direction.
  • the second image segmentation device includes: a computing device coupled to the analysis device and configured to calculate a number of pixel points at a plurality of locations in the connected domain, and determine the pixel point a trend of the number along the main direction, wherein the pixel points at each position are sequentially arranged along a secondary direction of the connected domain, the secondary direction being perpendicular to the main direction; and comparing means, and the calculating The device is coupled and configured to compare the change trend with a change trend of the shape of the target object in the main direction to determine a split position for acquiring an image of the target object in the connected domain .
  • the second image segmentation device includes: a computing device coupled to the analysis device and configured to calculate a true width of the connected domain in a secondary direction at each of a plurality of locations, wherein The secondary direction is perpendicular to the main direction, the plurality of positions are sequentially arranged along the main direction, pixel points at each position are sequentially arranged in the secondary direction; and a comparing device is connected to the computing device And configured to compare the true width to a reference width to determine a segmentation location for acquiring an image of the target object in the connected domain.
  • the computing device is further configured to calculate a true distance of each of the plurality of locations to a vertex of the connected domain, the comparing means further configured to compare the true distance to a reference length The segmentation position is determined.
  • At least one embodiment of the present invention also provides an image segmentation system including a processor, a memory, and computer program instructions stored in the memory, the computer program instructions being executed by the processor: from Obtaining a connected domain in which the target object is located in the depth image; determining a primary direction or a secondary direction of the connected domain by a principal component analysis method; and acquiring according to a relationship between the shape of the target object and the primary direction or the secondary direction The target object image.
  • At least one embodiment of the present invention also provides a storage medium in which computer program instructions are stored, the computer program instructions being adapted to be loaded and executed by a processor: obtaining a connected domain in which a target object is located from a depth image; An analysis method determines a primary direction or a secondary direction of the connected domain; and acquires the target object image according to a relationship between a shape of the target object and the primary direction or a secondary direction.
  • At least one embodiment of the present invention also provides an apparatus comprising the image segmentation system of any of the above or the storage medium described above.
  • FIG. 1 is a flowchart of an image segmentation method according to an embodiment of the present invention
  • FIG. 2 is a schematic diagram showing a distribution of pixel points p and their neighbors in a depth image according to an embodiment of the present invention
  • FIG. 3a is a schematic view showing a depth image taken in the case where the depth camera is opposed to the front side of the user in the embodiment of the present invention
  • FIG. 3b is a view schematically showing a depth image taken in the case where the depth camera is oriented in the same direction as the front direction of the user in the embodiment of the present invention
  • FIG. 4 is a schematic diagram of a connected domain of a hand obtained according to an embodiment of the present invention.
  • FIG. 5 is a view schematically showing a boundary of a connected domain where a hand is obtained according to an embodiment of the present invention
  • FIG. 6 is a view schematically showing a main direction of a connected domain where a hand is obtained according to an embodiment of the present invention
  • FIG. 7 is a diagram showing coordinates of a depth image rotated according to a main direction of a connected domain of a hand according to an embodiment of the present invention; a schematic diagram of the new coordinate system obtained after the system;
  • FIG. 8 is a schematic diagram of multiple reference positions in an embodiment of the present invention.
  • FIG. 9 is a schematic diagram of multiple reference positions in another embodiment of the present invention.
  • Figure 10 is a schematic illustration of a hand image obtained in accordance with an embodiment of the invention.
  • FIG. 11 is a flowchart of an image segmentation method according to an embodiment of the present invention.
  • FIG. 12 is a structural block diagram of an image segmentation system according to an embodiment of the present invention.
  • FIG. 13 is a structural block diagram of another image segmentation system according to an embodiment of the present invention.
  • Embodiments of the present invention provide an image segmentation method, an image segmentation system, and a storage medium, and an apparatus including the image segmentation system or a storage medium.
  • the connected domain of the target object is obtained from the depth image, and then the principal direction of the connected domain is determined by the principal component analysis method, and then according to the shape of the target object and the main direction.
  • the relationship between the secondary directions acquires an image of the target object from the connected domain, and the image of the target object can be extracted from the depth image and an image of the high quality target object can be obtained.
  • embodiments of the present invention may be used for hand segmentation in gesture recognition, in which case,
  • the target object is a human hand, so that the embodiment of the invention can obtain a high-quality hand image, thereby reducing the scanning range of the fingertip extraction and reducing the error rate, and providing accurate test samples for machine learning-based gesture recognition.
  • the embodiment of the present invention can also be used for any other image segmentation except the gesture recognition. The embodiment of the present invention is only described by taking the hand segmentation in the gesture recognition as an example.
  • At least one embodiment of the present invention provides an image segmentation method, including: step S1, obtaining a connected domain in which a target object is located from a depth image; and step S2, determining the connected domain by a principal component analysis method a main direction or a secondary direction; and step S3, acquiring an image of the target object from the connected domain according to a relationship between the shape of the target object and the primary direction or the secondary direction, to image the target object from the depth image Extract it out.
  • the depth image is a two-dimensional depth image acquired by the depth camera, and the value of each pixel in the two-dimensional depth image is the distance from the pixel point to the depth camera (ie, the depth value), usually in millimeters (mm). ) as a unit. It should be noted that if there is no object at the pixel point in the depth image, the depth value of the pixel point is represented by 0. For depth values that are not zero, the larger the depth value, the further the pixel point is from the camera.
  • the resolution of the depth image can be set according to actual needs, which is not limited here.
  • the resolution of the depth image may be 320 ⁇ 240.
  • the resolution of the depth image can be reduced, for example, to 160 ⁇ 120.
  • the connected domain is composed of a plurality of pixel points that are connected to each other, that is, each pixel in the connected domain and the adjacent pixel adjacent to it in space satisfy the similarity rule, for example,
  • the similarity rule is that the absolute value of the depth difference between each pixel point and the adjacent pixel point is less than or equal to the set depth difference.
  • the set depth difference is 10 mm to 15 mm to obtain a connected domain of the high quality target object.
  • the target object as a human hand as an example
  • the depth value between the spatially adjacent pixels usually does not change. Therefore, setting the depth difference to 10 mm to 15 mm is advantageous. Get connected domains of high quality hands.
  • embodiments for setting the depth difference include, but are not limited to, 10 mm to 15 mm.
  • the two pixel points are spatially adjacent, which will be described below in conjunction with FIG. 2.
  • the pixel point p has four spatially adjacent neighbors thereof. Pixels, that is, pixels whose coordinates are directly above (x, y+1), directly below (x, y-1), positive left (x-1, y), and right (x+1, y)
  • the adjacent pixels in the four horizontal and vertical directions constitute a 4 neighborhood of the pixel point p; in the diagonal direction, the pixel point p has four adjacent pixel points adjacent to the space, that is, the coordinates respectively Pixels for the upper left corner (x-1, y+1), the upper right corner (x+1, y+1), the lower left corner (x-1, y-1), and the lower right corner (x+1, y-1)
  • the adjacent pixel points in the four diagonal directions constitute a diagonal neighborhood of the pixel point p; in addition, the above 4 neighborhoods and diagonal neighborhoods constitute 8 neighborhoods of the pixel point
  • acquiring the connected domain in which the target object is located from the depth image in step S1 includes: detecting all connected domains and each connected in the depth image. The same parameter of the domain; and the connected domain with the set parameters as the connected domain where the target object is located.
  • the same parameter is a minimum depth value and the set parameter is a minimum minimum depth value.
  • the connected domain in which the target object is located from the depth image in step S1 includes: detecting all connected domains in the depth image and minimum depth values of each connected domain; and connecting the smallest minimum depth value The domain is the connected domain where the target object is located.
  • the minimum depth value of each connected domain refers to the depth value of the pixel with the smallest depth value among all the pixels in the connected domain.
  • Fig. 3a shows a depth image taken with the depth camera facing the front side of the user
  • Fig. 3b shows the depth image taken with the depth camera facing the same frontal orientation as the user.
  • the depth image generally includes a plurality of connected domains. Since the hand is closest to the depth camera, in the depth image, the connected domain where the hand is located is the connected domain with the smallest depth value. According to this, all connected domains in the depth image and the minimum depth value of each connected domain can be detected, and then the connected domain having the smallest minimum depth value is used as the connected domain in which the hand is located.
  • the connected domain in which the acquired hand is located may be as shown in the image of the hand in Figure 4.
  • the connected domain in which the target object is located may also be extracted according to other parameters of the connected domain other than the minimum depth value (eg, the contour shape, length, area, etc. of the connected domain).
  • the image segmentation method provided by at least one embodiment of the present disclosure may include the following steps S11 to S14, which are described in detail below.
  • Step S11 The set pixel point in the depth image is taken as an initial point and added to the setting queue.
  • step S11 the set pixel is selected by the processor of the depth image or by an image processing algorithm.
  • the setting queue is a FIFO (First Input First Output) queue, that is, a first in first out queue, in which the first incoming instruction is completed and retired, and then the second instruction is executed.
  • FIFO First Input First Output
  • Step S12 determining adjacent pixel points spatially adjacent to the initial point.
  • the 4 neighborhoods or the 8 neighborhoods of the initial point may be detected to determine adjacent pixel points that are spatially adjacent thereto.
  • Step S13 calculating an absolute value of the depth difference between the adjacent pixel point and the initial point, and adding the adjacent pixel point to the initial point if the absolute value of the depth difference is less than or equal to the set depth difference Connected domain.
  • the set depth difference may be 10 mm to 15 mm, for example, 10 mm, 13 mm, or 15 mm.
  • Step S14 The adjacent pixel point is taken as the next initial point and added to the setting queue, so as to subsequently check the neighborhood thereof.
  • steps S12 to S14 are repeated until each pixel in the set queue is processed, thereby determining the connected domain in which the initial point is located. After detecting all connected domains in the depth image in a similar manner, the connected domain having the smallest minimum depth value is selected, thereby detecting the connected domain in which the target object is located.
  • the depth values of the remaining pixels of the depth image except for the pixels in the connected domain may be set to a null value of zero.
  • the depth values of the remaining pixels are set to 0 except for the pixels in the connected domain where the hand is located.
  • the upper, lower, left, and right boundaries of each connected domain may also be detected, and then the connected domain where the target object is located may be selected; After the domain, the boundary of the connected domain of the hand is detected.
  • the boundary of the connected domain where the target object is located is as shown by the white rectangular frame in FIG.
  • the calculation can make the calculation in the subsequent step only within the boundary of the connected domain where the target object is located, without processing the pixels in the entire depth image.
  • the Principle Component Analysis (PCA) method is a multivariate statistical analysis method that transforms data into a new coordinate system by linear transformation, so that the variance of the data on the first coordinate axis is maximized, in the second
  • the variance of the data on the coordinate axes is the second largest, and so on; wherein the direction of the first coordinate axis is the main direction (that is, the main direction is the direction in which the variance of the data is the largest), and the direction of the other coordinate axes is the secondary direction ( The current direction is that the variance of the data is not the largest direction).
  • the main direction obtained by the PCA method in the connected domain obtained in step S1 may be as shown by the white line in FIG. 6 .
  • step S3 The image obtained by acquiring the target object from the connected domain in which the target object is located according to the relationship between the form of the target object and the main direction or the secondary direction described in step S3 will be described in detail below.
  • the shape of the target object may be a contour shape, a contour change trend, a size, an area, or other parameters of the target object or an object including the target object.
  • the shape of the human hand may be a contour change trend from the arm to the wrist gradually narrowing and gradually widening from the wrist. Therefore, the wrist position can be used as the split position to extract the image of the hand from the connected domain and remove the image of the arm.
  • a suitable segmentation position can be found according to the specific shape of the object.
  • the target object is a human head. Since the head below the head of the person has a significant difference in contour shape, contour change tendency, length and area, the shape of the human head can be the contour shape of the human head, the contour change trend, Length or area.
  • the relationship between the shape of the target object and the secondary direction is, for example, the shoulder width of the human body is the largest in the direction from the left hand side to the right hand side of the human body. Therefore, the image of the human head can be extracted from the connected domain where the human body is located with the shoulder as the divided position.
  • the image of the target object is obtained from the connected domain in which the target object is located according to the relationship between the shape of the target object and the main direction, and step S3 is described in detail.
  • the finding based on the above human hand is a process of gradually narrowing from the arm to the wrist and gradually widening from the wrist.
  • the image of acquiring the target object described in step S3 includes: determining the mesh The number of pixel points at a plurality of different positions in the connected domain in which the target object is located changes in the main direction, and the pixel points at each position are sequentially arranged in the secondary direction of the connected domain, and the secondary direction is perpendicular to the main direction; The change trend is compared with the change trend of the shape of the target object along the main direction; and the split position of the image for acquiring the target object is determined in the connected domain according to the comparison result.
  • the trend of the number of pixel points in different positions along the main direction reflects the trend of the shape of the target object along the main direction, by comparing the two, it can be used to find a suitable segmentation position to connect from the segmentation position according to the segmentation position.
  • the domain extracts only the image of the target object.
  • the image of the target object acquired in step S3 may be The method includes: determining a true width of the connected domain in a secondary direction at each of the plurality of positions, the secondary direction being perpendicular to the main direction, the plurality of positions being sequentially arranged along the main direction, and the pixel points at each position are in a secondary direction Arranging in order; and comparing the true width to the reference width to determine a segmentation position for acquiring an image of the target object in the connected domain.
  • the trend of the number of pixel points along the main direction may be determined first, and then the true width of the connected domain at each position may be calculated to determine the split position; for example, the number of determined pixel points may also be omitted along the main direction.
  • the step of changing the trend determines the thickness variation trend of the connected domain in the main direction according to the true width of the connected domain at the plurality of positions and selects the position of the true width within the reference width to determine the split position.
  • the width of the wrist is about 40 mm to 100 mm, so the reference width can be set to 40 mm to 100 mm, for example, 40 mm, 60 mm, 80 mm, or 100 mm.
  • the true width of the connected domain at each location may be determined based on the number of pixel points at each location, the average depth value of the pixel points at each location, and the focal length ratio of the camera that acquired the depth image.
  • the acquiring the image of the target object in step S3 further includes: determining a true distance between each of the plurality of locations to a vertex of the connected domain; and authenticating the real The distance is compared to the reference length to determine the split position.
  • the vertices of the connected domain may be determined according to the positional relationship between the coordinates of the vertices and the main direction.
  • the reference length can be set to 40 mm to 200 mm.
  • the position of the true distance within the reference length can be used as the split position.
  • the difference between the average depth between each adjacent two positions between the vertices of the connected domains from each position and the true distance in the main direction between the two adjacent positions can be calculated. Describe the true distance of each location to the vertices of the connected domain.
  • the image of acquiring the target object described in step S3 includes, for example, the following steps S31 to S35, which will be described below with reference to FIGS. 7 to 10.
  • Step S31 determining the number of pixel points at a plurality of different positions in the connected domain where the hand is located, the plurality of different positions are sequentially arranged along the main direction, and the pixel points at each position are sequentially arranged in the secondary direction of the connected domain, The direction is perpendicular to the main direction.
  • the main direction is rotated as a reference.
  • the original coordinate system of the depth image until the Y coordinate of the new coordinate system (see the Cartesian coordinate system XOY shown in FIG. 7) is parallel to the main direction (the orientation of the Y axis may be the same or opposite to the orientation of the main direction),
  • the X axis is parallel to the above secondary direction (as shown by the white line in Fig. 7), and all the pixels in the original depth image are given a new coordinate value in the new coordinate system.
  • the pixel point having the largest Y coordinate ie, the vertex of the connected domain, as indicated by the circle in FIG. 7
  • the array of disData(k) can represent the trend of the thickness of the connected domain in the two-dimensional depth image along the main direction. According to the trend of the thickness, combined with the trend of the shape of the human hand along the main direction, it can be found Provide a basis for wrists and arm removal.
  • Step S32 Determine the true width of the connected domain where the hand is located in each direction at the secondary direction.
  • ratio is the focal length ratio of the camera, which is determined according to the camera itself.
  • Step S33 determining a true distance between each of the plurality of locations to a vertex of the connected domain.
  • an array realLenth(n) representing the real length may also be acquired in this step S33.
  • the calculation method of realLength(n) is explained as follows.
  • Step S34 comparing the true width at each position obtained in step S32 with the reference width of the hand and combining the trend of the thickness variation of the connected domain along the main direction (for example, according to the number of pixel points obtained in step S31 or step S32) The true width obtained in the determination of the change trend), and comparing the true distance of each position obtained in step S33 to the vertex of the connected domain with the reference length of the hand to determine the wrist position (an example of the split position) .
  • the reference width of the wrist can be set to 40 mm to 100 mm, such as 40 mm, 80 mm or 100 mm.
  • the reference length of the hand can be set to 40 mm to 200 mm.
  • a plurality of positions that satisfy the wrist condition may be obtained within the reference length and reference width range of the hand.
  • the reference position indicated by the upper white straight line is not suitable as the split position, and the reference position indicated by the lower white line can be used as the split position, both of which have close to the wrist width and from the bottom.
  • the feature that is gradually widened and the length of the hand is within the minimum and maximum range of the hand.
  • a segmentation error may result, for example, only 4 fingers may be obtained according to the upper reference position in FIG.
  • the inventors of the present application found that for a human hand, at a reference position (for example, the upper reference position in FIG. 8) which is not suitable as a division position, there is a null value between some adjacent pixel points;
  • a reference position for example, the upper reference position in FIG. 8
  • the difference between the coordinates of the two farthest pixel points in the connected domain where the hand is located is subtracted (the difference is positive).
  • the difference between the coordinates of the two farthest pixel points in the connected domain where the hand is located is subtracted (the difference is positive).
  • the difference between the coordinates of the two farthest pixel points in the connected domain of the hand is subtracted
  • the value is approximately equal to the number of pixels at the reference position.
  • the acquiring an image of the target object in step S3 further includes: acquiring a plurality of reference positions in the plurality of positions; calculating a farthest distance at each of the reference positions The difference in coordinates between the two pixels (the difference is a positive value); and determining the division position based on the magnitude relationship between the difference and the number of pixel points at each of the reference positions.
  • the plurality of reference positions include a first reference position and a second reference position, and a difference between coordinates of two pixel points farthest apart from the first reference position is ⁇ X1 ( ⁇ X1>0), the first reference The number of pixel points at the position is N1, and when ( ⁇ X1-N1)/N1 is less than or equal to the set value and the set value is 10% to 15%, the first reference position is taken as the split position.
  • the split position can be selected from the first and second reference positions.
  • the lower reference position in FIG. 8 is the divided position, and the upper reference position cannot be used as the divided position.
  • the embodiment of the set value includes, but is not limited to, 10% to 15%, and can be set according to actual needs.
  • the inventor of the present application also found that, for a human hand, in the case where at least three reference positions are obtained, for example, the first, second, and third reference positions, the number of pixel points and the number of pixels are passed.
  • the relationship between the sizes determines that the second reference position is not suitable as the split position, such as If the distance of the third reference position from the main direction to the second reference position is less than or equal to the set distance (for example, 24 mm to 26 mm), the third reference position is also not suitable as the split position.
  • the difference between the coordinates of the two pixel points farthest from the second reference position is assumed.
  • ⁇ X2 ( ⁇ X2>0)
  • the number of pixel points at the second reference position is N2
  • ( ⁇ X2-N2)/N2 is greater than the above-mentioned set value (for example, greater than 15%)
  • the distance of the second reference position is less than or equal to a third reference position of the set distance (the third reference position is located between a vertex of the connected domain and the second reference position or a side of the second reference position away from the vertex) It is not selected as the split position, that is, the distance from the correct split position to the second reference position is greater than the set distance.
  • the set distance may be 24 mm to 26 mm, for example, 24 mm, 25 mm, or 26 mm.
  • each reference position can be determined in turn along the main direction (ie, starting from the apex of the connected domain).
  • the reference position 1 is unsuitable as the split position by the magnitude relationship between the difference between the coordinates of the pixels and the number of pixel points; then, the distance from the reference position 2 to the reference position 1 and the set value It is judged that the reference position 2 is not suitable as the division position; after that, it is judged that the reference position 3 is not suitable as the division position according to the magnitude relationship between the coordinate difference value and the number of pixel points, thereby obtaining the reference position 4 Can be used as a split position.
  • Step S35 After determining the wrist position by the step S34, the image of the hand is acquired from the connected domain in which the hand is located according to the wrist position and the image of the arm is removed. For example, a hand image with better effects as shown in FIG. 11 can be obtained.
  • a flowchart of a method provided by at least one embodiment of the present invention can be summarized as FIG. 11, that is, the method includes: extracting a connected domain from a depth image, and then passing the PCA method The main direction of the connected domain is analyzed, and then the true width and the true length of the connected domain are calculated. Finally, the segmentation position is judged by the heuristic feature to propose an image of the hand from the connected domain according to the segmentation position.
  • the heuristic feature may include: a reference width of the hand, a reference length of the hand, a trend of the thickness of the connected domain of the hand along the main direction, and a reference position at which the position is not suitable as the split position. There is a null value so that the farthest distance is at the reference position The difference between the coordinate difference of the pixel and the number of pixels is large, and the reference position which is not suitable as the reference position of the division position is less than or equal to the set distance (for example, 24 mm to 26 mm) is not suitable as the division position, etc. .
  • At least one embodiment of the present invention also provides an image segmentation system, as shown in FIG. 12, the system comprising: a first image segmentation device configured to process a depth image to obtain a target object from the depth image Connected domain; an analysis device coupled to the first image segmentation device and configured to determine a primary direction or a secondary direction of the connected domain acquired by the first image segmentation device by a principal component analysis method; and a second image segmentation device Connected to the analysis device and configured to acquire an image of the hand according to a relationship between a shape of the target object and a primary direction or a secondary direction.
  • the second image segmentation device includes a computing device coupled to the analysis device and a comparison device coupled to the computing device.
  • the computing device is configured to calculate a number of pixel points at a plurality of locations in the connected domain and to determine a trend of variation in the number of pixel points along the primary direction, wherein the pixel points at each location are in a secondary direction of the connected domain Arranging sequentially, the secondary direction is perpendicular to the main direction; accordingly, the comparing means is configured to compare the changing trend with the changing trend of the shape of the target object in the main direction to determine the segmentation of the image for acquiring the target object in the connected domain position.
  • the computing device is configured to calculate a true width of the connected domain in a secondary direction at each of the plurality of locations, the secondary direction being perpendicular to the primary direction, the plurality of locations being sequentially arranged along the primary direction, at each location
  • the pixels are sequentially arranged in the secondary direction; accordingly, the comparing means is configured to compare the true width with the reference width to determine a segmentation position for acquiring an image of the target object in the connected domain.
  • the reference width may be set to 40 mm to 100 mm, for example, 40 mm, 80 mm, or 100 mm.
  • the computing device can also be configured to both determine the trend of the number of pixel points along the main direction and the true width of the connected domain at each location in the secondary direction; accordingly, the comparing means can be configured In order to compare the change trend of the number of pixel points along the main direction with the change trend of the shape of the target object along the main direction, the true width of the connected domain at each position is compared with the reference width to determine the split position.
  • the computing device is further configured to calculate a true distance of each of the plurality of locations to a vertex of the connected domain; accordingly, the comparing means can compare the real distance to the reference length to determine the split location.
  • the reference length can be set to 40mm ⁇ 200mm.
  • the image segmentation system may further include a depth camera configured to acquire the depth image and output the depth image to the first image segmentation device.
  • a depth camera configured to acquire the depth image and output the depth image to the first image segmentation device.
  • the specific structures of the first image dividing device, the analyzing device, the second image dividing device, the computing device, and the comparing device in the image segmentation system may each correspond to a processor, for example, the processor may be a central processing unit (CPU, Central processing unit), microprocessor (MCU, Micro Signal Unit), digital signal processing (DSP, Digital Signal Processing) or programmable logic device (PLC, Programmable Logic Controller) and other electronic components or electronic components with processing functions A collection of devices.
  • CPU Central processing unit
  • MCU Micro Signal Unit
  • DSP Digital Signal Processing
  • PLC Programmable Logic Controller
  • the above devices in the embodiments of the present invention may be all integrated in one processor, or respectively implemented by different processors, or any two or more devices integrated in one processor; It can be implemented in the form of hardware or in the form of hardware plus software functional units.
  • At least one embodiment of the present invention also provides another image segmentation system, as shown in FIG. 12, the system comprising: a processor; a memory; and computer program instructions stored in the memory, when the computer program instructions are executed by the processor Execution: obtaining the connected domain where the target object is located from the depth image; determining the main direction or the secondary direction of the connected domain by principal component analysis; and from the connected domain according to the relationship between the shape of the target object and the primary or secondary direction Get an image of the target object.
  • the memory can include at least one of a read only memory and a random access memory and provides instructions and data to the processor.
  • a portion of the memory may also include non-volatile random access memory (NVRAM).
  • NVRAM non-volatile random access memory
  • the processor can be a general purpose processor, a digital signal processor (DSP), an application specific integrated circuit (ASIC), a field programmable gate array (FPGA), or other programmable logic device, discrete gate or transistor logic device, discrete hardware component.
  • DSP digital signal processor
  • ASIC application specific integrated circuit
  • FPGA field programmable gate array
  • a general purpose processor can be a microprocessor or any conventional processor or the like.
  • the image segmentation system may further include a depth camera configured to acquire the depth image and output the depth image to the first image segmentation device.
  • a depth camera configured to acquire the depth image and output the depth image to the first image segmentation device.
  • At least one embodiment of the present invention also provides a storage medium in which a computer program is stored The instructions, the computer program instructions are adapted to be loaded and executed by the processor: obtaining a connected domain in which the target object is located from the depth image; determining a primary direction or a secondary direction of the connected domain by a principal component analysis method; and according to a shape of the target object The relationship between the main direction or the secondary direction acquires an image of the target object.
  • the storage medium can be a semiconductor memory, a magnetic surface memory, a laser memory, a random access memory, a read only memory, a serial access memory, a non-permanent memory, a permanent memory, or any other form of storage well known in the art. medium.
  • the processor can be a general purpose processor, a digital signal processor (DSP), an application specific integrated circuit (ASIC), a field programmable gate array (FPGA), or other programmable logic device, discrete gate or transistor logic device, discrete hardware component .
  • DSP digital signal processor
  • ASIC application specific integrated circuit
  • FPGA field programmable gate array
  • a general purpose processor can be a microprocessor or any conventional processor or the like.
  • At least one embodiment of the present invention also provides an apparatus comprising the image segmentation system provided by any of the above embodiments or the storage medium described above.
  • the device may be a human-computer interaction device such as an AR smart glasses or a display, and the device acquires an image including an instruction of the user (for example, an image of a target object) by using an image segmentation system included therein, and analyzes the image by using the image segmentation system. Achieve human-computer interaction.
  • a human-computer interaction device such as an AR smart glasses or a display
  • the device acquires an image including an instruction of the user (for example, an image of a target object) by using an image segmentation system included therein, and analyzes the image by using the image segmentation system. Achieve human-computer interaction.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Human Computer Interaction (AREA)
  • Geometry (AREA)
  • General Health & Medical Sciences (AREA)
  • Psychiatry (AREA)
  • Social Psychology (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

一种图像分割方法、图像分割系统和存储介质及包括其的设备,该图像分割方法包括:从深度图像中获取目标物体所在的连通域;通过主成分分析方法确定所述连通域的主方向或次方向;以及根据所述目标物体的形态与所述主方向或次方向之间的关系,从所述连通域中获取所述目标物体的图像。本发明实施例可以获取高质量的目标物体的图像。

Description

图像分割方法、图像分割系统和存储介质及包括其的设备 技术领域
本发明的实施例涉及一种图像分割方法、图像分割系统和存储介质及包括其的设备。
背景技术
随着人机交互技术的发展,基于计算机视觉的手势识别技术因具有使人能够以自然的方式进行人机交互的优点而成为人机交互技术中重要的研究方向之一。
例如,可以将手势识别应用到例如AR眼镜等智能设备中,通过该智能设备中的摄像头捕捉包括人手的图像,对该图像进行手分割处理以获取只具有手的图像,之后对该只具有手的图像进行分析可以得知人手的动作(即手势)或提取指尖信息。在该过程中,手分割处理的效果会直接影响到后续特征提取以及识别的准确率。
发明内容
本发明的实施例提供一种图像分割方法、图像分割系统和存储介质及包括其的设备,本发明实施例可以获得高质量的目标物体的图像。
本发明的至少一个实施例提供一种图像分割方法,其包括:从深度图像中获取目标物体所在的连通域;通过主成分分析方法确定所述连通域的主方向或次方向;以及根据所述目标物体的形态与所述主方向或次方向之间的关系,从所述连通域中获取所述目标物体的图像。
例如,所述从深度图像获取目标物体所在的连通域包括:检测所述深度图像中的所有的连通域和每个连通域的同一参数;以及将具有设定参数的连通域作为所述目标物体所在的连通域。
例如,所述同一参数为最小深度值,所述设定参数为最小的最小深度值。
例如,所述图像分割方法还包括:步骤S11,将所述深度图像中的设定像素点作为初始点并添加到设定队列中;步骤S12,确定与所述初始点在空 间上相邻的相邻像素点;步骤S13,计算所述相邻像素点与所述初始点之间的深度差的绝对值,其中,在所述深度差的绝对值小于或等于设定深度差的情况下,将所述相邻像素点添加到所述初始点所在的连通域中;步骤S14,将所述相邻像素点作为下一个初始点并添加到所述设定队列中;以及重复上述步骤S12至步骤S14,以确定所述初始点所在的连通域。
例如,所述设定深度差为10mm~15mm。
例如,所述获取目标物体的图像包括:确定所述连通域中的多个位置处的像素点的数量沿所述主方向的变化趋势,其中,每个位置处的像素点沿所述连通域的次方向依次排列,所述次方向垂直于所述主方向;将所述变化趋势与所述目标物体的形态沿所述主方向的变化趋势进行比较;以及根据比较结果在所述连通域中确定用于获取所述目标物体的图像的分割位置。
例如,所述获取目标物体的图像包括:确定所述连通域在多个位置中的每个位置处沿次方向的真实宽度,其中,所述次方向垂直于所述主方向,所述多个位置沿所述主方向依次排列,每个位置处的像素点沿所述次方向依次排列;以及将所述真实宽度与参考宽度进行比较,以在所述连通域中确定用于获取所述目标物体的图像的分割位置。
例如,所述目标物体为人手,所述参考宽度为40mm~100mm。
例如,根据所述每个位置处的像素点的数量、所述每个位置处的像素点的平均深度值以及获取所述深度图像的摄像头的焦距比,确定所述连通域在所述每个位置处的真实宽度。
例如,所述获取目标物体的图像还包括:确定所述多个位置中的每个位置到所述连通域的顶点之间的真实距离;以及将所述真实距离与参考长度进行比较以确定所述分割位置。
例如,所述目标物体为人手,所述参考长度为40mm~200mm。
例如,根据所述每个位置到所述连通域的顶点之间的每相邻的两个位置之间的平均深度的差和沿所述主方向的真实距离,计算所述每个位置到所述连通域的顶点的真实距离。
例如,所述获取目标物体的图像还包括:在所述多个位置中获取多个参考位置;计算每个参考位置处相距最远的两个像素点之间的坐标的差值;以及根据所述差值与所述每个参考位置处的像素点的数量之间的大小关系,确 定所述分割位置。
例如,所述多个参考位置包括第一参考位置和第二参考位置,所述第一参考位置处相距最远的两个像素点之间的坐标的差值为大于0的ΔX1,所述第一参考位置处的像素点的数量为N1,在(ΔX1-N1)/N1小于或等于设定值并且所述设定值为10%~15%的情况下,将所述第一参考位置作为所述分割位置。
例如,在所述第二参考位置处相距最远的两个像素点之间的坐标的差值为大于0的ΔX2,所述第二参考位置处的像素点的数量为N2,在(ΔX2-N2)/N2大于所述设定值的情况下,所述分割位置到所述第二参考位置的距离大于设定距离,所述设定距离为24mm~26mm。
本发明的至少一个实施例还提供一种图像分割系统,其包括:第一图像分割装置,其被配置为对深度图像进行处理,以从所述深度图像中获取目标物体所在的连通域;分析装置,其与所述第一图像分割装置连接,并且被配置为通过主成分分析方法确定所述第一图像分割装置获取的所述连通域的主方向或次方向;以及第二图像分割装置,其与所述分析装置连接,并且被配置为根据所述目标物体的形态与所述主方向或次方向之间的关系从所述连通域中获取所述目标物体的图像。
例如,所述第二图像分割装置包括:计算装置,其与所述分析装置连接,并且被配置为计算所述连通域中的多个位置处的像素点的数量、以及确定所述像素点的数量沿所述主方向的变化趋势,其中,每个位置处的像素点沿所述连通域的次方向依次排列,所述次方向垂直于所述主方向;以及比较装置,其与所述计算装置连接,并且被配置为将所述变化趋势与所述目标物体的形态沿所述主方向的变化趋势进行比较,以在所述连通域中确定用于获取所述目标物体的图像的分割位置。
例如,所述第二图像分割装置包括:计算装置,其与所述分析装置连接,并且被配置为计算所述连通域在多个位置中的每个位置处沿次方向的真实宽度,其中,所述次方向垂直于所述主方向,所述多个位置沿所述主方向依次排列,每个位置处的像素点沿所述次方向依次排列;以及比较装置,其与所述计算装置连接,并且被配置为将所述真实宽度与参考宽度进行比较,以在所述连通域中确定用于获取所述目标物体的图像的分割位置。
例如,所述计算装置还被配置为计算所述多个位置中的每个位置到所述连通域的顶点的真实距离,所述比较装置还被配置为将所述真实距离与参考长度进行比较以确定所述分割位置。
本发明的至少一个实施例还提供一种图像分割系统,其包括处理器、存储器、以及存储在所述存储器中的计算机程序指令,在所述计算机程序指令被所述处理器运行时执行:从深度图像中获取目标物体所在的连通域;通过主成分分析方法确定所述连通域的主方向或次方向;以及根据所述目标物体的形态与所述主方向或次方向之间的关系,获取所述目标物体图像。
本发明的至少一个实施例还提供一种存储介质,其中存储有计算机程序指令,所述计算机程序指令适于由处理器加载并执行:从深度图像中获取目标物体所在的连通域;通过主成分分析方法确定所述连通域的主方向或次方向;以及根据所述目标物体的形态与所述主方向或次方向之间的关系,获取所述目标物体图像。
本发明的至少一个实施例还提供一种设备,其包括以上任一项所述的图像分割系统或者以上所述的存储介质。
附图说明
为了更清楚地说明本发明实施例的技术方案,下面将对实施例的附图作简单地介绍,显而易见地,下面描述中的附图仅仅涉及本发明的一些实施例,而非对本发明的限制。
图1为本发明实施例提供的图像分割方法的流程图;
图2为本发明实施例中深度图像中的像素点p及其邻域的分布示意图;
图3a示意性地示出了本发明实施例中深度摄像头与用户的正面相对的情况下拍摄的深度图像;
图3b示意性地示出了本发明实施例中深度摄像头朝向与用户的正面朝向相同的情况下拍摄的深度图像;
图4为根据本发明实施例得到的手所在连通域的示意图;
图5示意性地示出了根据本发明实施例得到的手所在连通域的边界;
图6示意性地示出了根据本发明实施例得到的手所在连通域的主方向;
图7为本发明实施例中根据手所在连通域的主方向旋转深度图像的坐标 系之后得到的新坐标系的示意图;
图8为本发明实施例中的多个参考位置的示意图;
图9为本发明另一实施例中的多个参考位置的示意图;
图10为根据发明实施例得到的手部图像的示意图;
图11为本发明实施例提供的图像分割方法的流程图;
图12为本发明实施例提供的一种图像分割系统的结构框图;
图13为本发明实施例提供的另一种图像分割系统的结构框图。
具体实施方式
为使本发明实施例的目的、技术方案和优点更加清楚,下面将结合本发明实施例的附图,对本发明实施例的技术方案进行清楚、完整地描述。显然,所描述的实施例是本发明的一部分实施例,而不是全部的实施例。基于所描述的本发明的实施例,本领域普通技术人员在无需创造性劳动的前提下所获得的所有其他实施例,都属于本发明保护的范围。
除非另外定义,本公开使用的技术术语或者科学术语应当为本发明所属领域内具有一般技能的人士所理解的通常意义。本公开中使用的“第一”、“第二”以及类似的词语并不表示任何顺序、数量或者重要性,而只是用来区分不同的组成部分。“包括”或者“包含”等类似的词语意指出现该词前面的元件或者物件涵盖出现在该词后面列举的元件或者物件及其等同,而不排除其他元件或者物件。“连接”或者“相连”等类似的词语并非限定于物理的或者机械的连接,而是可以包括电性的连接,不管是直接的还是间接的。“上”、“下”、“左”、“右”等仅用于表示相对位置关系,当被描述对象的绝对位置改变后,则该相对位置关系也可能相应地改变。
本发明实施例提供一种图像分割方法、图像分割系统和存储介质以及包括该图像分割系统或存储介质的设备。在本发明实施例提供的图像分割方法中,通过从深度图像中获取目标物体所在的连通域,然后通过主成分分析方法确定该连通域的主方向,之后根据该目标物体的形态与该主方向或次方向之间的关系从该连通域中获取该目标物体的图像,能够将目标物体的图像从深度图像中提取出来并且获得高质量的目标物体的图像。
例如,本发明实施例可以用于手势识别中的手分割,在这种情况下,上 述目标物体为人手,从而本发明实施例可以获取高质量的手部图像,从而既能使指尖提取的扫描范围缩小并且错误率降低,又能为基于机器学习的手势识别提供精准的测试样本。当然,本发明实施例也可以用于除手势识别之外的任何其他图像分割情况,本发明实施例仅以用于手势识别中的手分割为例进行说明。
如图1所示,本发明的至少一个实施例提供一种图像分割方法,其包括:步骤S1,从深度图像中获取目标物体所在的连通域;步骤S2,通过主成分分析方法确定该连通域的主方向或次方向;以及步骤S3,根据目标物体的形态与该主方向或次方向之间的关系,从所述连通域中获取目标物体的图像,以将目标物体的图像从深度图像中提取出来。
下面结合图2至图5,对步骤S1所述的从深度图像中获取目标物体所在的连通域进行详细说明。
在步骤S1中,深度图像为深度摄像头获取的二维深度图像,该二维深度图像中每个像素点的取值是该像素点到深度摄像头的距离(即深度值),通常用毫米(mm)作为单位。需要说明的是,深度图像中有的像素点处没有物体,则该像素点的深度值用0表示。对于不为零的深度值,深度值越大,则表示像素点距离摄像头越远。
深度图像的分辨率可以根据实际需要进行设置,这里不做限定。例如,深度图像的分辨率可以为320×240。例如,为了简化计算,可以对深度图像的分辨率进行缩小,例如缩小为160×120。
在步骤S1中,连通域是由相互连通的多个像素点构成的,也就是说,连通域中的每个像素点和在空间上与其相邻的相邻像素点满足相似性规则,例如,该相似性规则为每个像素点和该相邻像素点之间的深度差的绝对值小于或等于设定深度差。例如,在关于目标物体的在空间上相邻的像素点的深度值之间不发生突变的情况下,该设定深度差为10mm~15mm,以获取高质量的目标物体的连通域。以目标物体为人手为例,对于手的连通域而言,在空间上相邻的像素点之间的深度值通常不会发生突变,因此,将设定深度差设置为10mm~15mm,有利于获取高质量的手的连通域。当然,设定深度差的实施例包括但不限于10mm~15mm。
关于两个像素点在空间上相邻,下面结合图2进行说明。
例如,如图2所示,以xoy坐标系中的坐标为(x,y)的像素点p为例,在水平和垂直方向上,像素点p具有4个与其在空间上相邻的相邻像素点,即坐标分别为正上方(x,y+1)、正下方(x,y-1)、正左方(x-1,y)和正右方(x+1,y)的像素点,这4个水平和垂直方向上的相邻像素点构成像素点p的4邻域;在对角方向上,像素点p具有4个与其在空间上相邻的相邻像素点,即坐标分别为左上角(x-1,y+1)、右上角(x+1,y+1)、左下角(x-1,y-1)和右下角(x+1,y-1)的像素点,这4个对角方向上的相邻像素点构成像素点p的对角邻域;此外,上述4邻域和对角邻域构成像素点p的8邻域。像素点p与其4邻域、对角邻域和8邻域中的每个像素点在空间上相邻。
例如,在深度图像包括多个物体并且目标物体距离深度摄像头最近的情况下,步骤S1中的从深度图像中获取目标物体所在的连通域包括:检测深度图像中的所有的连通域和每个连通域的同一参数;以及将具有设定参数的连通域作为目标物体所在的连通域。
例如,所述同一参数为最小深度值,所述设定参数为最小的最小深度值。也就是说,步骤S1中的从深度图像中获取目标物体所在的连通域包括:检测深度图像中的所有的连通域和每个连通域的最小深度值;以及将具有最小的最小深度值的连通域作为目标物体所在的连通域。需要说明的是,每个连通域的最小深度值是指,在该连通域中的所有像素点中,深度值最小的像素点的深度值。
以目标物体为人手为例。图3a示出了深度摄像头与用户的正面相对的情况下拍摄的深度图像;图3b示出了深度摄像头朝向与用户的正面朝向相同的情况下拍摄的深度图像。如图3a和图3b所示,深度图像一般包括多个连通域,由于手距离深度摄像头最近,因此,在深度图像中,手所在的连通域是深度值最小的连通域。据此,可以检测深度图像中的所有的连通域和每个连通域的最小深度值,之后将具有最小的最小深度值的连通域作为手所在的连通域。例如,对于如图3b所示的深度图像,获取的手所在的连通域可以如图4中手的图像所示。
在其它实施例中,也可以根据连通域的除最小深度值之外的其它参数(例如连通域的轮廓形状、长度、面积等)提取目标物体所在的连通域。
例如,本公开的至少一个实施例提供的图像分割方法可以包括以下步骤S11至步骤S14,下面进行详细说明。
步骤S11:将深度图像中的设定像素点作为初始点并添加到设定队列中。
例如,在步骤S11中,该设定像素点由深度图像的处理者选定,或者通过图像处理算法定位。
例如,在步骤S11中,该设定队列为FIFO(First Input First Output)队列,即先入先出队列,在该队列中,先进入的指令先完成并引退,之后才执行第二条指令,以此类推。
步骤S12:确定与该初始点在空间上相邻的相邻像素点。
例如,在该步骤S12中,可以检测该初始点的4邻域或8邻域,以确定与其在空间上相邻的相邻像素点。
步骤S13:计算相邻像素点与初始点之间的深度差的绝对值,并且在深度差的绝对值小于或等于设定深度差的情况下,将该相邻像素点添加到该初始点所在的连通域中。
例如,在步骤S13中,该设定深度差可以为10mm~15mm,例如10mm,13mm或15mm等。
步骤S14:将该相邻像素点作为下一个初始点并添加到该设定队列中,以便于后续对其邻域进行检查。
重复上述步骤S12至步骤S14,直至该设定队列中的每个像素点都处理完毕,从而确定出初始点所在的连通域。按照类似方法检测出深度图像中的所有连通域之后,选出具有最小的最小深度值的连通域,从而检测出目标物体所在的连通域。
例如,在得到目标物体所在的连通域之后,为了简化后续的计算,可以将深度图像的除该连通域内的像素点之外的其余像素点的深度值都设置为空值0。以目标物体为人手为例,如图4所示,除了手所在的连通域内的像素点之外,其余的像素点的深度值都被设置为0。
例如,在检测深度图像中的所有的连通域的同时,也可以检测每个连通域的上、下、左、右边界,之后选出目标物体所在的连通域;也可以在检测出手所在的连通域之后,检测出手所在连通域的边界。以目标物体为人手为例,目标物体所在连通域的边界如图5中的白色矩形框所示。为了简化后续 的计算,可以使后续步骤中的计算只在目标物体所在的连通域的边界内进行,而不需对整个深度图像内的像素点进行处理。
下面对步骤S2中所述的通过主成分分析方法确定连通域的主方向或次方向进行详细说明。
主成分分析(Principle Component Analysis,PCA)方法是一种多元统计分析方法,通过线性变换将数据变换到一个新的坐标系中,使在第一个坐标轴上数据的方差达到最大,在第二个坐标轴上数据的方差次大,以此类推;其中,第一坐标轴的方向为主方向(也就是说,主方向为数据的方差最大的方向),其它坐标轴的方向为次方向(即次方向为数据的方差不是最大的方向)。
例如,以目标物体为人手为例,步骤S1中获取的连通域采用PCA方法获得的主方向可以如图6中的白色直线所示。
下面对步骤S3中所述的根据目标物体的形态与主方向或次方向之间的关系从目标物体所在的连通域中获取目标物体的图像进行详细说明。
目标物体的形态可以是目标物体或者包括该目标物体的物体的轮廓形状、轮廓变化趋势、尺寸、面积或其它参数。以目标物体为人手为例,人手的形态可以为从手臂到手腕逐渐变窄并且从手腕开始向上逐渐变宽这一轮廓变化趋势。因此,可以将手腕位置作为分割位置以从连通域中提取手的图像并去除手臂的图像。在目标物体为其它物体的情况下,可以根据该物体的具体形态寻找合适的分割位置。例如,目标物体为人头,由于人头与人的头部以下的部分在轮廓形状、轮廓变化趋势、长度和面积上都有明显差别,因此,人头的形态可以为人头的轮廓形状、轮廓变化趋势、长度或者面积。
以目标物体为人头为例,目标物体的形态与次方向之间的关系例如为:在从人体的左手侧到右手侧这一次方向上,人体的肩部宽度最大。因此,可以以肩部作为分割位置将人头的图像从人体所在的连通域中提取出来。
下面以目标物体为人手,以根据目标物体的形态与主方向之间的关系从目标物体所在的连通域中获取目标物体的图像为例,对步骤S3进行详细说明。
基于以上人手的形态为从手臂到手腕逐渐变窄并且从手腕开始向上逐渐变宽这一发现,例如,步骤S3中所述的获取目标物体的图像包括:确定目 标物体所在的连通域中的多个不同位置处的像素点的数量沿主方向的变化趋势,每个位置处的像素点沿连通域的次方向依次排列,次方向垂直于主方向;将该变化趋势与目标物体的形态沿主方向的变化趋势进行比较;以及根据比较结果在连通域中确定用于获取目标物体的图像的分割位置。由于不同位置处像素点的数量沿主方向的变化趋势反映目标物体的形态沿主方向的变化趋势,因此通过将二者进行比较,可以用于寻找合适的分割位置,以根据该分割位置从连通域中提取只包括目标物体的图像。
像素点的数量沿着主方向的变化趋势并不能反映连通域的实际宽度,为了更准确地确定出分割位置,例如,在至少一个实施例中,步骤S3中所述的获取目标物体的图像可以包括:确定连通域在多个位置中的每个位置处沿次方向的真实宽度,该次方向垂直于主方向,该多个位置沿主方向依次排列,每个位置处的像素点沿次方向依次排列;以及将真实宽度与参考宽度进行比较以在连通域中确定用于获取目标物体的图像的分割位置。
例如,可以先确定出像素点的数量沿着主方向的变化趋势,之后计算出连通域在每个位置处的真实宽度,以确定分割位置;例如,也可以省略确定像素点的数量沿主方向的变化趋势的步骤,根据连通域在所述多个位置处的真实宽度来判断连通域沿主方向的粗细变化趋势并选择真实宽度在参考宽度范围内的位置,以确定分割位置。
例如,在目标物体为人手的情况下,手腕的宽度约为40mm~100mm,因此参考宽度可以设置为40mm~100mm,例如40mm、60mm、80mm或100mm。例如,可以根据每个位置处的像素点的数量、每个位置处的像素点的平均深度值以及获取深度图像的摄像头的焦距比,确定连通域在每个位置处的真实宽度。
例如,为了更准确地确定出分割位置,步骤S3中所述的获取目标物体的图像还包括:确定上述多个位置中的每个位置到连通域的顶点之间的真实距离;以及将该真实距离与参考长度进行比较以确定分割位置。需要说明的是,例如,连通域的顶点可以根据该顶点的坐标与主方向之间的位置关系确定。
例如,在目标物体为人手的情况下,由于人手的长度(从指尖到手腕的长度)为40mm~200mm,因此,参考长度可以设置为40mm~200mm。在 真实宽度位于参考宽度范围内的情况下,真实距离在该参考长度范围内的位置可以作为分割位置。
例如,可以根据每个位置到连通域的顶点之间的每相邻的两个位置之间的平均深度的差和该每相邻的两个位置之间的沿主方向的真实距离,计算所述每个位置到连通域的顶点的真实距离。
以目标物体为人手为例,步骤S3所述的获取目标物体的图像例如包括以下步骤S31至步骤S35,下面以结合图7至图10进行说明。
步骤S31:确定手所在的连通域中的多个不同位置处的像素点的数量,该多个不同位置沿主方向依次排列,每个位置处的像素点沿连通域的次方向依次排列,次方向垂直于主方向。
例如,为了简化数据的计算,在确定出手所在的连通域的主方向(例如,定义该主方向大致从上向下,如图7中的白色箭头所示)之后,以该主方向为基准旋转深度图像的原坐标系直到新坐标系(参见如图7所示的直角坐标系XOY)的Y轴与该主方向平行(Y轴的朝向可以与主方向的朝向相同或相反),在这种情况下,X轴平行于上述次方向(如图7中的白色直线所示),原深度图像中的所有像素点在新坐标系下都被赋予了一个新的坐标值。
以新坐标系为准,在如图7所示的Y轴的朝向与主方向的朝向相反的情况下,从Y坐标最大的像素点(即连通域的顶点,如图7中圆圈标注的点所示)开始,沿Y坐标减小的方向(如箭头方向所示),依次统计Y=k的像素点的数量,例如,k=k0、k0-1、k0-2、……,其中,k0为连通域的顶点的Y坐标。也就是说,依次统计第一位置Y=k0的像素点的数量、第二位置Y=k0-1的像素点的数量、第三位置Y=k0-2的像素点的数量,以此类推,从而得出多个不同位置处的像素点的数量沿Y轴(即主方向)的变化趋势。在统计出连通域中沿箭头方向所有符合Y=k位置处的像素点的数量之后,将这些位置处像素点的数量存入数组disData(k)中。disData(k)代表手所在连通域中Y=k位置处的像素点的数量,例如,disData(k0)代表Y=k0位置处的像素点的数量,disData(k0-1)代表Y=k0-1位置处的像素点的数量,以此类推。
简单地说,disData(k)这个数组可以代表二维深度图像中手所在连通域沿主方向的粗细变化趋势,根据该粗细变化趋势,再结合人手的形态沿主方向的变化趋势,可以为寻找手腕、去除手臂提供依据。
步骤S32:确定手所在的连通域在每个位置处沿次方向的真实宽度。
通过步骤S31获取的disData(k)代表的是不同位置处的像素点的数量沿着主方向上的变化,但disData(k)并不能反映手所在连通域的实际宽度。因此,为了更准确地确定出手腕位置,可以计算出手所在连通域在Y=k处的真实宽度。
例如,为了计算出手所在连通域在Y=k处的真实宽度,首先计算出手所在连通域内符合Y=k的所有像素点的深度值的平均值,并将计算出的平均值存入数组aveDepth(k)中。数组aveDepth(k)代表disData(k)中Y=k位置处的像素点的平均深度值,这个平均深度值可以用来近似代替在Y=k位置处的像素点到摄像头的平均距离。之后,可以根据以下公式计算出手所在连通域在Y=k处的真实宽度realDis(k):
Figure PCTCN2017091986-appb-000001
realDis(k)的单位是毫米(mm)。需要说明的是,ratio为摄像头的焦距比,根据摄像头本身确定。
通过realDis(k)的计算公式可以看出,可以根据步骤S31中得到的每个位置处的像素点的数量disData(k)、每个位置处的像素点的平均深度值aveDepth(k)以及获取深度图像的摄像头的焦距比ratio,确定连通域在每个位置处的真实宽度realDis(k)。
例如,通过realDis(k0-1)可以计算出在连通域在Y=k0-1位置处的真实宽度,通过realDis(k0-2)可以计算出连通域在Y=k0-2位置处的真实宽度,以此类推,从而可以获取物体沿主方向的真实宽度的变化情况。
步骤S33:确定上述多个位置中的每个位置到连通域的顶点之间的真实距离。
为了进一步准确地获取手腕位置,在通过步骤S32获得代表真实宽度的数组realDis(k)之后,在本步骤S33中还可以获取代表真实长度的数组realLenth(n)。realLength(n)代表Y=k=k0-n位置到Y=k0位置(即连通域的顶点)的真实距离(该距离为3D距离,即在3维空间中的距离),也就是说,从连通域的顶点开始,在深度图像中沿主方向的长度n对应的真实距离为realLength(n),其中,设在Y=k0位置,n=k0-k0=0且realLenth(0)=0。下面对 realLength(n)的计算方法进行如下说明。
计算出每相邻的两个位置Y=k与Y=k-1的沿主方向的真实距离(该距离为2D距离)dy(k):
Figure PCTCN2017091986-appb-000002
其中,Y(k)和Y(k-1)分别表示在新坐标系下相邻的沿主方向的两个像素点的Y坐标,即Y(k)为位置Y=k处像素点的在Y坐标,Y(k-1)为位置Y=k-1处像素点的在Y坐标。
由于人手的真实长度在很大程度上受深度变化的影响,因此,手所在连通域的真实长度的计算方式不同于其真实宽度的计算方式,需要计算出Y=k和Y=k-1这两个相邻位置之间的平均深度的差,即z轴差dz(k):
dz(k)=aveDepth(k)-aveDepth(k-1),
之后,可以通过以下公式计算出连通域中的每个位置到该连通域的顶点的真实距离realLenth(n)数组中的所有元素:
Figure PCTCN2017091986-appb-000003
其中,n=k0-k。
通过以上realLenth(n)的计算公式可以看出,可以根据每个位置到连通域的顶点之间的每相邻的两个位置之间的平均深度的差dz(k)和每相邻的两个位置之间的沿主方向的真实距离dy(k),得到该每个位置到连通域的顶点的真实距离realLenth(n);并且,通过以上dy(k)的计算公式可以看出,每相邻的两个位置之间的沿主方向的真实距离dy(k)可以根据各位置处的像素点的平均深度值aveDepth(k)以及获取深度图像的摄像头的焦距比ratio得到。
步骤S34:将步骤S32中得到的每个位置处的真实宽度与手的参考宽度进行比较并结合连通域沿主方向的粗细变化趋势(例如根据将步骤S31中得到的像素点的数量或者步骤S32中得到的真实宽度来确定该变化趋势),以及将步骤S33中得到的每个位置到连通域的顶点的真实距离与手的参考长度进行比较,以确定出手腕位置(分割位置的一个示例)。
例如,手腕的参考宽度可以设置为40mm~100mm,例如40mm、80mm或100mm。例如,手的参考长度可以设置为40mm~200mm。
在至少一个实施例中,在手的参考长度和参考宽度范围内可能得到多个满足手腕条件的位置(以下称为参考位置)。例如,如图8所示,上方的白色直线所表示的参考位置不适合作为分割位置,下方的白色直线所表示的参考位置可以作为分割位置,这两个参考位置都具备接近手腕宽度和从下至上逐渐变宽的特性且手的截取长度都在手的最小值和最大值范围之内。在这种情况下,可能导致分割错误,例如,可能根据图8中上方的参考位置只得到4个手指。
在研究中,本申请的发明人发现:对于人手来说,在不适合作为分割位置的参考位置(例如图8中上方的参考位置)处,部分相邻的像素点之间存在空值;因此,在计算像素点的数量时,在不适合作为分割位置的参考位置处,手所在连通域中的两个相距最远像素点的坐标相减得到的差值(该差值为正值)明显大于该参考位置处的像素点的数量;然而,在分割位置(如图8中下方的参考位置所示)处,手所在连通域中相距最远的两个像素点的坐标相减得到的差值大致等于该参考位置处的像素点的数量。
基于以上发现,在本发明的至少一个实施例中,步骤S3所述的获取目标物体的图像还包括:在所述多个位置中获取多个参考位置;计算每个参考位置处相距最远的两个像素点之间的坐标的差值(该差值为正值);以及根据该差值与该每个参考位置处的像素点的数量之间的大小关系,确定分割位置。
例如,所述多个参考位置包括第一参考位置和第二参考位置,第一参考位置处相距最远的两个像素点之间的坐标的差值为ΔX1(ΔX1>0),第一参考位置处的像素点的数量为N1,在(ΔX1-N1)/N1小于或等于设定值并且该设定值为10%~15%的情况下,将该第一参考位置作为分割位置。通过这种方式可以从第一、二参考位置中选出分割位置。例如,通过这种方式得出图8中下方的参考位置为分割位置,而上方的参考位置不能作为分割位置。当然,所述的设定值的实施例包括但不限于10%~15%,可以根据实际需要进行设置。
在研究中,本申请的发明人还发现:对于人手来说,在得到至少3个参考位置的情况下,例如第一、二、三参考位置,在通过上述坐标差值与像素点的数量之间的大小关系确定出第二参考位置不适合作为分割位置之后,如 果第三参考位置沿主方向到第二参考位置的距离小于或等于设定距离(例如24mm~26mm),则第三参考位置也不适宜作为分割位置。
也就是说,例如,在本发明的至少一个实施例中,对于上述不能作为分割位置的第二参考位置,假设该第二参考位置处相距最远的两个像素点之间的坐标的差值为ΔX2(ΔX2>0),该第二参考位置处的像素点的数量为N2,在(ΔX2-N2)/N2大于上述设定值(例如大于15%)的情况下,沿主方向,到该第二参考位置的距离小于或等于设定距离的第三参考位置(该第三参考位置位于连通域的顶点与第二参考位置之间或者位于该第二参考位置的远离顶点的一侧)不会被选为分割位置,即正确的分割位置到该第二参考位置的距离大于该设定距离。例如,在目标物体为人手的情况下,该设定距离可以为24mm~26mm,例如24mm、25mm或26mm。
以图9为例,图9中示出了4个参考位置,将该4个参考位置从上到下依次编号为1、2、3、4,其中,参考位置1-3都不适合作为分割位置,参考位置4可以作为分割位置。在确定参考位置的过程中,可以沿主方向(即从连通域的顶点开始),依次对每个参考位置进行判断。例如,可以通过像素点之间的坐标的差值与像素点的数量之比的大小关系确定出参考位置1不适宜作为分割位置;然后,根据参考位置2到参考位置1的距离与设定值之间的大小关系判断出参考位置2也不适宜作为分割位置;之后,根据坐标差值与像素点的数量之间的大小关系判断出参考位置3不适宜作为分割位置,从而得出参考位置4可以作为分割位置。
步骤S35:在通过步骤S34确定出手腕位置之后,根据手腕位置,从手所在的连通域中获取手的图像并去除手臂的图像。例如,可以得到如图11所示的效果较好的手部图像。
综上所述,对于手所在的连通域,例如,本发明的至少一个实施例提供的方法的流程图可概括为图11,即该方法包括:从深度图像中提取连通域,然后通过PCA方法分析出该连通域的主方向,之后计算该连通域的真实宽度和真实长度,最后通过启发性特征判断分割位置以根据该分割位置从连通域中提出手的图像。例如,对于人手来说,如上所述,该启发性特征可以包括:手的参考宽度、手的参考长度、手的连通域沿主方向的粗细变化趋势、不适宜作为分割位置的参考位置处因存在空值而使得在该参考位置处相距最远的 像素点的坐标差值与像素点的数量相差较大、到不适宜作为分割位置的参考位置的距离小于或等于设定距离(例如24mm~26mm)的参考位置也不适宜作为分割位置,等等。
本发明的至少一个实施例还提供一种图像分割系统,如图12所示,该系统包括:第一图像分割装置,其被配置为对深度图像进行处理,以从深度图像中获取目标物体所在的连通域;分析装置,其与第一图像分割装置连接,并且被配置为通过主成分分析方法确定第一图像分割装置获取的连通域的主方向或次方向;以及第二图像分割装置,其与分析装置连接,并且被配置为根据目标物体的形态与主方向或次方向之间的关系获取手的图像。
例如,继续如图12所示,第二图像分割装置包括与分析装置连接的计算装置、以及与计算装置连接的比较装置。
例如,计算装置被配置为计算连通域中的多个位置处的像素点的数量、以及确定像素点的数量沿主方向的变化趋势,其中,每个位置处的像素点沿连通域的次方向依次排列,次方向垂直于主方向;相应地,比较装置被配置为将变化趋势与目标物体的形态沿主方向的变化趋势进行比较,以在连通域中确定用于获取目标物体的图像的分割位置。
例如,计算装置被配置为计算连通域在多个位置中的每个位置处沿次方向的真实宽度,该次方向垂直于主方向,该多个位置沿主方向依次排列,每个位置处的像素点沿次方向依次排列;相应地,比较装置被配置为将所述真实宽度与参考宽度进行比较,以在连通域中确定用于获取目标物体的图像的分割位置。例如,在目标物体为人手的情况下,参考宽度可以设置为40mm~100mm,例如40mm、80mm或100mm。
例如,计算装置也可以被配置为既可以确定像素点的数量沿主方向的变化趋势,又可以确定出连通域在每个位置处的沿次方向的真实宽度;相应地,比较装置可以被配置为既将像素点的数量沿主方向的变化趋势与目标物体的形态沿主方向的变化趋势进行比较,又将连通域在每个位置处的真实宽度与参考宽度进行比较,以确定分割位置。
例如,计算装置还被配置为计算所述多个位置中的每个位置到连通域的顶点的真实距离;相应地,比较装置可以将该真实距离与参考长度进行比较,以确定分割位置。例如,在目标物体为人手的情况下,参考长度可以设置为 40mm~200mm。
例如,该图像分割系统还可以包括深度摄像头,其被配置为获取深度图像并且将该深度图像输出给第一图像分割装置。
本发明实施例的图像分割系统中各器件的功能,可参照前述图像分割方法的实施例中的相关描述。
例如,图像分割系统中的第一图像分割装置、分析装置、第二图像分割装置、计算装置和比较装置的具体结构均可对应于处理器,例如,该处理器可以为中央处理器(CPU,Central Processing Unit)、微处理器(MCU,Micro Controller Unit)、数字信号处理器(DSP,Digital Signal Processing)或可编程逻辑器件(PLC,Programmable Logic Controller)等具有处理功能的电子元器件或电子元器件的集合。
另外,在本发明实施例中的上述装置可以全部集成在一个处理器中,或者分别通过不同的处理器实现,或者任意两个或两个以上的装置集成在一个处理器中;上述各装置既可以采用硬件的形式实现,也可以采用硬件加软件功能单元的形式实现。
本发明的至少一个实施例还提供另一种图像分割系统,如图12所示,该系统包括:处理器;存储器;以及存储在存储器中的计算机程序指令,在计算机程序指令被处理器运行时执行:从深度图像中获取目标物体所在的连通域;通过主成分分析方法确定连通域的主方向或次方向;以及根据目标物体的形态与主方向或次方向之间的关系,从连通域中获取目标物体的图像。
存储器可以包括只读存储器和随机存取存储器中的至少一个,并向处理器提供指令和数据。存储器的一部分还可以包括非易失性随机存取存储器(NVRAM)。
处理器可以是通用处理器、数字信号处理器(DSP)、专用集成电路(ASIC)、现场可编程门阵列(FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。通用处理器可以是微处理器或者任何常规的处理器等。
例如,该图像分割系统还可以包括深度摄像头,其被配置为获取深度图像并且将该深度图像输出给第一图像分割装置。
本发明的至少一个实施例还提供一种存储介质,其中存储有计算机程序 指令,计算机程序指令适于由处理器加载并执行:从深度图像中获取目标物体所在的连通域;通过主成分分析方法确定所述连通域的主方向或次方向;以及根据目标物体的形态与主方向或次方向之间的关系,获取目标物体图像。
例如,该存储介质可以是半导体存储器、磁表面存储器、激光存储器、随机存储器、只读存储器、串行访问存储器、非永久记忆的存储器、永久性记忆的存储器或者本领域熟知的任何其它形式的存储介质。
例如,处理器可以是通用处理器、数字信号处理器(DSP)、专用集成电路(ASIC)、现场可编程门阵列(FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。通用处理器可以是微处理器或者任何常规的处理器等。
本发明的至少一个实施例还提供一种设备,其包括以上任一实施例提供的图像分割系统或者以上所述的存储介质。
例如,该设备可以为AR智能眼镜、显示器等人机交互设备,该设备利用其包括的图像分割系统获取包括用户的指令的图像(例如目标物体的图像),并且通过对该图像进行分析处理,实现人机交互。
上述图像分割方法、图像分割系统和存储介质及包括其的设备的实施例可以互相参照。此外,在不冲突的情况下,本发明的实施例及实施例中的特征可以相互组合。
以上所述仅是本发明的示范性实施方式,而非用于限制本发明的保护范围,本发明的保护范围由所附的权利要求确定。
本申请要求于2016年10月17日递交的中国专利申请第201610905435.X号的优先权,在此全文引用上述中国专利申请公开的内容以作为本申请的一部分。

Claims (22)

  1. 一种图像分割方法,包括:
    从深度图像中获取目标物体所在的连通域;
    通过主成分分析方法确定所述连通域的主方向或次方向;以及
    根据所述目标物体的形态与所述主方向或次方向之间的关系,从所述连通域中获取所述目标物体的图像。
  2. 根据权利要求1所述的方法,其中,所述从深度图像获取目标物体所在的连通域包括:
    检测所述深度图像中的所有的连通域和每个连通域的同一参数;以及
    将具有设定参数的连通域作为所述目标物体所在的连通域。
  3. 根据权利要求2所述的方法,其中,所述同一参数为最小深度值,所述设定参数为最小的最小深度值。
  4. 根据权利要求1至3中任一项所述的方法,还包括:
    步骤S11:将所述深度图像中的设定像素点作为初始点并添加到设定队列中;
    步骤S12:确定与所述初始点在空间上相邻的相邻像素点;
    步骤S13:计算所述相邻像素点与所述初始点之间的深度差的绝对值,其中,在所述深度差的绝对值小于或等于设定深度差的情况下,将所述相邻像素点添加到所述初始点所在的连通域中;
    步骤S14:将所述相邻像素点作为下一个初始点并添加到所述设定队列中;以及
    重复上述步骤S12至步骤S14,以确定所述初始点所在的连通域。
  5. 根据权利要求4所述的方法,其中,所述设定深度差为10mm~15mm。
  6. 根据权利要求1至5中任一项所述的方法,其中,所述获取目标物体的图像包括:
    确定所述连通域中的多个位置处的像素点的数量沿所述主方向的变化趋势,其中,每个位置处的像素点沿所述连通域的次方向依次排列,所述次方向垂直于所述主方向;
    将所述变化趋势与所述目标物体的形态沿所述主方向的变化趋势进行比 较;以及
    根据比较结果在所述连通域中确定用于获取所述目标物体的图像的分割位置。
  7. 根据权利要求1至5中任一项所述的方法,其中,所述获取目标物体的图像包括:
    确定所述连通域在多个位置中的每个位置处沿次方向的真实宽度,其中,所述次方向垂直于所述主方向,所述多个位置沿所述主方向依次排列,每个位置处的像素点沿所述次方向依次排列;以及
    将所述真实宽度与参考宽度进行比较,以在所述连通域中确定用于获取所述目标物体的图像的分割位置。
  8. 根据权利要求7所述的方法,其中,所述目标物体为人手,所述参考宽度为40mm~100mm。
  9. 根据权利要求7或8所述的方法,其中,根据所述每个位置处的像素点的数量、所述每个位置处的像素点的平均深度值以及获取所述深度图像的摄像头的焦距比,确定所述连通域在所述每个位置处的真实宽度。
  10. 根据权利要求6或7所述的方法,其中,所述获取目标物体的图像还包括:
    确定所述多个位置中的每个位置到所述连通域的顶点之间的真实距离;以及
    将所述真实距离与参考长度进行比较以确定所述分割位置。
  11. 根据权利要求10所述的方法,其中,所述目标物体为人手,所述参考长度为40mm~200mm。
  12. 根据权利要求10或11所述的方法,其中,根据所述每个位置到所述连通域的顶点之间的每相邻的两个位置之间的平均深度的差和沿所述主方向的真实距离,计算所述每个位置到所述连通域的顶点的真实距离。
  13. 根据权利要求7-12中任一项所述的方法,其中,所述获取目标物体的图像还包括:
    在所述多个位置中获取多个参考位置;
    计算每个参考位置处相距最远的两个像素点之间的坐标的差值;以及
    根据所述差值与所述每个参考位置处的像素点的数量之间的大小关系, 确定所述分割位置。
  14. 根据权利要求13所述的方法,其中,
    所述多个参考位置包括第一参考位置和第二参考位置,所述第一参考位置处相距最远的两个像素点之间的坐标的差值为大于0的ΔX1,所述第一参考位置处的像素点的数量为N1,在(ΔX1-N1)/N1小于或等于设定值并且所述设定值为10%~15%的情况下,将所述第一参考位置作为所述分割位置。
  15. 根据权利要求14所述的方法,其中,
    在所述第二参考位置处相距最远的两个像素点之间的坐标的差值为大于0的ΔX2,所述第二参考位置处的像素点的数量为N2,在(ΔX2-N2)/N2大于所述设定值的情况下,所述分割位置到所述第二参考位置的距离大于设定距离,所述设定距离为24mm~26mm。
  16. 一种图像分割系统,包括:
    第一图像分割装置,其被配置为对深度图像进行处理,以从所述深度图像中获取目标物体所在的连通域;
    分析装置,其与所述第一图像分割装置连接,并且被配置为通过主成分分析方法确定所述第一图像分割装置获取的所述连通域的主方向或次方向;以及
    第二图像分割装置,其与所述分析装置连接,并且被配置为根据所述目标物体的形态与所述主方向或次方向之间的关系从所述连通域中获取所述目标物体的图像。
  17. 根据权利要求16所述的系统,其中,所述第二图像分割装置包括:
    计算装置,其与所述分析装置连接,并且被配置为计算所述连通域中的多个位置处的像素点的数量、以及确定所述像素点的数量沿所述主方向的变化趋势,其中,每个位置处的像素点沿所述连通域的次方向依次排列,所述次方向垂直于所述主方向;以及
    比较装置,其与所述计算装置连接,并且被配置为将所述变化趋势与所述目标物体的形态沿所述主方向的变化趋势进行比较,以在所述连通域中确定用于获取所述目标物体的图像的分割位置。
  18. 根据权利要求16所述的系统,其中,所述第二图像分割装置包括:
    计算装置,其与所述分析装置连接,并且被配置为计算所述连通域在多 个位置中的每个位置处沿次方向的真实宽度,其中,所述次方向垂直于所述主方向,所述多个位置沿所述主方向依次排列,每个位置处的像素点沿所述次方向依次排列;以及
    比较装置,其与所述计算装置连接,并且被配置为将所述真实宽度与参考宽度进行比较,以在所述连通域中确定用于获取所述目标物体的图像的分割位置。
  19. 根据权利要求17或18所述的系统,其中,所述计算装置还被配置为计算所述多个位置中的每个位置到所述连通域的顶点的真实距离,所述比较装置还被配置为将所述真实距离与参考长度进行比较以确定所述分割位置。
  20. 一种图像分割系统,包括:
    处理器;
    存储器;以及
    存储在所述存储器中的计算机程序指令,在所述计算机程序指令被所述处理器运行时执行:
    从深度图像中获取目标物体所在的连通域;
    通过主成分分析方法确定所述连通域的主方向或次方向;以及
    根据所述目标物体的形态与所述主方向或次方向之间的关系,获取所述目标物体图像。
  21. 一种存储介质,其中存储有计算机程序指令,所述计算机程序指令适于由处理器加载并执行:
    从深度图像中获取目标物体所在的连通域;
    通过主成分分析方法确定所述连通域的主方向或次方向;以及
    根据所述目标物体的形态与所述主方向或次方向之间的关系,获取所述目标物体图像。
  22. 一种设备,包括根据权利要求16-19中任一项所述的图像分割系统或权利要求20所述的图像分割系统或权利要求21所述的存储介质。
PCT/CN2017/091986 2016-10-17 2017-07-06 图像分割方法、图像分割系统和存储介质及包括其的设备 WO2018072483A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP17835447.8A EP3537375B1 (en) 2016-10-17 2017-07-06 Image segmentation methods, image segmentation system and device comprising same, and storage medium
US15/750,410 US10650523B2 (en) 2016-10-17 2017-07-06 Image segmentation method, image segmentation system and storage medium and apparatus including the same

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201610905435.XA CN107958458B (zh) 2016-10-17 2016-10-17 图像分割方法、图像分割系统及包括其的设备
CN201610905435.X 2016-10-17

Publications (1)

Publication Number Publication Date
WO2018072483A1 true WO2018072483A1 (zh) 2018-04-26

Family

ID=61953939

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/091986 WO2018072483A1 (zh) 2016-10-17 2017-07-06 图像分割方法、图像分割系统和存储介质及包括其的设备

Country Status (4)

Country Link
US (1) US10650523B2 (zh)
EP (1) EP3537375B1 (zh)
CN (1) CN107958458B (zh)
WO (1) WO2018072483A1 (zh)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108876795A (zh) * 2018-06-07 2018-11-23 四川斐讯信息技术有限公司 一种图像中物体的分割方法及系统
CN110390666A (zh) * 2019-06-14 2019-10-29 平安科技(深圳)有限公司 道路损伤检测方法、装置、计算机设备及存储介质
CN113436175A (zh) * 2021-06-30 2021-09-24 平安科技(深圳)有限公司 车图像分割质量的评估方法、装置、设备及存储介质
CN116703251A (zh) * 2023-08-08 2023-09-05 德润杰(山东)纺织科技有限公司 基于人工智能的胶圈生产质量检测方法

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10803616B1 (en) * 2017-04-13 2020-10-13 Facebook Technologies, Llc Hand calibration using single depth camera
CN109426789B (zh) * 2017-08-31 2022-02-01 京东方科技集团股份有限公司 手和图像检测方法和系统、手分割方法、存储介质和设备
US11957975B2 (en) * 2018-05-24 2024-04-16 Microsoft Technology Licensing, Llc Dead reckoning and latency improvement in 3D game streaming scenario
CN109003282B (zh) * 2018-07-27 2022-04-29 京东方科技集团股份有限公司 一种图像处理的方法、装置及计算机存储介质
CN109048918B (zh) * 2018-09-25 2022-02-22 华南理工大学 一种轮椅机械臂机器人的视觉引导方法
CN110544256B (zh) * 2019-08-08 2022-03-22 北京百度网讯科技有限公司 基于稀疏特征的深度学习图像分割方法及装置
CN112711250B (zh) * 2019-10-25 2022-07-05 科沃斯机器人股份有限公司 一种自行走设备移动控制方法与自行走设备
CN112907569B (zh) * 2021-03-24 2024-03-15 贝壳找房(北京)科技有限公司 头部图像区域的分割方法、装置、电子设备和存储介质
CN113129305B (zh) * 2021-05-18 2023-06-16 浙江大华技术股份有限公司 丝锭状态的确定方法、装置、存储介质及电子装置
CN114299406B (zh) * 2022-03-07 2022-06-07 山东鹰联光电科技股份有限公司 基于无人机航拍的光纤电缆线路巡检方法
CN114842033B (zh) * 2022-06-29 2022-09-02 江西财经大学 一种用于智能ar设备的图像处理方法

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102272773A (zh) * 2008-12-30 2011-12-07 诺基亚公司 提供手分割用于手势分析的方法、装置和计算机程序产品
US8625897B2 (en) * 2010-05-28 2014-01-07 Microsoft Corporation Foreground and background image segmentation
CN104217192A (zh) * 2013-06-03 2014-12-17 株式会社理光 基于深度图像的手定位方法和设备
CN104765440A (zh) * 2014-01-02 2015-07-08 株式会社理光 手检测方法和设备

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9417700B2 (en) * 2009-05-21 2016-08-16 Edge3 Technologies Gesture recognition systems and related methods
CN102831378B (zh) * 2011-06-14 2015-10-21 株式会社理光 人的检测和跟踪方法与系统
CN102867311B (zh) * 2011-07-07 2015-11-25 株式会社理光 目标跟踪方法和目标跟踪设备
EP2680228B1 (en) * 2012-06-25 2014-11-26 Softkinetic Software Improvements in or relating to three dimensional close interactions.
US9208580B2 (en) * 2012-08-23 2015-12-08 Qualcomm Incorporated Hand detection, location, and/or tracking
US9311550B2 (en) * 2013-03-06 2016-04-12 Samsung Electronics Co., Ltd. Device and method for image processing
RU2014108870A (ru) * 2014-03-06 2015-09-20 ЭлЭсАй Корпорейшн Процессор изображений, содержащий систему распознавания жестов с распознаванием неподвижной позы кисти на основе первого и второго наборов признаков
CN105096259B (zh) * 2014-05-09 2018-01-09 株式会社理光 深度图像的深度值恢复方法和系统

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102272773A (zh) * 2008-12-30 2011-12-07 诺基亚公司 提供手分割用于手势分析的方法、装置和计算机程序产品
US8625897B2 (en) * 2010-05-28 2014-01-07 Microsoft Corporation Foreground and background image segmentation
CN104217192A (zh) * 2013-06-03 2014-12-17 株式会社理光 基于深度图像的手定位方法和设备
CN104765440A (zh) * 2014-01-02 2015-07-08 株式会社理光 手检测方法和设备

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
GU, DE ET AL.: "Hand and Finger Tracking Based on Kinect Depth Information", THE 25TH CHINESE PROCESS CONTROL CONFERENCE, 9 August 2014 (2014-08-09), pages 1 - 5, XP009517006 *
LUO, LINGFENG ET AL.: "Gesture Reconition Using Depth Image", ICSP2014 PROCEEDINGS, 2014 12TH INTERNATIONAL CONFERENCE ON, 23 October 2014 (2014-10-23), pages 751 - 755, XP032725574 *
See also references of EP3537375A4 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108876795A (zh) * 2018-06-07 2018-11-23 四川斐讯信息技术有限公司 一种图像中物体的分割方法及系统
CN110390666A (zh) * 2019-06-14 2019-10-29 平安科技(深圳)有限公司 道路损伤检测方法、装置、计算机设备及存储介质
CN110390666B (zh) * 2019-06-14 2023-06-27 平安科技(深圳)有限公司 道路损伤检测方法、装置、计算机设备及存储介质
CN113436175A (zh) * 2021-06-30 2021-09-24 平安科技(深圳)有限公司 车图像分割质量的评估方法、装置、设备及存储介质
CN113436175B (zh) * 2021-06-30 2023-08-18 平安科技(深圳)有限公司 车图像分割质量的评估方法、装置、设备及存储介质
CN116703251A (zh) * 2023-08-08 2023-09-05 德润杰(山东)纺织科技有限公司 基于人工智能的胶圈生产质量检测方法
CN116703251B (zh) * 2023-08-08 2023-11-17 德润杰(山东)纺织科技有限公司 基于人工智能的胶圈生产质量检测方法

Also Published As

Publication number Publication date
US10650523B2 (en) 2020-05-12
EP3537375B1 (en) 2021-09-01
EP3537375A1 (en) 2019-09-11
EP3537375A4 (en) 2020-07-08
CN107958458A (zh) 2018-04-24
CN107958458B (zh) 2021-01-22
US20190043199A1 (en) 2019-02-07

Similar Documents

Publication Publication Date Title
WO2018072483A1 (zh) 图像分割方法、图像分割系统和存储介质及包括其的设备
US11120254B2 (en) Methods and apparatuses for determining hand three-dimensional data
CN107810522B (zh) 实时、基于模型的对象检测及姿态估计
JP6248533B2 (ja) 画像処理装置、画像処理方法および画像処理プログラム
US20150253864A1 (en) Image Processor Comprising Gesture Recognition System with Finger Detection and Tracking Functionality
CN107818290B (zh) 基于深度图的启发式手指检测方法
US20150036894A1 (en) Device to extract biometric feature vector, method to extract biometric feature vector, and computer-readable, non-transitory medium
US9213897B2 (en) Image processing device and method
WO2015149712A1 (zh) 一种指向交互方法、装置及系统
CN108573471B (zh) 图像处理装置、图像处理方法以及记录介质
US9082000B2 (en) Image processing device and image processing method
JP2022169723A (ja) ビジョンシステムで画像内のプローブを効率的に採点するためのシステム及び方法
US9418446B2 (en) Method and apparatus for determining a building location based on a building image
JP2016014954A (ja) 手指形状の検出方法、そのプログラム、そのプログラムの記憶媒体、及び、手指の形状を検出するシステム。
WO2017070923A1 (zh) 一种人脸识别方法和装置
CN110832542A (zh) 识别处理设备、识别处理方法和程序
JP6658188B2 (ja) 画像処理装置、画像処理方法および画像処理プログラム
JP2016099643A (ja) 画像処理装置、画像処理方法および画像処理プログラム
JP2013015891A (ja) 画像処理装置、画像処理方法及びプログラム
JP2015045919A (ja) 画像認識方法及びロボット
JP2011141744A (ja) 指紋特徴量抽出装置、指紋入力装置、指紋認証装置、及び指紋特徴量抽出方法
CN113128324B (zh) 基于深度数据的手势分割方法及其系统和电子设备
JP6586852B2 (ja) 画像処理装置
JP6042289B2 (ja) 姿勢推定装置、姿勢推定方法およびプログラム
CN110428434B (zh) 一种基于图像的人体肩宽测量方法及装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17835447

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE