US11729367B2 - Wide viewing angle stereo camera apparatus and depth image processing method using the same - Google Patents

Wide viewing angle stereo camera apparatus and depth image processing method using the same Download PDF

Info

Publication number
US11729367B2
US11729367B2 US17/405,615 US202117405615A US11729367B2 US 11729367 B2 US11729367 B2 US 11729367B2 US 202117405615 A US202117405615 A US 202117405615A US 11729367 B2 US11729367 B2 US 11729367B2
Authority
US
United States
Prior art keywords
image
human
stereo camera
camera apparatus
depth
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
US17/405,615
Other versions
US20220060677A1 (en
Inventor
Ki Yeong PARK
Dong Suk Kim
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Argosvision Inc
Argos Vision Inc
Original Assignee
Argos Vision Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from KR1020200104281A external-priority patent/KR102310958B1/en
Priority claimed from KR1020210023553A external-priority patent/KR102430274B1/en
Priority claimed from KR1020210023552A external-priority patent/KR102430273B1/en
Priority claimed from KR1020210097826A external-priority patent/KR20230016390A/en
Assigned to ARGO VISION INC. reassignment ARGO VISION INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: PARK, KI YEONG
Application filed by Argos Vision Inc filed Critical Argos Vision Inc
Assigned to ArgosVision Inc. reassignment ArgosVision Inc. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: KIM, DONG SUK
Assigned to ArgosVision Inc. reassignment ArgosVision Inc. CORRECTIVE ASSIGNMENT TO CORRECT THE ASSIGNEE'S NAME PREVIOUSLY RECORDED AT REEL: 057218 FRAME: 0641. ASSIGNOR(S) HEREBY CONFIRMS THE ASSIGNMENT. Assignors: ARGO VISION INC.
Publication of US20220060677A1 publication Critical patent/US20220060677A1/en
Assigned to ARGOSVISION, INC. reassignment ARGOSVISION, INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: KIM, DONG SUK
Assigned to ARGOSVISION, INC. reassignment ARGOSVISION, INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: PARK, KI YEONG
Priority to US18/343,900 priority Critical patent/US20230344979A1/en
Publication of US11729367B2 publication Critical patent/US11729367B2/en
Application granted granted Critical
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/20Image signal generators
    • H04N13/271Image signal generators wherein the generated image signals comprise depth maps or disparity maps
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery
    • G06T7/55Depth or shape recovery from multiple images
    • G06T7/593Depth or shape recovery from multiple images from stereo images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/64Three-dimensional objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/20Image signal generators
    • H04N13/204Image signal generators using stereoscopic image cameras
    • H04N13/207Image signal generators using stereoscopic image cameras using a single 2D image sensor
    • H04N13/232Image signal generators using stereoscopic image cameras using a single 2D image sensor using fly-eye lenses, e.g. arrangements of circular lenses
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/20Image signal generators
    • H04N13/204Image signal generators using stereoscopic image cameras
    • H04N13/239Image signal generators using stereoscopic image cameras using two 2D image sensors having a relative position equal to or related to the interocular distance
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/698Control of cameras or camera modules for achieving an enlarged field of view, e.g. panoramic image capture
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • G06T2207/10021Stereoscopic video; Stereoscopic image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10032Satellite or aerial image; Remote sensing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30248Vehicle exterior or interior
    • G06T2207/30252Vehicle exterior; Vicinity of vehicle
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N2013/0074Stereoscopic image analysis
    • H04N2013/0081Depth or disparity estimation from stereoscopic image signals

Definitions

  • example embodiments relate to a wide viewing angle stereo camera apparatus and a depth image processing method using the same, and more particularly, to a stereo camera apparatus having a wide viewing angle in which two lenses, for example, fisheye lenses are arranged in a vertical direction to improve a horizontal viewing angle and a vertical viewing angle and a depth image processing method using the same.
  • RGB-D red, green, blue-depth
  • 3D three-dimensional
  • the RGB-D camera may be classified into a stereo camera and a time of flight (TOF) camera based on a method of obtaining depth information.
  • a typical stereo camera includes two cameras installed as far apart as a baseline and obtains depth information based on a viewpoint mismatch between the cameras. To obtain depth information, a stereo matching process of comparing images of both cameras need to be performed.
  • a patterned light or a structured light may be used to increase a speed and accuracy of the stereo matching process.
  • the TOF camera refers to a camera that irradiates a laser toward a subject and calculates a distance based on an amount of time used when the irradiated light is reflected back from the subject and provides depth information with a two-dimensional (2D) image.
  • the TOF camera provides precise distance information compared to a general stereo scheme and provides a relatively low spatial resolution and has some constraints in a measurement distance and a use environment.
  • General cameras use a designed lens for a rectilinear projection.
  • a straight line in the real word appears as a straight line in an image and thus, an image similar to that perceived by a human may be obtained.
  • objects present around the image are expressed relatively larger than objects present in the center of the image.
  • a rectilinear projection lens is used up to a horizontal viewing angle of about 120 degrees and a fisheye lens is used if a viewing angle of greater than 120 degrees is required.
  • the fisheye lens may project the real world into an image as if there is Barrel distortion and may capture a viewing angle of more than 180 degrees in the image.
  • RGB-D camera As the use environment of the RGB-D camera is expanding to a moving object, such as, for example, a robot, a drone, and an autonomous driving vehicle, there is an increasing demand for a wide viewing angle.
  • many RGB-D cameras use the rectilinear projection lens and thus, have difficulty in providing the horizontal viewing angle of 120 degrees or more.
  • Some stereo cameras use the fisheye lens and thus, provide a wider viewing angle.
  • a fisheye lens image may not perform efficient stereo matching through a one-dimensional (1D) search as in a rectilinear projection image.
  • distance information may not be obtained in the same direction as a baseline of a stereo camera. Accordingly, a horizontal angle of view is limited to be within 180 degrees.
  • Example embodiments provide a stereo camera apparatus having a wide viewing angle in which two lenses, for example, fisheye lenses are arranged in a vertical direction to improve a horizontal viewing angle by a maximum horizontal viewing angle of the lens and to improve a vertical viewing angle up to a maximum of 180 degrees and a depth image processing method using the same.
  • example embodiments provide a wide viewing angle stereo camera-based human-machine interface system that may recognize a motion of a whole body of a user at a close distance within 1 m using a stereo camera apparatus described herein and a depth image processing method using the same.
  • example embodiments provide a wide viewing angle stereo camera-based first-person vision system that may recognize a motion of a user or a surrounding environment of the user using a stereo camera apparatus described herein and a depth image processing method using the same.
  • example embodiments provide a wide viewing angle stereo camera-based human following system that may recognize and follow a human at a close distance using a stereo camera apparatus described herein and a method thereof.
  • a stereo camera apparatus including a receiver configured to receive a first image and a second image of a subject captured through a first lens and a second lens that are provided in a vertical direction; a converter configured to convert the received first image and second image using a map projection scheme; and a processing configured to extract a depth of the subject by performing stereo matching on the first image and the second image converted using the map projection scheme, in a height direction.
  • the converter may be configured to convert the first image and the second image using an equirectangular projection scheme.
  • the converter may be configured to perform rotation transformation of matching an epipolar line to a longitudinal line of a spherical surface using a spherical coordinate system.
  • the processing may be configured to perform stereo matching by searching for the first image and the second image converted using the map projection scheme along a vertical line corresponding to the longitudinal line.
  • Each of the first lens and the second lens may include a fisheye lens.
  • the first lens and the second lens may be fisheye lenses each with a horizontal viewing angle of 180 degrees or more.
  • the processing may be configured to obtain an image of the subject having a horizontal wide viewing angle of 180 degrees or more using the extracted depth of the subject.
  • the receiver may include an image sensor configured to capture the first image and an image sensor configured to capture the second image, and each of the image sensor configured to capture the first image and the image sensor configured to capture the second image may be provided in a width direction to obtain a wider horizontal viewing angle than a vertical viewing angle in the case of a rectangular shape of which a width length is greater than a height length.
  • the stereo camera apparatus may further include a vision processing configured to recognize a motion of a user and a surrounding environment that includes an object and a terrain present around the user based on a depth map of the subject extracted from the stereo camera apparatus and to provide the user with the recognized motion of the user and surrounding environment.
  • a vision processing configured to recognize a motion of a user and a surrounding environment that includes an object and a terrain present around the user based on a depth map of the subject extracted from the stereo camera apparatus and to provide the user with the recognized motion of the user and surrounding environment.
  • the stereo camera apparatus may further include a controller configured to recognize a human and a distance from the human based on a depth map of the subject extracted from the stereo camera apparatus, and to control a human following system to follow the human at a predetermined distance based on the recognized human and distance from the human.
  • a controller configured to recognize a human and a distance from the human based on a depth map of the subject extracted from the stereo camera apparatus, and to control a human following system to follow the human at a predetermined distance based on the recognized human and distance from the human.
  • the controller may be configured to recognize a travel direction of the human and a surrounding environment that includes an object and a terrain present around the human based on the depth map of the subject, to detect presence or absence of a collision based on the travel direction of the human and the surrounding environment, and to control the human following system to follow the human by applying a collision detection result.
  • a depth image processing method of a stereo camera apparatus including receiving a first image and a second image of a subject captured through a first lens and a second lens that are provided in a vertical direction; converting the received first image and second image using a map projection scheme; and extracting a depth of the subject by performing stereo matching on the first image and the second image converted using the map projection scheme, in a height direction.
  • the depth image processing method may further include recognizing a motion of a user and a surrounding environment that includes an object and a terrain present around the user based on a depth map of the extracted subject and providing the user with the recognized motion of the user and surrounding environment.
  • the depth image processing method may further include recognizing a human and a distance from the human based on a depth map of the extracted subject, and controlling a human following system to follow the human at a predetermined distance based on the recognized human and distance from the human.
  • a wide viewing angle by improving a horizontal viewing angle and a vertical viewing angle, to implement a red, green, blue, depth (RGB-D) camera wider than a field of view of a human, and to easily detect an object using a wide viewing angle image converted using an equirectangular projection scheme.
  • RGB-D red, green, blue, depth
  • an upright subject appears tilted or bent at the periphery of the image.
  • the upright subject is expressed in an upright state, which facilitates object detection.
  • an RGB-D camera may be mounted to a moving object such as a robot, a vehicle, and a drone and may detect a whole body of a human present within 1 m. Therefore, if it is attached to a human following robot, the whole body may be detected although a target to be followed is close. Even though the target to be followed shifts a direction of movement side to side, the target does not deviate from a field of view and a stable following is possible. Accordingly, the example embodiments may be used for a human-robot interaction and various moving objects as well as the human-computer interface.
  • a wide viewing angle stereo camera apparatus is attached to a body portion, such as a head and a torso of a user, to capture a front direction of the user, it is possible to recognize a motion of a hand and an arm of the user and to grasp a posture and a behavior of the user due to the wide viewing angle. Also, by extracting a depth map using the wide viewing angle stereo camera apparatus, it is possible to detect or recognize a surrounding environment captured by the stereo camera apparatus, for example, a collision and a terrain such as whether a floor is flat or inclined.
  • VR virtual reality
  • HMD head mounted display
  • a human following system such as a moving robot, recognizes a human within a close distance, for example, within 1 m and follows the human, a professional manpower for operating a mobile robot may not be required and a manpower consumption may be reduced. That is, the example embodiments may apply various algorithms for detecting and following a captured human.
  • recognizing a human and a surrounding environment that includes an object and a terrain around the human using a depth map of a subject extracted by a stereo camera apparatus it is possible to detect presence or absence of a collision based on the surrounding environment.
  • the collision is determined to occur, it is possible to control a human following system by reflecting presence of the collision and to reduce a probability of damage to a moving robot.
  • the present disclosure may apply to various fields including logistics fields, such as, for example, a human-following logistics robot, a postman-following robot, a personal cargo robot, and an electric golf trolley as well as a distribution field.
  • logistics fields such as, for example, a human-following logistics robot, a postman-following robot, a personal cargo robot, and an electric golf trolley as well as a distribution field.
  • FIG. 1 illustrates a configuration of a stereo camera apparatus according to an example embodiment
  • FIG. 2 illustrates an example for describing an arrangement of fisheye lenses and a width direction arrangement and a viewing angle of each of image sensors configured to obtain a horizontal viewing angle wider than a vertical viewing angle in a stereo camera apparatus of the present disclosure
  • FIG. 3 illustrates an example for describing an epipolar line geometry in a stereo camera apparatus of the present disclosure
  • FIG. 4 illustrates an example for describing a method of extracting a depth
  • FIG. 5 illustrates an example for describing conversion of a fisheye lens image using an equirectangular projection scheme
  • FIG. 6 illustrates an example of a depth for a disparity of the same size
  • FIG. 7 illustrates an example for describing a viewing angle of a stereo camera apparatus of the present disclosure
  • FIG. 8 is a flowchart illustrating a depth image processing method of a stereo camera apparatus according to an example embodiment
  • FIG. 9 illustrates a configuration of a first-person vision system based on a stereo camera apparatus according to an example embodiment
  • FIG. 10 is a flowchart illustrating an additional operation of a depth image processing method when a stereo camera apparatus according to an example embodiment operates as a first-person vision system;
  • FIG. 11 illustrates a configuration of a human following system based on a stereo camera apparatus according to an example embodiment
  • FIG. 12 is a flowchart illustrating an additional operation of a depth image processing method performed when a stereo camera apparatus according to an example embodiment operates as a human following system.
  • the example embodiments relate to providing a stereo camera apparatus capable of having a wide viewing angle by arranging two lenses, for example, fisheye lenses in a vertical direction to improve a horizontal viewing angle and a vertical viewing angle.
  • the present disclosure may extract a depth of a subject captured by fisheye images and may obtain a wide viewing angle image based on the extracted depth by converting fisheye images respectively captured through the two fisheye lenses using a map projection scheme, for example, an equirectangular projection scheme in which a meridian is displayed as a straight line and by performing stereo matching on the fisheye images converted using the equirectangular projection scheme in a height direction or a vertical direction.
  • a map projection scheme for example, an equirectangular projection scheme in which a meridian is displayed as a straight line and by performing stereo matching on the fisheye images converted using the equirectangular projection scheme in a height direction or a vertical direction.
  • two fisheye lenses or fisheye lens cameras are vertically arranged to face the same direction. Therefore, if a spherical coordinate system is used, an epipolar line matches a longitudinal line (a meridian) of a spherical surface and if a fisheye lens image is converted using an equirectangular projection scheme, the longitudinal line is expressed as a vertical line in an image. Therefore, efficient stereo matching may be performed by conducting a search along the vertical line.
  • the stereo camera apparatus disclosed herein may extract or calculate a depth of a subject from an angular disparity between images converted using an equirectangular projection scheme for images captured using an upper fisheye lens and a lower fisheye lens.
  • FIG. 1 illustrates a configuration of a stereo camera apparatus according to an example embodiment, that is, illustrates a conceptual configuration of a stereo camera apparatus or system.
  • FIG. 2 illustrates an example for describing an arrangement of fisheye lenses and a width direction arrangement and a viewing angle of each of image sensors configured to obtain a horizontal viewing angle wider than a vertical viewing angle in a stereo camera apparatus of the present disclosure
  • FIG. 3 illustrates an example for describing an epipolar line geometry in a stereo camera apparatus of the present disclosure
  • FIG. 4 illustrates an example for describing a method of extracting a depth
  • FIG. 5 illustrates an example for describing conversion of a fisheye lens image using an equirectangular projection scheme
  • FIG. 6 illustrates an example of a depth for a disparity of the same size
  • FIG. 7 illustrates an example for describing a viewing angle of a stereo camera apparatus of the present disclosure.
  • a stereo camera apparatus 100 includes a receiver 110 , a converter 120 , and a processing 130 .
  • the receiver 110 receives a first image (or a first fisheye image) and a second image (or a second fisheye image) of a subject that are captured through a first lens, for example, a first fisheye lens (or a first fisheye lens camera), and a second lens, for example, a second fisheye lens (or a second fisheye lens camera) that are arranged in a vertical direction.
  • a first lens for example, a first fisheye lens (or a first fisheye lens camera)
  • a second lens for example, a second fisheye lens (or a second fisheye lens camera) that are arranged in a vertical direction.
  • the receiver 110 receives the first fisheye image and the second fisheye image of the subject respectively captured through the first fisheye lens and the second fisheye lens that are arranged in the vertical direction.
  • the first fisheye lens and the second fisheye lens are provided to face the same direction and may have a maximum horizontal viewing angle and a vertical viewing angle improved up to a maximum of 180 degrees.
  • the first fisheye lens and the second fisheye lens arranged in the vertical direction may have a horizontal viewing angle of 200 degrees or more and a vertical viewing angle of 180 degrees or more.
  • each fisheye lens is not limited to or restricted by the horizontal viewing angle of 200 degrees or more and the vertical viewing angle of 180 degrees or more.
  • the horizontal viewing angle and the vertical viewing angle may vary depending on a fisheye lens to be used.
  • the receiver 110 may arrange the respective image sensors in a width direction to obtain a width viewing angle (or a horizontal viewing angle) wider than a height viewing angle (or a vertical viewing angle).
  • each of the image sensors of the receiver 110 may be arranged in the height direction.
  • the converter 120 converts the first fisheye image and the second fisheye image received through the receiver 110 using a map projection scheme.
  • the converter 120 may convert the first fisheye image and the second fisheye image using an equirectangular projection scheme.
  • the equirectangular projection scheme refers to a single map projection scheme and may represent a latitudinal line and a longitudinal line as a horizontal line and a vertical line, respectively.
  • the converter 120 may convert a 220-degree fisheye image to an equirectangular image using the equirectangular projection scheme.
  • the equirectangular projection scheme may be used to capture a wide viewing angle that may not be captured in an existing rectilinear projection image, such as a 360-degree camera.
  • distortion may occur in which upper and lower portions of the image are greatly stretched left and right.
  • the converter 120 may perform an image rectification for rotating the first fisheye image and the second fisheye image to be matched in a direction of a meridian prior to converting the first fisheye image and the second fisheye image using the equirectangular projection scheme.
  • image rectification is performed, a comparison between the two images, that is, the first fisheye image and the second fisheye image may be simplified to a 1D search, which may lead to improving a speed of stereo matching.
  • the processing 130 extracts a depth or a depth map of the subject captured through the first fisheye lens and the second fisheye lens by performing stereo matching on fisheye images, that is, a first equirectangular image and a second equirectangular image converted by the converter 120 using the equirectangular projection scheme, in a height direction or a vertical direction.
  • two fisheye lenses or fisheye lens cameras are vertically provided to face the same direction. Therefore, if the spherical coordinate system is used, an epipolar line matches a longitudinal line (a meridian) of a spherical surface and if a fisheye lens image is converted using the equirectangular projection scheme, the longitudinal line is expressed as a vertical line in an image. Therefore, the processing 130 may perform efficient stereo matching by conducting a search along the vertical line.
  • the processing 130 may extract or calculate a depth d of a subject from an angular disparity, that is, ⁇ 1 - ⁇ 0 , between images converted using the equirectangular projection scheme.
  • the depth d may be extracted or calculated according to the following Equation 1.
  • a depth of an angular disparity with the same size may vary.
  • the processing 130 may obtain an image of the subject having a wide viewing angle that includes a horizontal viewing angle of 180 degrees or more and a vertical viewing angle of maximum 180 degrees using the extracted depth of the subject.
  • a stereo camera apparatus may obtain a wide viewing angle by arranging two lenses, for example, fisheye lenses in a vertical direction and thereby improving a horizontal viewing angle and a vertical viewing angle.
  • the present disclosure may implement a wide viewing angle RGB-D camera apparatus in which the vertical viewing angle is a maximum of 180 degrees and a horizontal viewing angle is much larger than 180 degrees.
  • a viewing angle having a horizontal viewing angle of 280 degrees and a vertical viewing angle of 180 degrees may be implemented.
  • a stereo camera apparatus it is possible to obtain a wide viewing angle by improving a horizontal viewing angle and a vertical viewing angle, to implement an RGB-D camera wider than a viewing angle of a human, and to easily detect an object accordingly.
  • a stereo camera apparatus may easily detect a close object and thus, may detect a whole body of a human close within 1 m.
  • the stereo camera apparatus may detect a whole body although a target to be followed is close. Although the target to be followed changes a travel direction left and right, the target to be followed does not deviate from a field of view and thus stable following is possible. Therefore, the present disclosure may be employed not only for a human-computer interface but also for a human-robot interaction and various moving objects.
  • the stereo camera apparatus disclosed herein may perform efficient stereo matching between an upper image and a lower image by converting images captured by arranging two fisheye lenses or fisheye lens cameras in a vertical direction or a height direction, using an equirectangular projection scheme.
  • the existing stereo camera may arrange the cameras in a convenient direction for mounting into consideration of a direction of a subject.
  • a shape of an image distortion may vary according to an installation direction of the camera. Referring to FIG. 6 , since an accuracy of a depth obtainable through stereo matching varies, an arrangement of the stereo camera capable of obtaining an image distortion and a further accurate depth value for the subject needs to be considered.
  • a straight line in the real world is expressed as a straight line in the image at all times and thus, a distortion does not occur in the image due to the installation direction of the camera.
  • a vertical line in the real world is expressed as a straight line
  • a horizontal line in the real world is expressed as a curved line and a shape of a distortion in the image varies according to the installation direction of the camera.
  • the camera using the image converted through the equirectangular projection scheme may be assumed to be used in a state in which the camera generally faces the front.
  • the wide viewing angle stereo camera in which two lenses are arranged in a height direction may use advantages found in arrangement in the height direction into consideration of the following three aspects.
  • occlusion occurs between two cameras.
  • a side camera that constitutes the stereo camera is visible in an image. Therefore, two cameras may be arranged in a vertical direction such that a direction of the occlusion may be upward and downward.
  • meaningful subjects are highly likely to be present in the front or in a side direction rather than in a ceiling direction or a floor direction. Therefore, that the direction of the occlusion is upward and downward may be advantageous.
  • an image is expressed in an excessively enlarged manner with getting closer to an upper end and a lower end of the image corresponding to a pole.
  • a direction in which such distortion occurs may be matched to an upward direction and a downward direction.
  • the stereo camera apparatus of the present disclosure may minimize an area in which a depth is unobtainable by matching a direction in which interference between two cameras or two lenses occurs and a direction in which the depth is unobtainable and may increase a horizontal viewing angle of a depth map by 180 degrees or more.
  • the stereo camera apparatus of the present disclosure may obtain a 280-degree depth map in a horizontal direction in the case of using a 280-degree fisheye lens.
  • a stereo camera apparatus performs stereo matching by searching for an upper camera image based on an image of a lower camera between an upper camera and the lower camera, a hole by occlusion is generated in a downward direction of an object.
  • the hole is generated on the left or the right of the object.
  • the present disclosure may be more advantageous for an objection detection since the hole is generated downward of the object if the hole is generated.
  • FIG. 8 is a flowchart illustrating a depth image processing method of a stereo camera apparatus according to an example embodiment, and also illustrates a flowchart in the stereo camera apparatus of FIGS. 1 to 7 .
  • the depth image processing method of the stereo camera apparatus receives a first image, for example, a first fisheye image, and a second image, for example, a second fisheye image, of a subject captured through a first lens and a second lens that are arranged in a vertical direction.
  • the received first fisheye image and the second fisheye image are converted using a map projection scheme in operation S 820 .
  • the first fisheye image and the second fisheye image may be converted using an equirectangular projection scheme.
  • transformation may be performed to match an epipolar to a longitudinal line of a spherical surface using a spherical coordinate system.
  • a depth of the subject is extracted by performing stereo matching on the first fisheye image and the second fisheye image converted using the map projection scheme, in a height direction in operation S 830 .
  • a depth image of the subject having a wide viewing angle in which a horizontal viewing angle is 180 degrees or more and a vertical viewing angle is a maximum of 180 degrees is obtained using the extracted depth of the subject.
  • stereo matching may be performed by searching for the first image and the second image converted using the map projection scheme along a vertical line corresponding to the longitudinal line.
  • FIG. 8 Although the method of FIG. 8 is described, it is apparent to those skilled in the art that the aforementioned description made above with reference to FIGS. 1 to 7 may apply to each of operations of FIG. 8 .
  • the stereo camera apparatus may recognize arm and hand motions of a user being captured, a posture of the user, and a surrounding environment, for example, a surrounding terrain and a surrounding object, and may provide such information.
  • FIGS. 9 and 10 Further description related to the disclosure is made with reference to FIGS. 9 and 10 .
  • FIG. 9 illustrates a configuration of a first-person vision system based on a stereo camera apparatus according to an example embodiment.
  • the stereo camera apparatus 100 may operate as a first-person vision system by further including a vision processing 910 .
  • the stereo camera apparatus 100 may be attached to a body portion, such as a head and a torso of the user, further a cap and the like, and may capture a front direction of the user. Since the stereo camera apparatus 100 has a wide viewing angle, the stereo camera apparatus 100 may capture a body portion, such as an arm and a hand of the user, and may also capture a surrounding environment, such as, for example, an object, a building, and a terrain, although the front of the user is captured.
  • a body portion such as an arm and a hand of the user
  • a surrounding environment such as, for example, an object, a building, and a terrain
  • the vision processing 910 recognizes a motion of the user and a surrounding environment that includes an object and a terrain present around the user using the depth map of the subject obtained by the stereo camera apparatus 100 and provides the user with the recognized information, that is, information about the motion of the user and the surrounding environment.
  • the first-person vision system that is, the stereo camera apparatus 100 may provide a field of view similar to that of a human through, for example, an RGB-D camera apparatus having a wide viewing angle of horizontally 180 degrees or more. Therefore, a hand motion of the user wearing the first-person vision system, that is, the stereo camera apparatus 100 may enter the field of the view in all directions. Accordingly, although the stereo camera apparatus 100 captures the front of the user, the vision processing 910 may recognize hand and arm motions of the user and a behavior and a posture of the user.
  • the vision processing 910 may detect a collision probability of the user by detecting a surrounding object and may provide the user with information about the detected collision probability.
  • the vision processing 910 may provide information, such as that the hand of the user is present behind the object or that the object is present within the hand of the user, and may also provide information about the surrounding terrain and object through recognition of the surrounding environment in real time.
  • the stereo camera apparatus 100 may detect hand and arm motions of the user or a close object by further including the vision processing 910 .
  • FIG. 10 is a flowchart illustrating an additional operation of a depth image processing method when a stereo camera apparatus according to an example embodiment operates as a first-person vision system, and illustrates an operation additionally performed by the stereo camera apparatus that operates as the first-person vision system of FIG. 9 after operation S 840 of the depth image processing method of FIG. 8 .
  • the stereo camera apparatus recognizes a motion of the user and a surrounding environment that includes an object and a terrain present around the user using the extracted depth map of the subject and provides the user with the recognized motion and surrounding environment of the user.
  • the stereo camera apparatus may recognize a human and a surrounding environment that includes an object and a terrain present around the human using the depth map of the subject extracted by the stereo camera apparatus and may detect presence or absence of a collision based on the recognized surrounding environment. When the collision is determined to occur, the stereo camera apparatus may control a human following system by reflecting the collision.
  • FIGS. 11 and 12 Further description related to the disclosure is made with reference to FIGS. 11 and 12 .
  • FIG. 11 illustrates a configuration of a human following system based on a stereo camera apparatus according to an example embodiment.
  • the stereo camera apparatus 100 may operate as the human following system by further including a controller 1110 .
  • the stereo camera apparatus 100 may capture a human to be followed by the human following system and a surrounding environment, for example, an object, a building and a terrain.
  • a surrounding environment for example, an object, a building and a terrain.
  • the controller 1110 may recognize a human to be followed and a distance from the human based on a depth map of the subject obtained by the stereo camera apparatus 100 and may control the human following system, for example, the stereo camera apparatus 100 to follow the human at a predetermined distance based on the recognized human and distance from the human.
  • the controller 1110 may recognize a travel direction of the human and the surrounding environment that includes an object present around the human, a collidable object, and a terrain as well as the human and the distance from the human using the depth map of the subject, may detect presence or absence of a collision of the human following system, that is, the stereo camera apparatus 100 , based on the recognized information, that is, the travel direction of the human and the surrounding environment, and may control the human following system, that is, the stereo camera apparatus 100 to follow the human by applying a collision detection result.
  • the human following system that is, the stereo camera apparatus 100 may provide a field of view similar to that of the human through the stereo camera apparatus 100 , for example, an RGB-D camera apparatus having a wide viewing angle in which a horizontal viewing angle is much greater than 180 degrees and thus, may follow the human at a close distance, for example, within 1 m and may recognize a whole body of the human at a close distance. Therefore, the human following system, that is, the stereo camera apparatus 100 may verify an accurate position and may follow the human that quickly moves not only in a longitudinal direction but also in a traverse direction alone.
  • FIG. 12 is a flowchart illustrating an additional operation of a depth image processing method performed when a stereo camera apparatus according to an example embodiment operates as a human following system, and illustrates an operation additionally performed by the stereo camera apparatus that operates as the human following system of FIG. 11 after operation S 840 of the depth image processing method of FIG. 8 .
  • a controller included in the stereo camera apparatus recognizes a human and a distance from the human based on the extracted depth map of the subject and controls the human following system to follow the human at a predetermined distance based on the recognized human and distance from the human.
  • the controller may recognize a travel direction of the human and a surrounding environment that includes an object and a terrain around the human based on the depth map of the subject, may detect presence or absence of a collision based on the travel direction of the human and the surrounding environment, and may control the human following system to follow the human by applying a collision detection result.
  • the apparatuses described herein may be implemented using hardware components, software components, and/or a combination thereof.
  • the systems, the apparatuses, and the components described herein may be implemented using one or more general-purpose or special purpose computers, such as, for example, a processor, a controller, an arithmetic logic unit (ALU), a digital signal processor, a microcomputer, a field programmable array (FPA), a programmable logic unit (PLU), a microprocessor, or any other device capable of responding to and executing instructions in a defined manner.
  • the processing device may run an operating system (OS) and one or more software applications that run on the OS.
  • the processing device also may access, store, manipulate, process, and create data in response to execution of the software.
  • OS operating system
  • the processing device also may access, store, manipulate, process, and create data in response to execution of the software.
  • a processing device may include multiple processing elements and/or multiple types of processing elements.
  • a processing device may include multiple processors or a processor and a controller.
  • different processing configurations are possible, such as parallel processors.
  • the software may include a computer program, a piece of code, an instruction, or some combinations thereof, for independently or collectively instructing or configuring the processing device to operate as desired.
  • Software and/or data may be permanently or temporally embodied in any type of machine, component, physical equipment, virtual equipment, computer storage medium or device, or a signal wave to be transmitted, to be interpreted by the processing device or to provide an instruction or data to the processing device.
  • the software also may be distributed over network coupled computer systems so that the software is stored and executed in a distributed fashion.
  • the software and data may be stored by one or more computer readable storage media.
  • the methods according to the above-described example embodiments may be configured in a form of program instructions performed through various computer devices and recorded in non-transitory computer-readable media.
  • the media may also include, alone or in combination with the program instructions, data files, data structures, and the like.
  • the program instructions recorded in the media may be specially designed and configured for the example embodiments or may be known to those skilled in the computer software art and thereby available.
  • Examples of the media include magnetic media such as hard disks, floppy disks, and magnetic tapes; optical media such as CD-ROM and DVDs; magneto-optical media such as floptical disks; and hardware devices that are specially configured to store program instructions, such as read-only memory (ROM), random access memory (RAM), flash memory, and the like.
  • Examples of program instructions include both machine code, such as produced by a compiler, and files containing higher level code that may be executed by the computer using an interpreter.
  • the hardware device may be configured to operate as one or more software modules to perform the operation of the example embodiments or vice versa.

Abstract

Disclosed are a wide viewing angle stereo camera apparatus and a depth image processing method using the same. A stereo camera apparatus includes a receiver configured to receive a first image and a second image of a subject captured through a first lens and a second lens that are provided in a vertical direction; a converter configured to convert the received first image and second image using a map projection scheme; and a processing configured to extract a depth of the subject by performing stereo matching on the first image and the second image converted using the map projection scheme, in a height direction.

Description

CROSS-REFERENCE TO RELATED APPLICATION
This application claims the priority benefit of Korean Patent Application No. 10-2020-0104281, filed on Aug. 20, 2020, No. 10-2021-0023552, filed on Feb. 22, 2021, No. 10-2021-0023553, filed on Feb. 22, 2021 and No. 10-2021-0097826, filed on Jul. 26, 2021, in the Korean Intellectual Property Office, the disclosure and entire contents of which are incorporated herein by reference in their entirety.
BACKGROUND 1. Field of the Invention
The following description of example embodiments relates to a wide viewing angle stereo camera apparatus and a depth image processing method using the same, and more particularly, to a stereo camera apparatus having a wide viewing angle in which two lenses, for example, fisheye lenses are arranged in a vertical direction to improve a horizontal viewing angle and a vertical viewing angle and a depth image processing method using the same.
2. Description of the Related Art
Many researches on a red, green, blue-depth (RGB-D) camera for outputting a color image and a depth map corresponding to three-dimensional (3D) information are being conducted to implement a human-computer interface. A Kinect sensor that is an RGB-D camera Microsoft has supplied for its games provides a function of recognize body gestures of a plurality of humans and is used for various human-computer interfaces as well as games, which are its original purpose. Various RGB-D camera products are released and are expanding the use scope to 3D content creation, autonomous driving, and robots.
The RGB-D camera may be classified into a stereo camera and a time of flight (TOF) camera based on a method of obtaining depth information. A typical stereo camera includes two cameras installed as far apart as a baseline and obtains depth information based on a viewpoint mismatch between the cameras. To obtain depth information, a stereo matching process of comparing images of both cameras need to be performed. A patterned light or a structured light may be used to increase a speed and accuracy of the stereo matching process. The TOF camera refers to a camera that irradiates a laser toward a subject and calculates a distance based on an amount of time used when the irradiated light is reflected back from the subject and provides depth information with a two-dimensional (2D) image. The TOF camera provides precise distance information compared to a general stereo scheme and provides a relatively low spatial resolution and has some constraints in a measurement distance and a use environment.
General cameras use a designed lens for a rectilinear projection. In the rectilinear projection, a straight line in the real word appears as a straight line in an image and thus, an image similar to that perceived by a human may be obtained. However, as a viewing angle of the camera increases, objects present around the image are expressed relatively larger than objects present in the center of the image. In general, a rectilinear projection lens is used up to a horizontal viewing angle of about 120 degrees and a fisheye lens is used if a viewing angle of greater than 120 degrees is required. The fisheye lens may project the real world into an image as if there is Barrel distortion and may capture a viewing angle of more than 180 degrees in the image.
As the use environment of the RGB-D camera is expanding to a moving object, such as, for example, a robot, a drone, and an autonomous driving vehicle, there is an increasing demand for a wide viewing angle. However, many RGB-D cameras use the rectilinear projection lens and thus, have difficulty in providing the horizontal viewing angle of 120 degrees or more. Some stereo cameras use the fisheye lens and thus, provide a wider viewing angle. However, a fisheye lens image may not perform efficient stereo matching through a one-dimensional (1D) search as in a rectilinear projection image. Also, although the fisheye lens is used, distance information may not be obtained in the same direction as a baseline of a stereo camera. Accordingly, a horizontal angle of view is limited to be within 180 degrees.
SUMMARY
Example embodiments provide a stereo camera apparatus having a wide viewing angle in which two lenses, for example, fisheye lenses are arranged in a vertical direction to improve a horizontal viewing angle by a maximum horizontal viewing angle of the lens and to improve a vertical viewing angle up to a maximum of 180 degrees and a depth image processing method using the same.
Also, example embodiments provide a wide viewing angle stereo camera-based human-machine interface system that may recognize a motion of a whole body of a user at a close distance within 1 m using a stereo camera apparatus described herein and a depth image processing method using the same.
Also, example embodiments provide a wide viewing angle stereo camera-based first-person vision system that may recognize a motion of a user or a surrounding environment of the user using a stereo camera apparatus described herein and a depth image processing method using the same.
Also, example embodiments provide a wide viewing angle stereo camera-based human following system that may recognize and follow a human at a close distance using a stereo camera apparatus described herein and a method thereof.
According to an aspect of an example embodiment, there is provided a stereo camera apparatus including a receiver configured to receive a first image and a second image of a subject captured through a first lens and a second lens that are provided in a vertical direction; a converter configured to convert the received first image and second image using a map projection scheme; and a processing configured to extract a depth of the subject by performing stereo matching on the first image and the second image converted using the map projection scheme, in a height direction.
The converter may be configured to convert the first image and the second image using an equirectangular projection scheme.
The converter may be configured to perform rotation transformation of matching an epipolar line to a longitudinal line of a spherical surface using a spherical coordinate system.
The processing may be configured to perform stereo matching by searching for the first image and the second image converted using the map projection scheme along a vertical line corresponding to the longitudinal line.
Each of the first lens and the second lens may include a fisheye lens.
The first lens and the second lens may be fisheye lenses each with a horizontal viewing angle of 180 degrees or more.
The processing may be configured to obtain an image of the subject having a horizontal wide viewing angle of 180 degrees or more using the extracted depth of the subject.
The receiver may include an image sensor configured to capture the first image and an image sensor configured to capture the second image, and each of the image sensor configured to capture the first image and the image sensor configured to capture the second image may be provided in a width direction to obtain a wider horizontal viewing angle than a vertical viewing angle in the case of a rectangular shape of which a width length is greater than a height length.
The stereo camera apparatus may further include a vision processing configured to recognize a motion of a user and a surrounding environment that includes an object and a terrain present around the user based on a depth map of the subject extracted from the stereo camera apparatus and to provide the user with the recognized motion of the user and surrounding environment.
The stereo camera apparatus may further include a controller configured to recognize a human and a distance from the human based on a depth map of the subject extracted from the stereo camera apparatus, and to control a human following system to follow the human at a predetermined distance based on the recognized human and distance from the human.
The controller may be configured to recognize a travel direction of the human and a surrounding environment that includes an object and a terrain present around the human based on the depth map of the subject, to detect presence or absence of a collision based on the travel direction of the human and the surrounding environment, and to control the human following system to follow the human by applying a collision detection result.
According to an aspect of an example embodiment, there is provided a depth image processing method of a stereo camera apparatus, the method including receiving a first image and a second image of a subject captured through a first lens and a second lens that are provided in a vertical direction; converting the received first image and second image using a map projection scheme; and extracting a depth of the subject by performing stereo matching on the first image and the second image converted using the map projection scheme, in a height direction.
The depth image processing method may further include recognizing a motion of a user and a surrounding environment that includes an object and a terrain present around the user based on a depth map of the extracted subject and providing the user with the recognized motion of the user and surrounding environment.
The depth image processing method may further include recognizing a human and a distance from the human based on a depth map of the extracted subject, and controlling a human following system to follow the human at a predetermined distance based on the recognized human and distance from the human.
According to some example embodiments, it is possible to obtain a wide viewing angle by arranging two lenses, for example, fisheye lenses in a vertical direction and thereby improving a horizontal viewing angle and a vertical viewing angle.
According to some example embodiments, it is possible to obtain a wide viewing angle by improving a horizontal viewing angle and a vertical viewing angle, to implement a red, green, blue, depth (RGB-D) camera wider than a field of view of a human, and to easily detect an object using a wide viewing angle image converted using an equirectangular projection scheme.
In the case of a fisheye lens image, an upright subject appears tilted or bent at the periphery of the image. However, in the image converted using the equirectangular projection scheme, the upright subject is expressed in an upright state, which facilitates object detection.
Herein, provided is a wider viewing angle than that of a human such that an RGB-D camera may be mounted to a moving object such as a robot, a vehicle, and a drone and may detect a whole body of a human present within 1 m. Therefore, if it is attached to a human following robot, the whole body may be detected although a target to be followed is close. Even though the target to be followed shifts a direction of movement side to side, the target does not deviate from a field of view and a stable following is possible. Accordingly, the example embodiments may be used for a human-robot interaction and various moving objects as well as the human-computer interface.
Also, according to some example embodiments, it is possible to recognize a motion of a user or to recognize a surrounding environment of the user using a wide viewing angle stereo camera apparatus.
According to some example embodiments, although a wide viewing angle stereo camera apparatus is attached to a body portion, such as a head and a torso of a user, to capture a front direction of the user, it is possible to recognize a motion of a hand and an arm of the user and to grasp a posture and a behavior of the user due to the wide viewing angle. Also, by extracting a depth map using the wide viewing angle stereo camera apparatus, it is possible to detect or recognize a surrounding environment captured by the stereo camera apparatus, for example, a collision and a terrain such as whether a floor is flat or inclined.
When the example embodiments are applied to an apparatus, such as virtual reality (VR) and a head mounted display (HMD), it is possible to implement information about a recognized motion of a user and a surrounding environment as a virtual reality.
Also, according to some to example embodiments, it is possible to recognize and follow a human at a close distance using a wide viewing angle stereo camera apparatus.
Therefore, according to some example embodiments, since a human following system, such as a moving robot, recognizes a human within a close distance, for example, within 1 m and follows the human, a professional manpower for operating a mobile robot may not be required and a manpower consumption may be reduced. That is, the example embodiments may apply various algorithms for detecting and following a captured human.
According to some example embodiments, it is possible to recognize a human within a close distance, for example, 1 m, to verify an accurate position and thereby to follow a human that quickly moves not only in a longitudinal direction but also in a latitudinal direction.
According to some example embodiments, by recognizing a human and a surrounding environment that includes an object and a terrain around the human using a depth map of a subject extracted by a stereo camera apparatus, it is possible to detect presence or absence of a collision based on the surrounding environment. When the collision is determined to occur, it is possible to control a human following system by reflecting presence of the collision and to reduce a probability of damage to a moving robot.
The present disclosure may apply to various fields including logistics fields, such as, for example, a human-following logistics robot, a postman-following robot, a personal cargo robot, and an electric golf trolley as well as a distribution field.
Further areas of applicability will become apparent from the description provided herein. The description and specific examples in this summary are intended for purposes of illustration only and are not intended to limit the scope of the present disclosure.
BRIEF DESCRIPTION OF THE DRAWINGS
These and/or other aspects, features, and advantages of the invention will become apparent and more readily appreciated from the following description of embodiments, taken in conjunction with the accompanying drawings of which:
FIG. 1 illustrates a configuration of a stereo camera apparatus according to an example embodiment;
FIG. 2 illustrates an example for describing an arrangement of fisheye lenses and a width direction arrangement and a viewing angle of each of image sensors configured to obtain a horizontal viewing angle wider than a vertical viewing angle in a stereo camera apparatus of the present disclosure;
FIG. 3 illustrates an example for describing an epipolar line geometry in a stereo camera apparatus of the present disclosure;
FIG. 4 illustrates an example for describing a method of extracting a depth;
FIG. 5 illustrates an example for describing conversion of a fisheye lens image using an equirectangular projection scheme;
FIG. 6 illustrates an example of a depth for a disparity of the same size;
FIG. 7 illustrates an example for describing a viewing angle of a stereo camera apparatus of the present disclosure;
FIG. 8 is a flowchart illustrating a depth image processing method of a stereo camera apparatus according to an example embodiment;
FIG. 9 illustrates a configuration of a first-person vision system based on a stereo camera apparatus according to an example embodiment;
FIG. 10 is a flowchart illustrating an additional operation of a depth image processing method when a stereo camera apparatus according to an example embodiment operates as a first-person vision system;
FIG. 11 illustrates a configuration of a human following system based on a stereo camera apparatus according to an example embodiment; and
FIG. 12 is a flowchart illustrating an additional operation of a depth image processing method performed when a stereo camera apparatus according to an example embodiment operates as a human following system.
DETAILED DESCRIPTION
Aspects and features of the disclosure and methods to achieve the same may become clear with reference to the accompanying drawings and the following example embodiments. Example embodiments, however, may be embodied in various different forms, and should not be construed as being limited to only the illustrated embodiments. Rather, the illustrated embodiments are provided as examples so that this disclosure will be thorough and complete, and are defined by the scope of the claims.
The terms used herein are to describe the example embodiments and not to limit the disclosure. As used herein, the singular forms “a,” “an,” and “the,” are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms “comprises” and/or “comprising,” when used in this specification, specify the presence of stated components, steps, operations, and/or elements, but do not preclude the presence or addition of one or more other components, steps, operations, and elements.
Unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which the example embodiments belong. Terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the relevant art and/or this disclosure, and should not be interpreted in an idealized or overly formal sense unless expressly so defined herein.
Hereinafter, the example embodiments will be described in more detail with reference to the accompanying drawings. Like reference numerals refer to like elements throughout and further description related thereto is omitted.
The example embodiments relate to providing a stereo camera apparatus capable of having a wide viewing angle by arranging two lenses, for example, fisheye lenses in a vertical direction to improve a horizontal viewing angle and a vertical viewing angle.
Here, the present disclosure may extract a depth of a subject captured by fisheye images and may obtain a wide viewing angle image based on the extracted depth by converting fisheye images respectively captured through the two fisheye lenses using a map projection scheme, for example, an equirectangular projection scheme in which a meridian is displayed as a straight line and by performing stereo matching on the fisheye images converted using the equirectangular projection scheme in a height direction or a vertical direction.
In the stereo camera apparatus disclosed herein, two fisheye lenses or fisheye lens cameras are vertically arranged to face the same direction. Therefore, if a spherical coordinate system is used, an epipolar line matches a longitudinal line (a meridian) of a spherical surface and if a fisheye lens image is converted using an equirectangular projection scheme, the longitudinal line is expressed as a vertical line in an image. Therefore, efficient stereo matching may be performed by conducting a search along the vertical line.
The stereo camera apparatus disclosed herein may extract or calculate a depth of a subject from an angular disparity between images converted using an equirectangular projection scheme for images captured using an upper fisheye lens and a lower fisheye lens.
Hereinafter, the example embodiments are described with reference to FIGS. 1 to 12 .
FIG. 1 illustrates a configuration of a stereo camera apparatus according to an example embodiment, that is, illustrates a conceptual configuration of a stereo camera apparatus or system. FIG. 2 illustrates an example for describing an arrangement of fisheye lenses and a width direction arrangement and a viewing angle of each of image sensors configured to obtain a horizontal viewing angle wider than a vertical viewing angle in a stereo camera apparatus of the present disclosure, FIG. 3 illustrates an example for describing an epipolar line geometry in a stereo camera apparatus of the present disclosure, FIG. 4 illustrates an example for describing a method of extracting a depth, FIG. 5 illustrates an example for describing conversion of a fisheye lens image using an equirectangular projection scheme, FIG. 6 illustrates an example of a depth for a disparity of the same size, and FIG. 7 illustrates an example for describing a viewing angle of a stereo camera apparatus of the present disclosure.
Referring to FIGS. 1 to 7 , a stereo camera apparatus 100 according to an example embodiment includes a receiver 110, a converter 120, and a processing 130.
The receiver 110 receives a first image (or a first fisheye image) and a second image (or a second fisheye image) of a subject that are captured through a first lens, for example, a first fisheye lens (or a first fisheye lens camera), and a second lens, for example, a second fisheye lens (or a second fisheye lens camera) that are arranged in a vertical direction.
For example, referring to FIG. 2 , the receiver 110 receives the first fisheye image and the second fisheye image of the subject respectively captured through the first fisheye lens and the second fisheye lens that are arranged in the vertical direction. Here, the first fisheye lens and the second fisheye lens are provided to face the same direction and may have a maximum horizontal viewing angle and a vertical viewing angle improved up to a maximum of 180 degrees. For example, the first fisheye lens and the second fisheye lens arranged in the vertical direction may have a horizontal viewing angle of 200 degrees or more and a vertical viewing angle of 180 degrees or more. However, it is provided as an example only and thus, each fisheye lens is not limited to or restricted by the horizontal viewing angle of 200 degrees or more and the vertical viewing angle of 180 degrees or more. The horizontal viewing angle and the vertical viewing angle may vary depending on a fisheye lens to be used.
Also, when each of a first image sensor of a first camera for capturing the first image and a second image sensor of a second camera for capturing the second image is in a rectangular shape of which a width length is greater than a height length, the receiver 110 may arrange the respective image sensors in a width direction to obtain a width viewing angle (or a horizontal viewing angle) wider than a height viewing angle (or a vertical viewing angle). However, without being limited thereto or restricted thereby, each of the image sensors of the receiver 110 may be arranged in the height direction.
The converter 120 converts the first fisheye image and the second fisheye image received through the receiver 110 using a map projection scheme.
Here, the converter 120 may convert the first fisheye image and the second fisheye image using an equirectangular projection scheme.
Here, the equirectangular projection scheme refers to a single map projection scheme and may represent a latitudinal line and a longitudinal line as a horizontal line and a vertical line, respectively. For example, referring to FIG. 3 , the converter 120 may convert a 220-degree fisheye image to an equirectangular image using the equirectangular projection scheme. The equirectangular projection scheme may be used to capture a wide viewing angle that may not be captured in an existing rectilinear projection image, such as a 360-degree camera. When the fisheye lens image with the viewing angle of 180 degrees or more is converted using the equirectangular projection scheme, distortion may occur in which upper and lower portions of the image are greatly stretched left and right.
Also, the converter 120 may perform an image rectification for rotating the first fisheye image and the second fisheye image to be matched in a direction of a meridian prior to converting the first fisheye image and the second fisheye image using the equirectangular projection scheme. When the image rectification is performed, a comparison between the two images, that is, the first fisheye image and the second fisheye image may be simplified to a 1D search, which may lead to improving a speed of stereo matching.
The processing 130 extracts a depth or a depth map of the subject captured through the first fisheye lens and the second fisheye lens by performing stereo matching on fisheye images, that is, a first equirectangular image and a second equirectangular image converted by the converter 120 using the equirectangular projection scheme, in a height direction or a vertical direction.
Here, referring to FIG. 4 , two fisheye lenses or fisheye lens cameras are vertically provided to face the same direction. Therefore, if the spherical coordinate system is used, an epipolar line matches a longitudinal line (a meridian) of a spherical surface and if a fisheye lens image is converted using the equirectangular projection scheme, the longitudinal line is expressed as a vertical line in an image. Therefore, the processing 130 may perform efficient stereo matching by conducting a search along the vertical line.
Referring to FIG. 5 , for images captured through an upper fisheye lens and a lower fisheye lens, the processing 130 may extract or calculate a depth d of a subject from an angular disparity, that is, φ10, between images converted using the equirectangular projection scheme. The depth d may be extracted or calculated according to the following Equation 1.
d = B · cos φ 1 sin ( φ 1 - φ 0 ) [ Equation 1 ]
Here, although an angular disparity is the same, it can be known that the depth d varies according to φ1. Also, it can be known that the depth d is not obtained in up and down corresponding to the same direction as a baseline B, for example, when φ1=−90 degrees or 90 degrees.
Here, referring to FIG. 6 , when a direction in which a camera faces is a Z axis and a direction of a baseline is a Y axis, a depth of an angular disparity with the same size may vary.
In existing stereo with a limited viewing angle, calculation is performed through approximation based on an aspect that a depth (or a distance) and a size of disparity are in inverse proportion. However, when using a wide viewing angle, the depth and the angular disparity have a complex relationship as in Equation 1, which needs to be considered.
Also, the processing 130 may obtain an image of the subject having a wide viewing angle that includes a horizontal viewing angle of 180 degrees or more and a vertical viewing angle of maximum 180 degrees using the extracted depth of the subject.
As described above, a stereo camera apparatus according to an example embodiment may obtain a wide viewing angle by arranging two lenses, for example, fisheye lenses in a vertical direction and thereby improving a horizontal viewing angle and a vertical viewing angle. For example, the present disclosure may implement a wide viewing angle RGB-D camera apparatus in which the vertical viewing angle is a maximum of 180 degrees and a horizontal viewing angle is much larger than 180 degrees. Here, in the case of using a 280-degree fisheye lens, a viewing angle having a horizontal viewing angle of 280 degrees and a vertical viewing angle of 180 degrees may be implemented.
Also, a stereo camera apparatus according to an example embodiment, it is possible to obtain a wide viewing angle by improving a horizontal viewing angle and a vertical viewing angle, to implement an RGB-D camera wider than a viewing angle of a human, and to easily detect an object accordingly. For example, referring to FIG. 7 , a stereo camera apparatus according to an example embodiment may easily detect a close object and thus, may detect a whole body of a human close within 1 m.
Also, when a stereo camera apparatus according to an example embodiment is attached to a following robot, the stereo camera apparatus may detect a whole body although a target to be followed is close. Although the target to be followed changes a travel direction left and right, the target to be followed does not deviate from a field of view and thus stable following is possible. Therefore, the present disclosure may be employed not only for a human-computer interface but also for a human-robot interaction and various moving objects.
As described above, the stereo camera apparatus disclosed herein may perform efficient stereo matching between an upper image and a lower image by converting images captured by arranging two fisheye lenses or fisheye lens cameras in a vertical direction or a height direction, using an equirectangular projection scheme.
In the case of an existing stereo camera using a rectilinear projection image, there is no big difference in arranging two cameras horizontally or vertically. Therefore, the existing stereo camera may arrange the cameras in a convenient direction for mounting into consideration of a direction of a subject. However, in the case of a wide viewing angle stereo camera using an image converted using the equirectangular projection scheme, a shape of an image distortion may vary according to an installation direction of the camera. Referring to FIG. 6 , since an accuracy of a depth obtainable through stereo matching varies, an arrangement of the stereo camera capable of obtaining an image distortion and a further accurate depth value for the subject needs to be considered.
In the case of a rectilinear projection image, a straight line in the real world is expressed as a straight line in the image at all times and thus, a distortion does not occur in the image due to the installation direction of the camera. However, in the case of the image converted using the equirectangular projection scheme, a vertical line in the real world is expressed as a straight line, whereas a horizontal line in the real world is expressed as a curved line and a shape of a distortion in the image varies according to the installation direction of the camera.
That is, only when the camera is not tilted left and right, upright subjects may be expressed with the least distortion in the image converted using the equirectangular projection scheme.
Therefore, the camera using the image converted through the equirectangular projection scheme may be assumed to be used in a state in which the camera generally faces the front. The wide viewing angle stereo camera in which two lenses are arranged in a height direction may use advantages found in arrangement in the height direction into consideration of the following three aspects.
First, in stereo with a viewing angle of 180 degrees or more, occlusion occurs between two cameras. For example, a side camera that constitutes the stereo camera is visible in an image. Therefore, two cameras may be arranged in a vertical direction such that a direction of the occlusion may be upward and downward. In a general situation, meaningful subjects are highly likely to be present in the front or in a side direction rather than in a ceiling direction or a floor direction. Therefore, that the direction of the occlusion is upward and downward may be advantageous.
Second, in the case of using an equirectangular projection scheme, an image is expressed in an excessively enlarged manner with getting closer to an upper end and a lower end of the image corresponding to a pole. By arranging two cameras in the vertical direction, a direction in which such distortion occurs may be matched to an upward direction and a downward direction.
Third, in the existing stereo with a limited viewing angle, calculation is performed through approximation based on aspect that a size of depth and a disparity are in an inverse proportion. However, if the viewing angle increases, the depth decreases from a direction the camera faces to a baseline direction of the stereo camera even for the disparity of the same size. Therefore, it is difficult to obtain an accurate depth. When the two cameras are arranged in the vertical direction, the baseline is directed upward and downward. Therefore, an area with a degraded accuracy may be matched upward and downward.
The stereo camera apparatus of the present disclosure may minimize an area in which a depth is unobtainable by matching a direction in which interference between two cameras or two lenses occurs and a direction in which the depth is unobtainable and may increase a horizontal viewing angle of a depth map by 180 degrees or more. The stereo camera apparatus of the present disclosure may obtain a 280-degree depth map in a horizontal direction in the case of using a 280-degree fisheye lens.
Also, if a stereo camera apparatus according to an example embodiment performs stereo matching by searching for an upper camera image based on an image of a lower camera between an upper camera and the lower camera, a hole by occlusion is generated in a downward direction of an object. In the stereo camera apparatus that is arranged in the horizontal direction, the hole is generated on the left or the right of the object. Compared thereto, the present disclosure may be more advantageous for an objection detection since the hole is generated downward of the object if the hole is generated.
FIG. 8 is a flowchart illustrating a depth image processing method of a stereo camera apparatus according to an example embodiment, and also illustrates a flowchart in the stereo camera apparatus of FIGS. 1 to 7 .
Referring to FIG. 8 , in operation S810, the depth image processing method of the stereo camera apparatus receives a first image, for example, a first fisheye image, and a second image, for example, a second fisheye image, of a subject captured through a first lens and a second lens that are arranged in a vertical direction.
When the first fisheye image and the second fisheye image are received in operation S810, the received first fisheye image and second fisheye image are converted using a map projection scheme in operation S820.
Here, in operation S820, the first fisheye image and the second fisheye image may be converted using an equirectangular projection scheme. Also, in operation S820, transformation may be performed to match an epipolar to a longitudinal line of a spherical surface using a spherical coordinate system.
When the first fisheye image and the second fisheye image are converted using the map projection scheme in operation S820, a depth of the subject is extracted by performing stereo matching on the first fisheye image and the second fisheye image converted using the map projection scheme, in a height direction in operation S830. In operation S840, a depth image of the subject having a wide viewing angle in which a horizontal viewing angle is 180 degrees or more and a vertical viewing angle is a maximum of 180 degrees is obtained using the extracted depth of the subject.
Here, in operation S830, stereo matching may be performed by searching for the first image and the second image converted using the map projection scheme along a vertical line corresponding to the longitudinal line.
Although the method of FIG. 8 is described, it is apparent to those skilled in the art that the aforementioned description made above with reference to FIGS. 1 to 7 may apply to each of operations of FIG. 8 .
Also, the stereo camera apparatus may recognize arm and hand motions of a user being captured, a posture of the user, and a surrounding environment, for example, a surrounding terrain and a surrounding object, and may provide such information.
Further description related to the disclosure is made with reference to FIGS. 9 and 10 .
FIG. 9 illustrates a configuration of a first-person vision system based on a stereo camera apparatus according to an example embodiment.
Referring to FIG. 9 , the stereo camera apparatus 100 may operate as a first-person vision system by further including a vision processing 910.
In this case, the stereo camera apparatus 100 may be attached to a body portion, such as a head and a torso of the user, further a cap and the like, and may capture a front direction of the user. Since the stereo camera apparatus 100 has a wide viewing angle, the stereo camera apparatus 100 may capture a body portion, such as an arm and a hand of the user, and may also capture a surrounding environment, such as, for example, an object, a building, and a terrain, although the front of the user is captured. Herein, anything captured in the above manner is described as a subject.
When the stereo camera apparatus 100 operates as the first-person vision system, the vision processing 910 recognizes a motion of the user and a surrounding environment that includes an object and a terrain present around the user using the depth map of the subject obtained by the stereo camera apparatus 100 and provides the user with the recognized information, that is, information about the motion of the user and the surrounding environment.
The first-person vision system, that is, the stereo camera apparatus 100 may provide a field of view similar to that of a human through, for example, an RGB-D camera apparatus having a wide viewing angle of horizontally 180 degrees or more. Therefore, a hand motion of the user wearing the first-person vision system, that is, the stereo camera apparatus 100 may enter the field of the view in all directions. Accordingly, although the stereo camera apparatus 100 captures the front of the user, the vision processing 910 may recognize hand and arm motions of the user and a behavior and a posture of the user.
Since a depth map is extracted by the stereo camera apparatus 100, the vision processing 910 may detect a collision probability of the user by detecting a surrounding object and may provide the user with information about the detected collision probability. When the user is holding an object with a hand of the user, the vision processing 910 may provide information, such as that the hand of the user is present behind the object or that the object is present within the hand of the user, and may also provide information about the surrounding terrain and object through recognition of the surrounding environment in real time.
The stereo camera apparatus 100 may detect hand and arm motions of the user or a close object by further including the vision processing 910.
FIG. 10 is a flowchart illustrating an additional operation of a depth image processing method when a stereo camera apparatus according to an example embodiment operates as a first-person vision system, and illustrates an operation additionally performed by the stereo camera apparatus that operates as the first-person vision system of FIG. 9 after operation S840 of the depth image processing method of FIG. 8 .
Referring to FIG. 10 , in operation S1010, after the depth and or the depth map of the subject is extracted through operations S830 and S840 of FIG. 8 , the stereo camera apparatus recognizes a motion of the user and a surrounding environment that includes an object and a terrain present around the user using the extracted depth map of the subject and provides the user with the recognized motion and surrounding environment of the user.
Also, the stereo camera apparatus may recognize a human and a surrounding environment that includes an object and a terrain present around the human using the depth map of the subject extracted by the stereo camera apparatus and may detect presence or absence of a collision based on the recognized surrounding environment. When the collision is determined to occur, the stereo camera apparatus may control a human following system by reflecting the collision.
Further description related to the disclosure is made with reference to FIGS. 11 and 12 .
FIG. 11 illustrates a configuration of a human following system based on a stereo camera apparatus according to an example embodiment.
Referring to FIG. 11 , the stereo camera apparatus 100 may operate as the human following system by further including a controller 1110.
In this case, since the stereo camera apparatus 100 has a wide viewing angle, the stereo camera apparatus 100 may capture a human to be followed by the human following system and a surrounding environment, for example, an object, a building and a terrain. Herein, anything captured in the above manner is described as a subject.
When the stereo camera apparatus 100 operates as the human following system, the controller 1110 may recognize a human to be followed and a distance from the human based on a depth map of the subject obtained by the stereo camera apparatus 100 and may control the human following system, for example, the stereo camera apparatus 100 to follow the human at a predetermined distance based on the recognized human and distance from the human.
Here, the controller 1110 may recognize a travel direction of the human and the surrounding environment that includes an object present around the human, a collidable object, and a terrain as well as the human and the distance from the human using the depth map of the subject, may detect presence or absence of a collision of the human following system, that is, the stereo camera apparatus 100, based on the recognized information, that is, the travel direction of the human and the surrounding environment, and may control the human following system, that is, the stereo camera apparatus 100 to follow the human by applying a collision detection result.
The human following system, that is, the stereo camera apparatus 100 may provide a field of view similar to that of the human through the stereo camera apparatus 100, for example, an RGB-D camera apparatus having a wide viewing angle in which a horizontal viewing angle is much greater than 180 degrees and thus, may follow the human at a close distance, for example, within 1 m and may recognize a whole body of the human at a close distance. Therefore, the human following system, that is, the stereo camera apparatus 100 may verify an accurate position and may follow the human that quickly moves not only in a longitudinal direction but also in a traverse direction alone.
FIG. 12 is a flowchart illustrating an additional operation of a depth image processing method performed when a stereo camera apparatus according to an example embodiment operates as a human following system, and illustrates an operation additionally performed by the stereo camera apparatus that operates as the human following system of FIG. 11 after operation S840 of the depth image processing method of FIG. 8 .
Referring to FIG. 12 , in operation S1210, after the depth or the depth map of the subject is extracted through operations S830 and S840 of FIG. 8 , a controller included in the stereo camera apparatus recognizes a human and a distance from the human based on the extracted depth map of the subject and controls the human following system to follow the human at a predetermined distance based on the recognized human and distance from the human.
Here, in operation S1210, the controller may recognize a travel direction of the human and a surrounding environment that includes an object and a terrain around the human based on the depth map of the subject, may detect presence or absence of a collision based on the travel direction of the human and the surrounding environment, and may control the human following system to follow the human by applying a collision detection result.
The apparatuses described herein may be implemented using hardware components, software components, and/or a combination thereof. For example, the systems, the apparatuses, and the components described herein may be implemented using one or more general-purpose or special purpose computers, such as, for example, a processor, a controller, an arithmetic logic unit (ALU), a digital signal processor, a microcomputer, a field programmable array (FPA), a programmable logic unit (PLU), a microprocessor, or any other device capable of responding to and executing instructions in a defined manner. The processing device may run an operating system (OS) and one or more software applications that run on the OS. The processing device also may access, store, manipulate, process, and create data in response to execution of the software. For purpose of simplicity, the description of a processing device is used as singular; however, one skilled in the art will be appreciated that a processing device may include multiple processing elements and/or multiple types of processing elements. For example, a processing device may include multiple processors or a processor and a controller. In addition, different processing configurations are possible, such as parallel processors.
The software may include a computer program, a piece of code, an instruction, or some combinations thereof, for independently or collectively instructing or configuring the processing device to operate as desired. Software and/or data may be permanently or temporally embodied in any type of machine, component, physical equipment, virtual equipment, computer storage medium or device, or a signal wave to be transmitted, to be interpreted by the processing device or to provide an instruction or data to the processing device. The software also may be distributed over network coupled computer systems so that the software is stored and executed in a distributed fashion. The software and data may be stored by one or more computer readable storage media.
The methods according to the above-described example embodiments may be configured in a form of program instructions performed through various computer devices and recorded in non-transitory computer-readable media. The media may also include, alone or in combination with the program instructions, data files, data structures, and the like. The program instructions recorded in the media may be specially designed and configured for the example embodiments or may be known to those skilled in the computer software art and thereby available. Examples of the media include magnetic media such as hard disks, floppy disks, and magnetic tapes; optical media such as CD-ROM and DVDs; magneto-optical media such as floptical disks; and hardware devices that are specially configured to store program instructions, such as read-only memory (ROM), random access memory (RAM), flash memory, and the like. Examples of program instructions include both machine code, such as produced by a compiler, and files containing higher level code that may be executed by the computer using an interpreter. The hardware device may be configured to operate as one or more software modules to perform the operation of the example embodiments or vice versa.
While the example embodiments are described with reference to specific example embodiments and drawings, it will be apparent to one of ordinary skill in the art that various alterations and modifications in form and details may be made in these example embodiments without departing from the spirit and scope of the claims and their equivalents. For example, suitable results may be achieved if the described techniques are performed in a different order, and/or if components in a described system, architecture, device, or circuit are combined in a different manner, or replaced or supplemented by other components or their equivalents.
Therefore, other implementations, other example embodiments, and equivalents of the claims are to be construed as being included in the claims.

Claims (12)

What is claimed is:
1. A stereo camera apparatus comprising:
a receiver configured to receive a first image and a second image of a subject captured through a first lens and a second lens that are provided in a vertical direction;
a converter configured to convert the received first image and second image using a map projection scheme; and
a processing configured to extract a depth of the subject by performing stereo matching on the first image and the second image converted using the map projection scheme, in a height direction,
wherein the converter is configured to convert the first image and the second image using an equirectangular projection scheme,
wherein the converter is configured to perform rotation transformation of matching an epipolar line to a longitudinal line of a spherical surface using a spherical coordinate system.
2. The stereo camera apparatus of claim 1, wherein the processing is configured to perform stereo matching by searching for the first image and the second image converted using the map projection scheme along a vertical line corresponding to the longitudinal line.
3. The stereo camera apparatus of claim 1, wherein each of the first lens and the second lens comprises a fisheye lens.
4. The stereo camera apparatus of claim 3, wherein the first lens and the second lens are fisheye lenses each with a horizontal viewing angle of 180 degrees or more.
5. The stereo camera apparatus of claim 1, wherein the processing is configured to obtain an image of the subject having a horizontal wide viewing angle of 180 degrees or more using the extracted depth of the subject.
6. The stereo camera apparatus of claim 1, wherein the receiver comprises an image sensor configured to capture the first image and an image sensor configured to capture the second image, and
each of the image sensor configured to capture the first image and the image sensor configured to capture the second image is provided in a width direction to obtain a wider horizontal viewing angle than a vertical viewing angle in the case of a rectangular shape of which a width length is greater than a height length.
7. The stereo camera apparatus of claim 1, further comprising:
a vision processing configured to recognize a motion of a user and a surrounding environment that comprises an object and a terrain present around the user based on a depth map of the subject extracted from the stereo camera apparatus and to provide the user with the recognized motion of the user and surrounding environment.
8. The stereo camera apparatus of claim 1, further comprising:
a controller configured to recognize a human and a distance from the human based on a depth map of the subject extracted from the stereo camera apparatus, and to control a human following system to follow the human at a predetermined distance based on the recognized human and distance from the human.
9. The stereo camera apparatus of claim 8, wherein the controller is configured to recognize a travel direction of the human and a surrounding environment that comprises an object and a terrain present around the human based on the depth map of the subject, to detect presence or absence of a collision based on the travel direction of the human and the surrounding environment, and to control the human following system to follow the human by applying a collision detection result.
10. A depth image processing method of a stereo camera apparatus, the depth image processing method comprising:
receiving a first image and a second image of a subject captured through a first lens and a second lens that are provided in a vertical direction;
converting the received first image and second image using a map projection scheme; and
extracting a depth of the subject by performing stereo matching on the first image and the second image converted using the map projection scheme, in a height direction,
wherein the converting includes converting the first image and the second image using an equirectangular projection scheme,
wherein the converting includes performing rotation transformation of matching an epipolar line to a longitudinal line of a spherical surface using a spherical coordinate system.
11. The depth image processing method of claim 10, further comprising:
recognizing a motion of a user and a surrounding environment that comprises an object and a terrain present around the user based on a depth map of the extracted subject and providing the user with the recognized motion of the user and surrounding environment.
12. The depth image processing method of claim 10, further comprising:
recognizing a human and a distance from the human based on a depth map of the extracted subject, and controlling a human following system to follow the human at a predetermined distance based on the recognized human and distance from the human.
US17/405,615 2020-08-20 2021-08-18 Wide viewing angle stereo camera apparatus and depth image processing method using the same Active US11729367B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US18/343,900 US20230344979A1 (en) 2020-08-20 2023-06-29 Wide viewing angle stereo camera apparatus and depth image processing method using the same

Applications Claiming Priority (8)

Application Number Priority Date Filing Date Title
KR1020200104281A KR102310958B1 (en) 2020-08-20 2020-08-20 Wide viewing angle stereo camera apparatus and depth image processing method using the same
KR10-2020-0104281 2020-08-20
KR10-2021-0023553 2021-02-22
KR1020210023553A KR102430274B1 (en) 2021-02-22 2021-02-22 Wide viewing angle stereo camera-based people following system and method therefore
KR1020210023552A KR102430273B1 (en) 2021-02-22 2021-02-22 Wide viewing angle stereo camera- based first person vision system and image processing method using the same
KR10-2021-0023552 2021-02-22
KR1020210097826A KR20230016390A (en) 2021-07-26 2021-07-26 Wide viewing angle stereo camera apparatus and depth image processing method using the same
KR10-2021-0097826 2021-07-26

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US18/343,900 Continuation US20230344979A1 (en) 2020-08-20 2023-06-29 Wide viewing angle stereo camera apparatus and depth image processing method using the same

Publications (2)

Publication Number Publication Date
US20220060677A1 US20220060677A1 (en) 2022-02-24
US11729367B2 true US11729367B2 (en) 2023-08-15

Family

ID=80270038

Family Applications (2)

Application Number Title Priority Date Filing Date
US17/405,615 Active US11729367B2 (en) 2020-08-20 2021-08-18 Wide viewing angle stereo camera apparatus and depth image processing method using the same
US18/343,900 Pending US20230344979A1 (en) 2020-08-20 2023-06-29 Wide viewing angle stereo camera apparatus and depth image processing method using the same

Family Applications After (1)

Application Number Title Priority Date Filing Date
US18/343,900 Pending US20230344979A1 (en) 2020-08-20 2023-06-29 Wide viewing angle stereo camera apparatus and depth image processing method using the same

Country Status (5)

Country Link
US (2) US11729367B2 (en)
EP (1) EP4057626A4 (en)
JP (1) JP7372708B2 (en)
CN (1) CN115191113A (en)
WO (1) WO2022039404A1 (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2022088142A (en) * 2020-12-02 2022-06-14 ヤマハ発動機株式会社 Distance recognition system, control method thereof, and marine vessel
US20220321859A1 (en) * 2021-03-25 2022-10-06 Korea Advanced Institute Of Science And Technology Real-time omnidirectional stereo matching method using multi-view fisheye lenses and system thereof
JP2022184139A (en) * 2021-05-31 2022-12-13 キヤノン株式会社 Image processing device, image processing method and program

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20170017700A (en) 2015-08-07 2017-02-15 삼성전자주식회사 Electronic Apparatus generating 360 Degrees 3D Stereoscopic Panorama Images and Method thereof
US20190325644A1 (en) * 2018-04-19 2019-10-24 Microsoft Technology Licensing, Llc Surface reconstruction for environments with moving objects
US20210334569A1 (en) * 2020-04-28 2021-10-28 NextVPU (Shanghai) Co., Ltd. Image depth determining method and living body identification method, circuit, device, and medium

Family Cites Families (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP5160640B2 (en) * 2007-06-20 2013-03-13 トムソン ライセンシング System and method for stereo matching of images
JP6353289B2 (en) * 2014-06-23 2018-07-04 株式会社Soken Ranging correction device
KR20180132893A (en) 2016-05-26 2018-12-12 후지필름 가부시키가이샤 A polishing liquid, a method of producing a polishing liquid, a polishing liquid stock solution, and a chemical mechanical polishing method
JP2017220051A (en) * 2016-06-08 2017-12-14 ソニー株式会社 Image processing device, image processing method and vehicle
KR101853269B1 (en) * 2017-04-12 2018-06-14 주식회사 씨오티커넥티드 Apparatus of stitching depth maps for stereo images
KR102454161B1 (en) * 2017-06-15 2022-10-14 삼성전자주식회사 Method and apparatus for processing omnidirectional image
JP6793608B2 (en) * 2017-08-07 2020-12-02 日立オートモティブシステムズ株式会社 Stereo image processing device
ES2702999B2 (en) 2017-09-05 2022-06-28 Univ Valencia Politecnica Porous material for the detection of Candida albicans, diagnostic method that uses it and method of preparing it
KR20190119231A (en) * 2018-04-04 2019-10-22 엘지전자 주식회사 Driving control device improved position correcting function and robot cleaner using the same
CN108888187A (en) * 2018-05-31 2018-11-27 四川斐讯信息技术有限公司 A kind of sweeping robot based on depth camera
KR102345230B1 (en) 2019-08-23 2021-12-30 주식회사 제우스이엔피 Etching composition for silicon nitride and method for preparing the same
KR20210023553A (en) 2019-08-23 2021-03-04 주식회사 에싸 Wood chopping apparatus and method using the same

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20170017700A (en) 2015-08-07 2017-02-15 삼성전자주식회사 Electronic Apparatus generating 360 Degrees 3D Stereoscopic Panorama Images and Method thereof
US20180278916A1 (en) * 2015-08-07 2018-09-27 Samsung Electronics Co., Ltd. Electronic device for generating 360-degree three-dimensional image and method therefor
US20190325644A1 (en) * 2018-04-19 2019-10-24 Microsoft Technology Licensing, Llc Surface reconstruction for environments with moving objects
US20210334569A1 (en) * 2020-04-28 2021-10-28 NextVPU (Shanghai) Co., Ltd. Image depth determining method and living body identification method, circuit, device, and medium

Also Published As

Publication number Publication date
JP2023502239A (en) 2023-01-23
US20230344979A1 (en) 2023-10-26
EP4057626A1 (en) 2022-09-14
US20220060677A1 (en) 2022-02-24
CN115191113A (en) 2022-10-14
WO2022039404A1 (en) 2022-02-24
EP4057626A4 (en) 2023-11-15
JP7372708B2 (en) 2023-11-01

Similar Documents

Publication Publication Date Title
US11729367B2 (en) Wide viewing angle stereo camera apparatus and depth image processing method using the same
CN110582798B (en) System and method for virtual enhanced vision simultaneous localization and mapping
EP3627446B1 (en) System, method and medium for generating a geometric model
US10410089B2 (en) Training assistance using synthetic images
US8824775B2 (en) Robot and control method thereof
US8265425B2 (en) Rectangular table detection using hybrid RGB and depth camera sensors
WO2022078467A1 (en) Automatic robot recharging method and apparatus, and robot and storage medium
US11788845B2 (en) Systems and methods for robust self-relocalization in a visual map
EP3455686A1 (en) Systems and methods for initializing a robot to autonomously travel a trained route
WO2019127518A1 (en) Obstacle avoidance method and device and movable platform
Rituerto et al. Comparison of omnidirectional and conventional monocular systems for visual SLAM
WO2015093130A1 (en) Information processing device, information processing method, and program
JP2013205175A (en) Device, method and program for recognizing three-dimensional target surface
US9734429B2 (en) Method, system and computer program product for detecting an obstacle with a camera
JP6410231B2 (en) Alignment apparatus, alignment method, and computer program for alignment
KR20220039101A (en) Robot and controlling method thereof
Kondaxakis et al. Real-time recognition of pointing gestures for robot to robot interaction
KR102310958B1 (en) Wide viewing angle stereo camera apparatus and depth image processing method using the same
KR102430273B1 (en) Wide viewing angle stereo camera- based first person vision system and image processing method using the same
KR20230016390A (en) Wide viewing angle stereo camera apparatus and depth image processing method using the same
Fu et al. Multi-waypoint visual homing in piecewise linear trajectory
TWI751735B (en) Automatic guided vehicle tracking system and automatic guided vehicle tracking method
CN116612459B (en) Target detection method, target detection device, electronic equipment and storage medium
TAREEN Large Scale 3D Simultaneous Localization and Mapping (LS-3D-SLAM) using Monocular Vision

Legal Events

Date Code Title Description
AS Assignment

Owner name: ARGO VISION INC., KOREA, REPUBLIC OF

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:PARK, KI YEONG;REEL/FRAME:057218/0641

Effective date: 20210817

FEPP Fee payment procedure

Free format text: ENTITY STATUS SET TO UNDISCOUNTED (ORIGINAL EVENT CODE: BIG.); ENTITY STATUS OF PATENT OWNER: SMALL ENTITY

FEPP Fee payment procedure

Free format text: ENTITY STATUS SET TO SMALL (ORIGINAL EVENT CODE: SMAL); ENTITY STATUS OF PATENT OWNER: SMALL ENTITY

AS Assignment

Owner name: ARGOSVISION INC., KOREA, REPUBLIC OF

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:KIM, DONG SUK;REEL/FRAME:057427/0781

Effective date: 20210902

STPP Information on status: patent application and granting procedure in general

Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION

AS Assignment

Owner name: ARGOSVISION INC., KOREA, REPUBLIC OF

Free format text: CORRECTIVE ASSIGNMENT TO CORRECT THE ASSIGNEE'S NAME PREVIOUSLY RECORDED AT REEL: 057218 FRAME: 0641. ASSIGNOR(S) HEREBY CONFIRMS THE ASSIGNMENT;ASSIGNOR:ARGO VISION INC.;REEL/FRAME:058454/0334

Effective date: 20210817

STPP Information on status: patent application and granting procedure in general

Free format text: NON FINAL ACTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER

STPP Information on status: patent application and granting procedure in general

Free format text: PUBLICATIONS -- ISSUE FEE PAYMENT RECEIVED

AS Assignment

Owner name: ARGOSVISION, INC., KOREA, REPUBLIC OF

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:KIM, DONG SUK;REEL/FRAME:064113/0801

Effective date: 20210902

Owner name: ARGOSVISION, INC., KOREA, REPUBLIC OF

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:PARK, KI YEONG;REEL/FRAME:064113/0285

Effective date: 20210817

STCF Information on status: patent grant

Free format text: PATENTED CASE

CC Certificate of correction