WO2023121410A1 - Système d'imagerie 3d intelligent traversant les obstacles utilisant une détection électromagnétique à bande ultralarge pour détecter des objets - Google Patents

Système d'imagerie 3d intelligent traversant les obstacles utilisant une détection électromagnétique à bande ultralarge pour détecter des objets Download PDF

Info

Publication number
WO2023121410A1
WO2023121410A1 PCT/KR2022/021243 KR2022021243W WO2023121410A1 WO 2023121410 A1 WO2023121410 A1 WO 2023121410A1 KR 2022021243 W KR2022021243 W KR 2022021243W WO 2023121410 A1 WO2023121410 A1 WO 2023121410A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
information
activity
representation
depth
Prior art date
Application number
PCT/KR2022/021243
Other languages
English (en)
Inventor
Fikriansyah ADZAKA
Junaidillah Fadlil
Ricky Setiawan
Muhammad Ishlahul HANIF
Wisma Chaerul Karunianto
Dinan Fakhri
Immanuel Catur TRIWIBOWO
Sumaryanto S
Original Assignee
Samsung Electronics Co., Ltd.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Samsung Electronics Co., Ltd. filed Critical Samsung Electronics Co., Ltd.
Publication of WO2023121410A1 publication Critical patent/WO2023121410A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/77Retouching; Inpainting; Scratch removal
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/02Detecting, measuring or recording pulse, heart rate, blood pressure or blood flow; Combined pulse/heart-rate/blood pressure determination; Evaluating a cardiovascular condition not otherwise provided for, e.g. using combinations of techniques provided for in this group with electrocardiography or electroauscultation; Heart catheters for measuring blood pressure
    • A61B5/024Detecting, measuring or recording pulse rate or heart rate
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/05Detecting, measuring or recording for diagnosis by means of electric currents or magnetic fields; Measuring using microwaves or radio waves 
    • A61B5/0507Detecting, measuring or recording for diagnosis by means of electric currents or magnetic fields; Measuring using microwaves or radio waves  using microwaves or terahertz waves
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/103Detecting, measuring or recording devices for testing the shape, pattern, colour, size or movement of the body or parts thereof, for diagnostic purposes
    • A61B5/11Measuring movement of the entire body or parts thereof, e.g. head or hand tremor, mobility of a limb
    • A61B5/1126Measuring movement of the entire body or parts thereof, e.g. head or hand tremor, mobility of a limb using a particular sensing technique
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S13/00Systems using the reflection or reradiation of radio waves, e.g. radar systems; Analogous systems using reflection or reradiation of waves whose nature or wavelength is irrelevant or unspecified
    • G01S13/02Systems using reflection of radio waves, e.g. primary radar systems; Analogous systems
    • G01S13/0209Systems with very large relative bandwidth, i.e. larger than 10 %, e.g. baseband, pulse, carrier-free, ultrawideband
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S13/00Systems using the reflection or reradiation of radio waves, e.g. radar systems; Analogous systems using reflection or reradiation of waves whose nature or wavelength is irrelevant or unspecified
    • G01S13/02Systems using reflection of radio waves, e.g. primary radar systems; Analogous systems
    • G01S13/50Systems of measurement based on relative movement of target
    • G01S13/52Discriminating between fixed and moving objects or between objects moving at different speeds
    • G01S13/56Discriminating between fixed and moving objects or between objects moving at different speeds for presence detection
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S13/00Systems using the reflection or reradiation of radio waves, e.g. radar systems; Analogous systems using reflection or reradiation of waves whose nature or wavelength is irrelevant or unspecified
    • G01S13/66Radar-tracking systems; Analogous systems
    • G01S13/72Radar-tracking systems; Analogous systems for two-dimensional tracking, e.g. combination of angle and range tracking, track-while-scan radar
    • G01S13/723Radar-tracking systems; Analogous systems for two-dimensional tracking, e.g. combination of angle and range tracking, track-while-scan radar by using numerical data
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S13/00Systems using the reflection or reradiation of radio waves, e.g. radar systems; Analogous systems using reflection or reradiation of waves whose nature or wavelength is irrelevant or unspecified
    • G01S13/88Radar or analogous systems specially adapted for specific applications
    • G01S13/886Radar or analogous systems specially adapted for specific applications for alarm systems
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S13/00Systems using the reflection or reradiation of radio waves, e.g. radar systems; Analogous systems using reflection or reradiation of waves whose nature or wavelength is irrelevant or unspecified
    • G01S13/88Radar or analogous systems specially adapted for specific applications
    • G01S13/887Radar or analogous systems specially adapted for specific applications for detection of concealed objects, e.g. contraband or weapons
    • G01S13/888Radar or analogous systems specially adapted for specific applications for detection of concealed objects, e.g. contraband or weapons through wall detection
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S7/00Details of systems according to groups G01S13/00, G01S15/00, G01S17/00
    • G01S7/02Details of systems according to groups G01S13/00, G01S15/00, G01S17/00 of systems according to group G01S13/00
    • G01S7/41Details of systems according to groups G01S13/00, G01S15/00, G01S17/00 of systems according to group G01S13/00 using analysis of echo signal for target characterisation; Target signature; Target cross-section
    • G01S7/415Identification of targets based on measurements of movement associated with the target
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/62Extraction of image or video features relating to a temporal dimension, e.g. time-based feature extraction; Pattern tracking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/64Three-dimensional objects
    • G06V20/647Three-dimensional objects by matching two-dimensional images to three-dimensional objects
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S13/00Systems using the reflection or reradiation of radio waves, e.g. radar systems; Analogous systems using reflection or reradiation of waves whose nature or wavelength is irrelevant or unspecified
    • G01S13/86Combinations of radar systems with non-radar systems, e.g. sonar, direction finder
    • G01S13/867Combination of radar systems with cameras
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S13/00Systems using the reflection or reradiation of radio waves, e.g. radar systems; Analogous systems using reflection or reradiation of waves whose nature or wavelength is irrelevant or unspecified
    • G01S13/88Radar or analogous systems specially adapted for specific applications
    • G01S13/881Radar or analogous systems specially adapted for specific applications for robotics
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S13/00Systems using the reflection or reradiation of radio waves, e.g. radar systems; Analogous systems using reflection or reradiation of waves whose nature or wavelength is irrelevant or unspecified
    • G01S13/88Radar or analogous systems specially adapted for specific applications
    • G01S13/89Radar or analogous systems specially adapted for specific applications for mapping or imaging
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person
    • G06T2207/30201Face

Definitions

  • the present invention relates generally as a system and method to utilize information from imaging sensor and UWB sensor on devices to perform object recognition, object tracking, activity recognition, vital sign detection, and 3D image reconstruction. These activities can be performed on-device to decrease latency, and practically on any device as long as the device has image sensor, has UWB sensor, and capable to perform image processing.
  • 3D image reconstruction technology has advanced tremendously.
  • 3D image modeling has been widely adopted in broad fields and application such as health, construction, automotive, and many more.
  • New systems and methods are constantly being invented and even combined with latest technologies.
  • 3D image model has changed the view and understanding of imaging system for better visualization and also analysis, resulting in improving accuracy, efficiency and precision in decision making especially in crucial situation.
  • Ultra-wide band This technology is currently widely applied for positioning and localization application. This technology can also be applied for imaging application, and currently there are only a few applications that exploit this capability.
  • UWB has the ability to pass through obstruction while reflecting impermeable objects such as living object and metal. This invention focuses on 3D reconstruction of any impermeable objects with essential information obtained by synchronizing several apparatuses and sensors, such as camera and UWB.
  • the main concept of this invention is to utilize multiple apparatuses and sensors, mainly camera and UWB sensors, to construct 3D image from a single image/video or multiple images/video and information from UWB sensors.
  • Apparatuses embedded with UWB can connect and synchronize with each other. This way, many information can be gathered and the 3D image can be reconstructed by combining and synchronizing the gathered information.
  • This will enable the reconstruction of 3D object even in situation where there are obstructions between the apparatuses and the objects, such as a wall or smoke, and even when the object's visibility is blocked or in a separate room.
  • This invention will reconstruct a 3D image model along with its metadata by utilizing image restoration, correction factor, and body vital signs such as pulse rate, respiration rate, etc.
  • Patent US20110298898A1 discloses three dimensional image generating system and method accommodating multi-view imaging.
  • the method used in this prior art may generate corrected depth maps.
  • the maps were generated by merging disparity information associated with a disparity between color images and depth maps. While this proposed invention focuses on combining multiple sensors, instead of only depth sensor, to enrich the data information, and utilizing UWB sensor to perform sensing through obstruction.
  • Patent US10360718B2 discloses method and apparatus for constructing three dimensional model of object.
  • This patent generates 3D model of the object by mapping textures onto a previously generated surface mesh. That surface mesh was generated by scanning the object along a trajectory around the object, estimating positions of the scanner which respectively correspond to the captured images of the object, refining the estimated position of the scanner based on at least two locations on the trajectory, and estimating the depth maps corresponding to the refined positions of the scanner.
  • this proposed invention focuses on constructing three dimensional model of an object by processing single/multiple photos, and utilizing UWB sensor to perform sensing through obstruction. This will eliminate the need to scan the object thoroughly from different angles to reconstruct the 3D model.
  • Patent CN103454630B discloses a method for Ultra wide band three-dimensional imaging based on multi-element transmitting technology. Echoes of the multiple pulse signals are received at a receiving terminal through an array with the space three-dimensional resolution capacity. Matched filtering is conducted on the echoes through copying of the transmitted pulses so that echo moments corresponding to the different transmitted pulses can be separated and extracted. While our proposed invention focuses on enriching the data information by combining multiple sensors, including UWB, to perform sensing through obstruction.
  • our invention proposes a novel method to reconstruct 3D model using electromagnetic sensing to detect objects through any obstructions.
  • This invention proposes a novel method on how to perform Intelligent Through-Obstruction 3D Imaging System by utilizing multiple apparatus and sensors to reconstruct 3D model of an object using images or videos and data gathered from UWB sensors.
  • This invention also has the ability to recognize, detect and track objects, and also track the vital signs of living objects.
  • 3D image reconstruction is using imaging sensor as the main source of data, which are images and videos.
  • the reconstruction methods might also require scanning of objects from multiple angles and use interpolation method to create the 3D model.
  • this approach still needs assistance from other devices such as infrared (IR) device to make the reconstruction better.
  • IR infrared
  • the respond and speed to act plays an important factor for survival.
  • important information such as body vital sign and object class will be crucial.
  • a system that can react fast and present precise and detail information.
  • this invention proposes a system and method to utilize information from imaging and UWB sensors to perform object recognition, object tracking, activity recognition and 3D image reconstruction. These activities can be performed practically on any device as long as the device has image sensor, UWB sensor, and capable to perform image processing.
  • This invention has three main features, which are better than conventional scheme, to wit:
  • the present invention will extract useful information from video, images and UWB information and convert it into feature representation, by:
  • the present invention will estimate depth information by utilizing general and detail information from the extracted feature representation, by:
  • the present invention will generate accurate 3D image reconstruction from sensing through obstruction objects, by:
  • FIG. 1 is the general overview of the invention utilizing Feature Extraction, Depth Estimation, Object Recognition, Object Tracking, Activity Recognition, Vital Sign Tracking and 3D Reconstruction methods in accordance with the present invention.
  • FIG. 2 is a sample use case scenario of using through-obstruction 3D reconstruction for smart home monitoring.
  • FIG. 3 is a sample use case scenario of using through-obstruction 3D reconstruction for public place monitoring.
  • FIG. 4 is a sample use case scenario of using through-obstruction 3D reconstruction for disaster rescue.
  • FIG. 5 is a sample use case scenario of using through-obstruction 3D reconstruction for online exam cheating prevention.
  • FIG. 6 is a sample use case scenario of using through-obstruction 3D reconstruction for detecting number of participants and vital sign tracking during a conference call.
  • FIG. 7 is a sample use case scenario of using through-obstruction 3D reconstruction for detecting object behind a vision-obstructing object.
  • FIG. 8 is Feature Extraction diagram.
  • FIG. 9 is the general overview of feature fusion adaptive learned parameters.
  • FIG. 10 is Depth Estimation diagram.
  • FIG. 11 is Object Recognition diagram.
  • FIG. 12 is Object Tracking diagram.
  • FIG. 13 is Activity Recognition diagram.
  • FIG. 14 is Vital Sign Tracking diagram.
  • FIG. 15 is 3D Reconstruction diagram.
  • FIG. 16 is the illustration of the system implementation in several situations such as moving object, fully and partially covered object, and overlapping object.
  • FIG. 17 is the flow diagram of 3D reconstruction process.
  • FIG. 18 is illustration of obstruction prediction area and inpainting process where object that originally not visible become visible.
  • FIG. 19 is illustration of obstruction prediction area and inpainting process where object that originally visible become not visible.
  • FIG. 20 is illustration of obstruction prediction area and inpainting process where objects are overlapping with each other.
  • Intelligent Through-Obstruction 3D Imaging System uses UWB Electromagnetic Sensing for Object Detection, hereinafter will be referred as Intelligent Through-Obstruction 3D Imaging System, in accordance with the present invention is shown.
  • Intelligent Through-Obstruction 3D Imaging System most 3D imaging technologies are relying mainly on inputs from camera. To create an accurate and precise 3D model of an object, the system needs to have images of the objects from many different angles. By doing so, the application of 3D imaging will be limited to stationary objects.
  • Our proposed system will combine the use of camera and UWB sensors, which makes it possible to create a 3D model only by using a single image or video.
  • the combination of both imaging and UWB sensors will open up the possibility of adoption on any devices, such as mobile devices and autonomous devices (robot, drones, CCTV, etc.), or any other devices that are equipped with camera and UWB as perception sensors
  • the input component consists of image and signal data.
  • the image data can be retrieved from any device that produces image data.
  • Video is also defined as image data in this system as it is treated as sequence of images.
  • the signal data comes from UWB sensor device.
  • the main component consists of 3 main processes, namely feature extraction process, electromagnetic sensing process, and 3D reconstruction process.
  • the output component consists of reconstructed 3D model and object metadata such as vital sign status, object class, and the object activity.
  • the first module is Feature Extraction, to extract useful information from image and signal information.
  • the second module is Depth Estimation, to generate distance map to get depth information.
  • the third module is Object Recognition, to recognize objects that will be used to track objects and vital signs.
  • the fourth module is Object Tracking, to determine object trajectory.
  • the fifth module is Activity Recognition, to understand the current activity of the object.
  • the sixth module is Vital Sign Tracking, to provide information about the vital stats of living objects (human or animals).
  • the last module is 3D Reconstruction, the final module that will generate 3D image model from the given input.
  • at least one of the modules may be implemented by at least one processor.
  • each module may be implemented by at least one processor.
  • TABLE 1 is the variation and difference of use cases for Intelligent Through-Obstruction 3D Imaging System. Referring now to TABLE 1, describes the various use cases in which this invention can be implemented. The invention can be implemented in various use cases when user uses 3D imaging using UWB electromagnetic sensing in daily live for monitoring and security purposes, and for emergency situation that requires fast response and analysis. There are five different use cases that this invention can be applied to.
  • FIG. 2 describes the user scenario of using Intelligent Through-Obstruction 3D Imaging System for smart home monitoring.
  • the system can be implemented for smart home monitoring system to monitor surrounding environment from a different room. For example, parents can monitor their children activities using smartphone even though they are located in different rooms. It can also be used for elderly care monitoring or other family activities.
  • the most common practice to monitor activities is to use multiple CCTVs installed in each rooms. This use of CCTV will not be necessary for this invention, and it will enable real time monitoring and continuous activity data feed via UWB when the user activated the system.
  • FIG. 2 it shows the initial step to use the system for smart home monitoring.
  • the process starts with setting up the device, in which the user, referred to as (Actor 1) has the option to activate or deactivate the features of capturing images through obstruction (i.e. wall), and fall detection monitoring, vital signs, etc.
  • User can start capturing objects located in another room (Room 2) via UWB and at any time during this process by pointing the camera to the objects to complete the 3D reconstruction of the object's surface appearance.
  • the device After capturing the object, the device will start to process the feature extraction, depth estimation and 3D reconstruction. Then, the device will show the result of 3D image and metadata output that contains essential information such as object type, position, vital signs, etc. If the user somehow could not capture object's surface appearance through camera, the 3D model will still be available. However, if that happens, the model will not have a surface appearance.
  • FIG. 3 describes the user scenario of using Intelligent Through-Obstruction 3D Imaging System for public place security.
  • CCTVs are commonly used in public places such as shopping centers, restaurants, public service offices or public transportation.
  • the use of CCTVs has several weaknesses such as not being able to monitor private places, such as public restroom, and the inability to identify human that are covered by another object. Therefore, UWB radar can be used as a solution to monitor public spaces more accurately and conveniently.
  • UWB has the ability to scan objects in a closed space, so that it can detect the presence of humans in a public restroom without disturbing their privacy.
  • UWB can be combined with CCTV to complement the function, or installed separately.
  • UWB radar works by scanning an object through the emitted signal and then receiving the reflected signal to process the data. The received signal is processed to identify human or object and its location. After the object detection process is successful, the tracking process is then carried out.
  • the results of object scanning can be processed in various ways as desired, such as violations of maintaining social distance boundaries, counting the number of visitors, monitoring visitors and making sure there are no visitors hiding in the vicinity.
  • the processed data can also be presented in graphs and charts, making it easier to read and understand.
  • FIG. 4 describes the user scenario of using Intelligent Through-Obstruction 3D Imaging System for disaster rescue.
  • Disaster such as earthquakes, landslides, tsunamis, floods, fire accident and volcanic activities, can happen anytime. When it happened, rescuing the survivors could take a lot of effort due to the difficult and chaotic situations. For example, when a building collapsed, the rescue team will be unable to rescue the survivors due to the debris and rubbles from the buildings. In this situation, technology will play an important role to help rescuer save more survivors.
  • radar is one of the best options that can be relied on to assist the disaster rescue process because of its ability to scan objects.
  • UWB One of the radars that can be used in this situation is UWB, with several advantages such as a wider range and low power usage.
  • UWB can be utilized through various tools such as robots or drones.
  • UWB can easily scan objects from the air using drones and make it easier for the rescue team to understand the terrain and the victim's position.
  • UWB can be used to scan the heart rate.
  • the robot retrieved information of the victims and recognizes some human objects, it will be easier for the rescue team to get priority rescue to save more lives.
  • UWB embedded on robot or drone can collect and scan data on the surrounding environment within the radar range. Object detection process will begin when a radar signal is received. Data can be processed in various ways and the results will be displayed on the screen of the rescue team, such as localization mapping, the number of detected humans, the heart rate of each victim and others.
  • FIG. 5 and FIG. 6 describes the user scenario of using Intelligent Through-Obstruction 3D Imaging System for conference call monitoring.
  • a conference call such as online exam or court hearing.
  • 3D reconstruction and through-obstruction technology the activity of students and witnesses can be monitored using only one equipment in a conference call environment.
  • the system can monitor student's hand gesture even when their hands are behind table without using another camera behind the student. Later the invigilator can decide if the examinee's hand gestures are indicating whether they are using a phone or opening a book.
  • FIG. 6 describes the situation where the other party might influence the victim or eyewitness' statement.
  • This system utilizes UWB's property to detect how many people in the room and predict what that other person is doing by observing their movement and gesture.
  • the system can also monitor witness' heart rate and breathing pattern. By processing the data, the system can predict if the witness is not telling the truth or feeling nervous. It is critical to assess witness' situation since the witness should be safe from any external influence during court hearing.
  • the system will use a camera and UWB device.
  • smartphones are also equipped with UWB, hence it would be safe to assume that most people should have an easy access to this invention's use case.
  • the system records a video using camera to get image data of the object.
  • the system will construct a 3D model of the object and the room.
  • UWB will supplement the 3D reconstruction with data of the target's full posture and gesture that are off the camera due to camera's range or vision obstructing objects such as table.
  • the full body posture 3D reconstruction can be used to observe the examinee's activity that may indicate cheating, such as hand gestures that indicates opening a book under the table.
  • the UWB can also detect if there is another person in the room. In the case of online exam, that person could be helping the examinee with his/her exam and in case of court hearing, that person could be threatening the subject to give a false statement. We can also add the heart rate and breathing pattern of the target data for a lie detector.
  • FIG. 7 describes the user scenario of using Intelligent Through-Obstruction 3D Imaging System for self-driving car.
  • Intelligent Through-Obstruction 3D Imaging System for self-driving car.
  • FIG. 7 illustrates the system detects an object such as car or a person behind a vision-obstructing object such as a wall, building, or inclined road.
  • the system can be paired using UWB with state of the art self-driving system.
  • the system collects a 3D depth map data using LiDAR, camera, or other image sensor.
  • the UWB fires a signal to detect surrounding objects. This signal can penetrate walls, but it bounces if it hits a human or metal. Using this behavior, the system can detect human located behind a wall or building.
  • the system then integrates the object's position and 3D depth estimation map. With these improvements, the system should be able to response better to the driving environment.
  • the Feature Extraction module which consists of two process parts based on the input.
  • the first part is the image data feature extraction process to get spatial feature and spectral feature.
  • the image data will be processed on Image Channel to encode the spatial feature extraction.
  • There are several image channels that can be used in this process such as RGB, CMYK, HSV image channel or any other image channels.
  • This subprocess will add the abstraction representation of the image data.
  • the data from image channel will be fed to the Gaussian synthesis to capture each image channel distribution. This subprocess considers the modeling texture through texture analysis and also synthesis using Gaussian random vector that will map the ⁇ and ⁇ parameters into the input.
  • the next subprocess is the Virtual Image Channel, where the system can add as many layer of abstraction as possible. However, this layer of abstraction will be calculated so that the model does not affected by the dimensionality problem.
  • This subprocess can be seen as decoding process because this can be mapped to the encoded image channel information.
  • the fourth subprocess is the Spatial Feature Extraction. This is the output of the encoded process that can be seen as the spatial feature representation.
  • the fifth subprocess is the Dimensionality Reduction that will reduce the dimension and capturing the most important information from the given input.
  • the dimensionality reduction process can be parametric or non-parametric process depending on the size of the input data.
  • the output of the dimensionality reduction process can be seen as the spectral feature.
  • the second part is feature extraction of the signal data.
  • UWB signal is transmitted and received in the form of wave packet.
  • the signal feed on inanimate objects will have the same phase since the distance travelled during transmission and receipt is always the same. However, signal feed from living or moving objects will have different phases.
  • human vital activities that are monitored in this system have unique frequency range. Human heart has the frequency of 1-3Hz, while the lung has respiration frequency between 0.16Hz - 0.33Hz. Thus, the system can identify the wave form by limiting the length of frequency range processed within the stated range.
  • UWB signal receptor will capture the signal feed reflected by surrounding objects. Since the proposed system is trying to monitor living objects and optimize the depth map, we need to categorize the signal feed between the living object and its environment. Signal will be selected based on the phase difference. When there are differences in phase of signal, it can be concluded that the signal is reflected from living objects. When the received signal feed has similar phase, it can be concluded that the signal came from inanimate objects in the surrounding environment. Secondly, prior to Signal Selection subprocess, the system will automatically do a correction to the reference point of the phase calculation. As mentioned before, the proposed system is using different phases to determine living objects and inanimate objects.
  • This system enables the users to move around from one location to another.
  • the movement of user will cause ambiguity on the received signal and it will be difficult to differentiate the signals reflected from living objects and inanimate objects. This happened because the movement of user will shorten or lengthen the distance traveled by the reflected signal. Therefore, Signal Correction is needed to calculate the magnitude of movement and the impact of the shifting phase received. This allows for a more accurate signal selection.
  • the next subprocesses are done to identify the vital signs of living objects in the surrounding environment.
  • the Band Pass Filter is used to eliminate the waves that are not categorized as vital signs of living objects.
  • the input for this process is the living object signal resulted from the Signal Selection. This signal is then passed through a band pass filter.
  • the filter ranges used are 1-3Hz and 0.16-0.33Hz, i.e. the same frequency range of heart rate and human respiratory activity.
  • the band pass filter After the band pass filter, the remaining signal contains information of human vital activity. However, this signal is still unusable as it is still mixed with noise signals that existed on the same frequency range.
  • Signal restoration will be done using the PCA algorithm to reduce the dimension of the signal. After passing the PCA, the signal will have the same shape as human heart beat and can be analyzed to identify the condition. The restored signal is then passed to a low pass filter. The low pass filter will trim the signal and search for the dominant frequency that represents the restoration signal used as the input in this process. Next, the system will calculate the average interval of each signal peak using the beat per minute (bpm) unit of the living object tracked by the system.
  • bpm beat per minute
  • the last 2 subprocesses are used to obtain the depth of surrounding objects to optimize the 3D reconstruction.
  • the ToF Calculation is used to calculate the traveling time of each signals received by the system.
  • the ToF is used to determine the distance of objects that reflects the UWB signals.
  • the last process is to combine all the distance value to create the depth map of the surrounding environment. After this process, the system will show a depth map that shows the rough estimation of distance from every object in the surrounding environment of the system user.
  • the feature fusion will be defined as:
  • Learned parameters in feature fusion are important to deal with various situations that can reduce the image and signal data quality.
  • the learned parameters can change adaptively when image data is not clear, i.e. low light, outside the frame, etc., overlapping with other object, difficult to detect the living object, or UWB signal data is not clear. For example, if the image data obtained is unclear, feature fusion will reduce the parameter value so that the system will use more signal data in the feature extraction process. Otherwise, when the UWB signal is not clear, the system will utilize more image data by decreasing the parameter value.
  • the image/video will be separated by its image channel. Then, each channel is interpolated using Gaussian synthesis. New virtual image channels are the result of the interpolation process. Using virtual image channel, it is then passed into a network to extract the spatial feature. In order to get spectral feature, the system will first do dimensionality reduction technique on the image. Using the reduced dimension, the image is then passed into a network to extract the spectral feature.
  • UWB signal is received in the form of wave packet and selected based on phase difference.
  • the dynamic phase represents a moving object
  • static phase represents a static object.
  • the signal is corrected based on the displacement of the user's position to distinguish the non-stationary signal and stationary signal.
  • the signal is passed through a bandpass filter (BPF) to filter the frequency range of heartbeat (1-3 Hz) and respiration (0.16-0.33 Hz).
  • BPF bandpass filter
  • the signal is restored using signal restoration process to obtain heartbeat and respiratory waveform.
  • the peak interval is calculated using low-pass filter (LPF) to determine heart rate (beats per minute) and respiration rate (breaths per minute).
  • LPF low-pass filter
  • ToF Time of Flight
  • the Depth Estimation module designed to estimate depth from extracted feature.
  • This module is crucial in this invention, because the depth information is essential for 3D reconstruction system, where 3D object will not have a good representation without good depth information.
  • the Semantic Image Representation submodule To generate semantic image representation, there are two components needed: the up-sampling transformation to encode more general point of view information, and the down-sampling transformation to encode a more detailed point of view information.
  • Each encoded information is then partitioned into several grid granularities in the Automatic Image Partition Grid Granularity submodule, which represent a more holistic semantic feature representation.
  • Semantic image representation process will compute and process the input from partition grid granularity to get the best representation. This process can also be referred as first filtering process.
  • the output of semantic image process will be fed into Depth Calculation submodule which comprises of several submodules.
  • Encoder submodule is used for capturing the most important information by squeezing and reducing the amount of data and remove the degree of data by some threshold. We can refer this process as second filtering process.
  • Segmentation submodule captures the segmentation information from the semantic image and split into several segments. The goal of segmentation module is to simplify and change the representation, so that we can have more abstraction view.
  • context extraction submodule is used to get the context in pixel-based context. After processing all the features, a depth calculation process is built by employing CNN-based model which will return the depth representation of the object.
  • the Object Recognition module detects and identifies object given the input from extracted feature. This recognized object later will be used to the next process such as vital sign tracking or activity recognition, if the recognized objects are living objects.
  • the first process in object recognition is data preprocessing which process the input from extracted feature and also depth representation information. The goal of data preprocessing is to get the proposed object regions.
  • the next process is region of interest (RoI) pooling that extract features specific for a given input candidate of regions. Given the output of the pooling process, the next process will generate shape and region based representation to feed to the classification model. The classification model will then predict the object class.
  • RoI region of interest
  • the Object Tracking module consists of three submodules.
  • the first submodule is Motion Estimation, to determine motion vectors that describe the transformation from one 2D image to another, usually from adjacent frames in a video sequence.
  • the process starts with the input which the object class already recognized. Then, it will be fed to the motion estimation submodule.
  • optical flow extraction will be applied to capture the motion of objects between consecutive frame of sequence. This optical flow is triggered by the relative movement between the object and the input device. The output will be used on block tracking selection module.
  • the optical flow representation will be encoded using encoder layer to get the most important representation.
  • the output of encoder layer is then fed into RNN-based model to learn the sequence pattern, while in parallel the decoder model will extract the encoded information.
  • Both output from RNN-based model and decoder model are passed into reconstruction motion.
  • the output of reconstruction motion will have several candidates of object blocks.
  • the second submodule is Block Tracking Selection, to select the object blocks using probability or threshold based selection.
  • the third submodule is the Block Tracking Correction that will correct the output of the object selection based on comparison between the output and user input or manual label. However, this submodule is optional in real case application because there is a chance that the system does not receive the user input or do not have real label.
  • the output of motion estimation model is the object trajectory that can be used for the next process in the system.
  • Activity Recognition module functions to identify activity, performed by living object using pose recognition of the object.
  • This process comprises of several steps, such as activity recognition preprocessing given the input of depth representation and the recognized object class.
  • the system will perform activity recognition if object class identified as a living object.
  • the next process is activity recognition prediction, and the output will be the activity of the living objects.
  • the activity recognition starts from the given input of depth representation from previous process.
  • the first process is removing the background by filtering irrelevant information in depth representation.
  • the next step is living object selection given the input of the object class.
  • Activity recognition prediction starts with 3D pose estimation. This process uses the depth information, to estimate the pose of the objects. After the pose is estimated, the joint angle information of a living object got extracted. Codebook generation method is then applied to get the feature representation. Finally, this information is passed into a model to recognize current activity of the living object.
  • This module focuses on vital sign tracking and monitoring of the recognized object. This process is important, especially in emergency, so decision can be made quickly based on anomalies of the vital sign. Firstly, the output of feature extraction and object recognition are used to determine whether the object is a living thing or not. Then, the heart rate (beats per minute) and respiration rate (breaths per minute) are obtained from feature extraction. This process is calculated based on a certain threshold to determine the presence of anomalies. The output of vital sign tracking is metadata information in the form of beats and breaths per minute. Metadata information also includes heart rate and respiratory anomalies, if any.
  • 3D Reconstruction module takes several input from extracted feature, activity recognition and motion estimation to be able to reconstruct the full 3D representation.
  • the 3D reconstruction module consists of surface reconstruction, object placement, image inpainting and 3D refinement.
  • the surface of target 3D model is first reconstructed using surface reconstruction module.
  • the object activity is included because in through-obstruction scenario, the noise level in depth representation is higher compared with no-obstruction scenario.
  • the output surface reconstruction then placed in the image in object placement module given the input from motion estimation module. This approach is important to enables more accurate position of the object.
  • an image in-painting method is performed by combining both image and signal data, utilizing pose and activity information, and estimating obstructed part of the object. This process will generate rendered 3D image which contains all the objects. Finally, a series of 3D refinement method is performed in order to get a more natural result.
  • the first scenario is when the object is partially covered, however UWB device still able to detect the object that located behind the obstruction layer or object.
  • the second scenario is when the target object is overlapping with other objects and UWB device is not able to detect the object behind the obstruction layer.
  • the third scenario is when the object is moving from being visible into fully covered obstruction object.
  • the last scenario is when the object is fully covered and not visible.
  • the system will do interpolation from visible body and extrapolation is used from the historical scene.
  • the system will use extrapolation and just use image data. However, this solution will be less accurate compared to the previous solution.
  • the third scenario can be solved with Object Tracking and Object Activity.
  • Object Tracking and Object Activity In cases when there are unknown objects (i.e. partially visible or no image information available), as described in the fourth scenario, a general 3D model will be generated as the inpainting, and the system will emphasize more on Activity Tracking.
  • the system will use the result from various inputs, such as object activity classification, the extracted features, and also object trajectory output. Based on extracted features and pose estimation from activity representation, the system generates the living model in generic form. Given input from object trajectory, the system will infer the object position in the image or frame. This process is crucial especially if there are multiple objects behind an obstruction, so that the system can identify the location and the identity of each object. Next, the system will place the model into the image or frame and check whether all the living objects are placed.
  • the system Given the object location in the image frame, the system will predict the missing region in the image and reconstruct that missing region based on the 4 scenarios, will referred to as t n , t n+1 , t n+2 , and t n+3 .
  • the inpainting core of this invention consists of two main function.
  • the first function is obstruction area prediction and inpainting of obstruction area or region.
  • the system will use two inputs data from image and UWB data.
  • each data will be encoded separately as feature representation.
  • these features representation will be concatenate as an input for inpainting process.
  • the main process in inpainting is decoding the feature representation in order to generate the obstructed image.
  • the system detects a new object "B", and at one point both object "A” and “B” are overlapping with each other. Since, the system can track the object trajectory, the system can infer that the object on the left is object "A” and the right object is "B". Hence, the inpainting system output is the object model "A” and the system will use generic model for object "B” since the system do not have the image information regarding object "B”.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Remote Sensing (AREA)
  • Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Software Systems (AREA)
  • Veterinary Medicine (AREA)
  • Public Health (AREA)
  • Animal Behavior & Ethology (AREA)
  • Evolutionary Computation (AREA)
  • Surgery (AREA)
  • Pathology (AREA)
  • Molecular Biology (AREA)
  • Heart & Thoracic Surgery (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Databases & Information Systems (AREA)
  • Physiology (AREA)
  • Cardiology (AREA)
  • Electromagnetism (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • Computer Graphics (AREA)
  • Geometry (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Radiology & Medical Imaging (AREA)
  • Dentistry (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Image Processing (AREA)

Abstract

Sont divulgués ici un système et un procédé permettant d'utiliser des informations provenant d'un capteur d'imagerie et d'un capteur à bande ultralarge (UWB) pour effectuer une reconnaissance d'objet, un suivi d'objet, une reconnaissance d'activité, une détection de signe vital et une reconstruction d'image 3D à travers un obstacle quelconque. La présente invention fait appel à un algorithme d'apprentissage profond et à un réseau neuronal pour effectuer une extraction de caractéristiques de données d'image et de données de signal, pour identifier une estimation de profondeur, pour reconnaître un objet et suivre son activité, et pour construire un modèle 3D de l'objet même si l'objet n'est pas visible ou chevauché par d'autres objets.
PCT/KR2022/021243 2021-12-23 2022-12-23 Système d'imagerie 3d intelligent traversant les obstacles utilisant une détection électromagnétique à bande ultralarge pour détecter des objets WO2023121410A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
IDP00202112003 2021-12-23
IDP00202112003 2021-12-23

Publications (1)

Publication Number Publication Date
WO2023121410A1 true WO2023121410A1 (fr) 2023-06-29

Family

ID=86903198

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/KR2022/021243 WO2023121410A1 (fr) 2021-12-23 2022-12-23 Système d'imagerie 3d intelligent traversant les obstacles utilisant une détection électromagnétique à bande ultralarge pour détecter des objets

Country Status (1)

Country Link
WO (1) WO2023121410A1 (fr)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150146964A1 (en) * 2013-11-27 2015-05-28 Industrial Technology Research Institute Inspection apparatus, method, and computer program product for machine vision inspection
US20160078636A1 (en) * 2009-07-07 2016-03-17 Trimble Navigation Limited Image-based surface tracking
US20180211399A1 (en) * 2017-01-26 2018-07-26 Samsung Electronics Co., Ltd. Modeling method and apparatus using three-dimensional (3d) point cloud
JP6482816B2 (ja) * 2014-10-21 2019-03-13 Kddi株式会社 生体検知装置、システム、方法及びプログラム

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160078636A1 (en) * 2009-07-07 2016-03-17 Trimble Navigation Limited Image-based surface tracking
US20150146964A1 (en) * 2013-11-27 2015-05-28 Industrial Technology Research Institute Inspection apparatus, method, and computer program product for machine vision inspection
JP6482816B2 (ja) * 2014-10-21 2019-03-13 Kddi株式会社 生体検知装置、システム、方法及びプログラム
US20180211399A1 (en) * 2017-01-26 2018-07-26 Samsung Electronics Co., Ltd. Modeling method and apparatus using three-dimensional (3d) point cloud

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
KIM JISU, LEE DEOKWOO: "Activity Recognition with Combination of Deeply Learned Visual Attention and Pose Estimation", APPLIED SCIENCES, vol. 11, no. 9, pages 4153, XP093074221, DOI: 10.3390/app11094153 *

Similar Documents

Publication Publication Date Title
US10095930B2 (en) System and method for home health care monitoring
Zhang et al. A survey on vision-based fall detection
Cardinaux et al. Video based technology for ambient assisted living: A review of the literature
US11854275B2 (en) Systems and methods for detecting symptoms of occupant illness
CN100556130C (zh) 使用多个摄像机的监视系统
Zhang et al. Evaluating depth-based computer vision methods for fall detection under occlusions
KR101839827B1 (ko) 원거리 동적 객체에 대한 얼굴 특징정보(연령, 성별, 착용된 도구, 얼굴안면식별)의 인식 기법이 적용된 지능형 감시시스템
CN108898108B (zh) 一种基于扫地机器人的用户异常行为监测系统及方法
Humenberger et al. Embedded fall detection with a neural network and bio-inspired stereo vision
CN114241379B (zh) 一种乘客异常行为识别方法、装置、设备及乘客监控系统
KR20080018642A (ko) 원격 응급상황 모니터링 시스템 및 방법
RU2370817C2 (ru) Система и способ отслеживания объекта
CN114359976A (zh) 一种基于人物识别的智能安防方法与装置
JP2003518251A (ja) ある表面を基準として対象を検出するための方法およびシステム
KR101446422B1 (ko) 영상 감시 시스템 및 방법
Hung et al. Fall detection with two cameras based on occupied area
Abobeah et al. Wearable RGB Camera-based Navigation System for the Visually Impaired.
Ghidoni et al. A distributed perception infrastructure for robot assisted living
WO2023121410A1 (fr) Système d'imagerie 3d intelligent traversant les obstacles utilisant une détection électromagnétique à bande ultralarge pour détecter des objets
WO2015198284A1 (fr) Système et procédé de description de la réalité
CN103155002B (zh) 用于在图像中识别虚拟视觉信息的方法和装置
An et al. Support vector machine algorithm for human fall recognition kinect-based skeletal data
CN116152906A (zh) 图像识别方法、装置、通信设备及可读存储介质
WO2022022809A1 (fr) Dispositif de masquage
KR101688910B1 (ko) 다중 레벨 얼굴 특징을 이용한 얼굴 마스킹 방법 및 그 장치

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22912043

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE