WO2023060746A1 - Small image multi-object detection method based on super-resolution - Google Patents

Small image multi-object detection method based on super-resolution Download PDF

Info

Publication number
WO2023060746A1
WO2023060746A1 PCT/CN2021/138098 CN2021138098W WO2023060746A1 WO 2023060746 A1 WO2023060746 A1 WO 2023060746A1 CN 2021138098 W CN2021138098 W CN 2021138098W WO 2023060746 A1 WO2023060746 A1 WO 2023060746A1
Authority
WO
WIPO (PCT)
Prior art keywords
resolution
resolution image
image
super
target
Prior art date
Application number
PCT/CN2021/138098
Other languages
French (fr)
Chinese (zh)
Inventor
秦文健
高帅强
Original Assignee
中国科学院深圳先进技术研究院
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中国科学院深圳先进技术研究院 filed Critical 中国科学院深圳先进技术研究院
Publication of WO2023060746A1 publication Critical patent/WO2023060746A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4053Scaling of whole images or parts thereof, e.g. expanding or contracting based on super-resolution, i.e. the output image resolution being higher than the sensor resolution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4046Scaling of whole images or parts thereof, e.g. expanding or contracting using neural networks

Definitions

  • the present invention relates to the technical field of natural image processing, and more specifically, to a super-resolution-based multi-target detection method for small images.
  • the blinding technology based on deep target detection usually uploads the collected images to the server, and then uses a supervised or semi-supervised method to train the network for processing, and then combines other sensory information for blinding.
  • This type of method makes full use of the advantages of deep learning in processing complex images, and has a very good performance in general blind-guiding scenarios.
  • blind guide equipment can more accurately identify common objects in the life scenes of blind people, such as trash cans, chairs, people, etc.
  • the detection results of such methods are not satisfactory.
  • the current super-resolution technology generally learns the correspondence between low-resolution and high-resolution images, which are divided into image super-resolution, feature map super-resolution and target super-resolution, and low-resolution images or feature maps are used as input. , output a high-resolution image or feature map, and compare it with the real high-resolution image or feature map.
  • Existing image object detection is usually divided into two categories: one is a two-stage detector, such as Faster R-CNN.
  • the other is a one-stage detector, such as YOLO, SSD.
  • Two-stage detectors have high localization and object recognition accuracy, while one-stage detectors have high inference speed.
  • the existing high-performance target detection algorithm takes a high-resolution image as input and outputs the coordinates and category of the target.
  • the obstacle detection methods of blind guide equipment are divided into traditional non-vision, traditional machine vision and machine vision methods based on deep learning.
  • Traditional non-vision only uses ultrasonic and infrared sensors, and the judgment of obstacles is limited to azimuth and distance, and the accuracy is low.
  • Traditional machine vision mainly uses pre-written algorithms to perform feature recognition on targets in images. This method has poor migration ability and is not intelligent.
  • the machine vision method based on deep learning learns the characteristics of images through data set training, can identify images of various scenes, and perform target detection, and the detection effect is also very good, but this method requires high-resolution image acquisition equipment and high-performance information Transmission and processing equipment, in the scene of wearable guide detection, image acquisition and processing need to consider power consumption, volume and weight, etc., and because the object information contained in the low-resolution image is very little, this method is difficult to effectively detect obstacle.
  • the purpose of the present invention is to overcome the defects of the above-mentioned prior art, and provide a method for detecting multi-targets in small images based on super-resolution, which includes: acquiring the first resolution image of the original scene; using a reversible neural network model to convert the first The resolution image is converted into a second resolution image and transmitted, and then restored to the first resolution image, wherein the resolution of the second resolution image is lower than the first resolution image; the restored first resolution image is input to The trained super-resolution diffusion model performs super-resolution reconstruction through a random iterative denoising process to output a super-resolution image; performs target detection on the super-resolution image to obtain target recognition information.
  • the present invention has the advantages of introducing a super-resolution structure in the blind-guiding auxiliary detection process to enrich image information; introducing a diffusion probability model, adding features of high-resolution images, and improving image quality in low-resolution scenarios. Obstacle detection accuracy.
  • Fig. 1 is the flow chart of the multi-object detection method of small image based on super-resolution according to one embodiment of the present invention
  • FIG. 2 is a schematic diagram of the spatial structure of a super-resolution-based small image multi-target detection method according to an embodiment of the present invention
  • Fig. 3 is a network structure diagram of an image scaling module according to an embodiment of the present invention.
  • Fig. 4 is a network structure diagram of a super-resolution module according to an embodiment of the present invention.
  • Fig. 5 is a schematic diagram of a target detection module according to an embodiment of the present invention.
  • the super-resolution-based small-image multi-target detection method provided by the present invention generally includes image acquisition, image scaling, super-resolution (that is, reconstructing a corresponding high-resolution image from a low-resolution image), target detection and post-processing, etc. process.
  • the provided super-resolution-based small image multi-target detection method includes the following steps:
  • Step S110 acquiring an original scene image.
  • the original image of the scene is acquired by the camera in the headset and passed to the image scaling module. While acquiring the image, record the location and status information such as the height and inclination of the device, so that it can be processed together with the target location information into information that the blind can feel.
  • Step S120 reducing the resolution of the original image, and transmitting the reduced resolution image to the server to restore the original resolution.
  • the original image is input to the scaling module, the low-resolution image and latent variables are output, and transmitted to the server side together, and the scaling module on the server side restores the low-resolution image and latent variables to the original resolution.
  • Normalized Flow is a powerful generative probabilistic model that uses reversible neural networks to learn downscaling and upscaling for image rescaling.
  • Reversible neural networks are used to implement the mapping of implicit parameters to measurable values, which is called the forward process.
  • the reverse process is to obtain the implicit parameters according to the measured values. Since the reversible neural network model is bijective, it can recover high-resolution images with high accuracy after downscaling.
  • FIG. 2 The schematic diagram of the image scaling process is shown in Figure 2, including M1, M2 and M3, where the structure of M1 is shown in Figure 3, M2 is a convolutional feature extraction network, and M3 is P flow-steps, including the activation normalization layer (Act -norm), 1 ⁇ 1 convolutional layer (1 ⁇ 1conv), affine coupling layer (affine coupling), y represents the image after reducing the resolution, and a represents the intermediate feature layer.
  • Act -norm the activation normalization layer
  • 1 ⁇ 1 convolutional layer (1 ⁇ 1conv
  • affine coupling affine coupling
  • y represents the image after reducing the resolution
  • a represents the intermediate feature layer.
  • the loss function for training a reversible neural network is set to:
  • x is the original resolution input
  • y is the low-resolution output
  • z is the latent variable output
  • x ⁇ -1 is the high-resolution image restored by y and z
  • y * is the low-resolution x obtained by bicubic linear interpolation rate image
  • y * and y's pixel loss is x and x ⁇ -1 of pixel loss
  • ⁇ 1 , ⁇ 2 , ⁇ 3 are the weights of the corresponding items.
  • the image scaling module scales the image to its original size.
  • Step S130 performing super-resolution reconstruction on the zoomed image to obtain a super-resolution image.
  • the output restored image is super-resolved to a high-resolution size by 16 times using the super-resolution diffusion model, and the denoising diffusion probability model is used to perform super-resolution through a stochastic iterative denoising process.
  • the super-resolution model SR3 (Image Super-Resolution) or the conditional diffusion probability denoising model is used for image super-resolution reconstruction.
  • the working principle is to learn to transform the standard normal distribution through a series of refinement steps for the empirical data distribution.
  • the super-resolution network structure is shown in Figure 4, using the U-Net architecture, which is trained with denoising targets to iteratively remove various levels of noise from the output.
  • the conditional diffusion probabilistic denoising model generates the target image y 0 in T refinement steps.
  • the model starts from a pure noise image y T ⁇ N(0,I), and transfers the distribution p ⁇ (y T-1
  • the distribution of intermediate images in the inference chain is defined according to a forward diffusion process that gradually adds Gaussian noise to the signal via a fixed Markov chain denoted q( yt
  • the goal of the model is to reverse the Gaussian diffusion process by iteratively recovering the signal from the noise via a reverse Markov chain conditioned on x (the low-resolution image).
  • the inverse chain is learned using a denoising model f ⁇ that takes as input a source image and a noisy target image and estimates the noise.
  • the training objective function is set as, for example:
  • ⁇ N(0,I) x represents a low-resolution image
  • y represents a high-resolution image
  • (x, y) is sampled from the training data set
  • y 0 represents the original high-resolution image
  • represents the noise scale
  • p( ⁇ ) represents the distribution of ⁇ , ie p ⁇ 1,2 ⁇ , when p is 1, it means loss, when p is 2, it means The square of the loss
  • T is the total number of diffusion times
  • t is the index of the number of diffusion times
  • f ⁇ is the conditional diffusion probability denoising model.
  • ⁇ t N(0,I)
  • ⁇ t is a hyperparameter with a value range of 0 ⁇ t ⁇ 1, which determines the variance of the noise added in each iteration
  • Step S140 based on the ultra-high resolution image, detect the category and position of the target.
  • the ultra-high resolution image is input into the target detector, and the category and coordinate information of the target are output.
  • feature pyramids are used to achieve multi-scale object detection.
  • Feature pyramids are a fundamental building block in multi-scale object detection.
  • high-level features contain rich semantic information, it is difficult to accurately preserve the location information of objects due to low resolution.
  • the low-level features have less semantic information, they can accurately contain object location information due to their high resolution.
  • the target detection module interpolates the ultra-low resolution image, stitches it with the high-resolution image, and jointly inputs it to the feature extraction module to obtain The results are sorted by weight.
  • step S150 the target information is fused with the device status information, and transformed into perceivable information.
  • the target information is fused with the device status information by using the post-processing module, and converted into information that blind people can feel.
  • experiment setup is as follows:
  • the low-resolution images (256, 3, 8, 8) are deconvolutionally calculated and upsampled by 16 times to (256, 3, 128, 128), and the noise images are stitched into (256, 6, 128, 128) as network input.
  • the network loss is obtained by formula 2, and then the gradient is calculated and backpropagated to update the network weights.
  • the reasoning process is: splicing the interpolated low-resolution images x and y T , and obtaining y T-1 from formula 3, and similarly, obtaining y T-2 from x and y T-1 , after T iterations Then get y 0 .
  • interpolated low-resolution images x and y 0 are spliced and input into the target detector to obtain two sets of target positions and categories. After weighted sorting, the non-maximum value suppression operation is performed to obtain the final result.
  • the present invention performs super-resolution on the low-resolution image through the diffusion probability model, and realizes the 16-fold down-conversion of the ultra-low-resolution image (such as the lowest 8*8 pixels) to the high-resolution image (such as 128*128 pixels), and then
  • the high-resolution image is detected by the target detection module, which solves the problem of poor robustness and low accuracy of target detection in low-resolution scenarios faced by the guide technology, and reduces power consumption of the device.
  • the present invention designs a super-resolution-based small image multi-target detection method, which solves the problem that the effect of obstacle detection in the blind-guiding technology becomes worse in ultra-low-resolution scenarios; using image scaling technology, The original image is scaled to a low-resolution image for low-cost transmission, and then the low-resolution image is restored to a high-quality original image; the image super-resolution technology based on the diffusion probability model is used to realize the low-resolution image when guiding the blind.
  • Target detection is carried out on images of blind people's life scenes, so as to provide a solution for the existing blind-guiding technology; at the same time, low-resolution images and high-resolution image information are used to improve detection accuracy.
  • the present invention uses lower-resolution images as the original input, so that the blind-guiding device can accommodate low-resolution cameras, and at the same time applies image scaling technology to reduce the amount of data transmission, reduce power consumption and reduce the volume of the device during data transmission, so that The blind guide device can work for a long time, reducing the burden on users.
  • the present invention can be a system, method and/or computer program product.
  • a computer program product may include a computer readable storage medium having computer readable program instructions thereon for causing a processor to implement various aspects of the present invention.
  • a computer readable storage medium may be a tangible device that can retain and store instructions for use by an instruction execution device.
  • a computer readable storage medium may be, for example, but is not limited to, an electrical storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing.
  • Computer-readable storage media include: portable computer diskettes, hard disks, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM), or flash memory), static random access memory (SRAM), compact disc read only memory (CD-ROM), digital versatile disc (DVD), memory stick, floppy disk, mechanically encoded device, such as a printer with instructions stored thereon A hole card or a raised structure in a groove, and any suitable combination of the above.
  • RAM random access memory
  • ROM read-only memory
  • EPROM erasable programmable read-only memory
  • flash memory static random access memory
  • SRAM static random access memory
  • CD-ROM compact disc read only memory
  • DVD digital versatile disc
  • memory stick floppy disk
  • mechanically encoded device such as a printer with instructions stored thereon
  • a hole card or a raised structure in a groove and any suitable combination of the above.
  • computer-readable storage media are not to be construed as transient signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through waveguides or other transmission media (e.g., pulses of light through fiber optic cables), or transmitted electrical signals.
  • Computer-readable program instructions described herein may be downloaded from a computer-readable storage medium to a respective computing/processing device, or downloaded to an external computer or external storage device over a network, such as the Internet, a local area network, a wide area network, and/or a wireless network.
  • the network may include copper transmission cables, fiber optic transmission, wireless transmission, routers, firewalls, switches, gateway computers, and/or edge servers.
  • a network adapter card or a network interface in each computing/processing device receives computer-readable program instructions from the network and forwards the computer-readable program instructions for storage in a computer-readable storage medium in each computing/processing device .
  • Computer program instructions for carrying out operations of the present invention may be assembly instructions, instruction set architecture (ISA) instructions, machine instructions, machine-related instructions, microcode, firmware instructions, state setting data, or Source or object code written in any combination, including object-oriented programming languages—such as Smalltalk, C++, Python, etc., and conventional procedural programming languages—such as the “C” language or similar programming languages.
  • Computer readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer, or entirely on the remote computer or server implement.
  • the remote computer can be connected to the user computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or it can be connected to an external computer (such as via the Internet using an Internet service provider). connect).
  • LAN local area network
  • WAN wide area network
  • an electronic circuit such as a programmable logic circuit, field programmable gate array (FPGA), or programmable logic array (PLA)
  • FPGA field programmable gate array
  • PDA programmable logic array
  • These computer-readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine such that when executed by the processor of the computer or other programmable data processing apparatus , producing an apparatus for realizing the functions/actions specified in one or more blocks in the flowchart and/or block diagram.
  • These computer-readable program instructions can also be stored in a computer-readable storage medium, and these instructions cause computers, programmable data processing devices and/or other devices to work in a specific way, so that the computer-readable medium storing instructions includes An article of manufacture comprising instructions for implementing various aspects of the functions/acts specified in one or more blocks in flowcharts and/or block diagrams.
  • each block in a flowchart or block diagram may represent a module, a portion of a program segment, or an instruction that includes one or more Executable instructions.
  • the functions noted in the block may occur out of the order noted in the figures. For example, two blocks in succession may, in fact, be executed substantially concurrently, or they may sometimes be executed in the reverse order, depending upon the functionality involved.
  • each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations can be implemented by a dedicated hardware-based system that performs the specified function or action , or may be implemented by a combination of dedicated hardware and computer instructions. It is well known to those skilled in the art that implementation by means of hardware, implementation by means of software, and implementation by a combination of software and hardware are all equivalent.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

A small image multi-object detection method based on super-resolution. The method comprises: acquiring a first-resolution image of an original scene (S110); converting the first-resolution image into a second-resolution image by using a reversible neural network model, then transmitting the second-resolution image, and then restoring the second-resolution image to the first-resolution image, wherein the resolution of the second-resolution image is lower than that of the first-resolution image (S120); inputting the first-resolution image obtained after restoration into a trained super-resolution diffusion model, executing super-resolution reconstruction by means of a random iterative denoising process, and outputting an ultra-high-resolution image (S130); and executing object detection on the ultra-high-resolution image, so as to obtain object identification information (S140). By means of the method, the obstacle detection precision in a low-resolution scenario is improved, and a blind guidance device can operate for a long time, thereby alleviating the burden of a user.

Description

一种基于超分辨率的小图像多目标检测方法A method of multi-target detection in small images based on super-resolution 技术领域technical field
本发明涉及自然图像处理技术领域,更具体地,涉及一种基于超分辨率的小图像多目标检测方法。The present invention relates to the technical field of natural image processing, and more specifically, to a super-resolution-based multi-target detection method for small images.
背景技术Background technique
目前,视障群体出行有很多不便,智能导盲的设计不仅有助于他们在出行时能较好地识别障碍物,而且为他们的日常生活带来了极大的便利。随着人工智能开始爆发,深度学习、卷积神经网络的出现使得计算机视觉在导盲应用方面逐渐颠覆依赖超声波等避障的传统导盲技术,使得复杂难以处理障碍物检测的问题得到了解决。At present, there are many inconveniences for the visually impaired to travel. The design of intelligent blind guide not only helps them to better identify obstacles when traveling, but also brings great convenience to their daily life. With the explosion of artificial intelligence, the emergence of deep learning and convolutional neural networks has enabled computer vision to gradually subvert the traditional blind-guiding technology that relies on ultrasonic and other obstacle avoidance in the application of blind-guiding, and has solved the problem of complex and difficult-to-handle obstacle detection.
在现有技术中,基于深度目标检测的导盲技术通常将采集的图像上传服务器,然后用有监督或半监督的方法训练网络进行处理,再结合其他传感信息进行导盲。这类方法充分利用了深度学习处理复杂图像的优势,在一般导盲情景下,有很不错的表现。尽管通过深度学习,导盲设备能对盲人生活场景中的常见物体,如垃圾桶,椅子,人等能进行较准确地识别。然而,对于低分辨率场景来说,这类方法的检测结果却不尽人意。基于视觉的导盲技术多数是应用高分辨率下的彩色图像训练网络实现,但受限于设备因素,难以采集到高分辨率图像信息,或对高分辨率图像的检测需要较高的算力和时间。在低分辨率场景下,图像的目标特征的有效性大打折扣,包含的信息很少,不易识别物体轮廓及类别。In the existing technology, the blinding technology based on deep target detection usually uploads the collected images to the server, and then uses a supervised or semi-supervised method to train the network for processing, and then combines other sensory information for blinding. This type of method makes full use of the advantages of deep learning in processing complex images, and has a very good performance in general blind-guiding scenarios. Although through deep learning, blind guide equipment can more accurately identify common objects in the life scenes of blind people, such as trash cans, chairs, people, etc. However, for low-resolution scenes, the detection results of such methods are not satisfactory. Most vision-based blind-guiding technologies are realized by using high-resolution color image training network, but limited by equipment factors, it is difficult to collect high-resolution image information, or the detection of high-resolution images requires high computing power and time. In low-resolution scenes, the effectiveness of the target features of the image is greatly reduced, the information contained is very little, and it is difficult to identify the outline and category of the object.
目前的超分辨率技术一般都是学习低分辨率到高分辨率图像的对应关系,分为图像超分辨率,特征图超分辨率和目标超分辨率,将低分辨率图像或特征图作为输入,输出高分辨率图像或特征图,与真实高分辨率图像或特征图比较。The current super-resolution technology generally learns the correspondence between low-resolution and high-resolution images, which are divided into image super-resolution, feature map super-resolution and target super-resolution, and low-resolution images or feature maps are used as input. , output a high-resolution image or feature map, and compare it with the real high-resolution image or feature map.
现有的图像目标检测通常被分为两类:一类是两阶段检测器,如Faster  R-CNN。另一种是一阶段检测器,如YOLO、SSD。两阶段检测器具有较高的定位和目标识别精度,而一阶段检测器具有较高的推理速度。现有高性能目标检测算法,将高分辨率图片作为输入,输出目标的坐标及类别。Existing image object detection is usually divided into two categories: one is a two-stage detector, such as Faster R-CNN. The other is a one-stage detector, such as YOLO, SSD. Two-stage detectors have high localization and object recognition accuracy, while one-stage detectors have high inference speed. The existing high-performance target detection algorithm takes a high-resolution image as input and outputs the coordinates and category of the target.
总体上,导盲设备的障碍物探测方法被分为传统无视觉、传统机器视觉和基于深度学习的机器视觉方法。传统无视觉只应用了超声、红外传感器,对障碍物的判断局限于方位距离,而且精度较低。传统机器视觉主要利用预先写好的算法,对图像中的目标进行特征识别,这种方法迁移能力不强,不具有智能性。基于深度学习的机器视觉方法通过数据集训练学习图像的特征,能够识别各种场景的图像,并进行目标检测,检测效果也十分不错,但这种方法需要高分辨率图像采集设备以及高性能信息传输及处理设备,在穿戴式导盲检测场景下,图像采集及处理都需考虑功耗,体积及重量等,并且由于低分辨率图像中包含的物体信息很少,这种方法难以有效检测出障碍物。In general, the obstacle detection methods of blind guide equipment are divided into traditional non-vision, traditional machine vision and machine vision methods based on deep learning. Traditional non-vision only uses ultrasonic and infrared sensors, and the judgment of obstacles is limited to azimuth and distance, and the accuracy is low. Traditional machine vision mainly uses pre-written algorithms to perform feature recognition on targets in images. This method has poor migration ability and is not intelligent. The machine vision method based on deep learning learns the characteristics of images through data set training, can identify images of various scenes, and perform target detection, and the detection effect is also very good, but this method requires high-resolution image acquisition equipment and high-performance information Transmission and processing equipment, in the scene of wearable guide detection, image acquisition and processing need to consider power consumption, volume and weight, etc., and because the object information contained in the low-resolution image is very little, this method is difficult to effectively detect obstacle.
发明内容Contents of the invention
本发明的目的是克服上述现有技术的缺陷,提供一种基于超分辨率的小图像多目标检测方法,该方法包括:获取原始场景的第一分辨率图像;利用可逆神经网络模型将第一分辨率图像转换为第二分辨率图像后进行传输,进而还原为第一分辨率图像,其中第二分辨率图像的分辨率低于第一分辨率图像;将还原的第一分辨率图像输入至经训练的超分辨率扩散模型,并通过随机迭代去噪过程执行超分辨率重建,输出超高分辨率图像;对所述超高分辨率图像执行目标检测,获得目标识别信息。The purpose of the present invention is to overcome the defects of the above-mentioned prior art, and provide a method for detecting multi-targets in small images based on super-resolution, which includes: acquiring the first resolution image of the original scene; using a reversible neural network model to convert the first The resolution image is converted into a second resolution image and transmitted, and then restored to the first resolution image, wherein the resolution of the second resolution image is lower than the first resolution image; the restored first resolution image is input to The trained super-resolution diffusion model performs super-resolution reconstruction through a random iterative denoising process to output a super-resolution image; performs target detection on the super-resolution image to obtain target recognition information.
与现有技术相比,本发明的优点在于,在导盲辅助检测过程中引入超分辨率结构,丰富图片信息;引入扩散概率模型,添加高分辨率图像的特征,提高低分辨率情景下的障碍物检测精度。Compared with the prior art, the present invention has the advantages of introducing a super-resolution structure in the blind-guiding auxiliary detection process to enrich image information; introducing a diffusion probability model, adding features of high-resolution images, and improving image quality in low-resolution scenarios. Obstacle detection accuracy.
通过以下参照附图对本发明的示例性实施例的详细描述,本发明的其它特征及其优点将会变得清楚。Other features of the present invention and advantages thereof will become apparent from the following detailed description of exemplary embodiments of the present invention with reference to the accompanying drawings.
附图说明Description of drawings
被结合在说明书中并构成说明书的一部分的附图示出了本发明的实施例,并且连同其说明一起用于解释本发明的原理。The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the principles of the invention.
图1是根据本发明一个实施例的基于超分辨率的小图像多目标检测方法的流程图;Fig. 1 is the flow chart of the multi-object detection method of small image based on super-resolution according to one embodiment of the present invention;
图2是根据本发明一个实施例的基于超分辨率的小图像多目标检测方法的空间结构示意图;2 is a schematic diagram of the spatial structure of a super-resolution-based small image multi-target detection method according to an embodiment of the present invention;
图3是根据本发明一个实施例的图像缩放模块网络结构图;Fig. 3 is a network structure diagram of an image scaling module according to an embodiment of the present invention;
图4是根据本发明一个实施例的超分辨率模块网络结构图;Fig. 4 is a network structure diagram of a super-resolution module according to an embodiment of the present invention;
图5是根据本发明一个实施例的目标检测模块示意图。Fig. 5 is a schematic diagram of a target detection module according to an embodiment of the present invention.
具体实施方式Detailed ways
现在将参照附图来详细描述本发明的各种示例性实施例。应注意到:除非另外具体说明,否则在这些实施例中阐述的部件和步骤的相对布置、数字表达式和数值不限制本发明的范围。Various exemplary embodiments of the present invention will now be described in detail with reference to the accompanying drawings. It should be noted that the relative arrangements of components and steps, numerical expressions and numerical values set forth in these embodiments do not limit the scope of the present invention unless specifically stated otherwise.
以下对至少一个示例性实施例的描述实际上仅仅是说明性的,决不作为对本发明及其应用或使用的任何限制。The following description of at least one exemplary embodiment is merely illustrative in nature and in no way taken as limiting the invention, its application or uses.
对于相关领域普通技术人员已知的技术、方法和设备可能不作详细讨论,但在适当情况下,所述技术、方法和设备应当被视为说明书的一部分。Techniques, methods and devices known to those of ordinary skill in the relevant art may not be discussed in detail, but where appropriate, such techniques, methods and devices should be considered part of the description.
在这里示出和讨论的所有例子中,任何具体值应被解释为仅仅是示例性的,而不是作为限制。因此,示例性实施例的其它例子可以具有不同的值。In all examples shown and discussed herein, any specific values should be construed as exemplary only, and not as limitations. Therefore, other instances of the exemplary embodiment may have different values.
应注意到:相似的标号和字母在下面的附图中表示类似项,因此,一旦某一项在一个附图中被定义,则在随后的附图中不需要对其进行进一步讨论。It should be noted that like numerals and letters denote like items in the following figures, therefore, once an item is defined in one figure, it does not require further discussion in subsequent figures.
本发明提供的基于超分辨率的小图像多目标检测方法整体上包括图像获取、图像缩放、超分辨率(即从低分辨率图像重建出相应的高分辨率图像)、目标检测和后处理等过程。The super-resolution-based small-image multi-target detection method provided by the present invention generally includes image acquisition, image scaling, super-resolution (that is, reconstructing a corresponding high-resolution image from a low-resolution image), target detection and post-processing, etc. process.
具体地,结合图1和图2所示,所提供的基于超分辨率的小图像多目标检测方法包括以下步骤:Specifically, as shown in Figure 1 and Figure 2, the provided super-resolution-based small image multi-target detection method includes the following steps:
步骤S110,获取原始场景图像。Step S110, acquiring an original scene image.
例如,由头戴设备中的相机获取场景的原始图像并传给图像缩放模块。在获取图像的同时,记录设备的高度,倾斜度等位置及状态信息,以便于后续与目标位置信息一同处理成盲人可以感受的信息。For example, the original image of the scene is acquired by the camera in the headset and passed to the image scaling module. While acquiring the image, record the location and status information such as the height and inclination of the device, so that it can be processed together with the target location information into information that the blind can feel.
步骤S120,降低原始图像的分辨率,并将降低分辨率后的图像传输到服务器还原为原始分辨率。Step S120, reducing the resolution of the original image, and transmitting the reduced resolution image to the server to restore the original resolution.
在该步骤中,将原始图像输入到缩放模块,输出低分辨率图像及潜变量,一同传输到服务器端,服务器端的缩放模块将低分辨率图像及潜变量还原为原始分辨率。通过降低图像分辨率,可以减少带宽及延迟,从而降低传输成本。In this step, the original image is input to the scaling module, the low-resolution image and latent variables are output, and transmitted to the server side together, and the scaling module on the server side restores the low-resolution image and latent variables to the original resolution. By reducing image resolution, bandwidth and delay can be reduced, thereby reducing transmission costs.
例如,归一化流是强大的生成概率模型,使用可逆神经网络来学习图像重新缩放的缩小和放大。可逆神经网络用于实现隐式参数到可测量值的映射,这种映射称为前向过程。逆向过程即根据测量值得到隐式参数。由于可逆神经网络模型是双射的,因此在降尺度后能以较高的精度恢复出高分辨率图像。For example, Normalized Flow is a powerful generative probabilistic model that uses reversible neural networks to learn downscaling and upscaling for image rescaling. Reversible neural networks are used to implement the mapping of implicit parameters to measurable values, which is called the forward process. The reverse process is to obtain the implicit parameters according to the measured values. Since the reversible neural network model is bijective, it can recover high-resolution images with high accuracy after downscaling.
图像缩放的过程示意参见图2所示,包括M1、M2和M3,其中M1的结构如图3所示,M2为卷积特征提取网络,M3为P个flow-step,包括激活标准化层(Act-norm)、1×1卷积层(1×1conv)、仿射耦合层(affine coupling),y表示降低分辨率后的图像,a表示中间特征层。The schematic diagram of the image scaling process is shown in Figure 2, including M1, M2 and M3, where the structure of M1 is shown in Figure 3, M2 is a convolutional feature extraction network, and M3 is P flow-steps, including the activation normalization layer (Act -norm), 1×1 convolutional layer (1×1conv), affine coupling layer (affine coupling), y represents the image after reducing the resolution, and a represents the intermediate feature layer.
在一个实施例中,训练可逆神经网络的损失函数设置为:In one embodiment, the loss function for training a reversible neural network is set to:
Figure PCTCN2021138098-appb-000001
Figure PCTCN2021138098-appb-000001
其中x是原始分辨率输入,y为低分辨率输出,z是潜变量输出,x τ-1是由y和z还原的高分辨率图像,y *是x经过双三次线性插值得到的低分辨率图像;
Figure PCTCN2021138098-appb-000002
是y *和y的
Figure PCTCN2021138098-appb-000003
像素损失,
Figure PCTCN2021138098-appb-000004
是x和x τ-1
Figure PCTCN2021138098-appb-000005
像素损失,
Figure PCTCN2021138098-appb-000006
是潜变量z的
Figure PCTCN2021138098-appb-000007
正则化,λ 1,λ 2,λ 3是相应项的权重。
Where x is the original resolution input, y is the low-resolution output, z is the latent variable output, x τ-1 is the high-resolution image restored by y and z, and y * is the low-resolution x obtained by bicubic linear interpolation rate image;
Figure PCTCN2021138098-appb-000002
is y * and y's
Figure PCTCN2021138098-appb-000003
pixel loss,
Figure PCTCN2021138098-appb-000004
is x and x τ-1 of
Figure PCTCN2021138098-appb-000005
pixel loss,
Figure PCTCN2021138098-appb-000006
is the latent variable z
Figure PCTCN2021138098-appb-000007
Regularization, λ 1 , λ 2 , λ 3 are the weights of the corresponding items.
在该步骤中,图像缩放模块将图像缩放到原始大小。In this step, the image scaling module scales the image to its original size.
步骤S130,对缩放处理后的图像进行超分辨率重建,获得超高分辨率图像。Step S130, performing super-resolution reconstruction on the zoomed image to obtain a super-resolution image.
例如,将输出的恢复图像利用超分辨率扩散模型16倍超分辨到高分 辨率大小,采用去噪扩散概率模型,通过随机迭代去噪过程执行超分辨率。For example, the output restored image is super-resolved to a high-resolution size by 16 times using the super-resolution diffusion model, and the denoising diffusion probability model is used to perform super-resolution through a stochastic iterative denoising process.
在一个实施例中,使用超分辨率模型SR3(Image Super-Resolution)或称条件扩散概率降噪模型进行图像超分辨率重建,工作原理是通过一系列的细化步骤学习将标准正态分布转换为经验数据分布。超分辨率网络结构如图4所示,采用U-Net架构,该架构通过去噪目标进行训练,以迭代地从输出中去除各种级别的噪声。In one embodiment, the super-resolution model SR3 (Image Super-Resolution) or the conditional diffusion probability denoising model is used for image super-resolution reconstruction. The working principle is to learn to transform the standard normal distribution through a series of refinement steps for the empirical data distribution. The super-resolution network structure is shown in Figure 4, using the U-Net architecture, which is trained with denoising targets to iteratively remove various levels of noise from the output.
条件扩散概率降噪模型在T个细化步骤中生成目标图像y 0。该模型从一幅纯噪声图像y T~N(0,I)开始,根据学习的条件转移分布p θ(y T-1|y t,x)通过连续迭代(y T-1,y T-2,...,y 0)使得y 0~p(y|x)。 The conditional diffusion probabilistic denoising model generates the target image y 0 in T refinement steps. The model starts from a pure noise image y T ~N(0,I), and transfers the distribution p θ (y T-1 |y t ,x) according to the learned conditional distribution through continuous iterations (y T-1 ,y T- 2 ,...,y 0 ) such that y 0 ~p(y|x).
仍结合图4所示,以低分辨率图像大小8×8为例,为了使模型以输入x为条件,使用反卷积计算将低分辨率图像上采样到目标分辨率,结果与y t连接在一起。 Still combined with Figure 4, taking the low-resolution image size 8×8 as an example, in order to make the model conditional on the input x, use deconvolution calculation to upsample the low-resolution image to the target resolution, and the result is concatenated with yt together.
根据前向扩散过程来定义推理链中的中间图像的分布,该前向扩散过程经由表示为q(y t|y t-1)的固定马尔可夫链将高斯噪声逐渐添加到信号。模型的目标是通过以x(低分辨率图像)为条件的反向马尔可夫链迭代地从噪声中恢复信号,从而逆转高斯扩散过程。使用去噪模型f θ来学习逆链,该模型以源图像和噪声目标图像作为输入,并估计噪声。训练目标函数例如设置为: The distribution of intermediate images in the inference chain is defined according to a forward diffusion process that gradually adds Gaussian noise to the signal via a fixed Markov chain denoted q( yt |yt -1 ). The goal of the model is to reverse the Gaussian diffusion process by iteratively recovering the signal from the noise via a reverse Markov chain conditioned on x (the low-resolution image). The inverse chain is learned using a denoising model f θ that takes as input a source image and a noisy target image and estimates the noise. The training objective function is set as, for example:
Figure PCTCN2021138098-appb-000008
Figure PCTCN2021138098-appb-000008
其中∈~N(0,I),,x表示低分辨率图像,y表示高分辨率图像,(x,y)从训练数据集中采样,y 0表示原始高分辨率图像,
Figure PCTCN2021138098-appb-000009
表示x加入噪声之后的图像,γ表示噪声尺度,p(γ)表示γ的分布,即
Figure PCTCN2021138098-appb-000010
p∈{1,2},p取1时代表
Figure PCTCN2021138098-appb-000011
损失,p取2时代表
Figure PCTCN2021138098-appb-000012
损失的平方,T表示总扩散次数,t表示扩散次数索引,f θ表示条件扩散概率降噪模型。
where ∈~N(0,I), x represents a low-resolution image, y represents a high-resolution image, (x, y) is sampled from the training data set, y 0 represents the original high-resolution image,
Figure PCTCN2021138098-appb-000009
Represents the image after adding noise to x, γ represents the noise scale, and p(γ) represents the distribution of γ, ie
Figure PCTCN2021138098-appb-000010
p∈{1,2}, when p is 1, it means
Figure PCTCN2021138098-appb-000011
loss, when p is 2, it means
Figure PCTCN2021138098-appb-000012
The square of the loss, T is the total number of diffusion times, t is the index of the number of diffusion times, and f θ is the conditional diffusion probability denoising model.
模型下的迭代求精的每次迭代都采用以下形式:Each iteration of iterative refinement under the model takes the form:
Figure PCTCN2021138098-appb-000013
Figure PCTCN2021138098-appb-000013
其中∈ t~N(0,I),α t是超参数,取值范围为0<α t<1,其确定在每次迭代 中添加的噪声的方差,
Figure PCTCN2021138098-appb-000014
where ∈ t ~N(0,I), α t is a hyperparameter with a value range of 0<α t <1, which determines the variance of the noise added in each iteration,
Figure PCTCN2021138098-appb-000014
步骤S140,基于超高分辨率图像,检测目标的类别和位置。Step S140, based on the ultra-high resolution image, detect the category and position of the target.
在该步骤中,将超高分辨率图像输入到目标检测器中,输出目标的类别及坐标信息。In this step, the ultra-high resolution image is input into the target detector, and the category and coordinate information of the target are output.
例如,参见图5所示,采用特征金字塔实现多尺度目标检测。特征金字塔是多尺度目标检测中的一个基本组成部分。高层的特征虽然包含了丰富的语义信息,但是由于低分辨率,很难准确地保存物体的位置信息。与之相反,低层的特征虽然语义信息较少,但是由于分辨率高,可以准确地包含物体位置信息。将低层的特征和高层的特征融合起来,构建特征金字塔,将每个特征图都输入到预测头中,从而实现识别和定位都准确的目标检测系统,检测出目标信息,例如,包括目标的类别和位置信息等。For example, as shown in Figure 5, feature pyramids are used to achieve multi-scale object detection. Feature pyramids are a fundamental building block in multi-scale object detection. Although high-level features contain rich semantic information, it is difficult to accurately preserve the location information of objects due to low resolution. In contrast, although the low-level features have less semantic information, they can accurately contain object location information due to their high resolution. Combine low-level features and high-level features to build a feature pyramid, and input each feature map into the prediction head, so as to realize a target detection system with accurate recognition and positioning, and detect target information, for example, including the category of the target and location information, etc.
优选地,由于简单的上采样也能使目标检测性能有很大提升,所以目标检测模块将超低分辨率图像进行插值,将其与高分辨率图像拼接,共同输入到特征提取模块,得到的结果进行加权排序。Preferably, since simple up-sampling can also greatly improve the target detection performance, the target detection module interpolates the ultra-low resolution image, stitches it with the high-resolution image, and jointly inputs it to the feature extraction module to obtain The results are sorted by weight.
步骤S150,将目标信息与设备状态信息融合,转化成可以感受的信息。In step S150, the target information is fused with the device status information, and transformed into perceivable information.
在该步骤中,利用后处理模块,将目标信息与设备状态信息融合,转化成盲人可以感受的信息。In this step, the target information is fused with the device status information by using the post-processing module, and converted into information that blind people can feel.
为进一步理解本发明,以下具体说明超分辨率重建过程的实施例,以8*8→128*128为例进行说明。In order to further understand the present invention, an embodiment of the super-resolution reconstruction process will be specifically described below, taking 8*8→128*128 as an example for illustration.
1)、构建训练集1), build a training set
忽略短边小于128像素的图片,将其余图片中心裁剪为128*128大小,作为高分辨率图片y 0;将高分辨率图片应用双三次插值算法16倍下采样到8*8大小,作为低分辨率图片x,所有高低分辨率图像对构成训练集。 Ignore the pictures whose short sides are smaller than 128 pixels, and crop the center of the remaining pictures to a size of 128*128 as a high-resolution picture y 0 ; apply the bicubic interpolation algorithm to downsample the high-resolution picture by 16 times to a size of 8*8 as a low-resolution picture Resolution picture x, all pairs of high and low resolution images constitute the training set.
2)、训练超分辨率扩散模型2), training super-resolution diffusion model
例如,实验设置如下:For example, the experiment setup is as follows:
批次大小:256;batch_size: 256;
优化器:AdamOptimizer: Adam
学习率:1e-4Learning rate: 1e-4
迭代次数:训练2000,推理100,α 0=0.9,α T=-19。 Number of iterations: training 2000, inference 100, α 0 =0.9, α T =-19.
在训练过程中,将低分辨率图片(256,3,8,8)应用反卷积计算16倍上采样到(256,3,128,128),和噪声图像拼接为(256,6,128,128),作为网络输入。由公式2得到网络损失,然后计算梯度并反向传播来更新网络权重。During the training process, the low-resolution images (256, 3, 8, 8) are deconvolutionally calculated and upsampled by 16 times to (256, 3, 128, 128), and the noise images are stitched into (256, 6, 128, 128) as network input. The network loss is obtained by formula 2, and then the gradient is calculated and backpropagated to update the network weights.
3)、利用经训练的模型进行推理3) Use the trained model for inference
具体地,推理过程是:将插值后的低分辨率图片x和y T拼接,由公式3得到y T-1,同理,由x和y T-1得到y T-2,经过T次迭代后得到y 0Specifically, the reasoning process is: splicing the interpolated low-resolution images x and y T , and obtaining y T-1 from formula 3, and similarly, obtaining y T-2 from x and y T-1 , after T iterations Then get y 0 .
进一步地,将插值后的低分辨率图片x和y 0拼接,输入到目标检测器中,得到两组目标位置及类别,加权排序后进行非极大值抑制操作,得到最终结果。 Further, the interpolated low-resolution images x and y 0 are spliced and input into the target detector to obtain two sets of target positions and categories. After weighted sorting, the non-maximum value suppression operation is performed to obtain the final result.
本发明通过扩散概率模型对低分辨率图像进行超分辨率,实现将超低分辨率图像(如最低8*8像素)到高分辨率图像(如128*128像素)的16倍率下转换,再由目标检测模块对高分辨率图像进行检测,解决导盲技术面临的低分辨率情景下的目标检测鲁棒性差和准确度低难题,减少设备功耗。The present invention performs super-resolution on the low-resolution image through the diffusion probability model, and realizes the 16-fold down-conversion of the ultra-low-resolution image (such as the lowest 8*8 pixels) to the high-resolution image (such as 128*128 pixels), and then The high-resolution image is detected by the target detection module, which solves the problem of poor robustness and low accuracy of target detection in low-resolution scenarios faced by the guide technology, and reduces power consumption of the device.
综上所述,本发明设计了一种基于超分辨率的小图像多目标检测方法,解决了导盲技术中障碍物检测在超低分辨率情景下效果变差的问题;利用图像缩放技术,实现将原始图像缩放到低分辨率图像进行低成本传输,再将低分辨率图像还原到高质量原始图像;采用基于扩散概率模型的图像超分辨率技术,实现在导盲时对低分辨率下的盲人生活场景图像进行目标检测,从而为现有导盲技术提供一种解决方案;同时利用低分辨图像和高分辨率图像信息,提高检测精度。总之,本发明将较低分辨率图像作为原始输入,使得导盲设备可以容纳低分辨率摄像机,同时应用图像缩放技术在数据传输的过程中减少数据传送量,减轻功耗及减少设备体积,使得导盲设备可以长时间工作,减轻使用者负担。In summary, the present invention designs a super-resolution-based small image multi-target detection method, which solves the problem that the effect of obstacle detection in the blind-guiding technology becomes worse in ultra-low-resolution scenarios; using image scaling technology, The original image is scaled to a low-resolution image for low-cost transmission, and then the low-resolution image is restored to a high-quality original image; the image super-resolution technology based on the diffusion probability model is used to realize the low-resolution image when guiding the blind. Target detection is carried out on images of blind people's life scenes, so as to provide a solution for the existing blind-guiding technology; at the same time, low-resolution images and high-resolution image information are used to improve detection accuracy. In a word, the present invention uses lower-resolution images as the original input, so that the blind-guiding device can accommodate low-resolution cameras, and at the same time applies image scaling technology to reduce the amount of data transmission, reduce power consumption and reduce the volume of the device during data transmission, so that The blind guide device can work for a long time, reducing the burden on users.
本发明可以是系统、方法和/或计算机程序产品。计算机程序产品可以包括计算机可读存储介质,其上载有用于使处理器实现本发明的各个方面的计算机可读程序指令。The present invention can be a system, method and/or computer program product. A computer program product may include a computer readable storage medium having computer readable program instructions thereon for causing a processor to implement various aspects of the present invention.
计算机可读存储介质可以是可以保持和存储由指令执行设备使用的 指令的有形设备。计算机可读存储介质例如可以是但不限于电存储设备、磁存储设备、光存储设备、电磁存储设备、半导体存储设备或者上述的任意合适的组合。计算机可读存储介质的更具体的例子(非穷举的列表)包括:便携式计算机盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、静态随机存取存储器(SRAM)、便携式压缩盘只读存储器(CD-ROM)、数字多功能盘(DVD)、记忆棒、软盘、机械编码设备、例如其上存储有指令的打孔卡或凹槽内凸起结构、以及上述的任意合适的组合。这里所使用的计算机可读存储介质不被解释为瞬时信号本身,诸如无线电波或者其他自由传播的电磁波、通过波导或其他传输媒介传播的电磁波(例如,通过光纤电缆的光脉冲)、或者通过电线传输的电信号。A computer readable storage medium may be a tangible device that can retain and store instructions for use by an instruction execution device. A computer readable storage medium may be, for example, but is not limited to, an electrical storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of computer-readable storage media include: portable computer diskettes, hard disks, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM), or flash memory), static random access memory (SRAM), compact disc read only memory (CD-ROM), digital versatile disc (DVD), memory stick, floppy disk, mechanically encoded device, such as a printer with instructions stored thereon A hole card or a raised structure in a groove, and any suitable combination of the above. As used herein, computer-readable storage media are not to be construed as transient signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through waveguides or other transmission media (e.g., pulses of light through fiber optic cables), or transmitted electrical signals.
这里所描述的计算机可读程序指令可以从计算机可读存储介质下载到各个计算/处理设备,或者通过网络、例如因特网、局域网、广域网和/或无线网下载到外部计算机或外部存储设备。网络可以包括铜传输电缆、光纤传输、无线传输、路由器、防火墙、交换机、网关计算机和/或边缘服务器。每个计算/处理设备中的网络适配卡或者网络接口从网络接收计算机可读程序指令,并转发该计算机可读程序指令,以供存储在各个计算/处理设备中的计算机可读存储介质中。Computer-readable program instructions described herein may be downloaded from a computer-readable storage medium to a respective computing/processing device, or downloaded to an external computer or external storage device over a network, such as the Internet, a local area network, a wide area network, and/or a wireless network. The network may include copper transmission cables, fiber optic transmission, wireless transmission, routers, firewalls, switches, gateway computers, and/or edge servers. A network adapter card or a network interface in each computing/processing device receives computer-readable program instructions from the network and forwards the computer-readable program instructions for storage in a computer-readable storage medium in each computing/processing device .
用于执行本发明操作的计算机程序指令可以是汇编指令、指令集架构(ISA)指令、机器指令、机器相关指令、微代码、固件指令、状态设置数据、或者以一种或多种编程语言的任意组合编写的源代码或目标代码,所述编程语言包括面向对象的编程语言—诸如Smalltalk、C++、Python等,以及常规的过程式编程语言—诸如“C”语言或类似的编程语言。计算机可读程序指令可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络—包括局域网(LAN)或广域网(WAN)—连接到用户计算机,或者,可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。在一些实施例中,通过利用计算机 可读程序指令的状态信息来个性化定制电子电路,例如可编程逻辑电路、现场可编程门阵列(FPGA)或可编程逻辑阵列(PLA),该电子电路可以执行计算机可读程序指令,从而实现本发明的各个方面。Computer program instructions for carrying out operations of the present invention may be assembly instructions, instruction set architecture (ISA) instructions, machine instructions, machine-related instructions, microcode, firmware instructions, state setting data, or Source or object code written in any combination, including object-oriented programming languages—such as Smalltalk, C++, Python, etc., and conventional procedural programming languages—such as the “C” language or similar programming languages. Computer readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer, or entirely on the remote computer or server implement. In cases involving a remote computer, the remote computer can be connected to the user computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or it can be connected to an external computer (such as via the Internet using an Internet service provider). connect). In some embodiments, an electronic circuit, such as a programmable logic circuit, field programmable gate array (FPGA), or programmable logic array (PLA), can be customized by utilizing state information of computer-readable program instructions, which can Various aspects of the invention are implemented by executing computer readable program instructions.
这里参照根据本发明实施例的方法、装置(系统)和计算机程序产品的流程图和/或框图描述了本发明的各个方面。应当理解,流程图和/或框图的每个方框以及流程图和/或框图中各方框的组合,都可以由计算机可读程序指令实现。Aspects of the present invention are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It should be understood that each block of the flowcharts and/or block diagrams, and combinations of blocks in the flowcharts and/or block diagrams, can be implemented by computer-readable program instructions.
这些计算机可读程序指令可以提供给通用计算机、专用计算机或其它可编程数据处理装置的处理器,从而生产出一种机器,使得这些指令在通过计算机或其它可编程数据处理装置的处理器执行时,产生了实现流程图和/或框图中的一个或多个方框中规定的功能/动作的装置。也可以把这些计算机可读程序指令存储在计算机可读存储介质中,这些指令使得计算机、可编程数据处理装置和/或其他设备以特定方式工作,从而,存储有指令的计算机可读介质则包括一个制造品,其包括实现流程图和/或框图中的一个或多个方框中规定的功能/动作的各个方面的指令。These computer-readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine such that when executed by the processor of the computer or other programmable data processing apparatus , producing an apparatus for realizing the functions/actions specified in one or more blocks in the flowchart and/or block diagram. These computer-readable program instructions can also be stored in a computer-readable storage medium, and these instructions cause computers, programmable data processing devices and/or other devices to work in a specific way, so that the computer-readable medium storing instructions includes An article of manufacture comprising instructions for implementing various aspects of the functions/acts specified in one or more blocks in flowcharts and/or block diagrams.
也可以把计算机可读程序指令加载到计算机、其它可编程数据处理装置、或其它设备上,使得在计算机、其它可编程数据处理装置或其它设备上执行一系列操作步骤,以产生计算机实现的过程,从而使得在计算机、其它可编程数据处理装置、或其它设备上执行的指令实现流程图和/或框图中的一个或多个方框中规定的功能/动作。It is also possible to load computer-readable program instructions into a computer, other programmable data processing device, or other equipment, so that a series of operational steps are performed on the computer, other programmable data processing device, or other equipment to produce a computer-implemented process , so that instructions executed on computers, other programmable data processing devices, or other devices implement the functions/actions specified in one or more blocks in the flowcharts and/or block diagrams.
附图中的流程图和框图显示了根据本发明的多个实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段或指令的一部分,所述模块、程序段或指令的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个连续的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或动作的专用的基于硬件的系统来实 现,或者可以用专用硬件与计算机指令的组合来实现。对于本领域技术人员来说公知的是,通过硬件方式实现、通过软件方式实现以及通过软件和硬件结合的方式实现都是等价的。The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in a flowchart or block diagram may represent a module, a portion of a program segment, or an instruction that includes one or more Executable instructions. In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks in succession may, in fact, be executed substantially concurrently, or they may sometimes be executed in the reverse order, depending upon the functionality involved. It should also be noted that each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations, can be implemented by a dedicated hardware-based system that performs the specified function or action , or may be implemented by a combination of dedicated hardware and computer instructions. It is well known to those skilled in the art that implementation by means of hardware, implementation by means of software, and implementation by a combination of software and hardware are all equivalent.
以上已经描述了本发明的各实施例,上述说明是示例性的,并非穷尽性的,并且也不限于所披露的各实施例。在不偏离所说明的各实施例的范围和精神的情况下,对于本技术领域的普通技术人员来说许多修改和变更都是显而易见的。本文中所用术语的选择,旨在最好地解释各实施例的原理、实际应用或对市场中的技术改进,或者使本技术领域的其它普通技术人员能理解本文披露的各实施例。本发明的范围由所附权利要求来限定。Having described various embodiments of the present invention, the foregoing description is exemplary, not exhaustive, and is not limited to the disclosed embodiments. Many modifications and alterations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described embodiments. The terminology used herein is chosen to best explain the principle of each embodiment, practical application or technical improvement in the market, or to enable other ordinary skilled in the art to understand each embodiment disclosed herein. The scope of the invention is defined by the appended claims.

Claims (10)

  1. 一种基于超分辨率的小图像多目标检测方法,包括以下步骤:A super-resolution-based small image multi-target detection method, comprising the following steps:
    步骤S1:获取原始场景的第一分辨率图像;Step S1: Acquiring the first resolution image of the original scene;
    步骤S2:利用可逆神经网络模型将第一分辨率图像转换为第二分辨率图像后进行传输,进而还原为第一分辨率图像,其中第二分辨率图像的分辨率低于第一分辨率图像;Step S2: Use the reversible neural network model to convert the first-resolution image into a second-resolution image and then transmit it, and then restore it to the first-resolution image, wherein the resolution of the second-resolution image is lower than that of the first-resolution image ;
    步骤S4:将还原的第一分辨率图像输入至经训练的超分辨率扩散模型,并通过随机迭代去噪过程执行超分辨率重建,输出超高分辨率图像;Step S4: Input the restored first resolution image into the trained super-resolution diffusion model, perform super-resolution reconstruction through a random iterative denoising process, and output a super-resolution image;
    步骤S4:对所述超高分辨率图像执行目标检测,获得目标识别信息。Step S4: Perform target detection on the ultra-high resolution image to obtain target identification information.
  2. 根据权利要求2所述的方法,其特征在于,将训练所述可逆神经网络模型的损失函数设置为:method according to claim 2, is characterized in that, the loss function of training described reversible neural network model is set to:
    Figure PCTCN2021138098-appb-100001
    Figure PCTCN2021138098-appb-100001
    其中x是第一分辨率图像输入,y为第二分辨率图像输出,z是潜变量输出,x τ-1是由y和z还原的第一分辨率图像,y *是x经过双三次线性插值得到的第二分辨率图像,
    Figure PCTCN2021138098-appb-100002
    是y *和y的
    Figure PCTCN2021138098-appb-100003
    像素损失,
    Figure PCTCN2021138098-appb-100004
    是x和x τ-1
    Figure PCTCN2021138098-appb-100005
    像素损失,
    Figure PCTCN2021138098-appb-100006
    是潜变量z的
    Figure PCTCN2021138098-appb-100007
    正则化,λ 1,λ 2,λ 3是相应项的权重。
    Where x is the first resolution image input, y is the second resolution image output, z is the latent variable output, x τ-1 is the first resolution image restored by y and z, y * is x after bicubic linear The second resolution image obtained by interpolation,
    Figure PCTCN2021138098-appb-100002
    is y * and y's
    Figure PCTCN2021138098-appb-100003
    pixel loss,
    Figure PCTCN2021138098-appb-100004
    is x and x τ-1 of
    Figure PCTCN2021138098-appb-100005
    pixel loss,
    Figure PCTCN2021138098-appb-100006
    is the latent variable z
    Figure PCTCN2021138098-appb-100007
    Regularization, λ 1 , λ 2 , λ 3 are the weights of the corresponding items.
  3. 根据权利要求1所述的方法,其特征在于,所述超分辨率扩散模型采用Unet架构,通过T个细化步骤学习将标准正态分布转换为经验数据分布。The method according to claim 1, wherein the super-resolution diffusion model adopts a Unet architecture, and learns to convert a standard normal distribution into an empirical data distribution through T refinement steps.
  4. 根据权利要求4所述的方法,其特征在于,在T个细化步骤中,所述超分辨率扩散模型从一幅纯噪声图像开始,根据学习的条件转移分布通过连续迭代使得生成的目标图像符合预设的概率分布。The method according to claim 4, characterized in that, in the T refinement steps, the super-resolution diffusion model starts from a pure noise image, and the generated target image is made conform to a predetermined probability distribution.
  5. 根据权利要求1所述的方法,其特征在于,所述超分辨率扩散模型的训练目标函数设置为:The method according to claim 1, wherein the training objective function of the super-resolution diffusion model is set to:
    Figure PCTCN2021138098-appb-100008
    Figure PCTCN2021138098-appb-100008
    其中∈~N(0,I),x表示低分辨率图像,y表示高分辨率图像,(x,y)从训练数据集中采样,y 0表示原始高分辨率图像,
    Figure PCTCN2021138098-appb-100009
    表示x加入噪声之后的图 像,γ表示噪声尺度,
    Figure PCTCN2021138098-appb-100010
    p取1时代表
    Figure PCTCN2021138098-appb-100011
    损失,p取2时代表
    Figure PCTCN2021138098-appb-100012
    损失的平方,T表示总扩散次数,t表示扩散次数索引,f θ表示超分辨率扩散模型,模型下的每次迭代采用以下形式:
    where ∈~N(0,I), x represents a low-resolution image, y represents a high-resolution image, (x, y) is sampled from the training data set, y 0 represents the original high-resolution image,
    Figure PCTCN2021138098-appb-100009
    Indicates the image after adding noise to x, and γ indicates the noise scale,
    Figure PCTCN2021138098-appb-100010
    When p is 1, it means
    Figure PCTCN2021138098-appb-100011
    loss, when p is 2, it means
    Figure PCTCN2021138098-appb-100012
    The square of the loss, T is the total number of diffusions, t is the index of the number of diffusions, f θ is the super-resolution diffusion model, and each iteration under the model takes the following form:
    Figure PCTCN2021138098-appb-100013
    Figure PCTCN2021138098-appb-100013
    其中∈ t~N(0,I),α t是超参数,取值范围为0<α t<1。 Where ∈ t ~N(0,I), α t is a hyperparameter, and the value range is 0<α t <1.
  6. 根据权利要求1所述的方法,其特征在于,在步骤S4中,将低层特征和高层特征进行融合,构建特征金字塔,将每个特征图都输入到预测头中,获得目标的类别和位置信息。The method according to claim 1, characterized in that in step S4, low-level features and high-level features are fused to construct a feature pyramid, and each feature map is input into the prediction head to obtain the category and location information of the target .
  7. 根据权利要求1所述的方法,其特征在于,所述超分辨率扩散模型的训练集根据以下步骤构建:The method according to claim 1, wherein the training set of the super-resolution diffusion model is constructed according to the following steps:
    将采集的图片裁剪为目标高分辨率大小,作为高分辨率图片;Crop the collected picture to the target high-resolution size as a high-resolution picture;
    将高分辨率图片应用双三次插值算法下采样到目标低分辨率大小,作为低分辨率图片;Apply the bicubic interpolation algorithm to downsample the high-resolution image to the target low-resolution size as a low-resolution image;
    所有高低分辨率图像对构成训练集。All pairs of high and low resolution images constitute the training set.
  8. 根据权利要求1所述的方法,其特征在于,采用头戴设备中的相机获取原始场景的第一分辨率图像,并将获得的目标识别信息与设备状态信息融合,转化成使用者能够感受的信息。The method according to claim 1, wherein the camera in the head-mounted device is used to obtain the first resolution image of the original scene, and the obtained target identification information is fused with the device status information to convert it into an image that the user can feel. information.
  9. 一种计算机可读存储介质,其上存储有计算机程序,其中,该程序被处理器执行时实现根据权利要求1至8中任一项所述方法的步骤。A computer-readable storage medium, on which a computer program is stored, wherein, when the program is executed by a processor, the steps of the method according to any one of claims 1 to 8 are realized.
  10. 一种计算机设备,包括存储器和处理器,在所述存储器上存储有能够在处理器上运行的计算机程序,其特征在于,所述处理器执行所述程序时实现权利要求1至8中任一项所述的方法的步骤。A computer device comprising a memory and a processor, wherein a computer program capable of running on the processor is stored on the memory, wherein any one of claims 1 to 8 is implemented when the processor executes the program The steps of the method described in the item.
PCT/CN2021/138098 2021-10-14 2021-12-14 Small image multi-object detection method based on super-resolution WO2023060746A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111198028.7A CN113920013B (en) 2021-10-14 2021-10-14 Super-resolution-based small image multi-target detection method
CN202111198028.7 2021-10-14

Publications (1)

Publication Number Publication Date
WO2023060746A1 true WO2023060746A1 (en) 2023-04-20

Family

ID=79240553

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/138098 WO2023060746A1 (en) 2021-10-14 2021-12-14 Small image multi-object detection method based on super-resolution

Country Status (2)

Country Link
CN (1) CN113920013B (en)
WO (1) WO2023060746A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116777906A (en) * 2023-08-17 2023-09-19 常州微亿智造科技有限公司 Abnormality detection method and abnormality detection device in industrial detection
CN117409192A (en) * 2023-12-14 2024-01-16 武汉大学 Data enhancement-based infrared small target detection method and device
CN117830800A (en) * 2024-03-04 2024-04-05 广州市仪美医用家具科技股份有限公司 Clothing detection and recovery method, system, medium and equipment based on YOLO algorithm

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114820398B (en) * 2022-07-01 2022-11-04 北京汉仪创新科技股份有限公司 Image font replacing method, system, equipment and medium based on diffusion model
CN115471398B (en) * 2022-08-31 2023-08-15 北京科技大学 Image super-resolution method, system, terminal equipment and storage medium
CN117078510B (en) * 2022-11-16 2024-04-30 电子科技大学 Single image super-resolution reconstruction method of potential features
CN116012296B (en) * 2022-12-01 2023-10-24 浙江大学 Prefabricated part detection method based on super-resolution and semi-supervised learning
CN116469047A (en) * 2023-03-20 2023-07-21 南通锡鼎智能科技有限公司 Small target detection method and detection device for laboratory teaching
CN117746171B (en) * 2024-02-20 2024-04-23 成都信息工程大学 Unsupervised weather downscaling method based on dual learning and auxiliary information

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107133916A (en) * 2017-04-21 2017-09-05 西安科技大学 Image-scaling method
CN107492070A (en) * 2017-07-10 2017-12-19 华北电力大学 A kind of single image super-resolution computational methods of binary channels convolutional neural networks
CN111784624A (en) * 2019-04-02 2020-10-16 北京沃东天骏信息技术有限公司 Target detection method, device, equipment and computer readable storage medium
US20210136394A1 (en) * 2019-11-05 2021-05-06 Canon Kabushiki Kaisha Encoding apparatus and encoding method, and decoding apparatus and decoding method
CN113014927A (en) * 2021-03-02 2021-06-22 三星(中国)半导体有限公司 Image compression method and image compression device
CN113139896A (en) * 2020-01-17 2021-07-20 波音公司 Target detection system and method based on super-resolution reconstruction
CN113298718A (en) * 2021-06-22 2021-08-24 云南大学 Single image super-resolution reconstruction method and system

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103136734B (en) * 2013-02-27 2016-01-13 北京工业大学 The suppressing method of edge Halo effect during a kind of convex set projection super-resolution image reconstruction
CN106981046B (en) * 2017-03-21 2019-10-11 四川大学 Single image super resolution ratio reconstruction method based on multi-gradient constrained regression
US11232541B2 (en) * 2018-10-08 2022-01-25 Rensselaer Polytechnic Institute CT super-resolution GAN constrained by the identical, residual and cycle learning ensemble (GAN-circle)
CN110428378B (en) * 2019-07-26 2022-02-08 北京小米移动软件有限公司 Image processing method, device and storage medium
CN111062872B (en) * 2019-12-17 2021-02-05 暨南大学 Image super-resolution reconstruction method and system based on edge detection
WO2021121108A1 (en) * 2019-12-20 2021-06-24 北京金山云网络技术有限公司 Image super-resolution and model training method and apparatus, electronic device, and medium
CN111369440B (en) * 2020-03-03 2024-01-30 网易(杭州)网络有限公司 Model training and image super-resolution processing method, device, terminal and storage medium
CN113496465A (en) * 2020-03-20 2021-10-12 微软技术许可有限责任公司 Image scaling
CN111353940B (en) * 2020-03-31 2021-04-02 成都信息工程大学 Image super-resolution reconstruction method based on deep learning iterative up-down sampling
CN113177882B (en) * 2021-04-29 2022-08-05 浙江大学 Single-frame image super-resolution processing method based on diffusion model

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107133916A (en) * 2017-04-21 2017-09-05 西安科技大学 Image-scaling method
CN107492070A (en) * 2017-07-10 2017-12-19 华北电力大学 A kind of single image super-resolution computational methods of binary channels convolutional neural networks
CN111784624A (en) * 2019-04-02 2020-10-16 北京沃东天骏信息技术有限公司 Target detection method, device, equipment and computer readable storage medium
US20210136394A1 (en) * 2019-11-05 2021-05-06 Canon Kabushiki Kaisha Encoding apparatus and encoding method, and decoding apparatus and decoding method
CN113139896A (en) * 2020-01-17 2021-07-20 波音公司 Target detection system and method based on super-resolution reconstruction
CN113014927A (en) * 2021-03-02 2021-06-22 三星(中国)半导体有限公司 Image compression method and image compression device
CN113298718A (en) * 2021-06-22 2021-08-24 云南大学 Single image super-resolution reconstruction method and system

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
CHITWAN SAHARIA; JONATHAN HO; WILLIAM CHAN; TIM SALIMANS; DAVID J. FLEET; MOHAMMAD NOROUZI: "Image Super-Resolution via Iterative Refinement", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, 30 June 2021 (2021-06-30), 201 Olin Library Cornell University Ithaca, NY 14853 , XP081980663 *
JINGYUN LIANG; ANDREAS LUGMAYR; KAI ZHANG; MARTIN DANELLJAN; LUC VAN GOOL; RADU TIMOFTE: "Hierarchical Conditional Flow: A Unified Framework for Image Super-Resolution and Image Rescaling", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, 11 August 2021 (2021-08-11), 201 Olin Library Cornell University Ithaca, NY 14853, XP091032867 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116777906A (en) * 2023-08-17 2023-09-19 常州微亿智造科技有限公司 Abnormality detection method and abnormality detection device in industrial detection
CN116777906B (en) * 2023-08-17 2023-11-14 常州微亿智造科技有限公司 Abnormality detection method and abnormality detection device in industrial detection
CN117409192A (en) * 2023-12-14 2024-01-16 武汉大学 Data enhancement-based infrared small target detection method and device
CN117409192B (en) * 2023-12-14 2024-03-08 武汉大学 Data enhancement-based infrared small target detection method and device
CN117830800A (en) * 2024-03-04 2024-04-05 广州市仪美医用家具科技股份有限公司 Clothing detection and recovery method, system, medium and equipment based on YOLO algorithm

Also Published As

Publication number Publication date
CN113920013A (en) 2022-01-11
CN113920013B (en) 2023-06-16

Similar Documents

Publication Publication Date Title
WO2023060746A1 (en) Small image multi-object detection method based on super-resolution
US11763466B2 (en) Determining structure and motion in images using neural networks
Shivakumar et al. Dfusenet: Deep fusion of rgb and sparse depth information for image guided dense depth completion
US11734847B2 (en) Image depth prediction neural networks
US10671855B2 (en) Video object segmentation by reference-guided mask propagation
US11481869B2 (en) Cross-domain image translation
CN113066017B (en) Image enhancement method, model training method and equipment
Zhou et al. Scale adaptive image cropping for UAV object detection
WO2021018106A1 (en) Pedestrian detection method, apparatus, computer-readable storage medium and chip
KR20220005432A (en) Scene representation using image processing
Liu et al. Effective image super resolution via hierarchical convolutional neural network
KR20220148274A (en) Self-supervised representation learning using bootstrapped latent representations
CN111832393A (en) Video target detection method and device based on deep learning
JP2024026745A (en) Using imager with on-purpose controlled distortion for inference or training of artificial intelligence neural network
Zhang et al. Unsupervised depth estimation from monocular videos with hybrid geometric-refined loss and contextual attention
Huang et al. Learning optical flow with R-CNN for visual odometry
US20230053618A1 (en) Recurrent unit for generating or processing a sequence of images
WO2023123873A1 (en) Dense optical flow calculation method employing attention mechanism
Kato et al. Visual language modeling on cnn image representations
CN117409375B (en) Dual-attention-guided crowd counting method, apparatus and computer storage medium
Pal et al. MAML-SR: Self-adaptive super-resolution networks via multi-scale optimized attention-aware meta-learning
US20220171959A1 (en) Method and apparatus with image processing
US20230394699A1 (en) Method of estimating a three-dimensional position of an object
Dronova et al. FlyNeRF: NeRF-Based Aerial Mapping for High-Quality 3D Scene Reconstruction
Bi et al. Image deblurring method based on feature fusion SRN

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21960471

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE