WO2020156361A1 - Training sample obtaining method and apparatus, electronic device and storage medium - Google Patents

Training sample obtaining method and apparatus, electronic device and storage medium Download PDF

Info

Publication number
WO2020156361A1
WO2020156361A1 PCT/CN2020/073396 CN2020073396W WO2020156361A1 WO 2020156361 A1 WO2020156361 A1 WO 2020156361A1 CN 2020073396 W CN2020073396 W CN 2020073396W WO 2020156361 A1 WO2020156361 A1 WO 2020156361A1
Authority
WO
WIPO (PCT)
Prior art keywords
video
frame
scene
feature information
target
Prior art date
Application number
PCT/CN2020/073396
Other languages
French (fr)
Chinese (zh)
Inventor
徐青松
李青
Original Assignee
杭州睿琪软件有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 杭州睿琪软件有限公司 filed Critical 杭州睿琪软件有限公司
Publication of WO2020156361A1 publication Critical patent/WO2020156361A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N99/00Subject matter not provided for in other groups of this subclass
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features

Definitions

  • the present invention relates to the field of machine learning technology, and in particular to a method, device, electronic equipment and computer-readable storage medium for obtaining training samples.
  • the purpose of the present invention is to provide a method, a device, an electronic device and a computer-readable storage medium for obtaining training samples to solve the problems of low efficiency and high cost of obtaining image training samples in the prior art.
  • the present invention provides a method for obtaining training samples, including:
  • the image of each marked video frame in the scene segment is extracted as a training sample.
  • the obtaining the scene fragment in the video includes:
  • the video is a single scene video, use the video as a scene segment
  • the scene switching detection technology is used to divide the video into multiple scene segments.
  • the scene switching detection technology includes: a pixel domain-based detection algorithm and/or a compressed domain-based detection algorithm.
  • the method before extracting the feature information of the target area marked in the initial frame, the method further includes:
  • Image preprocessing is performed on the initial frame to make the feature information of the target region in the initial frame more obvious.
  • the feature information of the target area includes one or more of color features, texture features, and shape features.
  • the step of performing feature search on the forward and/or backward video frames in the scene segment includes:
  • feature search is performed on the forward and/or backward video frames in the scene segment.
  • the method further includes:
  • the target feature information is: feature information of the marked area in the adjacent preset number of frames of the searched frame.
  • the present invention also provides a training sample obtaining device, including:
  • Obtaining module used to obtain scene fragments in the video
  • a first labeling module configured to select a video frame containing a target object in the scene fragment as an initial frame, and label the target area in the initial frame;
  • a first extraction module configured to extract feature information of the target area marked in the initial frame
  • the second labeling module is used to perform feature search on the forward and/or backward video frames in the scene segment based on the initial frame, and determine the feature information in each searched frame and the feature of the target area The area where the information matches, and automatically mark the area determined in each searched frame;
  • the second extraction module is used to extract the marked images of each video frame in the scene segment as training samples.
  • the obtaining module obtains a scene segment in the video, including:
  • the video is a single scene video, use the video as a scene segment
  • the scene switching detection technology is used to divide the video into multiple scene segments.
  • the scene switching detection technology includes: a pixel domain-based detection algorithm and/or a compressed domain-based detection algorithm.
  • the device further includes:
  • the preprocessing module is configured to perform image preprocessing on the initial frame before the first extraction module extracts the feature information of the target region marked in the initial frame, so that the The characteristic information of the target area is more obvious.
  • the feature information of the target area includes one or more of color features, texture features, and shape features.
  • the second extraction module performs feature search on the forward and/or backward video frames in the scene segment, including:
  • feature search is performed on the forward and/or backward video frames in the scene segment.
  • the second extraction module is further used for:
  • the target feature information is: feature information of the marked area in the adjacent preset number of frames of the searched frame.
  • the present invention also provides an electronic device, including a processor, a communication interface, a memory, and a communication bus, wherein the processor, the communication interface, and the memory communicate with each other through the communication bus; wherein,
  • the memory is used to store computer programs
  • the processor When the processor is used to execute the computer program stored in the memory, it implements the training sample obtaining method described in any one of the above.
  • the present invention also provides a computer-readable storage medium having a computer program stored in the computer-readable storage medium, and when the computer program is executed by a processor, the training sample obtaining method described in any one of the above is implemented.
  • the solution provided by the present invention firstly annotates the initial frame in the scene segment of the video, and then uses the target tracking technology to automatically annotate other video frames in the entire scene segment, thereby obtaining a large number of annotated images as a subsequent target recognition model Training samples.
  • manual annotation is performed by acquiring a large number of pictures.
  • the cost of image acquisition and annotation is relatively high.
  • the present invention can shoot a video, and the acquisition of annotation materials is more convenient and easy. Then a large number of automatically marked samples can be collected from the video, which reduces The cost of sample labeling improves the efficiency of labeling processing.
  • FIG. 1 is a schematic flowchart of a method for obtaining training samples according to an embodiment of the present invention
  • FIG. 2 is a schematic structural diagram of a training sample obtaining apparatus according to an embodiment of the present invention.
  • Fig. 3 is a structural block diagram of an electronic device provided by an embodiment of the present invention.
  • the embodiments of the present invention provide a method, device, electronic device, and computer-readable storage medium for obtaining training samples.
  • the training sample obtaining method of the embodiment of the present invention can be applied to the training sample obtaining device of the embodiment of the present invention, and the training sample obtaining device can be configured on an electronic device.
  • the electronic device may be a personal computer, a mobile terminal, etc.
  • the mobile terminal may be a hardware device with various operating systems such as a mobile phone or a tablet computer.
  • Fig. 1 is a schematic flowchart of a method for obtaining training samples according to an embodiment of the present invention. Please refer to Fig. 1.
  • a method for obtaining training samples may include the following steps:
  • a video is generally composed of one or more scene segments, and a scene is composed of multiple video frames.
  • the video on which the present invention is based can be a single-scene video or a multi-scene video. If the video is a single scene video, since the video contains only one scene segment, the video can be directly used as an obtained scene segment, and subsequent processing steps are executed.
  • the scene switching detection technology can be used to divide the video into multiple scene segments. After dividing multiple scene fragments, only one of the scene fragments can be used, and by performing subsequent processing steps, the images of each marked video frame in the scene fragment can be obtained as training samples, or it can be uniform for each scene fragment. Performing subsequent processing steps can further increase the number of training samples obtained.
  • Scene switching detection technology refers to finding out the frame and frame position where scene switching occurs in a video.
  • the obtained frame position can be used for fast and accurate video editing or further processing, and the frame sequence composed of the obtained frames can be used for rough description The entire video content.
  • traditional video scene switching detection methods generally use manual feature extraction methods, such as calculating the color histogram similarity of adjacent frames, or directly calculating the frame difference, or using the change of high-frequency subband coefficients of each frame in the video scene
  • the degree feature VH viewportHeight, window height
  • the calculation of high-frequency subband coefficients requires algorithms such as three-dimensional wavelet transform.
  • These technologies will calculate a feature value and compare it with the threshold. If it is greater than the threshold or less than the threshold, it is determined For switching frames.
  • adaptive threshold algorithms based on the above technologies, such as a video scene change detection method based on adaptive thresholds, but the sliding window size and preset thresholds in this method still need to be manually set.
  • the scene switching detection technology can adopt a pixel domain-based detection algorithm or a compression domain-based detection algorithm, and set corresponding scene switching thresholds according to different scenes, which can improve the speed and accuracy of scene switching detection.
  • the detection algorithm based on the pixel domain or the compressed domain can be referred to the prior art, which will not be repeated here.
  • S102 Select a video frame containing a target object in the scene segment as an initial frame, and mark the target area where the target object is located in the initial frame.
  • the target object may be an object of interest.
  • it can be identified according to the video frames it contains, and a video frame containing the target object can be selected as the initial frame for labeling.
  • the first frame where the target object appears can be selected as the initial frame. If the target object is in the first frame If the feature of is not obvious, look for a frame with more obvious features of the target object in the subsequent video frames as the initial frame. The requirements for this step are not very strict. You can probably choose a better video frame as the initial frame.
  • Its purpose is to mark the target area where the target object is located, so as to extract the feature information of the target area, so that the feature can be passed in the subsequent processing.
  • the search automatically marks the feature matching area in the previous or next video frames.
  • image preprocessing such as image denoising, contrast enhancement, etc.
  • image denoising contrast enhancement, etc.
  • the feature information of the target area may include one or more of color features, texture features, and shape features.
  • the color feature is a global feature that describes the surface properties of the scene corresponding to the image or image area. Generally, color characteristics are based on the characteristics of pixels. At this time, all pixels belonging to an image or image area have their own contributions. Color histogram is the most commonly used method to express color characteristics. It can simply describe the global distribution of colors in an image, that is, the proportions of different colors in the entire image. It is especially suitable for describing images and images that are difficult to automatically segment. The image of the spatial position of the object does not need to be considered, and it is not affected by the change of image rotation and translation, and further normalization is not affected by the change of image scale. The most commonly used color spaces are RGB color space and HSV color space. Color histogram feature matching methods include: histogram intersection method, distance method, center distance method, reference color table method, and cumulative color histogram method.
  • Texture feature is also a kind of global feature, it also describes the surface properties of the scene corresponding to the image or image area.
  • texture is only a characteristic of the surface of an object, and cannot fully reflect the essential attributes of the object, high-level image content cannot be obtained by using only texture features.
  • the texture feature is not based on the feature of pixels, it needs to perform statistical calculation in the area containing multiple pixels. In pattern matching, this regional feature has greater advantages and will not fail to match successfully due to local deviations.
  • texture features often have rotation invariance and have strong resistance to noise.
  • texture features include statistical methods, geometric methods, model methods, and signal processing methods.
  • the typical representative of statistical methods is a texture feature analysis method called gray-level co-occurrence matrix.
  • Gotsch and Kreyszig et al. based on the study of various statistical features in the co-occurrence matrix, through experiments, obtained four gray-level co-occurrence matrix Key features: energy, inertia, entropy, and correlation;
  • another typical method in statistical methods is to extract texture features from the image's autocorrelation function (ie, the image's energy spectrum function), that is, through the image's energy spectrum function Calculate and extract characteristic parameters such as texture thickness and directionality.
  • the geometric method is a texture feature analysis method based on the theory of texture primitives (basic texture elements).
  • the texture primitive theory believes that a complex texture can be composed of a number of simple texture primitives in a certain regular form.
  • Voronio checkerboard feature method and structural method.
  • the model method is based on the structural model of the image, and uses the parameters of the model as texture features.
  • Typical methods are random field model methods, such as Markov random field (MRF) model method and Gibbs random field model method.
  • MRF Markov random field
  • the extraction and matching of texture features mainly include: gray level co-occurrence matrix, Tamura texture feature, autoregressive texture model, wavelet transform, etc.
  • the feature extraction and matching of gray-level co-occurrence matrix mainly rely on four parameters: energy, inertia, entropy and correlation.
  • Tamura's texture feature is based on the psychological research of human visual perception of texture, and proposes six attributes, namely: roughness, contrast, direction, line image, regularity and roughness.
  • the auto-regressive texture model Simultaneous auto-regressive, SAR
  • SAR Markov Random Field
  • MRF Markov Random Field
  • the characteristic of the shape feature is that various retrieval methods based on the shape feature can effectively use the target of interest in the image for retrieval.
  • shape features there are two types of representation methods for shape features, one is contour features, and the other is regional features.
  • the contour feature of the image is mainly for the outer boundary of the object, while the regional feature of the image is related to the entire shape area.
  • Boundary feature method this method obtains the shape parameters of the image by describing the boundary feature.
  • the Hough transform method for detecting parallel lines and the histogram method for boundary directions are classic methods.
  • Hough transform is a method that uses the global characteristics of the image to connect edge pixels to form a closed boundary of a region. The basic idea is the point-line duality; the boundary direction histogram method first differentiates the image to obtain the edge of the image, and then makes Regarding the histogram of the edge size and direction, the usual method is to construct an image gray gradient direction matrix.
  • the basic idea of the Fourier shape descriptor method is to use the Fourier transform of the object boundary as the shape description, and use the closedness and periodicity of the region boundary to transform a two-dimensional problem into a one-dimensional problem.
  • Three shape expressions are derived from boundary points, which are curvature function, centroid distance, and complex coordinate function.
  • the geometric parameter method is a simpler area feature description method used for shape expression and matching.
  • a shape factor method shape factor
  • shape factor is used for quantitative measurement of shape (such as moment, area, perimeter, etc.).
  • QBIC a content-based image retrieval system
  • geometric parameters such as roundness, eccentricity, principal axis direction, and algebraic invariant moments are used for image retrieval based on shape features.
  • the shape invariant moment method uses the moment of the area occupied by the target as the shape description parameter.
  • representation and matching of shape features also include methods such as Finite Element Method (FEM), Turning Function (Turning Function), and Wavelet Descriptor (Wavelet Descriptor).
  • FEM Finite Element Method
  • Turning Function Turning Function
  • Wavelet Descriptor Wavelet Descriptor
  • This method based on wavelet and relative moments. This method first uses wavelet transform modulus maxima to obtain multi-scale edge images, then calculates 7 invariant moments of each scale, and then converts them into 10 relative moments , The relative moments on all scales are used as image feature vectors to unify the region and closed and unclosed structures.
  • S104 Using the initial frame as a reference, perform feature search on the forward and/or backward video frames in the scene segment, and determine the area in each searched frame whose feature information matches the feature information of the target area , And automatically mark the area determined in each searched frame.
  • each searched video frame can also be pre-processed, such as image denoising, contrast enhancement, etc., to make the feature information of the matching area in each searched frame more obvious.
  • the mean shift algorithm is a non-parametric method based on density gradient rise, which finds the target position through iterative calculations to achieve target tracking.
  • the so-called tracking is to find the position of the target in the next frame through the known position of the target in the image frame.
  • the significant advantage of the mean shift algorithm is that the algorithm has a small amount of calculation, is simple and easy to implement, and is very suitable for real-time tracking. Through experiments, it is proposed to use the kernel histogram to calculate the target distribution, which proves that the mean shift algorithm has good real-time characteristics.
  • Mean shift has a wide range of applications in clustering, image smoothing, segmentation and tracking.
  • the mean shift algorithm locks the local maximum of the probability function in an iterative manner. For example, if there is a rectangular window to frame a certain part of an image, the principle is to find the center of gravity of the data point in the predefined window, or the weighted average. The algorithm moves the center of the window to the center of gravity of the data point, and repeats this process until the center of gravity of the window converges to a stable point. Therefore, whether the result of the iteration is good or bad depends on the input probability map (the above-mentioned predefined window) and its initial position.
  • the entire tracking steps of the mean shift algorithm include: setting the initial tracking target, that is, framing the target to be tracked; obtaining the histogram of the chromaticity H channel image in the HSV of the target to be tracked; normalizing the histogram to be tracked; and obtaining new data
  • the histogram to be tracked is back-projected in the frame image; the mean value shifts, and the tracking position is updated.
  • Kalman filter It can overcome the shortcoming that Wiener filter needs infinite past data and it is difficult to guarantee real-time performance. It is impossible to make the final real result and the filtering result completely equal, and can only be approximated. Kalman filtering selects the minimum mean square error as the criterion, and introduces a state space model for recursive estimation. Kalman filters are often used in navigation, radar, surveillance and other fields involving target tracking. The basic process is: adopt the state space model of signal and noise, recursively in the order of "prediction-actual measurement-correction", use the information of the previous moment to estimate the state variable at the current moment, and use the real observation value The model at the previous moment was adjusted.
  • Kalman filter A typical application of Kalman filter is to predict the state of the target at the next moment from a limited observation value including the target position and noise.
  • target tracking is the process of selecting the target corresponding to the determined target from the multiple foreground blocks detected in the current frame, thereby obtaining the target's trajectory.
  • the Kalman filter is used to predict the change of the position and the target center, and then the target is accurately located through multi-feature matching. This is the target tracking of the Kalman filter.
  • the use of Kalman filter to track the target is mainly divided into four steps: the first step is to calculate the target center, SIFT feature, color histogram and other feature points according to the target detection result; the second step is based on the target Set the prediction area at the Kalman prediction position of the next frame, and select eligible candidate targets in this area to match one by one; the third step is to define similarity functions for SIFT features, color histograms, target centers and other features, and select the best Match the target; the fourth step is to optimize the Kalman filter parameters according to the target state (such as normal tracking, tracking loss, fusion splitting, target entry and exit).
  • the target state such as normal tracking, tracking loss, fusion splitting, target entry and exit.
  • Particle filtering is a non-parametric Monte Carlo simulation method to achieve recursive Bayesian filtering. It is suitable for any nonlinear system that can be described by a state space model, and its accuracy can approach the optimal estimation.
  • Particle filters are simple and easy to implement. They provide an effective solution for analyzing nonlinear dynamic systems, and are widely used in target tracking, signal processing, and automatic control.
  • the core idea of the particle filter algorithm is to use the weighted sum of a series of random samples to approximate the posterior probability density function, and approximate the integral operation by summing. This algorithm is derived from Monte Carlo's idea, that is, the frequency of an event is used to refer to the probability of the event.
  • Prediction stage The particle filter first generates a large number of samples based on the state transition function prediction, these samples are called particles, and the weighted sum of these particles is used to approximate the posterior probability density;
  • Correction stage As the observations arrive in sequence, the corresponding importance weight is calculated for each particle. This weight represents the probability of obtaining the observation when the predicted pose takes the i-th particle. In this way, all particles are evaluated in this way, the more likely to obtain the observed particles, the higher the weight obtained;
  • Re-sampling stage re-distribute the sampled particles according to the weight ratio. Since the number of particles that approximate the continuous distribution is limited, this step is very important. In the next round of filtering, input the resampled particle set into the state transition equation to obtain new predicted particles;
  • Map estimation For each sampled particle, the corresponding map estimation is calculated based on the sampled trajectory and observation.
  • the target characteristic information is acquired, and the characteristic information in the searched frame is determined to match the target characteristic information.
  • a searched frame does not match the feature extracted from the initial frame, it means that the feature change of the target object in the current frame (that is, the searched frame) exceeds the threshold and cannot be matched, then you can start from the current
  • the previous frame or the previous few frames select the frame that successfully matches the features of the initial frame, and perform feature matching and automatic labeling on the current frame again according to the feature information of the marked area in the selected frame. If the feature information of the marked area in the first few frames of the current frame cannot be matched with the current frame, you can select the frame that successfully matches the features of the initial frame from the next or next frames of the current frame.
  • the frames are again feature-matched and automatically labeled.
  • the video frame of the next scene segment can be used for feature matching.
  • the current frame is the first frame of the current scene segment, it can be used in the previous scene Feature matching is performed in the segment. If the matching feature is still not found, the median value of the feature point coordinates of the previous and subsequent frames of the current frame can be used as the feature point coordinates of the current frame, and then the area in the current frame can be adjusted and labeled by manual processing.
  • the feature point coordinates of the intermediate frames of these consecutive frames can be estimated first, and then the median frame coordinates of the preceding and following frames and intermediate frames can be estimated in turn until All frames are estimated, and then the areas in these consecutive frames are adjusted and labeled by manual processing; it is also possible to manually adjust and label the areas in the intermediate frames after the feature point coordinates of the intermediate frames are estimated, and then extract the intermediate frames The feature information of the newly labeled area is then automatically matched and labeled for the previous and next frames.
  • the image of each marked video frame can be extracted as a training sample. Since the scene fragment contains a large number of video frames, a large number of labeled image training samples can be obtained based on each scene fragment.
  • the solution provided by the present invention firstly annotates the initial frame in the video scene segment, and then uses target tracking technology to automatically annotate other video frames in the entire scene segment, thereby obtaining a large number of annotated images as The training samples of the target recognition model are established later.
  • manual annotation is performed by acquiring a large number of pictures, and the cost of image acquisition and annotation is relatively high.
  • the present invention can use to shoot a video. The acquisition of materials is more convenient and easy. Then a large number of automatically marked samples can be collected from the video, which reduces The cost of sample labeling improves the efficiency of labeling processing.
  • the present invention also provides a device for obtaining training samples.
  • the device includes:
  • the obtaining module 201 is used to obtain scene fragments in the video
  • the first labeling module 202 is configured to select a video frame containing a target object in the scene fragment as an initial frame, and label the target area where the target object is located in the initial frame;
  • the first extraction module 203 is configured to extract feature information of the target area marked in the initial frame
  • the second labeling module 204 is configured to perform feature search on the forward and/or backward video frames in the scene segment based on the initial frame, and determine the difference between the feature information in each searched frame and the target area. Areas that match the feature information, and automatically mark the areas determined in each searched frame;
  • the second extraction module 205 is configured to extract the marked images of each video frame in the scene segment as training samples.
  • the obtaining module 201 is specifically used for:
  • the video is a single scene video, use the video as a scene segment
  • the scene switching detection technology is used to divide the video into multiple scene segments.
  • the scene switching detection technology includes: a detection algorithm based on a pixel domain and a detection algorithm based on a compressed domain.
  • the device further includes:
  • the preprocessing module is configured to perform image preprocessing on the initial frame before the first extraction module 203 extracts the feature information of the target area marked in the initial frame, so that the The characteristic information of the target area is more obvious.
  • the feature information of the target area includes one or more of color features, texture features, and shape features.
  • the second extraction module 204 performs feature search on the forward and/or backward video frames in the scene segment, specifically:
  • feature search is performed on the forward and/or backward video frames in the scene segment.
  • the second extraction module 204 is further configured to:
  • the target feature information is: feature information of the marked area in the adjacent preset number of frames of the searched frame.
  • the present invention also provides an electronic device, as shown in FIG. 3, including a processor 301, a communication interface 302, a memory 303, and a communication bus 304.
  • the processor 301, the communication interface 302, and the memory 303 complete each other through the communication bus 304. Communication between,
  • the memory 303 is used to store computer programs
  • the processor 301 is configured to implement the following steps when executing the program stored in the memory 303:
  • the image of each marked video frame in the scene segment is extracted as a training sample.
  • the communication bus mentioned in the above electronic device may be a Peripheral Component Interconnect (PCI) bus or an Extended Industry Standard Architecture (EISA) bus.
  • PCI Peripheral Component Interconnect
  • EISA Extended Industry Standard Architecture
  • the communication bus can be divided into address bus, data bus, control bus and so on. For ease of representation, only one thick line is used in the figure, but it does not mean that there is only one bus or one type of bus.
  • the communication interface is used for communication between the aforementioned electronic device and other devices.
  • the memory may include random access memory (Random Access Memory, RAM), and may also include non-volatile memory (Non-Volatile Memory, NVM), such as at least one disk storage.
  • NVM non-Volatile Memory
  • the memory may also be at least one storage device located far away from the foregoing processor.
  • the foregoing processor may be a general-purpose processor, including a central processing unit (CPU), a network processor (Network Processor, NP), etc.; it may also be a digital signal processor (Digital Signal Processing, DSP), a dedicated integrated Circuit (Application Specific Integrated Circuit, ASIC), Field-Programmable Gate Array (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components.
  • CPU central processing unit
  • NP Network Processor
  • DSP Digital Signal Processing
  • ASIC Application Specific Integrated Circuit
  • FPGA Field-Programmable Gate Array
  • FPGA Field-Programmable Gate Array
  • the present invention also provides a computer-readable storage medium in which a computer program is stored, and when the computer program is executed by a processor, the method steps of the above-mentioned training sample obtaining method are realized.

Abstract

The present invention provides a training sample obtaining method and apparatus, an electronic device and a storage medium. The method comprises: obtaining a scene segment in a video; selecting a video frame comprising a target object from the scene segment as an initial frame, and marking a target area where the target object in the initial frame is located; extracting feature information of the marked target area in the initial frame; performing feature search on forward and/or backward video frames in the scene segment by taking the initial frame as a reference, determining the area in each searched frame of which the feature information matches the feature information of the target area, and automatically marking the determined area in each searched frame; and extracting the image of each marked video frame in the scene segment as a training sample. The present invention can solve the problems in the prior art of low efficiency and high cost of image training sample acquisition.

Description

一种训练样本获得方法、装置、电子设备和存储介质Method, device, electronic equipment and storage medium for obtaining training samples 技术领域Technical field
本发明涉及机器学习技术领域,尤其涉及一种训练样本获得方法、装置、电子设备和计算机可读存储介质。The present invention relates to the field of machine learning technology, and in particular to a method, device, electronic equipment and computer-readable storage medium for obtaining training samples.
背景技术Background technique
人工智能识别模型的建立需要大量训练样本进行训练,训练样本一般为图片格式。然而,为满足训练要求,通常需要获取大量的图片作为训练样本,并且在进行标注时,需要分别对每张图片进行目标标注,效率较低、成本也比较高。The establishment of an artificial intelligence recognition model requires a large number of training samples for training, and the training samples are generally in image format. However, in order to meet the training requirements, it is usually necessary to obtain a large number of pictures as training samples, and when labeling, each picture needs to be targeted separately, which is inefficient and costly.
发明内容Summary of the invention
本发明的目的在于提供一种训练样本获得方法、装置、电子设备和计算机可读存储介质,以解决现有技术中获取图像训练样本效率低、成本高的问题。The purpose of the present invention is to provide a method, a device, an electronic device and a computer-readable storage medium for obtaining training samples to solve the problems of low efficiency and high cost of obtaining image training samples in the prior art.
为解决上述技术问题,本发明提供了一种训练样本获得方法,包括:To solve the above technical problems, the present invention provides a method for obtaining training samples, including:
获得视频中的场景片段;Obtain scene fragments in the video;
在所述场景片段中选择一个包含目标对象的视频帧作为初始帧,对所述初始帧中的所述目标对象所在的目标区域进行标注;Selecting a video frame containing a target object in the scene segment as an initial frame, and labeling the target area in the initial frame where the target object is located;
提取所述初始帧中被标注的所述目标区域的特征信息;Extracting feature information of the target area marked in the initial frame;
以所述初始帧为基准,对所述场景片段中前向和/或后向的视频帧进行特征搜索,确定各个被搜索帧中特征信息与所述目标区域的特征信息相匹配的区域,并对各个被搜索帧中所确定的区域进行自动标注;Using the initial frame as a reference, perform a feature search on the forward and/or backward video frames in the scene segment, and determine the area in each searched frame whose feature information matches the feature information of the target area, and Automatically mark the area determined in each searched frame;
提取所述场景片段中已标注的各个视频帧的图像作为训练样本。The image of each marked video frame in the scene segment is extracted as a training sample.
可选的,所述获得视频中的场景片段,包括:Optionally, the obtaining the scene fragment in the video includes:
若所述视频为单场景视频,则将所述视频作为一个场景片段;If the video is a single scene video, use the video as a scene segment;
若所述视频为多场景视频,则利用场景切换检测技术,将所述视频划分为多个场景片段。If the video is a multi-scene video, the scene switching detection technology is used to divide the video into multiple scene segments.
可选的,所述场景切换检测技术包括:基于像素域的检测算法和/或基于压缩域的检测算法。Optionally, the scene switching detection technology includes: a pixel domain-based detection algorithm and/or a compressed domain-based detection algorithm.
可选的,在提取所述初始帧中被标注的所述目标区域的特征信息之前,还包括:Optionally, before extracting the feature information of the target area marked in the initial frame, the method further includes:
对所述初始帧进行图像预处理,以使所述初始帧中所述目标区域的特征信息更加明显。Image preprocessing is performed on the initial frame to make the feature information of the target region in the initial frame more obvious.
可选的,所述目标区域的特征信息,包括:颜色特征、纹理特征和形状特征中的一种或多种。Optionally, the feature information of the target area includes one or more of color features, texture features, and shape features.
可选的,对所述场景片段中前向和/或后向的视频帧进行特征搜索的步骤包括:Optionally, the step of performing feature search on the forward and/or backward video frames in the scene segment includes:
利用均值漂移算法、Kalman滤波算法或粒子滤波算法,对所述场景片段中前向和/或后向的视频帧进行特征搜索。Using a mean shift algorithm, a Kalman filter algorithm, or a particle filter algorithm, feature search is performed on the forward and/or backward video frames in the scene segment.
可选的,所述方法还包括:Optionally, the method further includes:
如果某一被搜索帧中不存在特征信息与所述目标区域的特征信息相匹配的区域,则获取目标特征信息,确定该被搜索帧中特征信息与所述目标特征信息相匹配的区域,并对该被搜索帧中所确定的区域进行自动标注;If there is no area in a searched frame whose feature information matches the feature information of the target area, acquire the target feature information, determine the area in the searched frame where the feature information matches the target feature information, and Automatically mark the area determined in the searched frame;
其中,所述目标特征信息为:该被搜索帧的相邻预设数量帧中已被标注区域的特征信息。Wherein, the target feature information is: feature information of the marked area in the adjacent preset number of frames of the searched frame.
本发明还提供一种训练样本获得装置,包括:The present invention also provides a training sample obtaining device, including:
获得模块,用于获得视频中的场景片段;Obtaining module, used to obtain scene fragments in the video;
第一标注模块,用于在所述场景片段中选择一个包含目标对象的视频帧作为初始帧,对所述初始帧中的所述目标区域进行标注;A first labeling module, configured to select a video frame containing a target object in the scene fragment as an initial frame, and label the target area in the initial frame;
第一提取模块,用于提取所述初始帧中被标注的所述目标区域的特征信息;A first extraction module, configured to extract feature information of the target area marked in the initial frame;
第二标注模块,用于以所述初始帧为基准,对所述场景片段中前向和/或后向的视频帧进行特征搜索,确定各个被搜索帧中特征信息与所述目标区域的特征信息相匹配的区域,并对各个被搜索帧中所确定的区域进行自动标注;The second labeling module is used to perform feature search on the forward and/or backward video frames in the scene segment based on the initial frame, and determine the feature information in each searched frame and the feature of the target area The area where the information matches, and automatically mark the area determined in each searched frame;
第二提取模块,用于提取所述场景片段中已标注的各个视频帧的图像作 为训练样本。The second extraction module is used to extract the marked images of each video frame in the scene segment as training samples.
可选的,所述获得模块获得视频中的场景片段,包括:Optionally, the obtaining module obtains a scene segment in the video, including:
若所述视频为单场景视频,则将所述视频作为一个场景片段;If the video is a single scene video, use the video as a scene segment;
若所述视频为多场景视频,则利用场景切换检测技术,将所述视频划分为多个场景片段。If the video is a multi-scene video, the scene switching detection technology is used to divide the video into multiple scene segments.
可选的,所述场景切换检测技术包括:基于像素域的检测算法和/或基于压缩域的检测算法。Optionally, the scene switching detection technology includes: a pixel domain-based detection algorithm and/or a compressed domain-based detection algorithm.
可选的,所述装置还包括:Optionally, the device further includes:
预处理模块,用于在所述第一提取模块提取所述初始帧中被标注的所述目标区域的特征信息之前,对所述初始帧进行图像预处理,以使所述初始帧中所述目标区域的特征信息更加明显。The preprocessing module is configured to perform image preprocessing on the initial frame before the first extraction module extracts the feature information of the target region marked in the initial frame, so that the The characteristic information of the target area is more obvious.
可选的,所述目标区域的特征信息,包括:颜色特征、纹理特征和形状特征中的一种或多种。Optionally, the feature information of the target area includes one or more of color features, texture features, and shape features.
可选的,所述第二提取模块对所述场景片段中前向和/或后向的视频帧进行特征搜索,包括:Optionally, the second extraction module performs feature search on the forward and/or backward video frames in the scene segment, including:
利用均值漂移算法、Kalman滤波算法或粒子滤波算法,对所述场景片段中前向和/或后向的视频帧进行特征搜索。Using a mean shift algorithm, a Kalman filter algorithm, or a particle filter algorithm, feature search is performed on the forward and/or backward video frames in the scene segment.
可选的,所述第二提取模块还用于:Optionally, the second extraction module is further used for:
如果某一被搜索帧中不存在特征信息与所述目标区域的特征信息相匹配的区域,则获取目标特征信息,确定该被搜索帧中特征信息与所述目标特征信息相匹配的区域,并对该被搜索帧中所确定的区域进行自动标注;If there is no area in a searched frame whose feature information matches the feature information of the target area, acquire the target feature information, determine the area in the searched frame where the feature information matches the target feature information, and Automatically mark the area determined in the searched frame;
其中,所述目标特征信息为:该被搜索帧的相邻预设数量帧中已被标注区域的特征信息。Wherein, the target feature information is: feature information of the marked area in the adjacent preset number of frames of the searched frame.
本发明还提供一种电子设备,包括处理器、通信接口、存储器和通信总线,其中,所述处理器、所述通信接口和所述存储器通过所述通信总线完成相互间的通信;其中,The present invention also provides an electronic device, including a processor, a communication interface, a memory, and a communication bus, wherein the processor, the communication interface, and the memory communicate with each other through the communication bus; wherein,
所述存储器用于存放计算机程序;The memory is used to store computer programs;
所述处理器用于执行所述存储器上存放的所述计算机程序时,实现上述 任一项所述的训练样本获得方法。When the processor is used to execute the computer program stored in the memory, it implements the training sample obtaining method described in any one of the above.
本发明还提供一种计算机可读存储介质,所述计算机可读存储介质内存储有计算机程序,所述计算机程序被处理器执行时实现上述任一项所述的训练样本获得方法。The present invention also provides a computer-readable storage medium having a computer program stored in the computer-readable storage medium, and when the computer program is executed by a processor, the training sample obtaining method described in any one of the above is implemented.
本发明提供的方案,首先对视频的场景片段内的初始帧进行标注,然后使用目标跟踪技术对整个场景片段内其它视频帧进行自动标注,从而获得大量的经过标注的图像作为后期建立目标识别模型的训练样本。现有技术中通过获取大量图片进行人工标注,图片获取以及标注成本较高,而本发明可以拍摄一段视频,标注素材的获取比较方便容易,然后可以从视频中采集大量自动标注的样本,降低了样本标注成本,提高了标注处理效率。The solution provided by the present invention firstly annotates the initial frame in the scene segment of the video, and then uses the target tracking technology to automatically annotate other video frames in the entire scene segment, thereby obtaining a large number of annotated images as a subsequent target recognition model Training samples. In the prior art, manual annotation is performed by acquiring a large number of pictures. The cost of image acquisition and annotation is relatively high. However, the present invention can shoot a video, and the acquisition of annotation materials is more convenient and easy. Then a large number of automatically marked samples can be collected from the video, which reduces The cost of sample labeling improves the efficiency of labeling processing.
附图说明Description of the drawings
图1是本发明一实施例提供的一种训练样本获得方法的流程示意图;FIG. 1 is a schematic flowchart of a method for obtaining training samples according to an embodiment of the present invention;
图2是本发明一实施例提供的一种训练样本获得装置的结构示意图;FIG. 2 is a schematic structural diagram of a training sample obtaining apparatus according to an embodiment of the present invention;
图3是本发明一实施例提供的电子设备的结构框图。Fig. 3 is a structural block diagram of an electronic device provided by an embodiment of the present invention.
具体实施方式detailed description
以下结合附图和具体实施例对本发明提出的一种训练样本获得方法、装置、电子设备和计算机可读存储介质作进一步详细说明。根据权利要求书和下面说明,本发明的优点和特征将更清楚。The method, device, electronic device, and computer-readable storage medium for obtaining training samples provided by the present invention will be further described in detail below in conjunction with the accompanying drawings and specific embodiments. According to the claims and the following description, the advantages and features of the present invention will be clearer.
为解决现有技术的问题,本发明实施例提供了一种训练样本获得方法、装置、电子设备及计算机可读存储介质。In order to solve the problems of the prior art, the embodiments of the present invention provide a method, device, electronic device, and computer-readable storage medium for obtaining training samples.
需要说明的是,本发明实施例的训练样本获得方法可应用于本发明实施例的训练样本获得装置,该训练样本获得装置可被配置于电子设备上。其中,该电子设备可以是个人计算机、移动终端等,该移动终端可以是手机、平板电脑等具有各种操作系统的硬件设备。It should be noted that the training sample obtaining method of the embodiment of the present invention can be applied to the training sample obtaining device of the embodiment of the present invention, and the training sample obtaining device can be configured on an electronic device. Wherein, the electronic device may be a personal computer, a mobile terminal, etc., and the mobile terminal may be a hardware device with various operating systems such as a mobile phone or a tablet computer.
图1是本发明一实施例提供的一种训练样本获得方法的流程示意图,请参考图1,一种训练样本获得方法可以包括如下步骤:Fig. 1 is a schematic flowchart of a method for obtaining training samples according to an embodiment of the present invention. Please refer to Fig. 1. A method for obtaining training samples may include the following steps:
S101,获得视频中的场景片段。S101: Obtain a scene segment in a video.
一个视频一般由一个或多个场景段组成,一个场景由多个视频帧组成。本发明所基于的视频可以是单场景视频,也可以是多场景视频。若所述视频为单场景视频,由于视频中只包含一个场景片段,则可以直接将所述视频作为所获得的一个场景片段,并执行后续的处理步骤。A video is generally composed of one or more scene segments, and a scene is composed of multiple video frames. The video on which the present invention is based can be a single-scene video or a multi-scene video. If the video is a single scene video, since the video contains only one scene segment, the video can be directly used as an obtained scene segment, and subsequent processing steps are executed.
若所述视频为多场景视频,则可以利用场景切换检测技术,将所述视频划分为多个场景片段。在划分出多个场景片段后,可以仅采用其中的一个场景片段,通过执行后续的处理步骤,得到该场景片段内已标注的各个视频帧的图像作为训练样本,也可以针对每一场景片段均执行后续的处理步骤,从而可以进一步增加所获得的训练样本的数量。If the video is a multi-scene video, the scene switching detection technology can be used to divide the video into multiple scene segments. After dividing multiple scene fragments, only one of the scene fragments can be used, and by performing subsequent processing steps, the images of each marked video frame in the scene fragment can be obtained as training samples, or it can be uniform for each scene fragment. Performing subsequent processing steps can further increase the number of training samples obtained.
场景切换检测技术,是指找出一个视频中发生场景切换的帧和帧位置,得到的帧位置能用于视频快速和精确剪辑或进一步处理,得到的帧所组成的帧序列可以用于粗略描述整个视频内容。Scene switching detection technology refers to finding out the frame and frame position where scene switching occurs in a video. The obtained frame position can be used for fast and accurate video editing or further processing, and the frame sequence composed of the obtained frames can be used for rough description The entire video content.
目前,传统的视频场景切换检测方法一般采用人工提取特征的方式,例如计算相邻帧的颜色直方图相似度,或者直接计算帧差,或者利用视频场景中各帧的高频子带系数的变化程度特征VH(viewportHeight,视窗高度)检测场景切换,其中计算高频子带系数需要用到三维小波变换等算法,这些技术都会计算出一个特征值然后与阈值比较,如果大于阈值或者小于阈值者判定为切换帧。还有一些基于上述技术的自适应阈值算法,例如基于自适应阈值的视频场景变化检测方法,但是该方法中滑动窗大小以及预设阈值还是需要人工设定。At present, traditional video scene switching detection methods generally use manual feature extraction methods, such as calculating the color histogram similarity of adjacent frames, or directly calculating the frame difference, or using the change of high-frequency subband coefficients of each frame in the video scene The degree feature VH (viewportHeight, window height) detects scene switching. The calculation of high-frequency subband coefficients requires algorithms such as three-dimensional wavelet transform. These technologies will calculate a feature value and compare it with the threshold. If it is greater than the threshold or less than the threshold, it is determined For switching frames. There are also some adaptive threshold algorithms based on the above technologies, such as a video scene change detection method based on adaptive thresholds, but the sliding window size and preset thresholds in this method still need to be manually set.
在本发明中,场景切换检测技术可以采用基于像素域的检测算法或基于压缩域的检测算法,根据场景不同设置相应的场景切换阈值,可以提高场景切换检测的速度和准确度。基于像素域或压缩域的检测算法可以参见现有技术,在此不做赘述。In the present invention, the scene switching detection technology can adopt a pixel domain-based detection algorithm or a compression domain-based detection algorithm, and set corresponding scene switching thresholds according to different scenes, which can improve the speed and accuracy of scene switching detection. The detection algorithm based on the pixel domain or the compressed domain can be referred to the prior art, which will not be repeated here.
S102,在所述场景片段中选择一个包含目标对象的视频帧作为初始帧,对所述初始帧中所述目标对象所在的目标区域进行标注。S102: Select a video frame containing a target object in the scene segment as an initial frame, and mark the target area where the target object is located in the initial frame.
其中,目标对象可以为感兴趣的物体。对于每一个场景片段可以根据其 包含的视频帧进行识别处理,选取一个包含目标对象的视频帧作为初始帧进行标注,可以选择目标对象出现的第一帧作为初始帧,如果第一帧中目标对象的特征不明显,则寻找后续视频帧中目标对象的特征较为明显的一帧作为初始帧。这一步的要求并非十分严格,大概选择较佳的视频帧作为初始帧即可,其目的主要是标注出目标对象所在的目标区域,以便提取目标区域的特征信息,这样可以在后续处理中通过特征搜索在往前或者往后的视频帧中自动标注出特征匹配的区域。Among them, the target object may be an object of interest. For each scene segment, it can be identified according to the video frames it contains, and a video frame containing the target object can be selected as the initial frame for labeling. The first frame where the target object appears can be selected as the initial frame. If the target object is in the first frame If the feature of is not obvious, look for a frame with more obvious features of the target object in the subsequent video frames as the initial frame. The requirements for this step are not very strict. You can probably choose a better video frame as the initial frame. Its purpose is to mark the target area where the target object is located, so as to extract the feature information of the target area, so that the feature can be passed in the subsequent processing. The search automatically marks the feature matching area in the previous or next video frames.
进一步的,在步骤S103提取所述初始帧中被标注的所述目标区域的特征信息之前,还可以对所述初始帧进行图像预处理,如图像去噪、对比度增强等,以使所述初始帧中所述目标区域的特征信息更加明显。Further, before extracting the feature information of the target region marked in the initial frame in step S103, image preprocessing, such as image denoising, contrast enhancement, etc., may be performed on the initial frame, so that the initial frame The characteristic information of the target area in the frame is more obvious.
S103,提取所述初始帧中被标注的所述目标区域的特征信息。S103: Extract feature information of the target area marked in the initial frame.
所述目标区域的特征信息可以包括:颜色特征、纹理特征和形状特征中的一种或多种。The feature information of the target area may include one or more of color features, texture features, and shape features.
颜色特征是一种全局特征,描述了图像或图像区域所对应的景物的表面性质。一般颜色特征是基于像素点的特征,此时所有属于图像或图像区域的像素都有各自的贡献。颜色直方图是最常用的表达颜色特征的方法,它能简单描述一幅图像中颜色的全局分布,即不同色彩在整幅图像中所占的比例,特别适用于描述那些难以自动分割的图像和不需要考虑物体空间位置的图像,而且它不受图像旋转和平移变化的影响,进一步借助归一化还可不受图像尺度变化的影响。最常用的颜色空间有RGB颜色空间、HSV颜色空间。颜色直方图特征匹配方法有:直方图相交法、距离法、中心距法、参考颜色表法、累加颜色直方图法。The color feature is a global feature that describes the surface properties of the scene corresponding to the image or image area. Generally, color characteristics are based on the characteristics of pixels. At this time, all pixels belonging to an image or image area have their own contributions. Color histogram is the most commonly used method to express color characteristics. It can simply describe the global distribution of colors in an image, that is, the proportions of different colors in the entire image. It is especially suitable for describing images and images that are difficult to automatically segment. The image of the spatial position of the object does not need to be considered, and it is not affected by the change of image rotation and translation, and further normalization is not affected by the change of image scale. The most commonly used color spaces are RGB color space and HSV color space. Color histogram feature matching methods include: histogram intersection method, distance method, center distance method, reference color table method, and cumulative color histogram method.
纹理特征也是一种全局特征,它也描述了图像或图像区域所对应景物的表面性质。但由于纹理只是一种物体表面的特性,并不能完全反映出物体的本质属性,所以仅仅利用纹理特征是无法获得高层次图像内容的。与颜色特征不同,纹理特征不是基于像素点的特征,它需要在包含多个像素点的区域中进行统计计算。在模式匹配中,这种区域性的特征具有较大的优越性,不会由于局部的偏差而无法匹配成功。作为一种统计特征,纹理特征常具有旋 转不变性,并且对于噪声有较强的抵抗能力。Texture feature is also a kind of global feature, it also describes the surface properties of the scene corresponding to the image or image area. However, because texture is only a characteristic of the surface of an object, and cannot fully reflect the essential attributes of the object, high-level image content cannot be obtained by using only texture features. Different from the color feature, the texture feature is not based on the feature of pixels, it needs to perform statistical calculation in the area containing multiple pixels. In pattern matching, this regional feature has greater advantages and will not fail to match successfully due to local deviations. As a statistical feature, texture features often have rotation invariance and have strong resistance to noise.
纹理特征的描述方法有:统计方法、几何法、模型法、信号处理法。统计方法的典型代表是一种称为灰度共生矩阵的纹理特征分析方法,Gotlieb和Kreyszig等人在研究共生矩阵中各种统计特征的基础上,通过实验,得出灰度共生矩阵的四个关键特征:能量、惯量、熵和相关性;统计方法中另一种典型方法,则是从图像的自相关函数(即图像的能量谱函数)提取纹理特征,即通过对图像的能量谱函数的计算,提取纹理的粗细度及方向性等特征参数。The description methods of texture features include statistical methods, geometric methods, model methods, and signal processing methods. The typical representative of statistical methods is a texture feature analysis method called gray-level co-occurrence matrix. Gotlieb and Kreyszig et al. based on the study of various statistical features in the co-occurrence matrix, through experiments, obtained four gray-level co-occurrence matrix Key features: energy, inertia, entropy, and correlation; another typical method in statistical methods is to extract texture features from the image's autocorrelation function (ie, the image's energy spectrum function), that is, through the image's energy spectrum function Calculate and extract characteristic parameters such as texture thickness and directionality.
几何法,是建立在纹理基元(基本的纹理元素)理论基础上的一种纹理特征分析方法,纹理基元理论认为,复杂的纹理可以由若干简单的纹理基元以一定的有规律的形式重复排列构成,在几何法中,比较有影响的算法有两种:Voronio棋盘格特征法和结构法。The geometric method is a texture feature analysis method based on the theory of texture primitives (basic texture elements). The texture primitive theory believes that a complex texture can be composed of a number of simple texture primitives in a certain regular form. In the geometric method, there are two more influential algorithms: Voronio checkerboard feature method and structural method.
模型法以图像的构造模型为基础,采用模型的参数作为纹理特征。典型的方法是随机场模型法,如马尔可夫(Markov)随机场(MRF)模型法和Gibbs随机场模型法。The model method is based on the structural model of the image, and uses the parameters of the model as texture features. Typical methods are random field model methods, such as Markov random field (MRF) model method and Gibbs random field model method.
在信号处理法中,纹理特征的提取与匹配主要有:灰度共生矩阵、Tamura纹理特征、自回归纹理模型、小波变换等。灰度共生矩阵特征提取与匹配主要依赖于能量、惯量、熵和相关性四个参数。Tamura纹理特征基于人类对纹理的视觉感知心理学研究,提出6种属性,即:粗糙度、对比度、方向度、线像度、规整度和粗略度。自回归纹理模型(simultaneous auto-regressive,SAR)是马尔可夫随机场(Markov Random Field,MRF)模型的一种应用实例。In the signal processing method, the extraction and matching of texture features mainly include: gray level co-occurrence matrix, Tamura texture feature, autoregressive texture model, wavelet transform, etc. The feature extraction and matching of gray-level co-occurrence matrix mainly rely on four parameters: energy, inertia, entropy and correlation. Tamura's texture feature is based on the psychological research of human visual perception of texture, and proposes six attributes, namely: roughness, contrast, direction, line image, regularity and roughness. The auto-regressive texture model (simultaneous auto-regressive, SAR) is an application example of the Markov Random Field (Markov Random Field, MRF) model.
形状特征的特点是:各种基于形状特征的检索方法都可以比较有效地利用图像中感兴趣的目标来进行检索。通常情况下,形状特征有两类表示方法,一类是轮廓特征,另一类是区域特征。图像的轮廓特征主要针对物体的外边界,而图像的区域特征则关系到整个形状区域。The characteristic of the shape feature is that various retrieval methods based on the shape feature can effectively use the target of interest in the image for retrieval. Generally, there are two types of representation methods for shape features, one is contour features, and the other is regional features. The contour feature of the image is mainly for the outer boundary of the object, while the regional feature of the image is related to the entire shape area.
首先,几种典型的形状特征描述方法有:边界特征法、傅里叶形状描述符法、几何参数法、形状不变矩法。First of all, several typical shape feature description methods are: boundary feature method, Fourier shape descriptor method, geometric parameter method, and shape invariant moment method.
边界特征法,该方法通过对边界特征的描述来获取图像的形状参数。其中Hough变换检测平行直线方法和边界方向直方图方法是经典方法。Hough 变换是利用图像全局特性而将边缘像素连接起来组成区域封闭边界的一种方法,其基本思想是点—线的对偶性;边界方向直方图法首先微分图像求得图像边缘,然后,做出关于边缘大小和方向的直方图,通常的方法是构造图像灰度梯度方向矩阵。Boundary feature method, this method obtains the shape parameters of the image by describing the boundary feature. The Hough transform method for detecting parallel lines and the histogram method for boundary directions are classic methods. Hough transform is a method that uses the global characteristics of the image to connect edge pixels to form a closed boundary of a region. The basic idea is the point-line duality; the boundary direction histogram method first differentiates the image to obtain the edge of the image, and then makes Regarding the histogram of the edge size and direction, the usual method is to construct an image gray gradient direction matrix.
傅里叶形状描述符(Fourier shape descriptors)法的基本思想是用物体边界的傅里叶变换作为形状描述,利用区域边界的封闭性和周期性,将二维问题转化为一维问题。由边界点导出三种形状表达,分别是曲率函数、质心距离、复坐标函数。The basic idea of the Fourier shape descriptor method is to use the Fourier transform of the object boundary as the shape description, and use the closedness and periodicity of the region boundary to transform a two-dimensional problem into a one-dimensional problem. Three shape expressions are derived from boundary points, which are curvature function, centroid distance, and complex coordinate function.
几何参数法,是形状的表达和匹配所采用的更为简单的区域特征描述方法,例如采用有关形状定量测度(如矩、面积、周长等)的形状参数法(shape factor)。在QBIC系统(一种基于内容的图像检索系统)中,便是利用圆度、偏心率、主轴方向和代数不变矩等几何参数,进行基于形状特征的图像检索。需要说明的是,形状参数的提取,必须以图像处理及图像分割为前提,参数的准确性必然受到分割效果的影响,对分割效果很差的图像,形状参数甚至无法提取。The geometric parameter method is a simpler area feature description method used for shape expression and matching. For example, a shape factor method (shape factor) is used for quantitative measurement of shape (such as moment, area, perimeter, etc.). In the QBIC system (a content-based image retrieval system), geometric parameters such as roundness, eccentricity, principal axis direction, and algebraic invariant moments are used for image retrieval based on shape features. It should be noted that the extraction of shape parameters must be based on image processing and image segmentation. The accuracy of the parameters must be affected by the segmentation effect. For images with poor segmentation effects, the shape parameters cannot even be extracted.
形状不变矩法,是利用目标所占区域的矩作为形状描述参数。The shape invariant moment method uses the moment of the area occupied by the target as the shape description parameter.
另外,在形状特征的表示和匹配方面,还包括有限元法(Finite Element Method,FEM)、旋转函数(Turning Function)和小波描述符(Wavelet Descriptor)等方法。In addition, the representation and matching of shape features also include methods such as Finite Element Method (FEM), Turning Function (Turning Function), and Wavelet Descriptor (Wavelet Descriptor).
其次,基于小波和相对矩的形状特征提取与匹配方法,该方法先用小波变换模极大值得到多尺度边缘图像,然后计算每一尺度的7个不变矩,再转化为10个相对矩,将所有尺度上的相对矩作为图像特征向量,从而统一了区域和封闭、不封闭结构。Secondly, the shape feature extraction and matching method based on wavelet and relative moments. This method first uses wavelet transform modulus maxima to obtain multi-scale edge images, then calculates 7 invariant moments of each scale, and then converts them into 10 relative moments , The relative moments on all scales are used as image feature vectors to unify the region and closed and unclosed structures.
S104,以所述初始帧为基准,对所述场景片段中前向和/或后向的视频帧进行特征搜索,确定各个被搜索帧中特征信息与所述目标区域的特征信息相匹配的区域,并对各个被搜索帧中所确定的区域进行自动标注。S104: Using the initial frame as a reference, perform feature search on the forward and/or backward video frames in the scene segment, and determine the area in each searched frame whose feature information matches the feature information of the target area , And automatically mark the area determined in each searched frame.
即,根据从初始帧提取的特征信息,对场景片段中的视频帧进行前向和/或后向特征搜索,确定各个被搜索帧中能够匹配从初始帧提取的特征信息的 区域,进而对匹配的区域进行自动标注,实现了在所述场景片段中的目标跟踪和自动标注。另外,在进行特征搜索前,还可以对各个被搜索视频帧进行预处理,如图像去噪、对比度增强等,以使各个被搜索帧中相匹配的区域的特征信息更加明显。That is, according to the feature information extracted from the initial frame, the forward and/or backward feature search is performed on the video frames in the scene segment, and the regions in each searched frame that can match the feature information extracted from the initial frame are determined, and then the matching The area of ?? is automatically labeled, which realizes target tracking and automatic labeling in the scene fragment. In addition, before the feature search, each searched video frame can also be pre-processed, such as image denoising, contrast enhancement, etc., to make the feature information of the matching area in each searched frame more obvious.
实际应用中,可使用均值漂移、Kalman滤波、粒子滤波等算法进行特征搜索。In practical applications, algorithms such as mean shift, Kalman filter, and particle filter can be used for feature search.
均值漂移算法是一种基于密度梯度上升的非参数方法,通过迭代运算找到目标位置,实现目标跟踪。所谓跟踪,就是通过已知的图像帧中目标的位置找到目标在下一帧中的位置。均值漂移算法显著的优点是算法计算量小,简单易实现,很适合于实时跟踪场合。通过实验提出应用核直方图来计算目标分布,证明了均值漂移算法具有很好的实时性特点。均值漂移在聚类、图像平滑、分割、跟踪等方面有着广泛的应用。The mean shift algorithm is a non-parametric method based on density gradient rise, which finds the target position through iterative calculations to achieve target tracking. The so-called tracking is to find the position of the target in the next frame through the known position of the target in the image frame. The significant advantage of the mean shift algorithm is that the algorithm has a small amount of calculation, is simple and easy to implement, and is very suitable for real-time tracking. Through experiments, it is proposed to use the kernel histogram to calculate the target distribution, which proves that the mean shift algorithm has good real-time characteristics. Mean shift has a wide range of applications in clustering, image smoothing, segmentation and tracking.
均值漂移算法以迭代的方式锁定概率函数的局部最大值。比如有一个矩形窗口将一幅图像的某个部分框住,原理就是寻找预定义窗口中数据点的重心,或者说加权平均值。该算法将窗口中心移动到数据点的重心处,并重复这个过程直到窗口重心收敛到一个稳定点。因此,迭代完成的结果的好与坏取决于输入的概率图(上述中的预定义窗口)和它的初始位置。The mean shift algorithm locks the local maximum of the probability function in an iterative manner. For example, if there is a rectangular window to frame a certain part of an image, the principle is to find the center of gravity of the data point in the predefined window, or the weighted average. The algorithm moves the center of the window to the center of gravity of the data point, and repeats this process until the center of gravity of the window converges to a stable point. Therefore, whether the result of the iteration is good or bad depends on the input probability map (the above-mentioned predefined window) and its initial position.
均值漂移算法的整个跟踪步骤包括:设置初始跟踪目标,即框住待跟踪目标;获取待跟踪目标的HSV中的色度H通道图像的直方图;待跟踪直方图归一化;到新的数据帧图像中反投影待跟踪直方图;均值漂移,更新跟踪位置。The entire tracking steps of the mean shift algorithm include: setting the initial tracking target, that is, framing the target to be tracked; obtaining the histogram of the chromaticity H channel image in the HSV of the target to be tracked; normalizing the histogram to be tracked; and obtaining new data The histogram to be tracked is back-projected in the frame image; the mean value shifts, and the tracking position is updated.
Kalman(卡尔曼)滤波:可以克服维纳滤波需要无限过去的数据而难以保证实时性的这一缺点。要使得最后真实结果和滤波结果完全相等是不可能,只能做到近似,Kalman滤波选择最小均方误差为准则,并引入了状态空间模型来进行递推估计。在涉及目标跟踪的导航、雷达、监控等领域中经常使用卡尔曼滤波器。其基本过程是:采用信号与噪声的状态空间模型,以“预测-实测-修正”顺序递推,利用前一时刻的信息对现时刻的状态变量进行估计,并以现时刻的真实观测值对前一时刻的模型进行调整。卡尔曼滤波的一个典 型应用就是从有限的包含目标位置、噪声的观测值中预测出目标的下一时刻的状态。在监控视频中,目标跟踪就是在从当前帧检测出的多个前景块中选出与已确定目标相对应的目标,从而得到目标的运动轨迹的过程。在此过程中利用卡尔曼滤波器来预测位置和目标中心的变化,再通过多特征的匹配对目标进行精确定位,这就是卡尔曼滤波的目标跟踪。总的来说,用卡尔曼滤波器实现对目标的跟踪主要分为四步:第一步根据目标检测的结果,计算目标的中心、SIFT特征、颜色直方图等特征点;第二步根据目标在下一帧的卡尔曼预测位置设置预测区域,在此区域内选取符合条件的候选目标逐个进行匹配;第三步分别对SIFT特征、颜色直方图、目标中心等特征定义相似性函数,选取最佳匹配目标;第四步根据目标的状态(如正常跟踪、跟踪丢失、融合分裂、目标进入退出)对Kalman滤波器参数进行优化。Kalman (Kalman) filter: It can overcome the shortcoming that Wiener filter needs infinite past data and it is difficult to guarantee real-time performance. It is impossible to make the final real result and the filtering result completely equal, and can only be approximated. Kalman filtering selects the minimum mean square error as the criterion, and introduces a state space model for recursive estimation. Kalman filters are often used in navigation, radar, surveillance and other fields involving target tracking. The basic process is: adopt the state space model of signal and noise, recursively in the order of "prediction-actual measurement-correction", use the information of the previous moment to estimate the state variable at the current moment, and use the real observation value The model at the previous moment was adjusted. A typical application of Kalman filter is to predict the state of the target at the next moment from a limited observation value including the target position and noise. In surveillance video, target tracking is the process of selecting the target corresponding to the determined target from the multiple foreground blocks detected in the current frame, thereby obtaining the target's trajectory. In this process, the Kalman filter is used to predict the change of the position and the target center, and then the target is accurately located through multi-feature matching. This is the target tracking of the Kalman filter. In general, the use of Kalman filter to track the target is mainly divided into four steps: the first step is to calculate the target center, SIFT feature, color histogram and other feature points according to the target detection result; the second step is based on the target Set the prediction area at the Kalman prediction position of the next frame, and select eligible candidate targets in this area to match one by one; the third step is to define similarity functions for SIFT features, color histograms, target centers and other features, and select the best Match the target; the fourth step is to optimize the Kalman filter parameters according to the target state (such as normal tracking, tracking loss, fusion splitting, target entry and exit).
粒子滤波,是通过非参数化的蒙特卡洛(Monte Carlo)模拟方法来实现递推贝叶斯滤波,适用于任何能用状态空间模型描述的非线性系统,精度可以逼近最优估计。粒子滤波器具有简单、易于实现等特点,它为分析非线性动态系统提供了一种有效的解决方法,从而广泛应用于目标跟踪、信号处理以及自动控制等领域。粒子滤波算法的核心思想是利用一系列随机样本的加权和近似后验概率密度函数,通过求和来近似积分操作。该算法源于Monte Carlo思想,即以某事件出现的频率来指代该事件的概率。因此在滤波过程中,需要用到概率的地方,一概对变量采样,以大量采样及其相应的权值来近似表示概率密度函数。其中最普遍的粒子滤波算法为SIR(Samping Importance Resampling)滤波器,该算法通过以下四步完成:Particle filtering is a non-parametric Monte Carlo simulation method to achieve recursive Bayesian filtering. It is suitable for any nonlinear system that can be described by a state space model, and its accuracy can approach the optimal estimation. Particle filters are simple and easy to implement. They provide an effective solution for analyzing nonlinear dynamic systems, and are widely used in target tracking, signal processing, and automatic control. The core idea of the particle filter algorithm is to use the weighted sum of a series of random samples to approximate the posterior probability density function, and approximate the integral operation by summing. This algorithm is derived from Monte Carlo's idea, that is, the frequency of an event is used to refer to the probability of the event. Therefore, in the filtering process, where probability is needed, variables are sampled, and a large number of samples and their corresponding weights are used to approximate the probability density function. The most common particle filter algorithm is SIR (Samping Importance Resampling) filter, which is completed by the following four steps:
1)预测阶段:粒子滤波首先根据状态转移函数预测生成大量的采样,这些采样就称之为粒子,利用这些粒子的加权和来逼近后验概率密度;1) Prediction stage: The particle filter first generates a large number of samples based on the state transition function prediction, these samples are called particles, and the weighted sum of these particles is used to approximate the posterior probability density;
2)校正阶段:随着观测值的依次到达,为每个粒子计算相应的重要性权值,这个权值代表了预测的位姿取第i个粒子时获得观测的概率。如此这般下来,对所有粒子都进行这样一个评价,越有可能获得观测的粒子,获得的权重越高;2) Correction stage: As the observations arrive in sequence, the corresponding importance weight is calculated for each particle. This weight represents the probability of obtaining the observation when the predicted pose takes the i-th particle. In this way, all particles are evaluated in this way, the more likely to obtain the observed particles, the higher the weight obtained;
3)重采样阶段:根据权值的比例重新分布采样粒子,由于近似逼近连续 分布的粒子数量有限,因此这个步骤非常重要。下一轮滤波中,再将重采样过后的粒子集输入到状态转移方程中,就能够获得新的预测粒子了;3) Re-sampling stage: re-distribute the sampled particles according to the weight ratio. Since the number of particles that approximate the continuous distribution is limited, this step is very important. In the next round of filtering, input the resampled particle set into the state transition equation to obtain new predicted particles;
4)地图估计:对于每个采样的粒子,通过其采样的轨迹与观测计算出相应的地图估计。4) Map estimation: For each sampled particle, the corresponding map estimation is calculated based on the sampled trajectory and observation.
进一步的,如果某一被搜索帧中不存在特征信息与所述目标区域的特征信息相匹配的区域,则获取目标特征信息,确定该被搜索帧中特征信息与所述目标特征信息相匹配的区域,并对该被搜索帧中所确定的区域进行自动标注;其中,所述目标特征信息为:该被搜索帧的相邻预设数量帧中已被标注区域的特征信息。即,如果某一被搜索帧未匹配到从初始帧提取的特征信息,则使用相邻几帧已成功匹配到的区域的特征信息,对该被搜索帧再次进行特征匹配并标注。Further, if there is no area in a searched frame whose characteristic information matches the characteristic information of the target area, then the target characteristic information is acquired, and the characteristic information in the searched frame is determined to match the target characteristic information. Area, and automatically mark the area determined in the searched frame; wherein, the target feature information is: the feature information of the marked area in the adjacent preset number of frames of the searched frame. That is, if a certain searched frame does not match the feature information extracted from the initial frame, the feature information of the area that has been successfully matched in several adjacent frames is used to perform feature matching and labeling on the searched frame again.
可以理解的是,当某一被搜索帧没有匹配到从初始帧提取的特征,说明目标对象在当前帧(即该被搜索帧)中的特征变化超过了阈值而无法匹配,这时可以从当前帧的前一帧或前几帧中选择与初始帧的特征成功匹配的帧,根据所选择的帧中已被标注区域的特征信息,对当前帧再次进行特征匹配以及自动标注。如果当前帧的前几帧中已被标注区域的特征信息都无法与当前帧相匹配,则可以从当前帧的后一帧或者后几帧中选择与初始帧的特征成功匹配的帧,对当前帧再次进行特征匹配以及自动标注。另外,如果当前帧是当前场景片段的最后一帧,则可以利用下一个场景片段的视频帧进行特征匹配,同理,若当前帧是当前场景片段的的第一帧,则可以在前一个场景片段中进行特征匹配。如果依然找不到匹配特征,则可以将当前帧的前后帧的特征点坐标的中值作为当前帧的特征点坐标,然后通过人工处理调整标注当前帧中的区域。It is understandable that when a searched frame does not match the feature extracted from the initial frame, it means that the feature change of the target object in the current frame (that is, the searched frame) exceeds the threshold and cannot be matched, then you can start from the current The previous frame or the previous few frames select the frame that successfully matches the features of the initial frame, and perform feature matching and automatic labeling on the current frame again according to the feature information of the marked area in the selected frame. If the feature information of the marked area in the first few frames of the current frame cannot be matched with the current frame, you can select the frame that successfully matches the features of the initial frame from the next or next frames of the current frame. The frames are again feature-matched and automatically labeled. In addition, if the current frame is the last frame of the current scene segment, the video frame of the next scene segment can be used for feature matching. Similarly, if the current frame is the first frame of the current scene segment, it can be used in the previous scene Feature matching is performed in the segment. If the matching feature is still not found, the median value of the feature point coordinates of the previous and subsequent frames of the current frame can be used as the feature point coordinates of the current frame, and then the area in the current frame can be adjusted and labeled by manual processing.
如果当前有连续几帧均没有匹配到从初始帧提取的特征,则可以先预估这连续几帧的中间帧的特征点坐标,再依次预估前后帧和中间帧的中值帧坐标,直到全部帧都完成预估,然后通过人工处理调整标注这连续几帧中的区域;也可以在预估中间帧的特征点坐标后,先进行人工调整标注中间帧中的区域,然后提取中间帧中新标注区域的特征信息,再对前后帧进行目标自动 匹配标注处理。If there are currently several consecutive frames that do not match the features extracted from the initial frame, the feature point coordinates of the intermediate frames of these consecutive frames can be estimated first, and then the median frame coordinates of the preceding and following frames and intermediate frames can be estimated in turn until All frames are estimated, and then the areas in these consecutive frames are adjusted and labeled by manual processing; it is also possible to manually adjust and label the areas in the intermediate frames after the feature point coordinates of the intermediate frames are estimated, and then extract the intermediate frames The feature information of the newly labeled area is then automatically matched and labeled for the previous and next frames.
S105,提取所述场景片段中已标注的各个视频帧的图像作为训练样本。S105, extracting the marked images of each video frame in the scene segment as a training sample.
在对场景片段中的各个视频帧进行标注后,可以将已标注的各个视频帧的图像提取出来,作为训练样本。由于场景片段中包含大量的视频帧,因此,基于每个场景片段可以获得大量的已标注的图像训练样本。After annotating each video frame in the scene segment, the image of each marked video frame can be extracted as a training sample. Since the scene fragment contains a large number of video frames, a large number of labeled image training samples can be obtained based on each scene fragment.
综上所述,本发明提供的方案,首先对视频的场景片段内的初始帧进行标注,然后使用目标跟踪技术对整个场景片段内其它视频帧进行自动标注,从而获得大量的经过标注的图像作为后期建立目标识别模型的训练样本。现有技术中通过获取大量图片进行人工标注,图片获取以及标注成本较高,而本发明可以利用拍摄一段视频,素材的获取比较方便容易,然后可以从视频中采集大量自动标注的样本,降低了样本标注成本,提高了标注处理效率。In summary, the solution provided by the present invention firstly annotates the initial frame in the video scene segment, and then uses target tracking technology to automatically annotate other video frames in the entire scene segment, thereby obtaining a large number of annotated images as The training samples of the target recognition model are established later. In the prior art, manual annotation is performed by acquiring a large number of pictures, and the cost of image acquisition and annotation is relatively high. However, the present invention can use to shoot a video. The acquisition of materials is more convenient and easy. Then a large number of automatically marked samples can be collected from the video, which reduces The cost of sample labeling improves the efficiency of labeling processing.
与上述的训练样本获得方法相对应,本发明还提供了一种训练样本获得装置,如图2所示,所述装置包括:Corresponding to the aforementioned method for obtaining training samples, the present invention also provides a device for obtaining training samples. As shown in FIG. 2, the device includes:
获得模块201,用于获得视频中的场景片段;The obtaining module 201 is used to obtain scene fragments in the video;
第一标注模块202,用于在所述场景片段中选择一个包含目标对象的视频帧作为初始帧,对所述初始帧中的所述目标对象所在的目标区域进行标注;The first labeling module 202 is configured to select a video frame containing a target object in the scene fragment as an initial frame, and label the target area where the target object is located in the initial frame;
第一提取模块203,用于提取所述初始帧中被标注的所述目标区域的特征信息;The first extraction module 203 is configured to extract feature information of the target area marked in the initial frame;
第二标注模块204,用于以所述初始帧为基准,对所述场景片段中前向和/或后向的视频帧进行特征搜索,确定各个被搜索帧中特征信息与所述目标区域的特征信息相匹配的区域,并对各个被搜索帧中所确定的区域进行自动标注;The second labeling module 204 is configured to perform feature search on the forward and/or backward video frames in the scene segment based on the initial frame, and determine the difference between the feature information in each searched frame and the target area. Areas that match the feature information, and automatically mark the areas determined in each searched frame;
第二提取模块205,用于提取所述场景片段中已标注的各个视频帧的图像作为训练样本。The second extraction module 205 is configured to extract the marked images of each video frame in the scene segment as training samples.
可选的,所述获得模块201,具体用于:Optionally, the obtaining module 201 is specifically used for:
若所述视频为单场景视频,则将所述视频作为一个场景片段;If the video is a single scene video, use the video as a scene segment;
若所述视频为多场景视频,则利用场景切换检测技术,将所述视频划分 为多个场景片段。If the video is a multi-scene video, the scene switching detection technology is used to divide the video into multiple scene segments.
可选的,所述场景切换检测技术包括:基于像素域的检测算法、基于压缩域的检测算法。Optionally, the scene switching detection technology includes: a detection algorithm based on a pixel domain and a detection algorithm based on a compressed domain.
可选的,所述装置还包括:Optionally, the device further includes:
预处理模块,用于在所述第一提取模块203提取所述初始帧中被标注的所述目标区域的特征信息之前,对所述初始帧进行图像预处理,以使所述初始帧中所述目标区域的特征信息更加明显。The preprocessing module is configured to perform image preprocessing on the initial frame before the first extraction module 203 extracts the feature information of the target area marked in the initial frame, so that the The characteristic information of the target area is more obvious.
可选的,所述目标区域的特征信息,包括:颜色特征、纹理特征和形状特征中的一种或多种。Optionally, the feature information of the target area includes one or more of color features, texture features, and shape features.
可选的,所述第二提取模块204对所述场景片段中前向和/或后向的视频帧进行特征搜索,具体为:Optionally, the second extraction module 204 performs feature search on the forward and/or backward video frames in the scene segment, specifically:
利用均值漂移算法、Kalman滤波算法或粒子滤波算法,对所述场景片段中前向和/或后向的视频帧进行特征搜索。Using a mean shift algorithm, a Kalman filter algorithm, or a particle filter algorithm, feature search is performed on the forward and/or backward video frames in the scene segment.
可选的,所述第二提取模块204还用于:Optionally, the second extraction module 204 is further configured to:
如果某一被搜索帧中不存在特征信息与所述目标区域的特征信息相匹配的区域,则获取目标特征信息,确定该被搜索帧中特征信息与所述目标特征信息相匹配的区域,并对该被搜索帧中所确定的区域进行自动标注;If there is no area in a searched frame whose feature information matches the feature information of the target area, acquire the target feature information, determine the area in the searched frame where the feature information matches the target feature information, and Automatically mark the area determined in the searched frame;
其中,所述目标特征信息为:该被搜索帧的相邻预设数量帧中已被标注区域的特征信息。Wherein, the target feature information is: feature information of the marked area in the adjacent preset number of frames of the searched frame.
本发明还提供了一种电子设备,如图3所示,包括处理器301、通信接口302、存储器303和通信总线304,其中,处理器301、通信接口302、存储器303通过通信总线304完成相互间的通信,The present invention also provides an electronic device, as shown in FIG. 3, including a processor 301, a communication interface 302, a memory 303, and a communication bus 304. The processor 301, the communication interface 302, and the memory 303 complete each other through the communication bus 304. Communication between,
存储器303,用于存放计算机程序;The memory 303 is used to store computer programs;
处理器301,用于执行存储器303上所存放的程序时,实现如下步骤:The processor 301 is configured to implement the following steps when executing the program stored in the memory 303:
获得视频中的场景片段;Obtain scene fragments in the video;
在所述场景片段中选择一个包含目标对象的视频帧作为初始帧,对所述初始帧中的所述目标对象所在的目标区域进行标注;Selecting a video frame containing a target object in the scene segment as an initial frame, and labeling the target area in the initial frame where the target object is located;
提取所述初始帧中被标注的所述目标区域的特征信息;Extracting feature information of the target area marked in the initial frame;
以所述初始帧为基准,对所述场景片段中前向和/或后向的视频帧进行特征搜索,确定各个被搜索帧中特征信息与所述目标区域的特征信息相匹配的区域,并对各个被搜索帧中所确定的区域进行自动标注;Using the initial frame as a reference, perform a feature search on the forward and/or backward video frames in the scene segment, and determine the area in each searched frame whose feature information matches the feature information of the target area, and Automatically mark the area determined in each searched frame;
提取所述场景片段中已标注的各个视频帧的图像作为训练样本。The image of each marked video frame in the scene segment is extracted as a training sample.
关于该方法各个步骤的具体实现以及相关解释内容可以参见上述图1所示的方法实施例,在此不做赘述。For the specific implementation of each step of the method and related explanation content, please refer to the method embodiment shown in FIG. 1 above, which will not be repeated here.
另外,处理器301执行存储器303上所存放的程序而实现的训练样本获得方法的其他实现方式,与前述方法实施例部分所提及的实现方式相同,这里也不再赘述。In addition, other implementations of the training sample obtaining method implemented by the processor 301 executing the program stored in the memory 303 are the same as the implementations mentioned in the foregoing method embodiments, and will not be repeated here.
上述电子设备提到的通信总线可以是外设部件互连标准(Peripheral Component Interconnect,PCI)总线或扩展工业标准结构(Extended Industry Standard Architecture,EISA)总线等。该通信总线可以分为地址总线、数据总线、控制总线等。为便于表示,图中仅用一条粗线表示,但并不表示仅有一根总线或一种类型的总线。The communication bus mentioned in the above electronic device may be a Peripheral Component Interconnect (PCI) bus or an Extended Industry Standard Architecture (EISA) bus. The communication bus can be divided into address bus, data bus, control bus and so on. For ease of representation, only one thick line is used in the figure, but it does not mean that there is only one bus or one type of bus.
通信接口用于上述电子设备与其他设备之间的通信。The communication interface is used for communication between the aforementioned electronic device and other devices.
存储器可以包括随机存取存储器(Random Access Memory,RAM),也可以包括非易失性存储器(Non-Volatile Memory,NVM),例如至少一个磁盘存储器。可选的,存储器还可以是至少一个位于远离前述处理器的存储装置。The memory may include random access memory (Random Access Memory, RAM), and may also include non-volatile memory (Non-Volatile Memory, NVM), such as at least one disk storage. Optionally, the memory may also be at least one storage device located far away from the foregoing processor.
上述的处理器可以是通用处理器,包括中央处理器(Central Processing Unit,CPU)、网络处理器(Network Processor,NP)等;还可以是数字信号处理器(Digital Signal Processing,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现场可编程门阵列(Field-Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。The foregoing processor may be a general-purpose processor, including a central processing unit (CPU), a network processor (Network Processor, NP), etc.; it may also be a digital signal processor (Digital Signal Processing, DSP), a dedicated integrated Circuit (Application Specific Integrated Circuit, ASIC), Field-Programmable Gate Array (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components.
本发明还提供了一种计算机可读存储介质,该计算机可读存储介质内存储有计算机程序,该计算机程序被处理器执行时实现上述的训练样本获得方法的方法步骤。The present invention also provides a computer-readable storage medium in which a computer program is stored, and when the computer program is executed by a processor, the method steps of the above-mentioned training sample obtaining method are realized.
需要说明的是,在本文中,诸如第一和第二等之类的关系术语仅仅用来将一个实体或者操作与另一个实体或操作区分开来,而不一定要求或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。而且,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、物品或者设备中还存在另外的相同要素。It should be noted that in this article, relational terms such as first and second are only used to distinguish one entity or operation from another entity or operation, and do not necessarily require or imply one of these entities or operations. There is any such actual relationship or order between. Moreover, the terms "include", "include" or any other variants thereof are intended to cover non-exclusive inclusion, so that a process, method, article or device including a series of elements not only includes those elements, but also includes those that are not explicitly listed Other elements of, or also include elements inherent to this process, method, article or equipment. If there are no more restrictions, the element defined by the sentence "including a..." does not exclude the existence of other same elements in the process, method, article, or equipment that includes the element.
上述描述仅是对本发明较佳实施例的描述,并非对本发明范围的任何限定,本发明领域的普通技术人员根据上述揭示内容做的任何变更、修饰,均属于权利要求书的保护范围。The foregoing description is only a description of the preferred embodiments of the present invention, and is not intended to limit the scope of the present invention. Any changes or modifications made by those of ordinary skill in the field of the present invention based on the foregoing disclosure shall fall within the protection scope of the claims.

Claims (16)

  1. 一种训练样本获得方法,其特征在于,包括:A method for obtaining training samples is characterized in that it includes:
    获得视频中的场景片段;Obtain scene fragments in the video;
    在所述场景片段中选择一个包含目标对象的视频帧作为初始帧,对所述初始帧中所述目标对象所在的目标区域进行标注;Selecting a video frame containing a target object in the scene fragment as an initial frame, and marking the target area where the target object is located in the initial frame;
    提取所述初始帧中被标注的所述目标区域的特征信息;Extracting feature information of the target area marked in the initial frame;
    以所述初始帧为基准,对所述场景片段中前向和/或后向的视频帧进行特征搜索,确定各个被搜索帧中特征信息与所述目标区域的特征信息相匹配的区域,并对各个被搜索帧中所确定的区域进行自动标注;Using the initial frame as a reference, perform a feature search on the forward and/or backward video frames in the scene segment, and determine the area in each searched frame whose feature information matches the feature information of the target area, and Automatically mark the area determined in each searched frame;
    提取所述场景片段中已标注的各个视频帧的图像作为训练样本。The image of each marked video frame in the scene segment is extracted as a training sample.
  2. 如权利要求1所述的训练样本获得方法,其特征在于,所述获得视频中的场景片段,包括:The method for obtaining training samples according to claim 1, wherein said obtaining a scene segment in a video comprises:
    若所述视频为单场景视频,则将所述视频作为一个场景片段;If the video is a single scene video, use the video as a scene segment;
    若所述视频为多场景视频,则利用场景切换检测技术,将所述视频划分为多个场景片段。If the video is a multi-scene video, the scene switching detection technology is used to divide the video into multiple scene segments.
  3. 如权利要求2所述的训练样本获得方法,其特征在于,所述场景切换检测技术包括:基于像素域的检测算法和/或基于压缩域的检测算法。The method for obtaining training samples according to claim 2, wherein the scene switching detection technology comprises: a detection algorithm based on a pixel domain and/or a detection algorithm based on a compressed domain.
  4. 如权利要求1所述的训练样本获得方法,其特征在于,在提取所述初始帧中被标注的所述目标区域的特征信息之前,还包括:5. The method for obtaining training samples according to claim 1, wherein before extracting the feature information of the target area marked in the initial frame, the method further comprises:
    对所述初始帧进行图像预处理,以使所述初始帧中所述目标区域的特征信息更加明显。Image preprocessing is performed on the initial frame to make the feature information of the target region in the initial frame more obvious.
  5. 如权利要求1所述的训练样本获得方法,其特征在于,所述目标区域的特征信息,包括:颜色特征、纹理特征和形状特征中的一种或多种。The method for obtaining training samples according to claim 1, wherein the feature information of the target region includes one or more of color features, texture features, and shape features.
  6. 如权利要求1所述的训练样本获得方法,其特征在于,对所述场景片段中前向和/或后向的视频帧进行特征搜索的步骤包括:The method for obtaining training samples according to claim 1, wherein the step of performing feature search on the forward and/or backward video frames in the scene segment comprises:
    利用均值漂移算法、Kalman滤波算法或粒子滤波算法,对所述场景片段中前向和/或后向的视频帧进行特征搜索。Using a mean shift algorithm, a Kalman filter algorithm, or a particle filter algorithm, feature search is performed on the forward and/or backward video frames in the scene segment.
  7. 如权利要求1所述的训练样本获得方法,其特征在于,所述方法还包括:The method for obtaining training samples according to claim 1, wherein the method further comprises:
    如果某一被搜索帧中不存在特征信息与所述目标区域的特征信息相匹配的区域,则获取目标特征信息,确定该被搜索帧中特征信息与所述目标特征信息相匹配的区域,并对该被搜索帧中所确定的区域进行自动标注;If there is no area in a searched frame whose feature information matches the feature information of the target area, acquire the target feature information, determine the area in the searched frame where the feature information matches the target feature information, and Automatically mark the area determined in the searched frame;
    其中,所述目标特征信息为:该被搜索帧的相邻预设数量帧中已被标注区域的特征信息。Wherein, the target feature information is: feature information of the marked area in the adjacent preset number of frames of the searched frame.
  8. 一种训练样本获得装置,其特征在于,包括:A device for obtaining training samples is characterized by comprising:
    获得模块,用于获得视频中的场景片段;Obtaining module, used to obtain scene fragments in the video;
    第一标注模块,用于在所述场景片段中选择一个包含目标对象的视频帧作为初始帧,对所述初始帧中的所述目标对象所在的目标区域进行标注;The first labeling module is configured to select a video frame containing a target object in the scene fragment as an initial frame, and label the target area where the target object is located in the initial frame;
    第一提取模块,用于提取所述初始帧中被标注的所述目标区域的特征信息;A first extraction module, configured to extract feature information of the target area marked in the initial frame;
    第二标注模块,用于以所述初始帧为基准,对所述场景片段中前向和/或后向的视频帧进行特征搜索,确定各个被搜索帧中特征信息与所述目标区域的特征信息相匹配的区域,并对各个被搜索帧中所确定的区域进行自动标注;The second labeling module is used to perform feature search on the forward and/or backward video frames in the scene segment based on the initial frame, and determine the feature information in each searched frame and the feature of the target area The area where the information matches, and automatically mark the area determined in each searched frame;
    第二提取模块,用于提取所述场景片段中已标注的各个视频帧的图像作为训练样本。The second extraction module is used to extract the marked images of each video frame in the scene segment as training samples.
  9. 如权利要求8所述的训练样本获得装置,其特征在于,所述获得模块获得视频中的场景片段,包括:8. The training sample obtaining device according to claim 8, wherein the obtaining module obtains scene fragments in the video, comprising:
    若所述视频为单场景视频,则将所述视频作为一个场景片段;If the video is a single scene video, use the video as a scene segment;
    若所述视频为多场景视频,则利用场景切换检测技术,将所述视频划分为多个场景片段。If the video is a multi-scene video, the scene switching detection technology is used to divide the video into multiple scene segments.
  10. 如权利要求9所述的训练样本获得装置,其特征在于,所述场景切换检测技术包括:基于像素域的检测算法和/或基于压缩域的检测算法。9. The training sample obtaining device according to claim 9, wherein the scene switching detection technology comprises: a detection algorithm based on a pixel domain and/or a detection algorithm based on a compressed domain.
  11. 如权利要求8所述的训练样本获得装置,其特征在于,所述装置还包括:8. The training sample obtaining device according to claim 8, wherein the device further comprises:
    预处理模块,用于在所述第一提取模块提取所述初始帧中被标注的所述 目标区域的特征信息之前,对所述初始帧进行图像预处理,以使所述初始帧中所述目标区域的特征信息更加明显。The preprocessing module is configured to perform image preprocessing on the initial frame before the first extraction module extracts the feature information of the target region marked in the initial frame, so that the The characteristic information of the target area is more obvious.
  12. 如权利要求8所述的训练样本获得装置,其特征在于,所述目标区域的特征信息,包括:颜色特征、纹理特征和形状特征中的一种或多种。8. The training sample obtaining device according to claim 8, wherein the feature information of the target region includes one or more of color feature, texture feature, and shape feature.
  13. 如权利要求8所述的训练样本获得装置,其特征在于,所述第二提取模块对所述场景片段中前向和/或后向的视频帧进行特征搜索,包括:The training sample obtaining device according to claim 8, wherein the second extraction module performs feature search on the forward and/or backward video frames in the scene segment, comprising:
    利用均值漂移算法、Kalman滤波算法或粒子滤波算法,对所述场景片段中前向和/或后向的视频帧进行特征搜索。Using a mean shift algorithm, a Kalman filter algorithm, or a particle filter algorithm, feature search is performed on the forward and/or backward video frames in the scene segment.
  14. 如权利要求8所述的训练样本获得装置,其特征在于,所述第二提取模块还用于:8. The training sample obtaining device according to claim 8, wherein the second extraction module is further configured to:
    如果某一被搜索帧中不存在特征信息与所述目标区域的特征信息相匹配的区域,则获取目标特征信息,确定该被搜索帧中特征信息与所述目标特征信息相匹配的区域,并对该被搜索帧中所确定的区域进行自动标注;If there is no area in a searched frame whose feature information matches the feature information of the target area, acquire the target feature information, determine the area in the searched frame where the feature information matches the target feature information, and Automatically mark the area determined in the searched frame;
    其中,所述目标特征信息为:该被搜索帧的相邻预设数量帧中已被标注区域的特征信息。Wherein, the target feature information is: feature information of the marked area in the adjacent preset number of frames of the searched frame.
  15. 一种电子设备,其特征在于,包括处理器、通信接口、存储器和通信总线,其中,所述处理器、所述通信接口和所述存储器通过所述通信总线完成相互间的通信;其中,An electronic device, characterized by comprising a processor, a communication interface, a memory, and a communication bus, wherein the processor, the communication interface and the memory complete mutual communication through the communication bus; wherein,
    所述存储器用于存放计算机程序;The memory is used to store computer programs;
    所述处理器用于执行所述存储器上存放的所述计算机程序时,实现如权利要求1-7中任一项所述的方法。When the processor is used to execute the computer program stored in the memory, the method according to any one of claims 1-7 is implemented.
  16. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质存储有计算机程序,所述计算机程序被执行时实现权利要求1-7中任一项所述的方法。A computer-readable storage medium, wherein the computer-readable storage medium stores a computer program, and when the computer program is executed, the method according to any one of claims 1-7 is realized.
PCT/CN2020/073396 2019-02-02 2020-01-21 Training sample obtaining method and apparatus, electronic device and storage medium WO2020156361A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910107568.6A CN109753975B (en) 2019-02-02 2019-02-02 Training sample obtaining method and device, electronic equipment and storage medium
CN201910107568.6 2019-02-02

Publications (1)

Publication Number Publication Date
WO2020156361A1 true WO2020156361A1 (en) 2020-08-06

Family

ID=66407340

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/073396 WO2020156361A1 (en) 2019-02-02 2020-01-21 Training sample obtaining method and apparatus, electronic device and storage medium

Country Status (2)

Country Link
CN (1) CN109753975B (en)
WO (1) WO2020156361A1 (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112233171A (en) * 2020-09-03 2021-01-15 上海眼控科技股份有限公司 Target labeling quality inspection method and device, computer equipment and storage medium
CN112257659A (en) * 2020-11-11 2021-01-22 四川云从天府人工智能科技有限公司 Detection tracking method, apparatus and medium
CN112801940A (en) * 2020-12-31 2021-05-14 深圳市联影高端医疗装备创新研究院 Model evaluation method, device, equipment and medium
CN113254703A (en) * 2021-05-12 2021-08-13 北京百度网讯科技有限公司 Video matching method, video processing device, electronic equipment and medium
CN114347030A (en) * 2022-01-13 2022-04-15 中通服创立信息科技有限责任公司 Robot vision following method and vision following robot
CN115620210A (en) * 2022-11-29 2023-01-17 广东祥利科技有限公司 Method and system for determining performance of electronic wire based on image processing
CN115499666B (en) * 2022-11-18 2023-03-24 腾讯科技(深圳)有限公司 Video compression method, video decompression method, video compression device, video decompression device, and storage medium
CN117237418A (en) * 2023-11-15 2023-12-15 成都航空职业技术学院 Moving object detection method and system based on deep learning

Families Citing this family (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109753975B (en) * 2019-02-02 2021-03-09 杭州睿琪软件有限公司 Training sample obtaining method and device, electronic equipment and storage medium
CN110503074B (en) * 2019-08-29 2022-04-15 腾讯科技(深圳)有限公司 Information labeling method, device and equipment of video frame and storage medium
CN110796041B (en) * 2019-10-16 2023-08-18 Oppo广东移动通信有限公司 Principal identification method and apparatus, electronic device, and computer-readable storage medium
CN110796098B (en) * 2019-10-31 2021-07-27 广州市网星信息技术有限公司 Method, device, equipment and storage medium for training and auditing content auditing model
CN110826509A (en) * 2019-11-12 2020-02-21 云南农业大学 Grassland fence information extraction system and method based on high-resolution remote sensing image
CN111191708A (en) * 2019-12-25 2020-05-22 浙江省北大信息技术高等研究院 Automatic sample key point marking method, device and system
CN111428589B (en) * 2020-03-11 2023-05-30 新华智云科技有限公司 Gradual transition identification method and system
CN111497847B (en) * 2020-04-23 2021-11-16 江苏黑麦数据科技有限公司 Vehicle control method and device
CN112307908B (en) * 2020-10-15 2022-07-26 武汉科技大学城市学院 Video semantic extraction method and device
CN112784750B (en) * 2021-01-22 2022-08-09 清华大学 Fast video object segmentation method and device based on pixel and region feature matching
CN113225461A (en) * 2021-02-04 2021-08-06 江西方兴科技有限公司 System and method for detecting video monitoring scene switching
CN115482426A (en) * 2021-06-16 2022-12-16 华为云计算技术有限公司 Video annotation method, device, computing equipment and computer-readable storage medium
CN113378958A (en) * 2021-06-24 2021-09-10 北京百度网讯科技有限公司 Automatic labeling method, device, equipment, storage medium and computer program product
CN113610030A (en) * 2021-08-13 2021-11-05 北京地平线信息技术有限公司 Behavior recognition method and behavior recognition device
CN113762286A (en) * 2021-09-16 2021-12-07 平安国际智慧城市科技股份有限公司 Data model training method, device, equipment and medium
CN114697702B (en) * 2022-03-23 2024-01-30 咪咕文化科技有限公司 Audio and video marking method, device, equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100202660A1 (en) * 2005-12-29 2010-08-12 Industrial Technology Research Institute Object tracking systems and methods
CN107886105A (en) * 2016-09-30 2018-04-06 法乐第(北京)网络科技有限公司 A kind of annotation equipment of image
CN107886104A (en) * 2016-09-30 2018-04-06 法乐第(北京)网络科技有限公司 A kind of mask method of image
CN109753975A (en) * 2019-02-02 2019-05-14 杭州睿琪软件有限公司 Training sample obtaining method and device, electronic equipment and storage medium

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103218603B (en) * 2013-04-03 2016-06-01 哈尔滨工业大学深圳研究生院 A kind of face automatic marking method and system
CN103559237B (en) * 2013-10-25 2017-02-15 南京大学 Semi-automatic image annotation sample generating method based on target tracking
CN103970906B (en) * 2014-05-27 2017-07-04 百度在线网络技术(北京)有限公司 The method for building up and device of video tab, the display methods of video content and device
CN108229285B (en) * 2017-05-27 2021-04-23 北京市商汤科技开发有限公司 Object classification method, object classifier training method and device and electronic equipment
CN108520218A (en) * 2018-03-29 2018-09-11 深圳市芯汉感知技术有限公司 A kind of naval vessel sample collection method based on target tracking algorism
CN108596958B (en) * 2018-05-10 2021-06-04 安徽大学 Target tracking method based on difficult positive sample generation
CN108986134B (en) * 2018-08-17 2021-06-18 浙江捷尚视觉科技股份有限公司 Video target semi-automatic labeling method based on related filtering tracking

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100202660A1 (en) * 2005-12-29 2010-08-12 Industrial Technology Research Institute Object tracking systems and methods
CN107886105A (en) * 2016-09-30 2018-04-06 法乐第(北京)网络科技有限公司 A kind of annotation equipment of image
CN107886104A (en) * 2016-09-30 2018-04-06 法乐第(北京)网络科技有限公司 A kind of mask method of image
CN109753975A (en) * 2019-02-02 2019-05-14 杭州睿琪软件有限公司 Training sample obtaining method and device, electronic equipment and storage medium

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112233171A (en) * 2020-09-03 2021-01-15 上海眼控科技股份有限公司 Target labeling quality inspection method and device, computer equipment and storage medium
CN112257659A (en) * 2020-11-11 2021-01-22 四川云从天府人工智能科技有限公司 Detection tracking method, apparatus and medium
CN112257659B (en) * 2020-11-11 2024-04-05 四川云从天府人工智能科技有限公司 Detection tracking method, device and medium
CN112801940A (en) * 2020-12-31 2021-05-14 深圳市联影高端医疗装备创新研究院 Model evaluation method, device, equipment and medium
CN113254703A (en) * 2021-05-12 2021-08-13 北京百度网讯科技有限公司 Video matching method, video processing device, electronic equipment and medium
CN114347030A (en) * 2022-01-13 2022-04-15 中通服创立信息科技有限责任公司 Robot vision following method and vision following robot
CN115499666B (en) * 2022-11-18 2023-03-24 腾讯科技(深圳)有限公司 Video compression method, video decompression method, video compression device, video decompression device, and storage medium
CN115620210A (en) * 2022-11-29 2023-01-17 广东祥利科技有限公司 Method and system for determining performance of electronic wire based on image processing
CN115620210B (en) * 2022-11-29 2023-03-21 广东祥利科技有限公司 Method and system for determining performance of electronic wire material based on image processing
CN117237418A (en) * 2023-11-15 2023-12-15 成都航空职业技术学院 Moving object detection method and system based on deep learning

Also Published As

Publication number Publication date
CN109753975A (en) 2019-05-14
CN109753975B (en) 2021-03-09

Similar Documents

Publication Publication Date Title
WO2020156361A1 (en) Training sample obtaining method and apparatus, electronic device and storage medium
CN110400332B (en) Target detection tracking method and device and computer equipment
CN105512683B (en) Object localization method and device based on convolutional neural networks
Lu et al. Robust and efficient saliency modeling from image co-occurrence histograms
CN110807473B (en) Target detection method, device and computer storage medium
CN110334762B (en) Feature matching method based on quad tree combined with ORB and SIFT
US20160307057A1 (en) Fully Automatic Tattoo Image Processing And Retrieval
WO2019071976A1 (en) Panoramic image saliency detection method based on regional growth and eye movement model
Thalji et al. Iris Recognition using robust algorithm for eyelid, eyelash and shadow avoiding
KR20190082593A (en) System and Method for Reidentificating Object in Image Processing
Jung et al. Eye detection under varying illumination using the retinex theory
Meher et al. Efficient method of moving shadow detection and vehicle classification
Song et al. Feature extraction and target recognition of moving image sequences
CN108765463B (en) Moving target detection method combining region extraction and improved textural features
Yaru et al. Algorithm of fingerprint extraction and implementation based on OpenCV
Elashry et al. Feature matching enhancement using the graph neural network (gnn-ransac)
CN111768436B (en) Improved image feature block registration method based on fast-RCNN
Yan et al. Saliency detection based on superpixel correlation and cosine window filtering
CN114119952A (en) Image matching method and device based on edge information
Dey et al. An efficient approach for pupil detection in iris images
Zhang et al. RGB-D saliency detection with multi-feature-fused optimization
Kerdvibulvech Hybrid model of human hand motion for cybernetics application
Zhang et al. Oil tank detection based on linear clustering saliency analysis for synthetic aperture radar images
Wang et al. Image saliency detection for multiple objects
Makandar et al. Comparison and Analysis of Different Feature Extraction Methods versus Noisy Images

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20749758

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20749758

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 20749758

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 02.03.2022)

122 Ep: pct application non-entry in european phase

Ref document number: 20749758

Country of ref document: EP

Kind code of ref document: A1