CN112801105B - Two-stage zero sample image semantic segmentation method - Google Patents

Two-stage zero sample image semantic segmentation method Download PDF

Info

Publication number
CN112801105B
CN112801105B CN202110093474.5A CN202110093474A CN112801105B CN 112801105 B CN112801105 B CN 112801105B CN 202110093474 A CN202110093474 A CN 202110093474A CN 112801105 B CN112801105 B CN 112801105B
Authority
CN
China
Prior art keywords
image
semantic
segmentation
stage
edge
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110093474.5A
Other languages
Chinese (zh)
Other versions
CN112801105A (en
Inventor
刘亚洁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Lab
Original Assignee
Zhejiang Lab
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Lab filed Critical Zhejiang Lab
Priority to CN202110093474.5A priority Critical patent/CN112801105B/en
Publication of CN112801105A publication Critical patent/CN112801105A/en
Application granted granted Critical
Publication of CN112801105B publication Critical patent/CN112801105B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a two-stage zero sample image semantic segmentation method which comprises a background image segmentation module and a zero sample target classification module which are independent of each other in category. The classification-independent front background image segmentation adopts a two-stage image segmentation framework based on Mask-RCNN, an inner edge discriminator and an outer edge discriminator are assisted, and an edge self-supervision module improves the precision of the image front background segmentation. The zero sample target classification module is based on a CADA-VAE algorithm and assists in reverse generation of visual features by Deepinverse to reduce the domain distance between the visual features and semantic features, and the zero sample target classification precision is improved. The zero sample target segmentation method can obtain better image segmentation performance on an unknown target after training on a known target, greatly reduces the requirements of samples and complicated manual labeling, reduces the labeling cost in the professional fields of medicine and the like, and greatly improves the performance of image semantic segmentation tasks in the scenes without samples and with fewer samples.

Description

Two-stage zero sample image semantic segmentation method
Technical Field
The invention relates to the field of deep learning image Segmentation, in particular to a two-stage Zero sample image Semantic Segmentation (ZSS) method.
Background
With the development of computer vision and image technology, deep learning is widely applied to various fields such as image classification, image detection, image segmentation and the like with the advantage of high performance, and the advanced level of each field is rapidly reached. Image semantic segmentation is used as a basic computer vision problem (image classification, object recognition detection and semantic segmentation), and is widely applied to the fields of automatic driving, medical imaging, industrial detection and the like. While current fully supervised semantic segmentation methods rely heavily on intensive pixel-level semantic labels. The semantic labels at the pixel level are acquired at high labor and time cost, and particularly, the labeling cost caused by high labeling threshold in the professional fields such as medical images is unpredictable. In order to reduce labeling cost, algorithms of weak labels (such as image level labels and target box level labels) and individual labels (such as small sample learning) attract extensive attention and research. The zero sample segmentation problem which has more application significance and is more challenging has not been paid extensive attention and research at present.
The current zero sample target segmentation method is based on class semantic information of a one-stage Deeplab series prediction pixel level. There are two major problems with this type of approach: 1) the overall information of the target is not utilized, so different parts of the article are predicted to be of different categories. 2) The prediction at the pixel level causes more noise in the prediction mask, i.e., more irregular noise regions may be predicted on the background.
Disclosure of Invention
In order to solve the defects of the prior art and achieve the purpose of improving the performance of the zero sample target segmentation method, the invention adopts the following technical scheme:
a two-stage zero sample image semantic segmentation method comprises the following steps:
s1, based on Mask-RCNN two-stage irrelevant foreground and background image segmentation, based on Mask-RCNN two-stage image segmentation frame, the classification branch of the second stage is changed into only distinguishing the two classes of the front and background, after the image passes through RPN, the image is sent to the second stage to be classified into the front and background, fine adjustment of the detection frame and segmentation of the foreground, after the image passes through Mask-RCNN, the foreground detection frame and the foreground Mask of an object irrelevant to the class are obtained, and because the classification branch does not distinguish the object class, the method can be ensured to obtain the detection frame and the foreground Mask of an unknown class when the method is tested after training on a known class;
s2, zero sample target classification is carried out based on CADA-VAE, automatic coding and decoding of a visual characteristic domain and a semantic characteristic domain are respectively carried out by adopting a variational self-encoder method, the visual characteristic and the semantic characteristic are converted into a common hidden variable characteristic space, high reconstruction accuracy of the visual characteristic and the semantic characteristic is guaranteed, a hidden variable characteristic with strong characterization capability is obtained, cross-domain alignment of the visual characteristic domain and the semantic characteristic domain is guaranteed, the domain distance between the visual characteristic domain and the semantic characteristic domain is reduced by adding cross-domain coding and decoding supervision, an unknown class can be connected with the visual characteristic through the semantic characteristic at high accuracy, then a classifier is trained based on the hidden variable characteristic converted by the unknown class semantic characteristic, an encoder E and a decoder D are given, and the loss of cross-alignment is as follows:
Figure BDA0002911276250000021
where x represents the visual or semantic features of the input and i, j represent different domains.
Further, an edge self-supervision and inner and outer edge discriminator module is added to the image segmentation branch of the Mask-RCNN in the step S1 to assist in image foreground segmentation.
Furthermore, the edge self-monitoring module is embodied as equal transformation, namely affine transformation is carried out on the input image and the input image is sent into a foreground and background classification network to obtain an image segmentation result, the result is the same as the result obtained by carrying out the same affine transformation on the image segmentation result of the original input image, the module can effectively eliminate the noise of the segmentation result and ensure the consistency of the segmentation result, and the foreground and background classification network F is used for realizing the function of the edge self-monitoring moduleθThe affine transformation matrix a,
Figure BDA0002911276250000022
the edge unsupervised loss is defined as follows:
Figure BDA0002911276250000023
wherein x represents an input picture to be divided, and w' represents
Figure BDA0002911276250000024
The weight matrix of (2).
Further, the inner edge and outer edge discriminator module is divided into an inner edge discriminator and an outer edge discriminator, the inner edge discriminator is used for judging whether the edge of the object is inside the object, the outer edge discriminator is used for judging whether the segmentation edge contains an image background, in the training process, the marking mask is expanded to obtain a simulated outer edge, the marking mask is corroded to obtain a simulated inner edge, the inner edge and outer edge discriminator is used for judging whether the marking mask is the inner edge or the outer edge, a mode of generation and discrimination countertraining is adopted to assist a generator to generate an edge with higher precision, and image foreground segmentation is assisted to obtain higher segmentation precision.
Further, the discriminator adopts a multilayer perceptron.
Further, in step S2, performing depinvering to reversely generate a visual feature assisted zero sample target classification, where depinvering reversely generates a visual feature map by using a trained model, and adds the visual feature map as a visual feature to the CADA-VAE zero sample target classification method, so as to align semantic features of an unknown class with the visual feature, reduce a domain distance between the visual feature of the unknown class and the semantic features, and improve classification accuracy.
Further, the DeepInverson joins a teacher network and a student network, i.e., knowledge distillation is adopted, and the KL divergence loss of the obtained features is monitored, so that the diversity of generated images is increased.
Further, adding supervision of moving average and moving average variance of each BN layer in the trained model, adding supervision of generating images, namely two norms and variances of visual feature graphs generated by the open source model reversely, increasing reality of the generated images, wherein l represents each layer of the network, u represents each layer of the network, andl
Figure BDA0002911276250000025
mean and variance, respectively, the BN layer is normalized as:
Figure BDA0002911276250000026
where E denotes the expectation, X denotes the data distribution, X denotes the image before synthesis,
Figure BDA0002911276250000031
representing the synthesized image.
Furthermore, in the first stage, an external rectangular frame of the object is obtained at the same time, and the visual characteristics are obtained after the content of the external rectangular frame passes through the network layer.
Furthermore, the semantic features are semantic word vectors or attribute vectors, the semantic word vectors are obtained through training of NLP models such as BERT, and the attribute vectors are obtained through existing data sets.
The invention has the advantages and beneficial effects that:
the method completely avoids expensive manpower and time cost consumed by sample labeling in the fully supervised semantic segmentation method, can be quickly applied to various fields, and particularly promotes the related methods in professional fields to be quickly improved.
Drawings
Fig. 1 is a frame configuration diagram of the present invention.
Detailed Description
The following describes in detail embodiments of the present invention with reference to the drawings. It should be understood that the detailed description and specific examples, while indicating the present invention, are given by way of illustration and explanation only, not limitation.
The invention avoids and inhibits the problems of the current method to a great extent, finally improves the performance of the zero sample target segmentation method, promotes the research progress in the field of the zero sample target segmentation method, and accelerates the application in the scientific research and engineering fields. And a brand-new two-stage segmentation algorithm is adopted to realize the knowledge transfer from the semantic segmentation of the known class (the class with the semantic labels available) to the semantic segmentation of the unknown class (the class with the semantic labels unavailable). Mask-RCNN is divided into two phases, the first phase scans the image and generates proposals (i.e., areas that are likely to contain an object), and the second phase classifies the proposals and generates detection boxes and masks.
A two-stage zero sample image semantic segmentation method comprises the following steps:
1) Mask-RCNN-based two-stage category-independent foreground and background image segmentation
A two-stage image segmentation frame based on Mask-RCNN changes the classification branch of the second stage into a two-stage image segmentation frame which only distinguishes front and background, and the image is sent to the second stage to classify the front and background, finely adjust a detection frame and segment the foreground after passing through RPN (region generated Network). The image is processed by Mask-RCNN to obtain a foreground detection frame and a foreground Mask of an object which are irrelevant to the class, and the classification branch does not distinguish the class of the target, so that the method can be ensured to obtain the detection frame and the foreground Mask of an unknown class when the method is tested after training on the known class.
2) Edge self-supervision and inner and outer edge discriminator assisted image foreground segmentation
And adding an edge self-supervision and inner and outer edge discriminator module in an image segmentation branch of the Mask-RCNN to assist in image foreground segmentation.
The edge self-supervision module is embodied as an equality transformation. The method comprises the steps of performing affine transformation on an input image, sending the input image into a foreground and background classification network to obtain an image segmentation result, wherein the result is the same as the result obtained by performing the same affine transformation on the image segmentation result of the original input image. The module can effectively eliminate the noise of the segmentation result and ensure the consistency of the segmentation result. Given a partitioned network FθThe set of affine transformation matrices a,
Figure BDA0002911276250000047
the edge unsupervised loss is defined as follows:
Figure BDA0002911276250000041
wherein x represents an input picture to be divided, and w' represents
Figure BDA0002911276250000042
The weight matrix of (2).
The inner edge and outer edge discriminator is divided into an inner edge discriminator and an outer edge discriminator. The inner edge discriminator mainly judges whether the object edge is inside the object, and the outer edge discriminator mainly judges whether the segmentation edge contains the image background. The method comprises the steps of expanding a label mask in a training process to obtain a simulated outer edge, corroding the label mask to obtain a simulated inner edge, judging whether the label mask is the inner edge or the outer edge by adopting an inner edge and outer edge discriminator, generating a higher-precision edge by adopting a mode of generating and discriminating countermeasure training, assisting a generator to generate a higher-precision edge, assisting image foreground segmentation and obtaining higher segmentation precision. The discriminator can be a multilayer perceptron.
3) CADA-VAE-based zero-sample target classification method
The CADA-VAE-based zero sample target classification is carried out, automatic coding and decoding of a visual characteristic domain and a semantic characteristic domain are respectively carried out by adopting a variational self-encoder method, the visual characteristic and the semantic characteristic are converted into a common hidden variable characteristic space, the visual characteristic and the semantic characteristic can achieve high reconstruction accuracy, and the hidden variable characteristic with high representation capability is obtained.
In the first stage, an external rectangular frame of the object can be obtained at the same time, visual features are obtained after the contents of the external rectangular frame pass through a network layer, the semantic features are semantic word vectors or attribute vectors, the semantic word vectors can be obtained through training of NLP models such as BERT and the like, and the attribute vectors can also be directly provided with data sets.
And then, in order to ensure cross-domain alignment of the visual characteristic domain and the semantic characteristic domain, reducing the domain distance between the visual characteristic domain and the semantic characteristic domain by adding cross-domain coding and decoding supervision, so that an unknown class can be associated with the visual characteristic through the semantic characteristic at high precision, and then training a classifier based on the hidden variable characteristic of the unknown class semantic characteristic conversion. Given encoder E, decoder D, the penalty for cross-alignment is:
Figure BDA0002911276250000043
where x represents the visual or semantic features of the input and i, j represent different domains.
4) Zero sample object classification assisted by using DeepInversion reverse generation of visual features
The DeepInverson utilizes an open source model trained in ImageNet to reversely generate a visual feature map, adopts the idea of knowledge distillation, adds supervision of a moving average value and a moving average variance of each Batch Norm (BN) layer in the trained model, adds supervision of a two-Norm and a variance of a generated image (namely, the open source model reversely generates the visual feature map), and increases the authenticity of the generated image. Each layer of the network, ul
Figure BDA0002911276250000044
Representing the mean and variance, respectively, the BN layer is regularized as:
Figure BDA0002911276250000045
wherein E represents expectation, X represents data scoreCloth, x represents an image before synthesis,
Figure BDA0002911276250000046
representing the synthesized image.
DeepInverson joined teacher and student networks, i.e., knowledge distillation, to gain supervision of KL divergence loss of features to increase the diversity of the images generated. And adding a visual feature graph generated reversely by DeepInverson as a visual feature into a CADA-VAE zero sample target classification method, aligning the semantic feature of the unknown class with the visual feature, and reducing the domain distance between the visual feature of the unknown class and the semantic feature, thereby improving the classification precision.
The above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and these modifications or substitutions do not depart from the scope of the embodiments of the present invention in nature.

Claims (10)

1. A two-stage zero sample image semantic segmentation method is characterized by comprising the following steps:
s1, based on two-stage classification irrelevant foreground and background image segmentation of Mask-RCNN, based on two-stage image segmentation frame of Mask-RCNN, the classification branch of the second stage is changed into only distinguishing two types of the foreground and the background, after the image passes through RPN, the image is sent to the second stage to be classified into the foreground and the background, fine adjustment of a detection frame and segmentation of the foreground, and after the image passes through the Mask-RCNN, a foreground detection frame and a foreground Mask of an object irrelevant to the classification are obtained;
s2, performing zero sample target classification based on CADA-VAE, firstly, respectively performing automatic coding and decoding of a visual characteristic domain and a semantic characteristic domain, converting the visual characteristic and the semantic characteristic into a common hidden variable characteristic space, then reducing the domain distance between the visual characteristic domain and the semantic characteristic domain by adding cross-domain coding and decoding supervision, then training a classifier based on the hidden variable characteristic converted by the unknown semantic characteristic, and giving a coder E and a decoder D, wherein the loss of cross alignment is as follows:
Figure FDA0003567701810000011
where x represents the visual or semantic features of the input and i, j represent different domains.
2. The method according to claim 1, wherein the Mask-RCNN image segmentation branch in step S1 is added with an edge self-supervision and inner-outer edge discriminator module.
3. The two-stage zero-sample image semantic segmentation method as claimed in claim 2, wherein the edge self-supervision module is embodied as an equality transformation, that is, affine transformation is performed on the input image and sent to a foreground and background classification network to obtain an image segmentation result, which is the same as the result obtained by performing the same affine transformation on the image segmentation result of the original input image, and the foreground and background classification network F is used for performing the same affine transformation on the image segmentation result of the original input imageθAffine transformation matrices A, Fθ(x) The edge unsupervised loss is defined as follows:
Figure FDA0003567701810000012
wherein x represents an input picture to be divided, and w' represents FθA weight matrix of (Ax).
4. The two-stage zero sample image semantic segmentation method as claimed in claim 2, wherein the inner and outer edge discriminator module is divided into an inner edge discriminator and an outer edge discriminator, the inner edge discriminator is used for judging whether an object edge is inside the object, the outer edge discriminator is used for judging whether a segmented edge contains an image background, in the training process, a marking mask is expanded to obtain a simulated outer edge, the marking mask is corroded to obtain a simulated inner edge, the inner and outer edge discriminator is used for judging whether the segmented edge is an inner edge or an outer edge, and a generation and discrimination countertraining mode is used for assisting a generator to generate a higher-precision edge.
5. The method as claimed in claim 4, wherein the discriminator employs a multi-layer perceptron.
6. The two-stage zero-sample image semantic segmentation method as claimed in claim 1, wherein in step S2, a deputy inverse-generation visual feature-aided zero-sample object classification is adopted, and the deputy inverse-generation visual feature map is generated by using a trained model and is added to the CADA-VAE zero-sample object classification method as the visual feature.
7. The two-stage zero-sample image semantic segmentation method of claim 6, wherein the DeepInverson incorporates teacher's network and student's network, i.e., knowledge distillation is used to obtain supervision of KL divergence loss of features.
8. The method as claimed in claim 6, wherein the monitoring of moving average and moving average variance of BN layers in the trained model is added, the monitoring of generating image, i.e. binary norm and variance of visual feature map generated by the open source model is added, l represents each layer of network, u represents each layer of network, and u represents each layer of networkl
Figure FDA0003567701810000021
Mean and variance, respectively, the BN layer is normalized as:
Figure FDA0003567701810000022
wherein E represents a desire, and X represents a numberAccording to the distribution, x represents an image before synthesis,
Figure FDA0003567701810000023
representing the synthesized image.
9. The two-stage zero-sample image semantic segmentation method as claimed in claim 1, wherein the first stage simultaneously obtains a circumscribed rectangle frame of the object, and the content of the circumscribed rectangle frame passes through a network layer to obtain the visual features.
10. The two-stage zero-sample image semantic segmentation method as claimed in claim 1, wherein the semantic features are semantic word vectors or attribute vectors, the semantic word vectors are obtained by training of an NLP model, and the attribute vectors are obtained by an existing data set.
CN202110093474.5A 2021-01-22 2021-01-22 Two-stage zero sample image semantic segmentation method Active CN112801105B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110093474.5A CN112801105B (en) 2021-01-22 2021-01-22 Two-stage zero sample image semantic segmentation method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110093474.5A CN112801105B (en) 2021-01-22 2021-01-22 Two-stage zero sample image semantic segmentation method

Publications (2)

Publication Number Publication Date
CN112801105A CN112801105A (en) 2021-05-14
CN112801105B true CN112801105B (en) 2022-07-08

Family

ID=75811523

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110093474.5A Active CN112801105B (en) 2021-01-22 2021-01-22 Two-stage zero sample image semantic segmentation method

Country Status (1)

Country Link
CN (1) CN112801105B (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113177612B (en) * 2021-05-24 2022-09-13 同济大学 Agricultural pest image identification method based on CNN few samples
CN113255829B (en) * 2021-06-17 2021-12-07 中国科学院自动化研究所 Zero sample image target detection method and device based on deep learning
CN113610173B (en) * 2021-08-13 2022-10-04 天津大学 Knowledge distillation-based multi-span domain few-sample classification method
CN114580425B (en) * 2022-05-06 2022-09-09 阿里巴巴(中国)有限公司 Named entity recognition method and device, electronic equipment and storage medium
CN117036790B (en) * 2023-07-25 2024-03-22 中国科学院空天信息创新研究院 Instance segmentation multi-classification method under small sample condition
CN116977796B (en) * 2023-09-25 2024-02-23 中国科学技术大学 Zero sample image recognition method, system, equipment and storage medium
CN117541882B (en) * 2024-01-05 2024-04-19 南京信息工程大学 Instance-based multi-view vision fusion transduction type zero sample classification method

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109993197B (en) * 2018-12-07 2023-04-28 天津大学 Zero sample multi-label classification method based on depth end-to-end example differentiation
CN112017182B (en) * 2020-10-22 2021-01-19 北京中鼎高科自动化技术有限公司 Industrial-grade intelligent surface defect detection method
CN112115951B (en) * 2020-11-19 2021-03-09 之江实验室 RGB-D image semantic segmentation method based on spatial relationship

Also Published As

Publication number Publication date
CN112801105A (en) 2021-05-14

Similar Documents

Publication Publication Date Title
CN112801105B (en) Two-stage zero sample image semantic segmentation method
CN109299274B (en) Natural scene text detection method based on full convolution neural network
CN110097568B (en) Video object detection and segmentation method based on space-time dual-branch network
EP3447727A1 (en) A method, an apparatus and a computer program product for object detection
CN110163286B (en) Hybrid pooling-based domain adaptive image classification method
CN113657560B (en) Weak supervision image semantic segmentation method and system based on node classification
CN111126115A (en) Violence sorting behavior identification method and device
CN112001939B (en) Image foreground segmentation algorithm based on edge knowledge conversion
CN113807420A (en) Domain self-adaptive target detection method and system considering category semantic matching
Chen et al. SEMEDA: Enhancing segmentation precision with semantic edge aware loss
CN111274987A (en) Facial expression recognition method and facial expression recognition device
Muthalagu et al. Vehicle lane markings segmentation and keypoint determination using deep convolutional neural networks
Yang et al. Task-specific loss for robust instance segmentation with noisy class labels
Liu et al. Toward visual quality enhancement of dehazing effect with improved Cycle-GAN
CN113221814A (en) Road traffic sign identification method, equipment and storage medium
CN116681903A (en) Weak supervision significance target detection method based on complementary fusion pseudo tag
Deng et al. Text enhancement network for cross-domain scene text detection
CN117036706A (en) Image segmentation method and system based on multi-modal dialogue language model
CN116740124A (en) Vehicle tracking and license plate recognition combined detection method based on improved YOLOv8
CN112750128A (en) Image semantic segmentation method and device, terminal and readable storage medium
CN112732967B (en) Automatic image annotation method and system and electronic equipment
AFFES et al. Comparison of YOLOV5, YOLOV6, YOLOV7 and YOLOV8 for Intelligent Video Surveillance.
Das et al. Object Detection on Scene Images: A Novel Approach
Wang et al. Patch based multiple instance learning algorithm for object tracking
Al-Jawahry et al. Faster Region-Convolutional Neural Network with ResNet50 for Video Stream Object Detection

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant