CN111079671B - Method and device for detecting abnormal articles in scene - Google Patents

Method and device for detecting abnormal articles in scene Download PDF

Info

Publication number
CN111079671B
CN111079671B CN201911329567.2A CN201911329567A CN111079671B CN 111079671 B CN111079671 B CN 111079671B CN 201911329567 A CN201911329567 A CN 201911329567A CN 111079671 B CN111079671 B CN 111079671B
Authority
CN
China
Prior art keywords
scene
feature
graph
map
sub
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911329567.2A
Other languages
Chinese (zh)
Other versions
CN111079671A (en
Inventor
黄泽元
孙楠
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Zhichuang Digital Technology Service Co ltd
Original Assignee
Shenzhen Jizhi Digital Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Jizhi Digital Technology Co Ltd filed Critical Shenzhen Jizhi Digital Technology Co Ltd
Priority to CN201911329567.2A priority Critical patent/CN111079671B/en
Publication of CN111079671A publication Critical patent/CN111079671A/en
Application granted granted Critical
Publication of CN111079671B publication Critical patent/CN111079671B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/35Categorising the entire scene, e.g. birthday party or wedding scene
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/462Salient features, e.g. scale invariant feature transforms [SIFT]

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

The embodiment of the invention discloses a method and a device for detecting abnormal articles in a scene, wherein the method comprises the following steps: respectively inputting the scene detection graph and the scene standard graph into corresponding sub-networks of the twin network, and respectively obtaining a feature graph of the scene detection graph and a feature graph of the scene standard graph; fusing the feature map of the scene detection map and the feature map of the scene standard map to obtain a common feature map; calculating by adopting a plurality of void convolutions and the common feature graph respectively to obtain sub-feature graphs of the objects in the scene enclosed by the target frame; and screening out the sub-feature maps with abnormal articles from the sub-feature maps of the various articles in the scene through a full-connection network, calculating the coordinates of the sub-feature maps with the abnormal articles, and using the target frame to circle the abnormal articles in the scene detection map. By the abnormal article detection method provided by the embodiment of the application, abnormal articles which should not appear in a scene originally can be detected, manpower is greatly liberated, and life quality of people is improved.

Description

Method and device for detecting abnormal articles in scene
Technical Field
The invention relates to the field of abnormal article detection, in particular to a method and a device for detecting abnormal articles in a scene.
Background
With the continuous development of urbanization in China, various high-rise comprehensive construction groups such as comprehensive shopping malls, residential quarters, supermarket stores and the like emerge continuously. Human activities are also increasingly focused on various comprehensive building groups. However, various abnormal articles such as sundries, garbage and the like are often placed on the fire fighting access intentionally or unintentionally, which greatly affects the traveling of people, may hurt children and old people, and even when a fire occurs, the fire fighting access is full of sundries, so that people cannot escape in time.
The traditional abnormal article detection is mainly carried out manually, such as a cleaner cleans sundries on a channel, a special worker checks whether a fire fighting channel is blocked, and the like.
Therefore, a method for detecting abnormal articles is desired to replace manual inspection.
Disclosure of Invention
In view of this, the present application provides a method and an apparatus for detecting abnormal articles in a scene, which are used to detect abnormal articles that should not appear in the scene, greatly liberate manpower, and improve life quality of people.
In a first aspect of the present application, a method for detecting an abnormal object in a scene is provided, where the method includes:
respectively inputting the scene detection graph and the scene standard graph into corresponding sub-networks of the twin network, and respectively obtaining a feature graph of the scene detection graph and a feature graph of the scene standard graph; wherein the subnetworks in the twin network are of the same architecture; the scene detection graph and the scene standard graph have the same scene identification;
fusing the feature map of the scene detection map and the feature map of the scene standard map to obtain a common feature map; wherein the common feature map has all features of the scene detection map and the scene standard map;
calculating by adopting a plurality of void convolutions and the common feature graph respectively to obtain sub-feature graphs of the objects in the scene enclosed by the target frame; the hole convolution has a two-layer structure, and the first layer of hole convolution is used for calculating to obtain the central point of the target frame; the second layer of hole convolution is used for calculating to obtain the width and height values of the target frame;
and screening out the sub-feature maps with abnormal articles from the sub-feature maps of the various articles in the scene through a full-connection network, calculating the coordinates of the sub-feature maps with the abnormal articles, and using the target frame to circle the abnormal articles in the scene detection map.
Optionally, the twin network has two sub-networks, the respectively inputting the scene detection map and the scene standard map into the corresponding sub-networks of the twin network to respectively obtain the feature map of the scene detection map and the feature map of the scene standard map, and the method includes:
inputting a scene detection graph into a first sub-network of the twin network to obtain a feature map of the scene detection graph, and inputting a scene standard graph into a second sub-network of the twin network to obtain a feature map of the scene standard graph.
Optionally, the twin network has three sub-networks, and the inputting the scene detection diagram and the two scene standard diagrams into the corresponding sub-networks of the twin network respectively to obtain the feature diagrams of the scene detection diagram and the feature diagrams of the two scene standard diagrams respectively includes:
inputting a scene detection graph into a first sub-network of the twin network to obtain a feature map of the scene detection graph, and inputting a first scene standard graph into a second sub-network of the twin network to obtain a feature map of the first scene standard graph; and inputting a second scene standard diagram into a third sub-network of the twin network to obtain a characteristic diagram of the second scene standard diagram.
Optionally, the multiple types of items in the scene include:
in a market passage scene, the multiple types of articles comprise people, doors, fire extinguishers, warning boards and abnormal articles; wherein the abnormal articles are articles except human beings, doors, fire extinguishers and warning boards which are adjacent in a physical space and are not separated.
Optionally, the method for collecting multiple types of articles further includes:
collecting a scene standard graph;
putting a new article in the scene, collecting a corresponding scene detection graph, and marking the new article as an abnormal article; and the scene is a scene except the market channel.
Optionally, a sub-network in the twin network forms a backbone network by a residual error network and a feature pyramid network.
Optionally, the fusing the feature map of the scene detection map and the feature map of the scene standard map to obtain a common feature map includes:
stacking the feature map of the scene detection map and the feature map of the scene standard map along the number of channels;
calculating the stacked characteristic graph and a convolution kernel to obtain the common characteristic graph; the size of the convolution kernel is 3x3, the step length is 1, and the number of the convolution kernels is half of the number of channels. Optionally, the calculating, by using the plurality of hole convolutions and the common feature map, to obtain the sub-feature maps of the multiple types of objects in the scene enclosed by the target frame includes:
calculating three pre-target frames by respectively adopting three hole convolutions with expansion rates of 0, 1 and 2 and the common characteristic diagram;
stacking the three pre-target frames, and compressing to obtain a target frame;
and sub-feature maps of multiple types of objects in the scene encircled by the target frame.
Optionally, the method further includes:
and adjusting the first layer structure in the cavity convolution by using the coordinates of the sub-feature graph with the abnormal object as a feedback signal.
In a second aspect of the present application, there is provided an apparatus for detecting an anomalous object in a scene, the apparatus comprising:
the system comprises a twin network feature extraction unit, a twin network feature fusion unit, a cavity convolution calculation unit and an abnormal article detection unit;
the twin network feature extraction unit is used for respectively inputting the scene detection graph and the scene standard graph into corresponding sub-networks of the twin network to respectively obtain the feature graph of the scene detection graph and the feature graph of the scene standard graph; wherein the subnetworks in the twin network are of the same architecture; the scene detection graph and the scene standard graph have the same scene identification;
the twin network feature fusion unit is used for fusing the feature map of the scene detection map and the feature map of the scene standard map to obtain a common feature map; wherein the common feature map has all features of the scene detection map and the scene standard map;
the hole convolution calculating unit is used for calculating with the common feature map by adopting a plurality of hole convolutions to obtain sub-feature maps of various objects in a scene enclosed by the target frame; the hole convolution has a two-layer structure, and the first layer of hole convolution is used for calculating to obtain the central point of the target frame; the second layer of hole convolution is used for calculating to obtain the width and height values of the target frame;
the abnormal article detection unit is used for screening out the sub-feature graphs with the abnormal articles from the sub-feature graphs of the multiple types of articles in the scene through a full-connection network, calculating the coordinates of the sub-feature graphs with the abnormal articles, and enclosing the abnormal articles in the scene detection graph by the target frame.
Compared with the prior art, the technical scheme of the application has the advantages that:
in the technical method provided by the application, a scene detection graph and a scene standard graph are respectively input into corresponding sub-networks of a twin network, a feature graph of the scene detection graph and a feature graph of the scene standard graph are respectively obtained, then the feature graph of the scene detection graph and the feature graph of the scene standard graph are fused to obtain a common feature graph, then a plurality of hole convolutions are respectively adopted to be calculated with the common feature graph to obtain sub-feature graphs of multiple types of objects in the scene enclosed by a target frame, finally the sub-feature graphs of the abnormal objects are screened out from the sub-feature graphs of the multiple types of objects in the scene through a full-connection network, the coordinates of the sub-feature graphs of the abnormal objects are calculated, and the abnormal objects are enclosed by the target frame in the scene detection graph. According to the method for detecting the abnormal article, the scene detection diagram and the scene standard diagram are respectively input into the corresponding sub-networks of the twin network, the feature diagram of the scene detection diagram and the feature diagram of the scene standard diagram can be obtained through shared calculation, and then the obtained feature diagrams are fused to obtain the common feature diagram with the feature combination of the two feature diagrams. After the target frame is obtained through the hole convolution calculation, the obtained target frame has sparsity and representativeness. And screening the sub-feature maps with the abnormal articles through a full-connection network, calculating the coordinates of the sub-feature maps, mapping the coordinates to a scene detection map, and using the target frame to circle the abnormal articles, so that the positions of the abnormal articles in the detection picture can be obtained. Abnormal articles which should not appear in the scene originally are detected through the method, manpower is greatly liberated, and life quality of people is improved.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments described in the present application, and other drawings can be obtained by those skilled in the art without creative efforts.
Fig. 1 is a flowchart of a method for detecting an abnormal object in a scene according to the present application;
FIG. 2 is a flowchart of a method for detecting an abnormal object in another scenario provided by the present application;
fig. 3 is a schematic structural diagram of a device for detecting an abnormal object in a scene according to the present application.
Detailed Description
In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1, fig. 1 is a flowchart of a method for detecting an abnormal object in a scene provided by the present application, where the method may include the following steps 101-104.
Step 101: respectively inputting the scene detection graph and the scene standard graph into corresponding sub-networks of the twin network, and respectively obtaining a feature graph of the scene detection graph and a feature graph of the scene standard graph; wherein the subnetworks in the twin network are of the same architecture; the scene detection graph and the scene standard graph have the same scene identification.
The scene detection graph and the scene standard graph have the same scene identification, and the two pictures can be confirmed to be pictures in the same scene through the scene identification. The scene standard graph is a picture of a normal situation of the scene, that is, the scene standard graph does not have an abnormal object which should not appear in the scene. For example, the scene is a mall passageway scene, and the scene standard chart should have human beings, doors, fire extinguishers and warning boards, except for the human beings, the doors, the fire extinguishers and the warning boards, which belong to abnormal articles. The scene detection graph is a picture to be detected whether the scene to be detected has abnormal articles or not.
The abnormal object is an object that should not appear in the scene standard diagram, and is a complex irregular structure body different from the background in the scene standard diagram.
It should be noted that the twin network has a plurality of sub-networks, each sub-network has the same architecture, for example, a residual network and a feature pyramid network form a backbone network, and the two networks adopt a classical connection mode, wherein four stages of the residual network respectively have 4 layers of outputs, and the 4 layers are respectively subjected to upsampling, 3x3 convolution, 1x1 convolution and pooling layer calculation to obtain 5 layers of outputs of the classical feature pyramid network.
In a possible embodiment, the twin network has three subnetworks, then a scene detection map is input into a first subnetwork of the twin network, a feature map of the scene detection map is obtained, a first scene standard map is input into a second subnetwork of the twin network, a feature map of the first scene standard map is obtained; and inputting a second scene standard diagram into a third sub-network of the twin network to obtain a characteristic diagram of the second scene standard diagram.
Step 102: fusing the feature map of the scene detection map and the feature map of the scene standard map to obtain a common feature map; wherein the common feature map has all features of the scene detection map and the scene standard map.
The feature map of the scene detection map extracted through the sub-network has the features of the scene detection map, and the feature map of the scene standard map extracted through the sub-network has the features of the scene standard map. For example, if there are human beings, gates in the scene standard graph, the feature graph of the scene standard graph will have the features of the two types of features described above. If the scene detection map also has a table, then the scene detection map has the characteristics of the table in addition to the human and door characteristics. And fusing the obtained characteristics, wherein the obtained common sub-characteristic diagram has the characteristics of a class, a door and a table, and the table is known to be an abnormal object relative to a scene standard diagram.
Step 103: calculating by adopting a plurality of void convolutions and the common feature graph respectively to obtain sub-feature graphs of the objects in the scene enclosed by the target frame; the hole convolution has a two-layer structure, and the first layer of hole convolution is used for calculating to obtain the central point of the target frame; and the second layer of hole convolution is used for calculating and obtaining the width and height values of the target frame.
In the traditional method, a classic method of setting a target frame by a sliding window is often adopted, the target frame set manually not only has artificial subjective experience, but also is overlaid and paved on the whole picture, so that the subsequent calculation amount is extremely large. In the embodiment of the application, the hole convolution is adopted, and the hole convolution can enlarge the receptive field under the condition of not performing pooling loss information, so that each convolution output contains information in a larger range. The problems that global information is needed in the image and long sequence information is needed to be relied on can be well processed. In the embodiment of the application, after the hole convolution is adopted for calculation, the neural network can learn the position and the size of the target frame by itself. The method adopts a plurality of hole convolutions to carry out calculation respectively, can obtain a plurality of target frames with different receptive fields, enables the target frames to have universality and representativeness, and enables subsequent calculation to be more accurate and faster due to the target frames with high quality and less quantity. When the target frame is predicted, two branches can be set up, one branch predicts whether the selected point is the center of the target frame or not by calculating the detection picture through the hole convolution and taking a supervisory signal as the position of the center point of the target frame to obtain a feature map with the depth of 1, the other branch predicts the height and width offset of the target frame by calculating the detection picture through the hole convolution and taking the supervisory signal as the height and width offset of the target frame to obtain a feature map with the depth of 2. The hole convolution of the two branches is used for calculation, the center point, the width value and the height value of the target frame are obtained respectively, the position of the target frame on the picture can be determined, and the sub-feature graphs of various objects in the scene enclosed by the target frame can be displayed on the picture. The target frame obtained by the method is sparse and high in quality. The various types of objects may be people on the picture, doors, tables, etc.
In addition, since the feature size of the original feature map may not be the same as that of the calculated feature map, the feature map may be transformed by a self-deformation convolution of 1 × 1 once.
Step 104: and screening out the sub-feature maps with abnormal articles from the sub-feature maps of the various articles in the scene through a full-connection network, calculating the coordinates of the sub-feature maps with the abnormal articles, and using the target frame to circle the abnormal articles in the scene detection map.
The sub-feature map with the abnormal object can be found in the obtained sub-feature maps through a full-connection network, for example, three pictures are obtained in step 103, the target frame in picture 1 circles a person and a door in the picture, the target frame in picture 2 circles a door and a table in the picture, and the target frame in picture 3 circles a door in the picture. Wherein the table is an abnormal object, then the picture 2 is screened out through the full-connection network. And then calculating the coordinates of the scene detection picture with the table feature map in the scene detection picture in the image 2, and using the target frame to circle the abnormal object in the scene detection picture.
In the embodiment provided by the application, a scene detection graph and a scene standard graph are respectively input into corresponding sub-networks of a twin network, and a feature graph of the scene detection graph and a feature graph of the scene standard graph are respectively obtained; wherein the subnetworks in the twin network are of the same architecture; the scene detection graph and the scene standard graph have the same scene identification, and then the feature graph of the scene detection graph and the feature graph of the scene standard graph are fused to obtain a common feature graph; the common feature graph has all features of the scene detection graph and the scene standard graph, and then a plurality of hole convolutions are adopted to respectively calculate with the common feature graph to obtain sub-feature graphs of multiple types of objects in the scene enclosed by the target frame; the hole convolution has a two-layer structure, and the first layer of hole convolution is used for calculating to obtain the central point of the target frame; and the second-layer cavity convolution is used for calculating to obtain the width and height values of the target frame, screening the sub-feature graphs with abnormal articles from the sub-feature graphs of the various articles in the scene through a full-connection network, calculating the coordinates of the sub-feature graphs with the abnormal articles, and using the target frame to circle the abnormal articles in the scene detection graph. According to the method for detecting the abnormal article, the scene detection diagram and the scene standard diagram are respectively input into the corresponding sub-networks of the twin network, the feature diagram of the scene detection diagram and the feature diagram of the scene standard diagram can be obtained through shared calculation, and then the obtained feature diagrams are fused to obtain the common feature diagram with the feature combination of the two feature diagrams. After the target frame is obtained through the hole convolution calculation, the obtained target frame has sparsity and representativeness. And screening the sub-feature maps with the abnormal articles through a full-connection network, calculating the coordinates of the sub-feature maps, mapping the coordinates to a scene detection map, and using the target frame to circle the abnormal articles, so that the positions of the abnormal articles in the detection picture can be obtained. Abnormal articles which should not appear in the scene originally are detected through the method, manpower is greatly liberated, and life quality of people is improved.
In order to make the technical solution provided by the embodiment of the present invention clearer, the following describes the method for detecting abnormal articles provided by the embodiment of the present invention with an embodiment of a mall passageway in combination with fig. 2.
Assuming that sundries in a market channel scene need to be detected, a model needs to be trained before detection, and because the data volume is small, various monotonous backgrounds can be used as generalized scenes (such as a desktop, a lawn, floors with various colors, a roof, a road and the like), and the specific operation is to collect standard graphs of various generalized scenes; placing a new item in the generalized scene, such as placing a football on a lawn, acquiring a corresponding scene detection map, and marking the new item, i.e., the football, as an anomalous item. Therefore, more training data can be collected, and the model robustness is better. Through observation, in a market passage scene, people, doors, fire extinguishers and warning boards are certain to be in common categories, the categories are marked as non-sundry categories and used for distinguishing sundries, the categories are required to be marked and trained, and the total number of sundry labels is calculated to be 5. It should be noted that, in the labeling, as long as there is no partition between adjacent physical spaces, the labels should be identified as one sundry, for example, a lamp is placed on a table, and the labels should be identified as one sundry, not two.
Step 201: and respectively inputting the market channel scene detection graph and the market channel scene standard graph into corresponding sub-networks of the twin network to respectively obtain 5 characteristic graphs of the scene detection graph and 5 characteristic graphs of the scene standard graph.
In an embodiment of the present application, the twin network has two sub-networks, then a mall channel scene detection map is input into a first sub-network of the twin network, 5 feature maps of the mall channel scene detection map are obtained, a mall channel scene standard map is input into a second sub-network of the twin network, 5 feature maps of the mall channel scene standard map are obtained.
Step 202: and fusing the 5 characteristic graphs of the market channel scene detection graph and the 5 characteristic graphs of the market channel scene standard graph to obtain a common characteristic graph.
Specifically, the fusion mode is as follows: the mall channel scene detection graph and the mall channel scene standard graph respectively have 5 feature graphs, the number of the channels is 256, the dimensions are (256,256,256) (128,128,256) (64, 256) (32, 256) (16, 256), the same feature graphs are stacked along the number of the channels, the number of the channels of the feature graphs after stacking is 512, then convolution operation is carried out on the feature graphs obtained by stacking and convolution kernels with the size of 3x3, the step length of 1 and the number of 256 to obtain a common feature graph, and the number of the channels of the common feature graph obtained according to the convolution calculation is 256.
Step 203: and respectively calculating the hole convolutions with the expansion rates of 0, 1 and 2 and the common feature map to obtain sub-feature maps of the objects in the scene enclosed by the target frame.
In the embodiment of the application, three cavity convolutions are constructed, the expansion rates are respectively 0, 1 and 2, the three cavity convolutions are calculated in parallel, the height and the width are respectively set as H and W characteristic diagrams, H x W x1 characteristic diagrams are respectively generated through 3 paths of cavity convolution operation to predict whether each point is at the center, and H x W x 2 characteristic diagrams are used for predicting the offset of the height and the width of each target frame; and fusing the channels and the feature maps with the same function, and finally performing channel combination through 1x1 convolution.
Step 204: and screening out a sub-feature diagram with abnormal articles from the sub-feature diagrams of 5 types of articles in the market channel scene through a full-connection network, calculating the coordinates of the sub-feature diagram with the abnormal articles, and using the target frame to circle the abnormal articles in the scene detection diagram.
Specifically, in the embodiment of the present application, 5 feature maps are obtained in step 202, a series of target frames are generated in step 203, each target frame may be mapped to a feature vector in a mall channel scene detection picture, then two layers of full connection are performed on the feature vectors, so as to obtain the positions of the wide, high, and central points corresponding to the article types of the target frames, respectively, then the probabilities corresponding to each type of the article are obtained by classification, such as 10 percent of the probability of human being, 10 percent of the probability of a door, 30 percent of the probability of a fire extinguisher, 10 percent of a warning sign, and 40 percent of the probability of a trash can, so that the trash can should be an abnormal article in the mall channel scene, the position of the target frame that is circled out of the trash can is calculated, for example, the position of (0,0) is calculated, the trash can is circled out through the target frame in the mall channel scene detection picture, it is proved that the mall channel scene detection picture has an abnormal article trash can be added relative to the, and the garbage bin detects the lower left corner of picture in market passageway scene.
The coordinates of the sub-feature map having the abnormal object are used as a feedback signal to adjust the first layer structure in the void convolution.
The embodiment of the present invention provides a method for detecting an abnormal article in a scene, and also provides a device for detecting an abnormal article, as shown in fig. 3, including:
a twin network feature extraction unit 310, a twin network feature fusion unit 320, a hole convolution calculation unit 330 and an abnormal article detection unit 340;
the twin network feature extraction unit 310 may be configured to input the scene detection map and the scene standard map into corresponding sub-networks of the twin network, and obtain a feature map of the scene detection map and a feature map of the scene standard map, respectively; wherein the subnetworks in the twin network are of the same architecture; the scene detection graph and the scene standard graph have the same scene identification.
The twin network feature fusion unit 320 may be configured to fuse the feature map of the scene detection map and the feature map of the scene standard map to obtain a common feature map; wherein the common feature map has all features of the scene detection map and the scene standard map.
The hole convolution calculating unit 330 may be configured to calculate, by using a plurality of hole convolutions and the common feature map, sub-feature maps of multiple types of objects in a scene enclosed by the target frame; the hole convolution has a two-layer structure, and the first layer of hole convolution is used for calculating to obtain the central point of the target frame; and the second layer of hole convolution is used for calculating and obtaining the width and height values of the target frame.
The abnormal object detection unit 340 may be configured to screen a sub-feature map with an abnormal object from sub-feature maps of multiple types of objects in the scene through a full-connection network, calculate coordinates of the sub-feature map with the abnormal object, and enclose the abnormal object with the target frame in the scene detection map.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the apparatus embodiment, since it is substantially similar to the method embodiment, it is relatively simple to describe, and reference may be made to some descriptions of the method embodiment for relevant points. The above-described apparatus embodiments are merely illustrative, and the units and modules described as separate components may or may not be physically separate. In addition, some or all of the units and modules may be selected according to actual needs to achieve the purpose of the solution of the embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
The foregoing is directed to embodiments of the present invention, and it is understood that various modifications and improvements can be made by those skilled in the art without departing from the spirit of the invention.

Claims (10)

1. A method for detecting an anomalous object in a scene, the method comprising:
respectively inputting the scene detection graph and the scene standard graph into corresponding sub-networks of the twin network, and respectively obtaining a feature graph of the scene detection graph and a feature graph of the scene standard graph; wherein the subnetworks in the twin network are of the same architecture; the scene detection graph and the scene standard graph have the same scene identification;
fusing the feature map of the scene detection map and the feature map of the scene standard map to obtain a common feature map; wherein the common feature map has all features of the scene detection map and the scene standard map;
calculating by adopting a plurality of void convolutions and the common feature graph respectively to obtain sub-feature graphs of the objects in the scene enclosed by the target frame; the hole convolution has a two-layer structure, and the first layer of hole convolution is used for calculating to obtain the central point of the target frame; the second layer of hole convolution is used for calculating and obtaining the width and height values of the target frame;
and screening out the sub-feature maps with abnormal articles from the sub-feature maps of the various articles in the scene through a full-connection network, calculating the coordinates of the sub-feature maps with the abnormal articles, and using the target frame to circle the abnormal articles in the scene detection map.
2. The method of claim 1, wherein the twin network has two sub-networks, and the inputting the scene detection map and the scene standard map into the corresponding sub-networks of the twin network respectively obtains the feature map of the scene detection map and the feature map of the scene standard map respectively comprises:
inputting a scene detection graph into a first sub-network of the twin network to obtain a feature map of the scene detection graph, and inputting a scene standard graph into a second sub-network of the twin network to obtain a feature map of the scene standard graph.
3. The method according to claim 1, wherein the twin network has three sub-networks, and the inputting the scene detection map and the two scene standard maps into the corresponding sub-networks of the twin network respectively to obtain the feature map of the scene detection map and the feature maps of the two scene standard maps respectively comprises:
inputting a scene detection graph into a first sub-network of the twin network to obtain a feature map of the scene detection graph, and inputting a first scene standard graph into a second sub-network of the twin network to obtain a feature map of the first scene standard graph; and inputting a second scene standard diagram into a third sub-network of the twin network to obtain a characteristic diagram of the second scene standard diagram.
4. The method of claim 1, wherein the plurality of classes of items in the scene comprise:
in a market passage scene, the multiple types of articles comprise people, doors, fire extinguishers, warning boards and abnormal articles; wherein the abnormal articles are articles except people, doors, fire extinguishers and warning boards which are adjacent in a physical space and are not separated.
5. The method according to claim 4, wherein the method for collecting the plurality of categories of objects further comprises:
collecting a scene standard graph;
putting a new article in the scene, collecting a corresponding scene detection graph, and marking the new article as an abnormal article; and the scene is a scene except the market channel.
6. The method of claim 1, wherein the sub-networks in the twin network comprise a backbone network consisting of a residual network and a feature pyramid network.
7. The method of claim 1, wherein the fusing the feature map of the scene detection map and the feature map of the scene standard map to obtain a common feature map comprises:
stacking the feature map of the scene detection map and the feature map of the scene standard map along the number of channels;
calculating the stacked characteristic graph and a convolution kernel to obtain the common characteristic graph; the size of the convolution kernel is 3x3, the step length is 1, and the number of the convolution kernels is half of the number of channels.
8. The method according to claim 1, wherein the calculating the plurality of hole convolutions and the common feature map to obtain the sub-feature maps of the plurality of classes of objects in the scene enclosed by the target frame comprises:
calculating three pre-target frames by respectively adopting three hole convolutions with expansion rates of 0, 1 and 2 and the common characteristic diagram;
stacking the three pre-target frames, and compressing to obtain a target frame;
and sub-feature maps of multiple types of objects in the scene encircled by the target frame.
9. The method according to any one of claims 1-8, further comprising:
and adjusting the first layer structure in the cavity convolution by using the coordinates of the sub-feature graph with the abnormal object as a feedback signal.
10. An apparatus for detecting anomalous objects in a scene, said apparatus comprising:
the system comprises a twin network feature extraction unit, a twin network feature fusion unit, a cavity convolution calculation unit and an abnormal article detection unit;
the twin network feature extraction unit is used for respectively inputting the scene detection graph and the scene standard graph into corresponding sub-networks of the twin network to respectively obtain the feature graph of the scene detection graph and the feature graph of the scene standard graph; wherein the subnetworks in the twin network are of the same architecture; the scene detection graph and the scene standard graph have the same scene identification;
the twin network feature fusion unit is used for fusing the feature map of the scene detection map and the feature map of the scene standard map to obtain a common feature map; wherein the common feature map has all features of the scene detection map and the scene standard map;
the hole convolution calculating unit is used for calculating with the common feature map by adopting a plurality of hole convolutions to obtain sub-feature maps of various objects in a scene enclosed by the target frame; the hole convolution has a two-layer structure, and the first layer of hole convolution is used for calculating to obtain the central point of the target frame; the second layer of hole convolution is used for calculating and obtaining the width and height values of the target frame;
the abnormal article detection unit is used for screening out the sub-feature graphs with the abnormal articles from the sub-feature graphs of the multiple types of articles in the scene through a full-connection network, calculating the coordinates of the sub-feature graphs with the abnormal articles, and enclosing the abnormal articles in the scene detection graph by the target frame.
CN201911329567.2A 2019-12-20 2019-12-20 Method and device for detecting abnormal articles in scene Active CN111079671B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911329567.2A CN111079671B (en) 2019-12-20 2019-12-20 Method and device for detecting abnormal articles in scene

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911329567.2A CN111079671B (en) 2019-12-20 2019-12-20 Method and device for detecting abnormal articles in scene

Publications (2)

Publication Number Publication Date
CN111079671A CN111079671A (en) 2020-04-28
CN111079671B true CN111079671B (en) 2020-11-03

Family

ID=70316485

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911329567.2A Active CN111079671B (en) 2019-12-20 2019-12-20 Method and device for detecting abnormal articles in scene

Country Status (1)

Country Link
CN (1) CN111079671B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11114153B2 (en) 2019-12-30 2021-09-07 Taiwan Semiconductor Manufacturing Co., Ltd. SRAM devices with reduced coupling capacitance

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108932508A (en) * 2018-08-13 2018-12-04 杭州大拿科技股份有限公司 A kind of topic intelligent recognition, the method and system corrected

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10621586B2 (en) * 2017-01-31 2020-04-14 Paypal, Inc. Fraud prediction based on partial usage data
US10078790B2 (en) * 2017-02-16 2018-09-18 Honda Motor Co., Ltd. Systems for generating parking maps and methods thereof
CN108416780B (en) * 2018-03-27 2021-08-31 福州大学 Object detection and matching method based on twin-region-of-interest pooling model
US10846593B2 (en) * 2018-04-27 2020-11-24 Qualcomm Technologies Inc. System and method for siamese instance search tracker with a recurrent neural network
CN110163197B (en) * 2018-08-24 2023-03-10 腾讯科技(深圳)有限公司 Target detection method, target detection device, computer-readable storage medium and computer equipment
CN109409263B (en) * 2018-10-12 2021-05-04 武汉大学 Method for detecting urban ground feature change of remote sensing image based on Siamese convolutional network
CN109492618A (en) * 2018-12-06 2019-03-19 复旦大学 Object detection method and device based on grouping expansion convolutional neural networks model
CN110288589B (en) * 2019-06-28 2021-07-02 四川大学 Hematoma expansion prediction method and device
CN110532886A (en) * 2019-07-31 2019-12-03 国网江苏省电力有限公司 A kind of algorithm of target detection based on twin neural network

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108932508A (en) * 2018-08-13 2018-12-04 杭州大拿科技股份有限公司 A kind of topic intelligent recognition, the method and system corrected

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
"Object Detection Model Based on Deep Dilated Convolutional Networks by Fusing Transfer Learning";Yu Quan et al.;《IEEE》;20191210;全文 *
"基于深度神经网络的道路目标检测研究";陈佳鹏;《中国优秀硕士学位论文全文数据库 信息科技辑》;20180915(第09期);全文 *
"面向车辆检测的扩张全卷积神经网络";程雅慧 等;《计算机系统应用》;20181226;第28卷(第1期);全文 *

Also Published As

Publication number Publication date
CN111079671A (en) 2020-04-28

Similar Documents

Publication Publication Date Title
US8243987B2 (en) Object tracking using color histogram and object size
CN107749067A (en) Fire hazard smoke detecting method based on kinetic characteristic and convolutional neural networks
CN110808945B (en) Network intrusion detection method in small sample scene based on meta-learning
JP5045371B2 (en) Foreground / background classification apparatus, method, and program for each pixel of moving image
CN110363134B (en) Human face shielding area positioning method based on semantic segmentation
CN105405150B (en) Anomaly detection method and device based on fusion feature
CN107315755A (en) The orbit generation method and device of query object
CN103903008B (en) A kind of method and system of the mist grade based on image recognition transmission line of electricity
CN110969205A (en) Forest smoke and fire detection method based on target detection, storage medium and equipment
CN103069434A (en) Multi-mode video event indexing
CN107705321A (en) Moving object detection and tracking method based on embedded system
CN103150552B (en) A kind of driving training management method based on number of people counting
CN111222478A (en) Construction site safety protection detection method and system
CN114689058B (en) Fire evacuation path planning method based on deep learning and hybrid genetic algorithm
CN111079671B (en) Method and device for detecting abnormal articles in scene
CN111753587A (en) Method and device for detecting falling to ground
CN113920585A (en) Behavior recognition method and device, equipment and storage medium
Andersson et al. Activity recognition and localization on a truck parking lot
CN111639359A (en) Method and system for detecting and early warning privacy risks of social network pictures
CN116152745A (en) Smoking behavior detection method, device, equipment and storage medium
CN115798133A (en) Flame alarm method, device, equipment and storage medium
US20230076241A1 (en) Object detection systems and methods including an object detection model using a tailored training dataset
CN114170677A (en) Network model training method and equipment for detecting smoking behavior
CN112734699A (en) Article state warning method and device, storage medium and electronic device
CN109670470B (en) Pedestrian relationship identification method, device and system and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20221207

Address after: 100012 5314, Floor 5, Building 6, No. 8, Beiyuanxiao Street, Chaoyang District, Beijing

Patentee after: Beijing Zhichuang Digital Technology Service Co.,Ltd.

Address before: No.103, no.1003, Nanxin Road, Nanshan community, Nanshan street, Nanshan District, Shenzhen City, Guangdong Province

Patentee before: Shenzhen Jizhi Digital Technology Co.,Ltd.