CN114419503A - Video data-based unattended agent vendor analysis method - Google Patents

Video data-based unattended agent vendor analysis method Download PDF

Info

Publication number
CN114419503A
CN114419503A CN202210041074.4A CN202210041074A CN114419503A CN 114419503 A CN114419503 A CN 114419503A CN 202210041074 A CN202210041074 A CN 202210041074A CN 114419503 A CN114419503 A CN 114419503A
Authority
CN
China
Prior art keywords
preset
image
feature
network
monitoring
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210041074.4A
Other languages
Chinese (zh)
Inventor
战凯
项一东
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Shanghai Wentian Technology Development Co ltd
Original Assignee
Beijing Shanghai Wentian Technology Development Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Shanghai Wentian Technology Development Co ltd filed Critical Beijing Shanghai Wentian Technology Development Co ltd
Priority to CN202210041074.4A priority Critical patent/CN114419503A/en
Publication of CN114419503A publication Critical patent/CN114419503A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

The invention provides a video data-based unattended agent vendor analysis method, which comprises the following steps: acquiring a monitoring video through a preset monitoring device, and intercepting the monitoring video according to frames to obtain a monitoring image; based on a preset labellmg tool, identifying and marking the monitoring image, and acquiring a monitoring image containing a tourist; and inputting the monitoring image into a preset detection network model for preprocessing, acquiring a prediction heat map, and identifying the tourist under the outdoor scene through the prediction heat map.

Description

Video data-based unattended agent vendor analysis method
Technical Field
The invention relates to the technical field of video data analysis, in particular to a video data-based analysis method for unattended agent vendors.
Background
At present, the deep learning technology is rapidly developed and widely applied in the fields of image classification, natural language processing and face recognition. The Convolutional Neural Network (CNN) is excellent in performance in the field of computer vision and image recognition, deep image information can be extracted through the CNN, more complex high-level semantic information can be learned, and noise interference can be overcome for images in complex scenes. The network structure commonly used in the convolutional neural network is ALexNet, VGG, ResNet, google net, etc., and the network is used for extracting the features of the image and finally applied to the tasks of image classification and detection. In the prior art, the object detection and identification methods are used, most of the objects are detected and classified aiming at fixed features, such as the features of a mobile phone, a television and the like are fixed, and the general object detection cannot be well identified aiming at the dealer in the outdoor scene. At present, a specific analysis method for careless businessman is lacked, which helps to control cities and makes the control of the cities more standard.
Disclosure of Invention
The invention provides a video data-based unattended vendor analysis method, which aims to solve the problems.
The invention provides a video data-based unattended agent vendor analysis method, which comprises the following steps:
acquiring a monitoring video through a monitoring device pre-installed in a monitoring area, and intercepting the monitoring video according to frames to obtain a monitoring image;
based on a preset labellmg tool, identifying and marking the monitoring image, and acquiring a monitoring image containing a tourist;
and inputting the monitoring image into a preset detection network model for preprocessing, acquiring a prediction heat map, transmitting the prediction heat map to a preset big data processing center for model training, and identifying the type of the vendor.
As an embodiment of the present invention, before the identifying and labeling the monitoring image based on a preset labellmg tool and acquiring the monitoring image including a vendor, the method further includes:
based on a preset monitoring device and a monitoring image, carrying out area identification and positioning on a corresponding monitoring area, and meanwhile, screening the monitoring area where the tourists are prohibited to spread to determine a screened monitoring area;
acquiring a monitoring image corresponding to the screened monitoring area, and determining a screened image;
and acquiring the corresponding relation between the screening monitoring area and the corresponding screening image, and setting a corresponding area label on the screening image according to the corresponding relation.
As an embodiment of the present technical solution, the identifying and labeling the monitoring image based on a preset labellmg tool to obtain a monitoring image including a vendor, further includes:
collecting a preset labeling type set in a labellmg tool; wherein the content of the first and second substances,
the marking type set at least comprises time marking, violation type marking and monitoring area type marking;
identifying the monitoring image of the area label according to the marking type to obtain an identification image;
labeling areas of tourist and pedestrain in the identified image based on a preset labellmg tool, retrieving a labeling type set, and generating a corresponding category label;
and the class labels are added to the corresponding identification images in a watermark mode to generate monitoring images.
As an embodiment of the present technical solution, the inputting the monitoring image into a preset detection network model for preprocessing includes:
step 1: zooming the monitoring image of the target monitoring area by a preset size to generate a zoomed image;
step 2: inputting the scaled image into a preset CenterNet detection network model for normalization processing;
and step 3: transmitting the scaled image after normalization processing to a CenterNet detection network model for structural feature extraction, and determining a feature image;
and 4, step 4: and transmitting the characteristic image to a prediction branch network preset in a CenterNet detection network model for prediction to obtain a prediction heat map.
As an embodiment of the present technical solution, the step 3 includes:
step 30: the scaled image after normalization processing is transmitted to a ResNet50 network structure mechanism preset in a CenterNet detection network model for feature extraction, and a ResNet50 feature image is determined;
step 31: transmitting the ResNet50 feature image to a preset convolution network layer in a ResNet50 network structure mechanism for maximum pooling operation through a preset step length, and determining a first feature image; wherein the content of the first and second substances,
the convolution network layer comprises a first layer of convolution network layer and a second layer of convolution network layer, the convolution network layer is provided with 5 layers of network structures, and each layer of network structure passes through a block area;
step 32: and transmitting the first characteristic image to a CenterNet detection network model for deconv deconvolution three times of sampling to generate a characteristic image.
As an embodiment of the present technical solution, the step 4 includes:
step 40: transmitting the characteristic image to a prediction branch network preset in a CenterNet detection network model, and calculating the prediction length, width and size; wherein the content of the first and second substances,
the predicted branch networks include at least a first predicted branch network, a second predicted branch network, and a third predicted branch network;
the first prediction branch network is used for carrying out shallow feature prediction analysis on the feature image;
the second prediction branch network is used for carrying out deep feature prediction analysis on the feature image;
the third prediction branch network is used for fusing the prediction analysis of the shallow feature and the deep feature of the feature image;
step 41: detecting a preset prediction branch network in the network model based on the CenterNet, and calculating the offset size of the prediction central point;
step 42: and sequentially transmitting the characteristic images to a prediction branch network preset in a CenterNet detection network model based on the predicted length and width, the predicted size and the predicted central point offset size to generate a prediction heat map.
As an embodiment of the present technical solution, the step 31 further includes:
step 310: clustering ResNet50 characteristic images based on a preset SOM network memorability clustering algorithm in a ResNet50 network structure mechanism to determine a clustering characteristic diagram;
step 311: inputting the clustering feature graph to a preset SOM network input layer in a preset convolution network layer in a preset ResNet50 network structure mechanism to generate a corresponding input node; wherein the content of the first and second substances,
the SOM network input layer corresponds to a high-dimensional characteristic input vector of the clustering characteristic graph;
step 312: based on a preset step length, retrieving the shortest output layer channel in the SOM network output layer away from the input node, updating the historical winning output layer channel through the shortest output layer channel, and determining a winning output layer channel;
step 313: updating the weight of the output layer channel in the adjacent area within the preset range of the winning output layer channel, and collecting the updated output node on the winning output layer; wherein the content of the first and second substances,
the output node keeps the topological characteristic of an input vector, and the input node is connected with the output node through a weight vector;
step 314: and decompressing and reading the output node to determine a first characteristic image.
As an embodiment of the present technical solution, the step 312 includes:
step 3120: initializing all output layer units, performing weight assignment on each input node in the initialized output layer units, and determining an assignment node;
step 3121: randomly selecting an input vector of the input second feature map, and setting an input channel of a feature space based on the input vector;
x={xi:i=1,...,D}
wherein x is the input mode set, xiRepresenting the ith input channel, and D representing the total number of the input channels;
step 3122: acquiring connection weight between an input channel and a preset neuron, and judging the squared Euclidean distance between the input channel and the connection weight of each neuron;
Figure BDA0003470240380000051
wherein d isj(x) Represents the squared Euclidean distance, wjiRepresenting the input channel xiAnd the weight of the connection between the neurons j, j being 1,2, …, N representing the total number of neurons;
step 3123: screening a weight vector closest to an input channel based on a preset step length and the squared Euclidean distance, searching corresponding neurons, and determining a winning neuron;
step 3124: acquiring competition information of a winning neuron, mapping the competition information to a corresponding input channel, updating the weight of a historical winning output layer channel, and determining a corresponding discrete output channel; wherein
The weight updating mode of the historical winning output layer channel is delta wji=β(t)·Tj,I(x)·(xi-wji)
Wherein I (x) represents the input channel index for obtaining the neuron, Tj,I(x)A topological neighborhood representing a lookup index for the jth neuron and the corresponding input channel; Δ wjiRepresenting the updated weight of the historical winning output layer channel, t representing the contraction time of the topological neighborhood, and beta (t) representing a weight adjustment domain related to the contraction time; step 3125: fitting the discrete output channelsGenerating a winning space;
step 3126: and adjusting the regional weight of the adjacent region of the preset range in the winning space, and closing the regional weight and the weight of the input channel within a preset threshold value to determine a winning output layer channel.
As an embodiment of the present technical solution, the neuron maps a topological neighborhood in an SOM network for a preset self-organizing feature, where the topological neighborhood is as follows:
Figure BDA0003470240380000061
where j is 1,2, …, N represents the total number of neurons, i (x) the lookup index of the input channel when obtaining the neurons, Sj,I(x)The lateral distance, T, of the lookup index for the jth neuron and the corresponding input channelj,I(x)A topological neighborhood representing a lookup index for the jth neuron and the corresponding input channel;
the topological neighborhood satisfies the decay time dependent function:
Figure BDA0003470240380000062
where t represents the contraction time of the topological neighborhood, σ (t) represents an attenuating time-dependent function with respect to the contraction time t, τ0Representing the attenuation size range, σ, of the topological neighborhood0Representing the degree of attenuation of the topological neighborhood.
As an embodiment of the present invention, the transmitting the prediction heatmap to a preset big data processing center for model training, and identifying a vendor type includes:
model training is carried out on the prediction heat map based on a preset big data processing center, feature extraction is carried out on vendor data and vendor-free vendor data respectively, and a first class feature and a second class feature are determined;
respectively transmitting the extracted first class features and the extracted second class features to a preset SOM (self-organizing map) network for cluster analysis and recognition, and determining a cluster feature map;
searching the clustering feature map in a preset feature library, calculating the clustering feature map and performing similar calculation on the feature map prestored in the feature library, and determining the similarity;
sorting the similarity, determining the similarity in a preset sorting range, searching a corresponding feature map, and identifying the feature category of the feature map;
based on the feature class, an identification type of a vendor is determined.
Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and drawings.
The technical solution of the present invention is further described in detail by the accompanying drawings and embodiments.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the principles of the invention and not to limit the invention. In the drawings:
fig. 1 is a flow chart of a method for analyzing a video-based unattended vendor in an embodiment of the present invention;
FIG. 2 is a flow chart of a method for analyzing a video-based unattended vendor in an embodiment of the invention;
fig. 3 is a flowchart of an analysis method for a video-based unattended vendor in an embodiment of the present invention.
Detailed Description
The preferred embodiments of the present invention will be described in conjunction with the accompanying drawings, and it will be understood that they are described herein for the purpose of illustration and explanation and not limitation.
It will be understood that when an element is referred to as being "secured to" or "disposed on" another element, it can be directly on the other element or be indirectly on the other element. When an element is referred to as being "connected to" another element, it can be directly or indirectly connected to the other element.
It will be understood that the terms "length," "width," "upper," "lower," "front," "rear," "left," "right," "vertical," "horizontal," "top," "bottom," "inner," "outer," and the like, as used herein, refer to an orientation or positional relationship indicated in the drawings that is solely for the purpose of facilitating the description and simplifying the description, and do not indicate or imply that the device or element being referred to must have a particular orientation, be constructed and operated in a particular orientation, and is therefore not to be construed as limiting the invention.
Moreover, it is noted that, in this document, relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions, and "a plurality" means two or more unless specifically limited otherwise. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus.
Although embodiments of the present invention have been shown and described, it will be appreciated by those skilled in the art that changes, modifications, substitutions and alterations can be made in these embodiments without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.
Example 1:
as shown in fig. 1, an embodiment of the present invention provides 1 a method for analyzing an unattended vendor based on video data, including:
acquiring a monitoring video through a monitoring device pre-installed in a monitoring area, and intercepting the monitoring video according to frames to obtain a monitoring image;
based on a preset labellmg tool, identifying and marking the monitoring image, and acquiring a monitoring image containing a tourist;
and inputting the monitoring image into a preset detection network model for preprocessing, acquiring a prediction heat map, transmitting the prediction heat map to a preset big data processing center for model training, and identifying the type of the vendor.
The working principle and the beneficial effects of the technical scheme are as follows:
the embodiment of the invention provides a video data-based unattended agent vendor analysis method, which comprises the steps of collecting a monitoring video through a preset monitoring device, intercepting the monitoring video according to frames, obtaining a monitoring image, determining an image in a monitoring range, providing original data for later identification of vendor vendors, identifying and labeling the monitoring image based on a preset labellmg tool, and obtaining the monitoring image containing the vendor vendors; inputting the monitoring image into a preset detection network model for preprocessing, acquiring a prediction heat map, transmitting the prediction heat map to a preset big data processing center for model training, identifying the type of a tourist, detecting a target area of an object aiming at a video or an image of the tourist under an outdoor scene, extracting corresponding feature points according to the current area, establishing a feature image sequence of the tourist based on an AI (artificial intelligence) big data platform, and performing retrieval comparison according to the extracted feature points and the feature image sequence, thereby realizing the identification of the tourist under the outdoor scene.
Example 2:
as shown in fig. 2, the present technical solution provides an embodiment, before the identifying and labeling the monitoring image based on a preset labellmg tool and acquiring the monitoring image including a vendor, the method further includes:
based on a preset monitoring device and a monitoring image, carrying out area identification and positioning on a corresponding monitoring area, and meanwhile, screening the monitoring area where the tourists are prohibited to spread to determine a screened monitoring area;
acquiring a monitoring image corresponding to the screened monitoring area, and determining a screened image;
and acquiring the corresponding relation between the screening monitoring area and the corresponding screening image, and setting a corresponding area label on the screening image according to the corresponding relation.
The working principle and the beneficial effects of the technical scheme are as follows:
the technical scheme is that the monitoring image is identified and marked based on a preset labellmg tool, and a monitoring area is positioned based on a preset monitoring device and the monitoring image before the monitoring image containing the tourist is obtained; according to the method, a monitoring area of a stall where tourists are forbidden is screened according to a hospital or community gate and aiming at a fixed area, and the event that goods are sold along a street is not allowed to be carried out in a designated area, so that the screened monitoring area is determined; the method comprises the steps of obtaining monitoring images corresponding to a screening monitoring area, determining the screening images, collecting the corresponding relation between the screening images and the corresponding monitoring area, and setting an area label based on the corresponding relation.
Example 3:
this technical scheme provides an embodiment, based on preset labellmg instrument, to the control image discerns and marks, obtains the control image that contains the vendor, still includes:
collecting a preset labeling type set in a labellmg tool; wherein the content of the first and second substances,
the marking type set at least comprises time marking, violation type marking and monitoring area type marking;
identifying the monitoring image of the area label according to the marking type to obtain an identification image;
labeling areas of tourist and pedestrain in the identified image based on a preset labellmg tool, retrieving a labeling type set, and generating a corresponding category label;
and the class labels are added to the corresponding identification images in a watermark mode to generate monitoring images.
The working principle and the beneficial effects of the technical scheme are as follows:
the technical scheme includes that monitoring images are identified and labeled based on a preset labellmg tool, monitoring images including tourist agents are obtained, a label type set preset in the labellmg tool is collected, the labellmg tool can annotate the images, and the label type set at least comprises time labels, violation type labels and monitoring area type labels; identifying the monitoring image of the area label through the marking type to obtain an identification image, namely marking the area of the tourist in the identification image based on a preset labellmg tool, retrieving a marking type set and generating a corresponding category label; the marking type is printed on the identification image to generate a monitoring image, namely the image with the marking watermark, so that the identification efficiency of the dealer can be improved.
Example 4:
the technical scheme provides an embodiment, the inputting the monitoring image into a preset detection network model for preprocessing comprises:
step 1: zooming the monitoring image of the target monitoring area by a preset size to generate a zoomed image;
step 2: inputting the scaled image into a preset CenterNet detection network model for normalization processing;
and step 3: transmitting the scaled image after normalization processing to a CenterNet detection network model for structural feature extraction, and determining a feature image;
and 4, step 4: and transmitting the characteristic image to a prediction branch network preset in a CenterNet detection network model for prediction to obtain a prediction heat map.
The working principle and the beneficial effects of the technical scheme are as follows:
the technical scheme includes that a monitoring image is input into a preset detection network model for preprocessing, the monitoring image is zoomed by a preset size, the zoomed image is determined, the zoomed image is input into a preset CenterNet detection network model for normalization processing, the normalized image classification can unify the format of the image, the processing burden of the detection network model is reduced, the zoomed image after the normalization processing is transmitted into the CenterNet detection network model for structural feature extraction, feature images are determined, feature recognition is carried out on the image, the feature images comprise a first feature image and a second feature image, the second feature image is transmitted into a prediction branch network preset in the CenterNet detection network model for prediction, a prediction heat map is obtained, the image is zoomed, the memory can be effectively saved, the picture recognition speed is improved, and the image for key analysis can be independently and highly resolved, thereby improving the efficiency of processing the image.
Example 5:
this technical scheme provides an embodiment, step 3, including:
step 30: the scaled image after normalization processing is transmitted to a ResNet50 network structure mechanism preset in a CenterNet detection network model for feature extraction, and a ResNet50 feature image is determined;
step 31: transmitting the ResNet50 feature image to a preset convolution network layer in a ResNet50 network structure mechanism for maximum pooling operation through a preset step length, and determining a first feature image; wherein the content of the first and second substances,
the convolution network layer comprises a first layer of convolution network layer and a second layer of convolution network layer, the convolution network layer is provided with 5 layers of network structures, and each layer of network structure passes through a block area;
step 32: and transmitting the first characteristic image to a CenterNet detection network model for deconv deconvolution three times of sampling to generate a characteristic image.
The working principle and the beneficial effects of the technical scheme are as follows:
in the technical scheme, the scaled image after normalization processing is transmitted to a ResNet50 network structure mechanism preset in a CenterNet detection network model for feature extraction, and a ResNet50 feature image is determined; transmitting the ResNet50 feature image to a preset convolution network layer in a ResNet50 network structure mechanism for maximum pooling operation through a preset step length, and determining a first feature image; the convolution network layer comprises a first layer of convolution network layer and a second layer of convolution network layer, the convolution network layer is provided with 5 layers of network structures, and each layer of network structure passes through one block; the first characteristic image is transmitted to a CenterNet detection network model to be subjected to deconv deconvolution three-time sampling to generate a second characteristic image, the image is finely identified through multiple convolution and identification of the image, the characteristic image is identified, a ResNet50 network structure mechanism provides a trained light training network, the processing efficiency of the processed characteristic image with a uniform format can be accelerated, and high-precision identification is achieved through multiple convolution sampling on the basis of high efficiency.
Example 6:
this technical solution provides an embodiment, and step 4 includes:
step 40: transmitting the characteristic image to a prediction branch network preset in a CenterNet detection network model, and calculating the prediction length, width and size; wherein the content of the first and second substances,
the predicted branch networks include at least a first predicted branch network, a second predicted branch network, and a third predicted branch network;
the first prediction branch network is used for carrying out shallow feature prediction analysis on the feature image;
the second prediction branch network is used for carrying out deep feature prediction analysis on the feature image;
the third prediction branch network is used for fusing the prediction analysis of the shallow feature and the deep feature of the feature image;
step 41: detecting a preset prediction branch network in the network model based on the CenterNet, and calculating the offset size of the prediction central point;
step 42: and sequentially transmitting the characteristic images to a prediction branch network preset in a CenterNet detection network model based on the predicted length and width, the predicted size and the predicted central point offset size to generate a prediction heat map.
The working principle and the beneficial effects of the technical scheme are as follows:
the technical scheme includes that a prediction length, width and size are calculated based on a prediction branch network preset in a CenterNet detection network model, the prediction branch network at least comprises a first prediction branch network, a second prediction branch network and a third prediction branch network, the three prediction branches can extract feature images of different feature types, the prediction branch network preset in the CenterNet detection network model is based on the CenterNet detection network, the offset size of a prediction central point is calculated, the second feature images are sequentially transmitted to the prediction branch network preset in the CenterNet detection network model, a prediction heat map is generated through the prediction length, the prediction size and the offset size of the prediction central point, and the prediction heat map is mainly used for performing key point identification on a specific part in an image through colors or lines, so that image features can be effectively distinguished, and areas of key point identification of the image are distinguished.
Example 7:
this technical solution provides an embodiment, in step 31, the method further includes:
step 310: clustering ResNet50 characteristic images based on a preset SOM network memorability clustering algorithm in a ResNet50 network structure mechanism to determine a clustering characteristic diagram;
step 311: inputting the clustering feature graph to a preset SOM network input layer in a preset convolution network layer in a preset ResNet50 network structure mechanism to generate a corresponding input node; wherein the content of the first and second substances,
the SOM network input layer corresponds to a high-dimensional characteristic input vector of the clustering characteristic graph;
step 312: based on a preset step length, retrieving the shortest output layer channel in the SOM network output layer away from the input node, updating the historical winning output layer channel through the shortest output layer channel, and determining a winning output layer channel;
step 313: updating the weight of the output layer channel in the adjacent area within the preset range of the winning output layer channel, and collecting the updated output node on the winning output layer; wherein the content of the first and second substances,
the output node keeps the topological characteristic of an input vector, and the input node is connected with the output node through a weight vector;
step 314: and decompressing and reading the output node to determine a first characteristic image.
The working principle and the beneficial effects of the technical scheme are as follows:
the technical scheme is based on a preset SOM network memorability clustering algorithm in a ResNet50 network structure mechanism, and is used for clustering ResNet50 characteristic images and determining a clustering characteristic diagram; inputting the clustering feature graph into a preset SOM network input layer in a preset convolution network layer in a ResNet50 network structure mechanism to generate a corresponding input node; the input layer corresponds to a high-dimensional input vector; based on a preset step length, retrieving the shortest output layer channel from the input node to the output layer, updating the historical winning output layer channel, and determining the winning output layer channel; updating the weight of the output layer channel of the adjacent area in the preset range of the winning output layer channel to obtain an output node; the output node keeps the topological characteristic of the input vector, and the input node is connected with the output node through the weight vector; decompressing and reading the output node, determining a first characteristic image, and clustering the images with larger characteristic weight to improve the identification accuracy of the characteristic image.
Example 8:
this technical solution provides an embodiment, where step 312 includes:
step 3120: initializing all output layer units, performing weight assignment on each input node in the initialized output layer units, and determining an assignment node;
step 3121: randomly selecting an input vector of the input second feature map, and setting an input channel of a feature space based on the input vector;
x={xi:i=1,...,D}
wherein x is the input mode set, xiRepresenting the ith input channel, and D representing the total number of the input channels;
step 3122: acquiring connection weight between an input channel and a preset neuron, and judging the squared Euclidean distance between the input channel and the connection weight of each neuron;
Figure BDA0003470240380000161
wherein d isj(x) Represents the squared Euclidean distance, wjiRepresenting the input channel xiAnd the weight of the connection between the neurons j, j being 1,2, …, N representing the total number of neurons;
step 3123: screening a weight vector closest to an input channel based on a preset step length and the squared Euclidean distance, searching corresponding neurons, and determining a winning neuron;
step 3124: acquiring competition information of a winning neuron, mapping the competition information to a corresponding input channel, updating the weight of a historical winning output layer channel, and determining a corresponding discrete output channel; wherein
The weight updating mode of the historical winning output layer channel is delta wji=β(t)·Tj,I(x)·(xi-wji)
Wherein I (x) represents the input channel index for obtaining the neuron, Tj,I(x)A topological neighborhood representing a lookup index for the jth neuron and the corresponding input channel; Δ wjiRepresenting the updated weight of the historical winning output layer channel, t representing the contraction time of the topological neighborhood, and beta (t) representing a weight adjustment domain related to the contraction time;
step 3125: fitting the discrete output channels to generate a winning space;
step 3126: and adjusting the regional weight of the adjacent region of the preset range in the winning space, and closing the regional weight and the weight of the input channel within a preset threshold value to determine a winning output layer channel.
The working principle and the beneficial effects of the technical scheme are as follows:
the technical scheme provides an embodiment, all output layer units are initialized, weight assignment is carried out on each input node in the initialized output layer units, and assignment nodes are determined; randomly selecting an input vector of the input second feature map, setting an input channel x of the feature space based on the input vector, and acquiring the input channel and a preset valueThe squared Euclidean distance d between the input channel and the connection weight of each neuron is judgedj(x) (ii) a Fitting the discrete output channels to generate a winning space; the regional weight of the adjacent region of the preset range in the winning space is adjusted, the regional weight and the weight of the input channel are close to each other in the preset threshold value, the channel of the winning output layer is determined, the training speed of searching the topological neighborhood is improved, meanwhile, the winning space is searched more accurately, and accurate original data are provided for searching and identifying images.
Example 9:
the technical solution provides an embodiment, where the neuron maps a topological neighborhood in the SOM network for a preset self-organizing feature, where the topological neighborhood is as follows:
Figure BDA0003470240380000171
where j is 1,2, …, N represents the total number of neurons, i (x) the lookup index of the input channel when obtaining the neurons, Sj,I(x)The lateral distance, T, of the lookup index for the jth neuron and the corresponding input channelj,I(x)A topological neighborhood representing a lookup index for the jth neuron and the corresponding input channel;
the topological neighborhood satisfies the decay time dependent function:
Figure BDA0003470240380000172
where t represents the contraction time of the topological neighborhood, σ (t) represents an attenuating time-dependent function with respect to the contraction time t, τ0Representing the attenuation size range, σ, of the topological neighborhood0Representing the degree of attenuation of the topological neighborhood.
The working principle and the beneficial effects of the technical scheme are as follows:
the technical scheme provides an embodiment, and the neuron maps a topological neighborhood T in the SOM network for preset self-organizing featuresj,I(x)The topological neighborhood satisfies an attenuation time dependence function sigma (t), the data training speed is improved, a neural training network with strong robustness and robustness is provided, and multi-neuron awakening data query is provided.
Example 10:
the technical solution provides an embodiment, the transmitting the prediction heatmap to a preset big data processing center for model training, and identifying a vendor type, including:
model training is carried out on the prediction heat map based on a preset big data processing center, feature extraction is carried out on vendor data and vendor-free vendor data respectively, and a first class feature and a second class feature are determined;
respectively transmitting the extracted first class features and the extracted second class features to a preset SOM (self-organizing map) network for cluster analysis and recognition, and determining a cluster feature map;
searching the clustering feature map in a preset feature library, calculating the clustering feature map and performing similar calculation on the feature map prestored in the feature library, and determining the similarity;
sorting the similarity, determining the similarity in a preset sorting range, searching a corresponding feature map, and identifying the feature category of the feature map;
based on the feature class, an identification type of a vendor is determined.
The working principle and the beneficial effects of the technical scheme are as follows:
the technical scheme includes that the prediction heat map is transmitted to a preset big data processing center to conduct model training, the type of a tourist is identified, model training is conducted on the prediction heat map based on the preset big data processing center, feature extraction is conducted on the tourist data and the non-tourist data respectively, and a first type of feature and a second type of feature are determined; respectively transmitting the extracted first class features and the extracted second class features to a preset SOM (self-organizing map) network for cluster analysis and recognition, and determining a cluster feature map; searching and searching the clustering feature map in a preset feature library, calculating the clustering feature map and performing similar calculation on the feature map prestored in the feature library, and determining the similarity; sorting the similarity, determining the similarity in a preset sorting range, searching a corresponding feature map, and identifying the feature category of the feature map; based on the feature categories, the identification type of the dealer is determined, and dealer data under different scenes are collected in advance, for example, data around a hospital and a community are collected, and an image conforming to a real scene is selected. And inputting the selected images into a pre-trained model to extract a feature map, and establishing and storing a feature library by using the extracted feature map. When the characteristic search is carried out, the distance between the characteristic search device and the established characteristic library is calculated, the value with the minimum distance is found out, and the corresponding result is output.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.

Claims (10)

1. An analysis method for unattended vendor transaction based on video data, comprising:
acquiring a monitoring video through a monitoring device pre-installed in a monitoring area, and intercepting the monitoring video according to frames to obtain a monitoring image;
based on a preset labellmg tool, identifying and marking the monitoring image, and acquiring a monitoring image containing a tourist;
and inputting the monitoring image into a preset detection network model for preprocessing, acquiring a prediction heat map, transmitting the prediction heat map to a preset big data processing center for model training, and identifying the type of the vendor.
2. The method as claimed in claim 1, wherein the method for analyzing the unattended vendor's running product based on the video data, before the step of identifying and labeling the monitored image based on the preset labellmg tool and acquiring the monitored image of the vendor's running product, further comprises:
based on a preset monitoring device and a monitoring image, carrying out area identification and positioning on a corresponding monitoring area, and meanwhile, screening the monitoring area where the tourists are prohibited to spread to determine a screened monitoring area;
acquiring a monitoring image corresponding to the screened monitoring area, and determining a screened image;
and acquiring the corresponding relation between the screening monitoring area and the corresponding screening image, and setting a corresponding area label on the screening image according to the corresponding relation.
3. The method as claimed in claim 1, wherein the step of identifying and labeling the monitored images based on a preset labellmg tool to obtain the monitored images of the vendor comprises:
collecting a preset labeling type set in a labellmg tool; wherein the content of the first and second substances,
the marking type set at least comprises time marking, violation type marking and monitoring area type marking;
identifying the monitoring image of the area label according to the marking type to obtain an identification image;
labeling areas of tourist and pedestrain in the identified image based on a preset labellmg tool, retrieving a labeling type set, and generating a corresponding category label;
and the class labels are added to the corresponding identification images in a watermark mode to generate monitoring images.
4. The method of claim 1, wherein the inputting the monitoring image into a predetermined detection network model for preprocessing comprises:
step 1: zooming the monitoring image of the target monitoring area by a preset size to generate a zoomed image;
step 2: inputting the scaled image into a preset CenterNet detection network model for normalization processing;
and step 3: transmitting the scaled image after normalization processing to a CenterNet detection network model for structural feature extraction, and determining a feature image;
and 4, step 4: and transmitting the characteristic image to a prediction branch network preset in a CenterNet detection network model for prediction to obtain a prediction heat map.
5. The method of claim 4, wherein the step 3 comprises:
step 30: the scaled image after normalization processing is transmitted to a ResNet50 network structure mechanism preset in a CenterNet detection network model for feature extraction, and a ResNet50 feature image is determined;
step 31: transmitting the ResNet50 feature image to a preset convolution network layer in a ResNet50 network structure mechanism for maximum pooling operation through a preset step length, and determining a first feature image; wherein the content of the first and second substances,
the convolution network layer comprises a first layer of convolution network layer and a second layer of convolution network layer, the convolution network layer is provided with 5 layers of network structures, and each layer of network structure passes through a block area;
step 32: and transmitting the first characteristic image to a CenterNet detection network model for deconv deconvolution three times of sampling to generate a characteristic image.
6. The method of claim 4, wherein the step 4 comprises:
step 40: transmitting the characteristic image to a prediction branch network preset in a CenterNet detection network model, and calculating the prediction length, width and size; wherein the content of the first and second substances,
the predicted branch networks include at least a first predicted branch network, a second predicted branch network, and a third predicted branch network;
the first prediction branch network is used for carrying out shallow feature prediction analysis on the feature image;
the second prediction branch network is used for carrying out deep feature prediction analysis on the feature image;
the third prediction branch network is used for fusing the prediction analysis of the shallow feature and the deep feature of the feature image;
step 41: detecting a preset prediction branch network in the network model based on the CenterNet, and calculating the offset size of the prediction central point;
step 42: and sequentially transmitting the characteristic images to a prediction branch network preset in a CenterNet detection network model based on the predicted length and width, the predicted size and the predicted central point offset size to generate a prediction heat map.
7. The method of claim 5, wherein the step 31 further comprises:
step 310: clustering ResNet50 characteristic images based on a preset SOM network memorability clustering algorithm in a ResNet50 network structure mechanism to determine a clustering characteristic diagram;
step 311: inputting the clustering feature graph to a preset SOM network input layer in a preset convolution network layer in a preset ResNet50 network structure mechanism to generate a corresponding input node; wherein the content of the first and second substances,
the SOM network input layer corresponds to a high-dimensional characteristic input vector of the clustering characteristic graph;
step 312: based on a preset step length, retrieving the shortest output layer channel in the SOM network output layer away from the input node, updating the historical winning output layer channel through the shortest output layer channel, and determining a winning output layer channel;
step 313: updating the weight of the output layer channel in the adjacent area within the preset range of the winning output layer channel, and collecting the updated output node on the winning output layer; wherein the content of the first and second substances,
the output node keeps the topological characteristic of an input vector, and the input node is connected with the output node through a weight vector;
step 314: and decompressing and reading the output node to determine a first characteristic image.
8. The method of claim 7, wherein the step 312 comprises:
step 3120: initializing all output layer units, performing weight assignment on each input node in the initialized output layer units, and determining an assignment node;
step 3121: randomly selecting an input vector of the input second feature map, and setting an input channel of a feature space based on the input vector;
x={xi:i=1,...,D}
wherein x is the input mode set, xiRepresenting the ith input channel, and D representing the total number of the input channels;
step 3122: acquiring connection weight between an input channel and a preset neuron, and judging the squared Euclidean distance between the input channel and the connection weight of each neuron;
Figure FDA0003470240370000041
wherein d isj(x) Represents the squared Euclidean distance, wjiRepresenting the input channel xiAnd the weight of the connection between the neurons j, j being 1,2, …, N representing the total number of neurons;
step 3123: screening a weight vector closest to an input channel based on a preset step length and the squared Euclidean distance, searching corresponding neurons, and determining a winning neuron;
step 3124: acquiring competition information of a winning neuron, mapping the competition information to a corresponding input channel, updating the weight of a historical winning output layer channel, and determining a corresponding discrete output channel; wherein
The weight updating mode of the historical winning output layer channel is delta wji=β(t)·Tj,I(x)·(xi-wji)
Wherein I (x) represents the input channel index for obtaining the neuron, Tj,I(x)A topological neighborhood representing a lookup index for the jth neuron and the corresponding input channel; Δ wjiRepresenting the updated weight of the historical winning output layer channel, t representing the contraction time of the topological neighborhood, and beta (t) representing a weight adjustment domain related to the contraction time; step 3125: fitting the discrete output channels to generate a winning space;
step 3126: and adjusting the regional weight of the adjacent region of the preset range in the winning space, and closing the regional weight and the weight of the input channel within a preset threshold value to determine a winning output layer channel.
9. The method of claim 8, wherein the neurons map a topological neighborhood in the SOM network for the pre-defined ad hoc feature, the topological neighborhood being as follows:
Figure FDA0003470240370000051
where j is 1,2, …, N represents the total number of neurons, i (x) the lookup index of the input channel when obtaining the neurons, Sj,I(x)The lateral distance, T, of the lookup index for the jth neuron and the corresponding input channelj,I(x)A topological neighborhood representing a lookup index for the jth neuron and the corresponding input channel;
the topological neighborhood satisfies the decay time dependent function:
Figure FDA0003470240370000061
where t represents the contraction time of the topological neighborhood, σ (t) represents an attenuating time-dependent function with respect to the contraction time t, τ0Representing the attenuation size range, σ, of the topological neighborhood0Representing the degree of attenuation of the topological neighborhood.
10. The method of claim 1, wherein the transmitting the prediction heatmap to a pre-defined big data processing center for model training to identify a vendor type comprises:
model training is carried out on the prediction heat map based on a preset big data processing center, feature extraction is carried out on vendor data and vendor-free vendor data respectively, and a first class feature and a second class feature are determined;
respectively transmitting the extracted first class features and the extracted second class features to a preset SOM (self-organizing map) network for cluster analysis and recognition, and determining a cluster feature map;
searching the clustering feature map in a preset feature library, calculating the clustering feature map and performing similar calculation on the feature map prestored in the feature library, and determining the similarity;
sorting the similarity, determining the similarity in a preset sorting range, searching a corresponding feature map, and identifying the feature category of the feature map;
based on the feature class, an identification type of a vendor is determined.
CN202210041074.4A 2022-01-14 2022-01-14 Video data-based unattended agent vendor analysis method Pending CN114419503A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210041074.4A CN114419503A (en) 2022-01-14 2022-01-14 Video data-based unattended agent vendor analysis method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210041074.4A CN114419503A (en) 2022-01-14 2022-01-14 Video data-based unattended agent vendor analysis method

Publications (1)

Publication Number Publication Date
CN114419503A true CN114419503A (en) 2022-04-29

Family

ID=81272765

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210041074.4A Pending CN114419503A (en) 2022-01-14 2022-01-14 Video data-based unattended agent vendor analysis method

Country Status (1)

Country Link
CN (1) CN114419503A (en)

Similar Documents

Publication Publication Date Title
CN107679250B (en) Multi-task layered image retrieval method based on deep self-coding convolutional neural network
CN110717534B (en) Target classification and positioning method based on network supervision
CN108288051B (en) Pedestrian re-recognition model training method and device, electronic equipment and storage medium
CN111783576B (en) Pedestrian re-identification method based on improved YOLOv3 network and feature fusion
CN111709311A (en) Pedestrian re-identification method based on multi-scale convolution feature fusion
CN107256017B (en) Route planning method and system
CN112633071B (en) Pedestrian re-identification data domain adaptation method based on data style decoupling content migration
KR20200075114A (en) System and Method for Matching Similarity between Image and Text
CN112132014B (en) Target re-identification method and system based on non-supervised pyramid similarity learning
CN111967527B (en) Peony variety identification method and system based on artificial intelligence
CN112489081A (en) Visual target tracking method and device
CN113255354B (en) Search intention recognition method, device, server and storage medium
CN115471739A (en) Cross-domain remote sensing scene classification and retrieval method based on self-supervision contrast learning
CN113920472A (en) Unsupervised target re-identification method and system based on attention mechanism
CN115393666A (en) Small sample expansion method and system based on prototype completion in image classification
CN111444816A (en) Multi-scale dense pedestrian detection method based on fast RCNN
CN113033507B (en) Scene recognition method and device, computer equipment and storage medium
CN110287369A (en) A kind of semantic-based video retrieval method and system
CN114708645A (en) Object identification device and object identification method
CN116206201A (en) Monitoring target detection and identification method, device, equipment and storage medium
CN116189130A (en) Lane line segmentation method and device based on image annotation model
CN114419503A (en) Video data-based unattended agent vendor analysis method
CN112101154B (en) Video classification method, apparatus, computer device and storage medium
CN112015937B (en) Picture geographic positioning method and system
CN114529852A (en) Video data-based carry-over detection and analysis method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination