CN112306985A

CN112306985A - A digital retinal multimodal feature joint precise retrieval method

Info

Publication number: CN112306985A
Application number: CN201910701647.XA
Authority: CN
Inventors: 杨长水; 齐峰; 魏勇刚; 贾惠柱
Original assignee: Beijing Boya Huishi Intelligent Technology Research Institute Co ltd
Current assignee: Beijing Boya Huishi Intelligent Technology Research Institute Co ltd
Priority date: 2019-07-31
Filing date: 2019-07-31
Publication date: 2021-02-02

Abstract

A method for joint accurate retrieval of digital retinal multimodal features, comprising the following steps: Step 1. Build a big data platform network, including a production cluster and a test cluster, and use at least two core switches. For the production cluster, the aggregation switches are connected in pairs. Connect to the 10 Gigabit optical interface of the service access switch and the gigabit interface of the out-of-band management access switch, of which 10 Gigabit is used to transmit service data, and Gigabit is used to transmit out-of-band management data; for the test cluster, the aggregation switch is connected to the Connect to the gigabit interface of a single service access switch; the service network adopts 10GE or higher network cards. Step 2. After the construction of the cluster and each component is completed, first format the system storage space where the HDFS file system in the entire cluster is located, and then start the cluster; the present invention adopts the Hadoop cloud computing framework proven by a lot of practice as the basis, and has good stability. The method is simple and easy to implement.

Description

Digital retina multi-modal feature combined accurate retrieval method

Technical Field

The invention discloses a digital retina multi-modal characteristic combined accurate retrieval method, relates to the field of security monitoring and artificial intelligence, and more particularly relates to a digital retina multi-modal characteristic combined accurate retrieval method and a method for accurately retrieving monitored vehicles or people based on the method.

Background

In the process of rapid development, the video monitoring industry is continuously developing towards networking, high-definition, intellectualization and diversification. With the deep application of artificial intelligence, cloud computing, big data and unmanned aerial vehicle technology, the diversification of video intelligent analysis becomes the most distinctive feature of a new generation of video monitoring system. The intelligent video monitoring platform (platform for short) is developed by a digital retina end-to-end system developed by Beijing university digital video coding and decoding national engineering laboratory, simultaneously supports functions of concentration transcoding, image analysis, feature retrieval, application display and the like of monitoring videos, integrates multiple frontier technologies such as visual content analysis, visual feature retrieval, big data analysis, cloud storage, deep learning and the like, develops multiple technologies such as parallel multi-channel video concentration transcoding, human and vehicle visual feature extraction, massive visual big data quick retrieval, double-current remote communication, software defined camera network, real-time service application middleware, gives consideration to an online camera and an offline video file, is suitable for efficient storage, quick retrieval and intelligent application of city-level large-scale monitoring videos, and can provide an integral large-scale monitoring video intelligent application solution for users, the method can be widely applied to intelligent video processing of various bayonet and micro-bayonet public security surveillance video scenes accessed by public security organs. Provides an effective technical means for comprehensive management of cities and detection of cases by public security institutions.

How to process the massive video monitoring data in the modern society provides great challenges for a technical architecture of a traditional system using a relational database (Oracle, Mysql, SQLServer), and the requirements of users cannot be met from a data storage level alone. However, due to the lack of intelligent massive video analysis techniques, the utilization of such information is extremely low. In order to fully utilize the information and guarantee social security, people try to apply a vehicle identification technology to intelligent video analysis to realize quick confirmation of the identity of a suspected criminal vehicle. However, in the face of mass vehicle image information, the search speed of vehicle identification cannot meet the application requirements of the security department at all, and a rapid mass vehicle image search comparison method is urgently needed.

Disclosure of Invention

The invention aims to provide a method for establishing an efficient vehicle multi-modal feature vector data index table, which ensures the real-time performance and reliability of the spatial index of a vehicle multi-modal feature search engine.

A digital retina multi-modal feature combined accurate retrieval method comprises the following steps:

step 1, building a big data platform network, comprising a production cluster and a test cluster, adopting at least two core switches,

for a production cluster, hanging a tera optical interface of a service access switch and a gigabit interface of an out-of-band management access switch in pairs below a convergence switch, wherein the tera is used for running service data, and the gigabit is used for running out-of-band management data; for the test cluster, the aggregation switch is connected with a kilomega interface of a single service access switch; the service network adopts network cards above 10 GE.

Step 2, after the cluster and each assembly are built, firstly formatting a system storage space where an HDFS file system in the whole cluster is located, and then starting the cluster;

step 3, after starting the Hadoop, starting a ZooKeeper assembly, finally starting an HBase assembly, and starting a Flink assembly;

and 4, collecting data: acquiring a vehicle body area in a target vehicle image, and then extracting characteristic information of the target vehicle image from the vehicle body area, wherein the characteristic information comprises characteristic points of the target vehicle image and the scale, the main direction and the relative position of the characteristic points; then, according to the feature information of the target vehicle image, inquiring feature points of the sample image stored in the feature database and the scale, the main direction and the relative position of the feature points of the sample image, and determining an image similar to the target vehicle image;

step 5, cleaning data:

step 6, data distributed cloud storage: unstructured data are stored in HDFS and structured data are stored in Hbase.

Step 7, preparing an input picture: selecting a search picture includes the following ways:

(1) importing a target vehicle image by a local file;

(2) picture containing target → target vehicle image capture

(3) Local video: video stream → image capture of target-containing picture → target vehicle

(4) Camera real-time video streaming: video stream → image capture of target-containing picture → target vehicle

(5) And (3) importing the query of the feature library: semantic search → target vehicle image list → multi-angle vehicle image of selected target

Step 8, multi-modal feature combined accurate retrieval:

the method firstly adopts the traditional method to extract the image characteristic mode to carry out 'rough retrieval' on the image database, and then carries out 'fine retrieval' on the basis of an improved V-I depth network model; the problem of 'sea fishing needles' in the original massive images is converted into realizable 'desktop fishing needles', and the image investigation efficiency is improved;

step 9, screening search results: pre-training a model for filtering vehicles, wherein the model comprises a vehicle type model, a color model, a sub-brand model, a license plate information model and a characteristic region classification model; the similarity calculation between the vehicle images is mainly used for grading the images according to the similarity of the attributes or the characteristics of the images, and judging the similarity of the whole content of the images according to the grades;

step 10, controlling the target vehicle:

if the vehicle picture is not the required vehicle picture, performing secondary retrieval, tertiary retrieval or more times according to semantics or the searched picture until the picture required by control is found;

step 11, alarming the target vehicle;

and 12, replaying the target vehicle track.

The invention has the following beneficial effects: the method does not need to adopt an expensive high-performance workstation for constructing a mass vehicle recognition search engine, is realized on the basis of a Hadoop cloud computing framework proved by a large number of practices, and has good stability, simple method and easy implementation. The invention also provides a high-efficiency vehicle feature vector group data index table method, which ensures the space index real-time performance and reliability of the vehicle image recognition search engine.

Drawings

FIG. 1 is a diagram of a hardware framework to which the present invention relates;

FIG. 2 is a flow chart of the present invention;

FIG. 3 is a diagram of a quasi-search concept of the present invention;

FIG. 4 is a comparison of prior art and inventive vehicle search;

FIG. 5 is a vehicle trajectory playback diagram of the present invention;

fig. 6 is a vehicle position playback diagram of the present invention.

Detailed Description

Based on the prior art, a digital retina multi-modal feature combined accurate retrieval method is provided. The system utilizes three core technologies of video structuralization, vehicle identification and big data processing to extract the characteristics and label analysis processing of 'vehicle' actual combat elements related in urban videos, and combines the structured mass data and the big data processing system together to provide a comprehensive solution for meeting the actual combat and social management function targets for social functional departments such as public security and the like.

In order to achieve the purpose, the invention provides a digital retina multi-modal feature joint accurate retrieval method.

The method comprises the following steps: a first layer: a base resource layer. The method supports the acquisition of the picture of vehicle identification through the butt joint with a monitoring platform, a bayonet system and the like; supporting pictures as a source to be analyzed; a second layer: and (4) an algorithm analysis layer. And (3) extracting the features of the target in the video or the bayonet picture by means of a vehicle depth recognition technology. Extractable vehicle basic features: license plate, color, model, brand, etc. And a third layer: and a big data processing layer. The big data platform includes two core components: HDFS and Hbase store, can store structured data after identification, unstructured data, and can provide distributed computing resources. A fourth layer: a cloud computing layer: and the Flink calculates in real time and supports retrieval, operation and association of data in the Storage. And a fifth layer: and (4) an intelligent application layer. Based on the mass vehicle passing data after identification, a series of intelligent applications are formed by combining the user requirements. Comprises the following steps: retrieval class, data mining class, big data statistics class. A sixth layer: and a working portal layer. And (4) iterating the search results for multiple times through two search modes of human-vehicle semantics and image searching by images, and performing deployment and control operation through the search results. The method can use a cheap common server group to construct a mass vehicle search engine, and is realized on the basis of Hadoop, Hbase and Flink cloud computing frames proved by a large number of practices, so that the method has good stability and reliability and a quick retrieval function.

The technical scheme of the invention is as follows:

the method is based on a cloud computing framework, wherein a reservoir is formed by a distributed personnel or vehicle identity information data table and is used for storing massive personnel or vehicle images, personnel or vehicle characteristic vectors and corresponding personnel or vehicle information; the cloud computing layer is composed of a personnel or vehicle characteristic vector clustering index table and a clustering list table and is used for establishing and maintaining an information index table; the outer layer is used for receiving tasks, calculating vehicle characteristic vectors and distributing the tasks. The system stores personnel or vehicle feature vectors of a large number of personnel or vehicle images obtained by utilizing a personnel or vehicle feature extraction method in an unstructured HBase database to obtain a personnel or vehicle identity information data table, and establishes an information index table comprising a personnel or vehicle feature vector clustering index table and a plurality of clustering name list tables after clustering analysis is respectively carried out on each dimensional feature of the personnel or vehicle feature vectors in the table by utilizing a K-means clustering algorithm.

1. Based on the method, the invention further provides a personnel or vehicle multi-modal characteristic combined accurate retrieval engine design method based on the cloud computing technology, and the method is characterized in that a mass vehicle identification process is divided into two stages of mass data organization and vehicle characteristic searching and comparing. The mass data organization stage is a stage for establishing a high-efficiency vehicle characteristic vector data index table, in this stage, the characteristic vectors of mass vehicle images obtained by calculation by using a characteristic extraction method are stored in an unstructured HBase database to obtain a vehicle identity information data table, and each dimension characteristic of the vehicle characteristic vectors in the table is respectively subjected to cluster analysis by using a K mean value clustering algorithm to establish an information index table (comprising a vehicle characteristic vector cluster index table and a plurality of cluster name list tables); in the vehicle feature searching and comparing stage, each dimension of feature of the feature vector of the vehicle image to be compared is utilized to search in the information index table, the result information obtained by searching is combined, so that the range of the vehicle data needing to be played and compared is greatly reduced, then the parallel calculation in the Flink frame is utilized to carry out the vehicle feature vector comparison calculation, and the calculation efficiency and the load balance are improved.

2. Based on the above method, the invention further provides a method for inputting various images, and provides a conditional input mode for combined accurate retrieval, wherein the following 5 modes are provided:

(1) importing a target vehicle image by a local file;

(2) picture containing target → target vehicle image capture;

(3) local video: video stream → picture containing target → target vehicle image capture;

(4) camera real-time video streaming: video stream → picture containing target → target vehicle image capture;

(5) and (3) importing the query of the feature library: semantic search → target vehicle image list → multi-angle vehicle image of the selected target.

3. Based on the method, the invention further provides a target library method, which is used for storing the result picture with high reliability of vehicle or personnel retrieval each time into the target library to be used as the condition input of secondary retrieval, so that the accuracy of the retrieval result is improved.

4. Based on the method, the invention further provides a mass data storage method, namely, the data distributed cloud storage: unstructured data are stored in HDFS and structured data are stored in Hbase.

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described below with reference to the accompanying drawings by referring to specific examples.

The hardware environment for implementation is: the hardware environment for implementation is: the system server side runs in a Hadoop cluster, and the cluster comprises four servers. The cluster adopts a Master-Slave server architecture, one server is used as a Master node Master, and the other three servers are used as Slave nodes Slave, and all the servers operate in the same local area network.

TABLE 1

The invention uses a FastDFS cluster file system to store picture and video information, an HBase cluster to store structural characteristic information, a flash cluster to search people and vehicles, and a Zookeeper to provide service reliability, and the roles are distributed as follows:

TABLE 2

The invention is implemented as follows:

step 1, networking: the project big data platform comprises a production cluster and a test cluster, and two core switches (convergence) are adopted.

For a production cluster, a service access switch (a tera optical interface) and an out-of-band management access switch (a gigabit interface) are hooked under a convergence switch in pairs, wherein tera is used for running service data, and gigabit is used for running out-of-band management data (optional). For the test cluster, a single service access switch (gigabit interface) is connected below the aggregation switch. In order to prevent the exchange bandwidth between the nodes from becoming the bottleneck of the system performance, the service network adopts 10GE network cards.

And 2, after the cluster and each assembly are built, firstly formatting a system storage space where the HDFS file system in the whole cluster is located, and then starting the cluster.

And 3, after starting the Hadoop, starting the ZooKeeper component, finally starting the HBase component, and starting the Flink component.

And 4, collecting data: the method comprises the steps of obtaining a vehicle body area in a target vehicle image, and then extracting feature information of the target vehicle image from the vehicle body area, wherein the feature information comprises feature points of the target vehicle image and the scale, the main direction and the relative position of the feature points. And then inquiring the feature points of the sample image stored in the feature database and the scale, the main direction and the relative position of the feature points of the sample image according to the feature information of the target vehicle image, and determining the image similar to the target vehicle image.

Step 5, cleaning data:

(1) importing a target vehicle image by a local file;

(2) picture containing target → target vehicle image capture

Step 8, multi-modal feature combined accurate retrieval:

the method firstly adopts the traditional method to extract the image characteristics to carry out 'rough retrieval' on the image database, and then carries out 'fine retrieval' on the basis of the improved V-I deep network model. The problem of 'sea fishing needles' in the original massive images is converted into the achievable 'desktop fishing needles', and the image investigation efficiency is improved.

Step 9, screening search results: pre-training a model for filtering vehicles, wherein the model comprises a vehicle type model, a color model, a sub-brand model, a license plate information model and a characteristic region classification model; the similarity calculation between the vehicle images is mainly used for grading the images according to the similarity of the attributes or the characteristics of the images, and judging the similarity of the whole content of the images according to the grades.

Step 10, controlling the target vehicle:

if the picture is not the required vehicle picture, the secondary retrieval, the third retrieval or more times can be carried out according to the semantic meaning or the searched picture until the picture required by the control is found.

Step 11, alarming the target vehicle

Step 12, replaying the target vehicle track

It is noted that the disclosed embodiments are intended to aid in further understanding of the invention, but those skilled in the art will appreciate that: various substitutions and modifications are possible without departing from the spirit and scope of the invention and appended claims. Therefore, the invention should not be limited to the embodiments disclosed, but the scope of the invention is defined by the appended claims.

Claims

1. A digital retinal multimodal feature joint precise retrieval method, comprising the following steps:

Step 1. Set up a big data platform network, including production cluster and test cluster, using at least two core switches,

For a production cluster, the 10-Gigabit optical interface of the service access switch and the gigabit interface of the out-of-band management access switch are connected in pairs under the aggregation switch. Data; for the test cluster, the gigabit interface of a single service access switch is attached to the aggregation switch; the service network uses 10GE or higher network cards.

Step 2. After the cluster and each component are built, first format the system storage space where the HDFS file system in the entire cluster is located, and then start the cluster;

Step 3. After starting Hadoop, start the ZooKeeper component, finally start the HBase component, and start the Flink component;

Step 4. Collect data: obtain the body area in the target vehicle image, and then extract the feature information of the target vehicle image from the body area. The feature information includes the feature points of the target vehicle image and the scale, main direction and relative position of the feature points; then Query the feature points of the sample image stored in the feature database and the scale, main direction and relative position of the feature points of the sample image according to the feature information of the target vehicle image, and determine the image similar to the target vehicle image;

Step 5. Clean data:

Step 6. Data distributed cloud storage: unstructured data is stored in HDFS, and structured data is stored in Hbase.

Step 7. Prepare to enter a picture: There are several ways to select a search picture:

(1) Import the target vehicle image from the local file;

(2) Pictures containing the target → target vehicle image capture;

(3) Local video: video stream→image containing target→target vehicle image capture;

(4) Camera real-time video stream: video stream→picture containing target→target vehicle image capture;

(5) Feature library query import: semantic search → target vehicle image list → multi-angle vehicle image selection of the target;

Step 8. Multimodal feature joint accurate retrieval:

The invention firstly adopts the traditional method to extract image features to perform "rough retrieval" on the image database, and then implements "fine retrieval" based on the improved V-I deep network model;

Step 9. Screen search results: pre-train a model for vehicle filtering, which includes a vehicle type model, a color model, a sub-brand model, a license plate information model, and a feature area classification model; the similarity between vehicle images is mainly calculated. It is used to score images according to the similarity of their attributes or features, and to judge the similarity of the overall content of the images according to the score;

Step 10. Control the target vehicle:

If it is not the required vehicle picture, it can be divided into semantics or searched pictures for secondary retrieval, three retrievals or more times, until the required picture is found;

Step 11, alert the target vehicle;

Step 12: Play back the target vehicle trajectory.