CN110569390B - Efficient retrieval method for massive case data based on storage medium distinguishing - Google Patents

Efficient retrieval method for massive case data based on storage medium distinguishing Download PDF

Info

Publication number
CN110569390B
CN110569390B CN201910771016.5A CN201910771016A CN110569390B CN 110569390 B CN110569390 B CN 110569390B CN 201910771016 A CN201910771016 A CN 201910771016A CN 110569390 B CN110569390 B CN 110569390B
Authority
CN
China
Prior art keywords
target
face
relational database
pedestrian
cyclist
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910771016.5A
Other languages
Chinese (zh)
Other versions
CN110569390A (en
Inventor
严国建
陈斌
王彬
曾璐
何海峰
范玲珑
李健
陈秀峰
王思桐
聂瑜智
陈伟
董骏
魏伟
普应奇
梁瑞凡
李志强
乔熙
陈正义
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
WUHAN DAQIAN INFORMATION TECHNOLOGY CO LTD
Original Assignee
WUHAN DAQIAN INFORMATION TECHNOLOGY CO LTD
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by WUHAN DAQIAN INFORMATION TECHNOLOGY CO LTD filed Critical WUHAN DAQIAN INFORMATION TECHNOLOGY CO LTD
Priority to CN201910771016.5A priority Critical patent/CN110569390B/en
Publication of CN110569390A publication Critical patent/CN110569390A/en
Application granted granted Critical
Publication of CN110569390B publication Critical patent/CN110569390B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/71Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/78Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/783Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • G06F16/7837Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using objects detected or recognised in the video content
    • G06F16/784Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using objects detected or recognised in the video content the detected or recognised objects being people
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention relates to a method for efficiently retrieving massive case data based on a distinguished storage medium, which comprises the following steps: storing the structural semantic information of pedestrians or cyclists into a relational database table, and storing the face characteristic data into a non-relational database table; detecting a target of a pedestrian or a cyclist of the picture to be retrieved, detecting a human face, and extracting human face characteristic data; selecting one or more fields of pedestrian or cyclist type and structural semantic information, inquiring a corresponding target unique identifier in a relational database table, and combining the target type of the pedestrian or cyclist, and efficiently and quickly retrieving face feature data meeting the conditions in a face feature table in a non-relational database; the invention can quickly and accurately search massive cases by a search mode of combining the face characteristic data and the human body structural semantics.

Description

Efficient retrieval method for massive case data based on storage medium distinguishing
Technical Field
The invention relates to storage and retrieval of mass data, in particular to a high-efficiency retrieval method of mass case data based on distinguishing storage media.
Background
Video tracking is one of the important means of detecting a case, and in the video tracking process, a target object needs to be retrieved from video data.
At present, for storing video data, different storage media are used according to different data structure types, and currently, the storage of face feature vector data (unstructured data) is still based on files, relational databases as storage media and the like, so that the storage space is large, and the performance of the face feature vector data is low.
In the traditional searching mode of searching cases by cases, case results with the same case type in the case library are searched (through searching of structured semantic fields in the cases), the searched effective case results are not necessarily accurate, especially when the case library stores cases with long time and huge quantity, case handling staff often take a long time and great efforts to search, even if secondary attribute screening is carried out on the results again, the effective case can not be accurately searched, huge efforts can be consumed under the condition of massive data, even case handling opportunities are misused, and the method is quite intelligent.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provide a high-efficiency retrieval method for massive cases based on distinguishing storage media.
The technical scheme adopted for realizing the purpose of the invention is a high-efficiency retrieval method of massive case data based on distinguishing storage media, which comprises the following steps:
s1, storing structural semantic information of pedestrians or cyclists into a relational database table, storing face characteristic data into a non-relational database table, and establishing an association relationship between the relational database table and the non-relational database table by using the target type of the pedestrians or cyclists, the face characteristics, the structural semantic information of the pedestrians or cyclists and the target unique identifier of the related case;
s2, performing pedestrian or cyclist target detection, face detection and face feature data extraction on the uploaded picture to be detected through a deep learning algorithm;
s3, selecting one or more fields of pedestrian or cyclist types and structural semantic information, inquiring corresponding target unique identifiers in a relational database table through the structural semantic information, and combining the target types of the pedestrians or cyclist, and efficiently and quickly retrieving face feature data meeting the conditions in a face feature table in a non-relational database;
s4, comparing the face characteristic data extracted in the S2 with the face characteristic data retrieved in the S3 to obtain a score of similarity;
s5, finding out the corresponding case unique identifier from the face case association table of the relational database by the target unique identifier, and combining the obtained similarity scores to finally obtain the similarity scores of the cases;
s6, sorting the case similarity scores from high to low and then returning a final result.
In the above technical solution, the step S1 includes: and (3) selecting a frame of the pedestrian or the cyclist in the video snapshot, firstly detecting a target of the pedestrian or the cyclist, secondly detecting a human face and extracting human face characteristic data through a deep learning algorithm, finally extracting structural semantic information of the pedestrian or the cyclist, storing the structural semantic information of the pedestrian or the cyclist into a relational database table through a back-end program after extracting the two types of data, and storing the human face characteristic data into a non-relational database table.
Further, three data tables are established in the relational database: a pedestrian target information table, cyclist target information and a face case association table; a face feature table is built in a non-relational database,
the pedestrian target information table field comprises a target unique identifier, whether a human face exists, a human face coordinate, a deletion mark and a pedestrian structural semantic field in a plurality of fields of target pedestrian structural information;
the target information table field of the cyclist comprises a plurality of fields such as a target unique identifier, whether a face exists, face coordinates, a deletion mark, target cyclist structural information and the like;
the face case association table field comprises: structured information of a primary key unique identifier, a case unique identifier, a target type, a target unique identifier;
the face feature table field includes: storing the unique target identifier, the target type and the face characteristic data in a K-V key value mode;
the video snapshot is manually marked and identified to be the target type of the pedestrian or the cyclist, the face characteristics, the structural semantic information of the pedestrian or the cyclist and the case unique identifier of the case are respectively stored in the four tables in sequence so as to establish the association relation between the relational database table and the non-relational database table.
Compared with the prior art, the invention has the following advantages:
(1) Besides supporting pedestrian target detection, the method also supports target detection, face feature detection and structural semantic extraction of cyclists.
(2) Mass storage and high performance reading: according to different structured and unstructured data structures, different storage media are used for storage, namely a relational database and a non-relational database (NOSQL, NEWSQL), and the advantages of storage space, high-performance read-write data (especially mass data), high expansibility and the like are effectively and reasonably utilized.
(3) Face recognition is fast: and a manual labeling mode is adopted instead of an automatic mode, pedestrians, cyclists or faces in the corresponding video snapshot are directly selected in a frame mode, the steps of automatic detection and automatic identification of all targets (pedestrians, cyclists and vehicles) in the video snapshot are omitted, and detection and identification time is saved.
(4) Quick and accurate recognition case: aiming at different service scenes, massive cases are quickly and accurately searched in a search mode of combining face feature data and human body structural semantics.
Drawings
Fig. 1 is a flow chart of data storage in the efficient retrieval method of massive case data based on distinguishing storage media of the invention.
Fig. 2 is a case retrieval flow chart in the efficient retrieval method of massive case data based on distinguishing storage media of the present invention.
Detailed Description
The invention will now be described in further detail with reference to the drawings and to specific examples.
The invention discloses a method for efficiently retrieving massive case data based on distinguishing storage media, which comprises two parts of data storage and case retrieval, wherein the flow of the data storage is shown in a figure 1, and the method comprises the following steps:
s1, manually marking, selecting a pedestrian or a cyclist in a video snapshot, detecting a target of the pedestrian or the cyclist, extracting face detection and face feature data through a deep learning algorithm, finally extracting structural semantic information of the pedestrian or the cyclist, storing the structural semantic information (structural data) of the pedestrian or the cyclist into a relational database table through a back-end program after extracting the two types of data, and storing the face feature data (unstructured data) into a non-relational database (NOSQL, NEWSQL) table, wherein the data storage is described in detail as follows:
(1) Establishing three data tables in a relational database: the pedestrian target information table, the cyclist target information and the face case association table are specifically as follows:
the first pedestrian target information table field includes: the structured information such as the unique identifier of the target, whether the target has a face, the face coordinates, the deletion mark (0: deleted 1: not deleted), the structured information of the target pedestrian (for example, the pedestrian structured semantic field disclosed in the panoramic description method and the system of the pedestrian target under the monitoring scene disclosed in the Chinese patent with publication number 107341445A, see the following table 1, and not including the face feature data) and the like is stored in the two-dimensional table of the relational database.
TABLE 1
The second rider target information table field includes: the target unique identifier, whether the face exists, the face coordinates, the deletion mark (0: deletion 1: undeleted), a plurality of fields (see table 2, which does not contain face feature data) of the structured information of the target cyclist, and the like, and the structured information are stored in a relational database two-dimensional table.
TABLE 2
The third face case association table field includes: structured information such as a primary key unique identifier, a case unique identifier, a target type (1: pedestrian; 4: cyclist), a target unique identifier (corresponding to the unique identifier in the pedestrian target information table and the cyclist target information, respectively) and the like is stored in the relational database two-dimensional table.
(2) A face feature table is built in a non-relational database (NOSQL, NEWSQL), and the fields comprise: the unique target identifier, the target type and the face characteristic data are stored in a K-V key value mode, namely: the unique target identifier and the target type field are combined to be used as keys, and the face characteristic data is used as a value to be stored in a non-relational database table.
(3) And manually marking the video snapshot, and respectively storing the data such as the target type of the pedestrian or the cyclist, the face characteristics, the structural semantic information of the pedestrian or the cyclist, the case unique identifier of the case and the like identified by the algorithm into the four tables in sequence (the generated target unique identifier is associated with the target type) so as to establish the association relation between the relational database table and the non-relational database table.
The case retrieval flow is shown in fig. 2, and includes:
s2, detecting a target of a pedestrian or a cyclist in the uploaded picture or the appointed picture URL through a deep learning algorithm, detecting a human face, and extracting human face characteristic data;
s3, manually selecting one or more fields of pedestrian or cyclist types and structural semantic information through case retrieval, inquiring corresponding target unique identifiers in a relational database table through the structural semantic information, and combining the target types of the pedestrians or cyclists, and efficiently and quickly retrieving face feature data meeting the conditions in a face feature table in a non-relational database;
s4, comparing the face characteristic data extracted in the S2 with the face characteristic data retrieved in the S3 to obtain a score of similarity;
s5, finding out the corresponding case unique identifier from the face case association table of the relational database by the target unique identifier, and combining the similarity scores obtained in the step S4 to finally obtain the similarity scores of the cases;
s6, sorting the case similarity scores from high to low and then returning a final result.
In the prior art, the structured information is used for searching, or the face feature vector is used singly for comparison, and the search and comparison result still needs to be manually screened and filtered for the second time, so that huge effort and even a case handling opportunity are consumed under the condition of massive data.

Claims (2)

1. A method for efficiently retrieving massive case data based on distinguishing storage media is characterized by comprising the following steps:
s1, storing structural semantic information of pedestrians or cyclists into a relational database table, storing face characteristic data into a non-relational database table, and establishing an association relationship between the relational database table and the non-relational database table by using the target type of the pedestrians or cyclists, the face characteristics, the structural semantic information of the pedestrians or cyclists and the target unique identifier of the related case; three data tables are established in a relational database: a pedestrian target information table, cyclist target information and a face case association table; a face feature table is built in a non-relational database,
the pedestrian target information table field includes: a target unique identifier, whether a face exists, a face coordinate, a deletion mark and a pedestrian structural semantic field in a plurality of fields of target pedestrian structural information;
the rider target information table field includes: a plurality of fields such as a target unique identifier, whether a face exists, face coordinates, a delete marker, target cyclist structured information, and the like;
the face case association table field comprises: structured information of a primary key unique identifier, a case unique identifier, a target type, a target unique identifier;
the face feature table field includes: storing the unique target identifier, the target type and the face characteristic data in a K-V key value mode;
manually marking the identified target type of the pedestrian or the cyclist, the face characteristics, the structured semantic information of the pedestrian or the cyclist and the case unique identifier of the case in the video snapshot to the four tables in sequence respectively so as to establish an association relationship between the relational database table and the non-relational database table;
s2, performing pedestrian or cyclist target detection, face detection and face feature data extraction on the uploaded picture to be detected through a deep learning algorithm;
s3, selecting one or more fields of pedestrian or cyclist types and structural semantic information, inquiring corresponding target unique identifiers in the relational database table through the structural semantic information, and combining the target types of the pedestrians or cyclist, and efficiently and quickly retrieving face feature data meeting the conditions in a face feature table in the non-relational database;
s4, comparing the face characteristic data extracted in the S2 with the face characteristic data retrieved in the S3 to obtain a score of similarity;
s5, finding out the corresponding case unique identifier from the face case association table of the relational database by the target unique identifier, and combining the obtained similarity scores to finally obtain the similarity scores of the cases;
s6, sorting the case similarity scores from high to low and then returning a final result.
2. The efficient retrieval method of massive case data based on distinguishing storage media according to claim 1, wherein step S1 includes: through manual annotation, a pedestrian or a cyclist is subjected to frame selection in a video snapshot, through a deep learning algorithm, a target of the pedestrian or the cyclist is detected, face detection and face feature data are extracted, structural semantic information of the pedestrian or the cyclist is extracted, after the two types of data are extracted, the structural semantic information of the pedestrian or the cyclist is stored in a relational database table through a back-end program, and the face feature data are stored in a non-relational database table.
CN201910771016.5A 2019-08-20 2019-08-20 Efficient retrieval method for massive case data based on storage medium distinguishing Active CN110569390B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910771016.5A CN110569390B (en) 2019-08-20 2019-08-20 Efficient retrieval method for massive case data based on storage medium distinguishing

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910771016.5A CN110569390B (en) 2019-08-20 2019-08-20 Efficient retrieval method for massive case data based on storage medium distinguishing

Publications (2)

Publication Number Publication Date
CN110569390A CN110569390A (en) 2019-12-13
CN110569390B true CN110569390B (en) 2024-01-30

Family

ID=68774135

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910771016.5A Active CN110569390B (en) 2019-08-20 2019-08-20 Efficient retrieval method for massive case data based on storage medium distinguishing

Country Status (1)

Country Link
CN (1) CN110569390B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111586354A (en) * 2020-04-28 2020-08-25 上海市保安服务(集团)有限公司 Investigation system
CN112231530A (en) * 2020-11-19 2021-01-15 中国民航信息网络股份有限公司 Airport remote boarding check method based on biological information and related device
CN112487222B (en) * 2020-11-30 2021-11-30 江苏正赫通信息科技有限公司 Method for quickly searching and effectively storing similar human faces

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103235825A (en) * 2013-05-08 2013-08-07 重庆大学 Method used for designing large-quantity face recognition search engine and based on Hadoop cloud computing frame
JP2017076424A (en) * 2010-12-20 2017-04-20 セールスフォース ドット コム インコーポレイティッド Methods and systems for retrieving data
CN108268600A (en) * 2017-12-20 2018-07-10 北京邮电大学 Unstructured Data Management and device based on AI
CN108984799A (en) * 2018-08-21 2018-12-11 北京深瞐科技有限公司 A kind of video data handling procedure and device
CN108984598A (en) * 2018-05-31 2018-12-11 中国地质大学(武汉) A kind of fusion method and system of relationship type geologic database and NoSQL
CN109214315A (en) * 2018-08-21 2019-01-15 北京深瞐科技有限公司 Across the camera tracking method and device of people's vehicle
CN109993102A (en) * 2019-03-28 2019-07-09 北京达佳互联信息技术有限公司 Similar face retrieval method, apparatus and storage medium

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2017076424A (en) * 2010-12-20 2017-04-20 セールスフォース ドット コム インコーポレイティッド Methods and systems for retrieving data
CN103235825A (en) * 2013-05-08 2013-08-07 重庆大学 Method used for designing large-quantity face recognition search engine and based on Hadoop cloud computing frame
CN108268600A (en) * 2017-12-20 2018-07-10 北京邮电大学 Unstructured Data Management and device based on AI
CN108984598A (en) * 2018-05-31 2018-12-11 中国地质大学(武汉) A kind of fusion method and system of relationship type geologic database and NoSQL
CN108984799A (en) * 2018-08-21 2018-12-11 北京深瞐科技有限公司 A kind of video data handling procedure and device
CN109214315A (en) * 2018-08-21 2019-01-15 北京深瞐科技有限公司 Across the camera tracking method and device of people's vehicle
CN109993102A (en) * 2019-03-28 2019-07-09 北京达佳互联信息技术有限公司 Similar face retrieval method, apparatus and storage medium

Also Published As

Publication number Publication date
CN110569390A (en) 2019-12-13

Similar Documents

Publication Publication Date Title
Zheng et al. SIFT meets CNN: A decade survey of instance retrieval
CN110569390B (en) Efficient retrieval method for massive case data based on storage medium distinguishing
Gammeter et al. I know what you did last summer: object-level auto-annotation of holiday snaps
Chou et al. Pattern-based near-duplicate video retrieval and localization on web-scale videos
Zhou et al. Scalar quantization for large scale image search
CN103530652B (en) A kind of video categorization based on face cluster, search method and system thereof
CN102542058B (en) Hierarchical landmark identification method integrating global visual characteristics and local visual characteristics
CN111460961B (en) Static video abstraction method for CDVS-based similarity graph clustering
US20080162561A1 (en) Method and apparatus for semantic super-resolution of audio-visual data
CN105320705A (en) Retrieval method and device for similar vehicle
US9773058B2 (en) Methods and systems for arranging and searching a database of media content recordings
US9390170B2 (en) Methods and systems for arranging and searching a database of media content recordings
CN110110147A (en) A kind of method and device of video frequency searching
CN111177432A (en) Large-scale image retrieval method based on hierarchical depth hash
CN102508901A (en) Content-based massive image search method and content-based massive image search system
CN101515286B (en) Image matching method based on image feature multi-level filtration
CN103810300A (en) Method and device for inquiring non-index coverage data
CN108491543A (en) Image search method, image storage method and image indexing system
CN110287369A (en) A kind of semantic-based video retrieval method and system
Srinagesh et al. A modified shape feature extraction technique for image retrieval
WO2010071617A1 (en) Method and apparatus for performing image processing
CN103092935A (en) Approximate copy image detection method based on scale invariant feature transform (SIFT) quantization
CN111666263A (en) Method for realizing heterogeneous data management in data lake environment
Xiaoling A novel circular ring histogram for content-based image retrieval
CN104636330A (en) Related video rapid searching method based on structural data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant