CN111506772B

CN111506772B - Image searching method and system based on image feature extraction

Info

Publication number: CN111506772B
Application number: CN201910095245.XA
Authority: CN
Inventors: 胡钢
Original assignee: Beijing Xiaoxi Intelligent Technology Center LP
Current assignee: Chongqing Xiaoxi Intelligent Technology Co ltd
Priority date: 2019-01-31
Filing date: 2019-01-31
Publication date: 2023-05-09
Anticipated expiration: 2039-01-31
Also published as: CN111506772A

Abstract

The invention discloses a picture searching method and system based on image feature extraction, comprising a feature extraction and comparison module, a picture feature extraction module and a picture searching module, wherein the source video is compressed and stored in a compressed video database through the feature extraction and comparison module, clustering is performed to form a cluster group, and a cluster sequence set corresponding to all source videos and a cluster sequence set of pictures to be queried are obtained, so that comparison is performed in the picture searching module to obtain a comparison result, and an actual frame picture and video are obtained by combining indexes stored in the compressed video database, so that further video acquisition is performed. The beneficial effects are that: and (3) carrying out mutual comparison of feature extraction on each frame in the video by utilizing an image feature extraction technology, and eliminating repeated frame feature sequences. The method has the advantages that the low occupied memory is realized, the memory occupation is reduced after video compression, compared with the prior art, the image search and retrieval speed is increased, and the positioning is accurate.

Description

Image searching method and system based on image feature extraction

Technical Field

The invention relates to the technical field of image searching, in particular to an image searching method and system based on image feature extraction.

Background

With the development of the film and television industry, a large amount of videos are saved for recording human culture and artistic heritage, or some film and television watching platforms or broadcast and television groups for profit. However, due to the accumulation of time, huge data still exists after the classification preservation is performed.

With the development of great use and intelligence, people begin to search for more convenient, rapid and accurate video searching methods. In the prior art, when video saving is performed, text information such as naming or recording time is usually set for each video through keywords. People search by inputting keywords or time, but because the data volume is large, the keywords are often repeated, and even if the keywords are searched in a reduced range, a lot of data still exist, when searching, people can only search correspondingly by watching related video content, the data volume is large, the searching speed is low, and the time is wasted.

Along with development, people start to propose to adopt pictures to perform video targeted search, namely, search from a huge database through pictures in any video acquired randomly belongs to which video, in the prior art, the video targeted search still stays in comparing frames of each video with the acquired pictures one by one, time consumption is long, and a server has high requirement, large memory occupation and low applicability.

Therefore, there is an improvement in the image searching technology in the prior art, and there is a need to propose an image searching technology to increase the image searching speed.

Disclosure of Invention

Aiming at the problems, the invention provides a graph search method and a graph search system based on image feature extraction, which are used for extracting image features of video and storing the features in a concise way.

In order to achieve the above purpose, the invention adopts the following specific technical scheme:

the image searching method based on image feature extraction is characterized by comprising the following specific steps:

step 1: extracting and compressing the characteristics of each frame of picture in M videos to obtain M video characteristic point sets;

step 2: sending the M video feature point sets into a compressed video database, and obtaining a clustering cluster and a cluster sequence set of the M video feature points through clustering operation;

step 3: the picture obtaining end obtains a picture to be queried, and a picture feature sequence to be queried is obtained after the picture feature extraction module; sending the image clusters into a compressed video database for clustering to obtain a to-be-queried image cluster sequence;

step 4: comparing the image cluster sequence to be queried with a cluster sequence set of M video feature points to obtain a sequence similarity; extracting a video characteristic point set and a frame characteristic sequence which correspond to the sequence similarity of the picture cluster sequence to be queried, wherein the video characteristic point set corresponds to the highest sequence similarity of the picture cluster sequence to be queried;

Step 5: extracting a frame picture in a corresponding video according to the video feature point set and the index of the frame feature sequence, and comparing the frame picture with a picture to be queried to obtain a picture similarity;

step 6: and if the picture similarity is greater than the picture similarity threshold, outputting the corresponding video or the index of the corresponding video.

Through the design, after the M videos are subjected to feature extraction and compression, M video feature point sets are correspondingly obtained. And forming a cluster group and a cluster sequence set of M video feature points corresponding to the M videos through clustering in a compressed video database. After any picture is obtained, a picture cluster sequence to be queried is obtained after feature extraction and clustering, and the frame picture with the maximum similarity and the index of the corresponding video are found from the cluster sequence set of M video feature points by comparing the picture cluster sequence to be queried with the cluster sequence set of M video feature points. The accuracy of the searched video is further confirmed by comparing the frame picture meeting the requirements with the actual picture of the picture to be searched, and the searching precision is improved. By designing a new module, video feature extraction and compression are realized, the memory occupation amount of the video is reduced, continuous features with large frame similarity in the video are utilized, frame pictures with large similarity are deleted, and frames in the video are greatly reduced. And after feature extraction is performed by combining the uniqueness of the feature sequences of the pictures and the difference of the feature sequences extracted by each picture, the reserved video feature point set memory is further reduced. Wherein M is a positive integer.

Further, in step 1, the specific steps for obtaining M video feature point sets after feature extraction and compression are performed on each frame of picture in M videos are as follows:

step 11: the video acquisition end sequentially acquires a section of video Ax from M video feature points and stores the video Ax in a storage module to be extracted;

step 12: the storage module to be extracted obtains the total frame number N of the video Ax, and each frame in the video Ax is sequentially sent to the feature extraction and comparison module;

step 13: the feature extraction and comparison module is used for designating a reference frame, preprocessing the video Ax according to the reference frame, extracting, comparing, screening, deleting and compressing the image features to obtain a video feature point set, wherein the specific content is as follows:

s131: the feature extraction and comparison module acquires a first frame, enables the first frame to be a reference frame, sequences the extracted features after extracting the features of the first frame image to obtain a reference frame feature sequence, and stores the reference frame feature sequence in the feature sequence storage module;

s132: the feature extraction and comparison module obtains the next frame, enables the frame to be a frame to be compared, extracts the image features of the frame to be compared, and then sorts the extracted features to obtain a frame feature sequence to be compared;

S133: comparing the frame characteristic sequence to be compared with the reference frame characteristic sequence;

if the similarity is greater than or equal to the set similarity threshold, discarding the feature sequence of the frame to be compared, and entering step S134;

if the similarity is smaller than the set similarity threshold, the frame feature sequence to be compared is made to be a new reference frame feature sequence, namely: the corresponding frames to be compared are made to be reference frames; sequentially storing the new reference frame characteristic sequences in a characteristic sequence storage module, and entering step S134;

s134: judging whether the Nth frame in the video Ax is completely compared; if yes, the feature sequence storage module forms all the stored reference frame feature sequences into a video feature point set, and outputs the video feature point set of the video Ax; otherwise, returning to the step S132;

step 14: and judging whether all M videos are compressed, if yes, ending the step 1, otherwise, returning to the step 11.

Since the video is composed of frames arranged in sequence, and the process of composing the animation is to be performed, the image similarity of adjacent frames is large. Through the design, when the image searching function is realized, in order to simplify the compressed video database, the image characteristics of each frame of the video are extracted, adjacent frames are compared one by one, repeated frames are deleted, and finally, a video characteristic point set with low repetition rate is reserved. When the image searching function is performed, after the image features of the image are extracted, the corresponding video feature point set can be quickly found. Compared with the prior art, the image searching process is short in period, and is convenient for popularization and use. The time position of the picture in the video can be accurately positioned.

When the feature extraction and comparison module obtains the picture of any frame, the picture needs to be preprocessed, wherein preprocessing contents comprise: picture size resetting processing, which resets the picture to a uniform size, and gray scale processing, which can be set according to system customization. The gray value of gray processing is also self-defined and set, and the gray value is 0-255.

The similarity threshold is set in a user-defined mode, after the frame feature sequence to be compared is compared with the reference frame feature sequence, if the similarity is larger than the similarity threshold, the two frames are considered to be only needed to be stored in the reference frame, and then the method is analogically performed, video feature extraction is performed on one section of video, and only part of feature sequence is stored, so that the video occupation capacity for searching for the image is greatly reduced, the image searching is not affected, and the searching speed is accelerated.

Where x in Ax is a positive integer.

In step 2, the specific step of obtaining a cluster group by clustering the M video feature point sets is as follows:

step 21a: randomly selecting k features from the M video feature point sets as the clustering centers of the initial clusters;

Step 21b: respectively calculating the clustering distances from all the features to k features in the M video feature point sets as the centers of the clustering clusters, and setting the minimum clustering distance to divide the features;

step 21c: selecting a cluster with a changed cluster center, calculating a cluster distance average value of the corresponding cluster, determining the cluster center of the cluster according to the cluster distance average value,

step 21d: if the cluster center does not change any more, ending the clustering, outputting a cluster group formed by the k cluster groups and the corresponding cluster center, otherwise, returning to the step 21b;

in the step 2, the specific steps of obtaining the cluster sequence set of the M video feature points through clustering operation by the M video feature point sets are as follows: numbering k cluster groups in the cluster group;

calculating the clustering distance from each feature to k clustering clusters in M video feature point sets in sequence;

and sequencing k clustering distances of each feature, classifying the feature into a cluster corresponding to the minimum clustering distance, and acquiring a cluster number corresponding to the feature until all the features in each video feature point set are classified, so as to obtain M cluster sequence sets.

The method comprises the steps of classifying the features of the video through a cluster analysis algorithm to obtain a cluster group with unique video features, and classifying the features of M videos. And providing a comparison basis when comparing the pictures to be queried. After extracting the features in the pictures to be queried, putting the features into a cluster group to classify each feature, and then carrying out one-to-one comparison on M cluster sequence sets corresponding to M videos, and comparing whether sequences which are the same as the feature classification of the pictures to be queried exist in the M cluster sequence sets. And the images are subjected to feature clustering through a clustering algorithm, so that feature similarity comparison is facilitated, and a comparison basis is provided for image searching.

In step 3, the picture obtaining end obtains the picture to be queried, and the feature sequence of the picture to be queried is obtained after feature extraction; and sending the image clusters into a compressed video database for clustering, wherein the specific steps for obtaining the image cluster sequence to be queried are as follows:

step 31: after the picture obtaining end obtains the picture to be queried, the picture to be queried is sent to a picture feature extraction module;

step 32: the picture feature extraction module is combined with a third party visual library, extracts a picture feature sequence to be queried and sends the picture feature sequence to the compressed video database;

step 33: and calculating all feature clustering distances in the feature sequence of the picture to be queried, classifying the features into the cluster corresponding to the minimum clustering distance, and obtaining the picture cluster sequence to be queried corresponding to the feature sequence of the picture to be queried.

By adopting the scheme, the pictures to be queried are subjected to feature extraction and classification and are used for comparing with M cluster sequence sets so as to find out corresponding frame pictures and source videos.

Still further described, the feature extraction and comparison module and the picture feature extraction module are both connected with a third party visual library; the characteristic extraction and comparison module is at least provided with a frame sending unit and a characteristic sequence receiving unit, wherein the frame sending unit is used for sending the frame to be compared to the third-party visual library to extract the characteristic of the image of the frame to be compared, and feeding the obtained characteristic sequence of the frame to be compared back to the characteristic sequence receiving unit; the image feature extraction module is provided with an image sending unit and an image feature sequence receiving unit to be queried, and the image sending unit is used for sending the image to be queried to the third-party visual library for feature extraction, and feeding the obtained image feature sequence to be queried back to the image feature sequence receiving unit to be queried.

The third party visual library is a computer vision open source library, which may be OpenCV, javaCV, torch3Vision, imLab, CIMG, generic Image Library (GIL) -boost integration, etc.

Still further technical scheme is: the third party visual library is an OpenCV visual library.

OpenCV is a computer vision application platform of open source code, the project is initiated by russian team of subordinate research and development center of intel corporation, the OpenCV is aimed at realizing real-time computer vision, and is a cross-platform computer vision library. The development is rapid from the development date, and the tripod force support and contribution of calves in various companies and industries are obtained, and the BSD is open source permission, so that the BSD can be freely applied to the fields of scientific research and commercial application. OpenCV is taken as a powerful computer vision open source library, and references to implementation details and styles of MatLab to a great extent, for example, after opencv2.x version, more and more functions implement functions of MatLab, and even the names of the crisp continuous functions are all identical (such as imread, imshowh, imwriter, etc.). In this way, the distance between product development and academic research is shortened, and the development efficiency of developers is greatly improved, so that Intel corporation is a great corporation. In computer memory, the digital images are stored and operated on in the form of a matrix, for example, in MatLab, one matrix is associated after image reading, and in OpenCV, the same is true. In the early opencv1.X version, the processing of images was achieved by an ipiimage (the name originates from another open source library Intel Image Processing Library of Intel, abbreviated to ipiimage) structure. Early opencvs were written in C language, and therefore the interface provided was also a C language interface, the source code of which was entirely the programming style of C. The IplImage structure is the basic data structure of OpenCV matrix operations. In the OpenCV2.X version, an OpenCV open source library introduces an object-oriented programming idea, a large amount of source codes are rewritten with c++, and a Matrix class (abbreviation of Matrix) is a package class introduced by OpenCV for processing images. Functionally, the Mat class is further enhanced on the basis of an IplImage structure, and because of introducing the C++ advanced programming characteristic, the expansibility of the Mat class is greatly improved, the content of the Mat class is continuously enriched in later versions, and the definition of the Mat class is checked to find that the design implementation is quite comprehensive and specific and basically cover the basic requirement of computer vision on image processing.

OpenCV already contains the following application domain functions: two-dimensional and three-dimensional feature toolboxes, motion estimation, face recognition systems, gesture recognition, human-machine interaction, mobile robots, motion understanding, object authentication, segmentation and recognition, stereo vision, motion tracking, augmented reality (AR technology).

Based on the above functional implementation needs, openCV further includes the following statistical machine learning library: boosting algorithm, decision Tree learning, gradient Boosting algorithm, EM algorithm (expectation maximization), KNN algorithm, naive bayes classification, artificial neural network, random forest, palm vector machine.

Most of the modules in OpenCV are implemented based on c++, some of which are implemented based on C language, and SDKs provided by current opencvs already support c++, java, python, and other language application development. The newly developed algorithms and module interfaces of the current OpenCV are all generated based on C++. OpenCV supports application development on nearly all mainstream OS systems, including Windows, mac, linux, freeBSD, openBSD, etc. The mobile platform supports Android, IOS, blackBerray and other platforms. The user can obtain relevant SDK downloads from OpenCV authorities, develop documents and environment configuration information. OpenCV has attracted the eyes of many companies immediately since release of release 1.0, and is widely used in product development and innovation in many fields, including satellite map and electronic map splicing, image noise processing in medicine, object detection, security and intrusion detection in security monitoring fields, automatic monitoring and alarm, product quality detection in manufacturing and industry, and camera calibration. Unmanned aerial vehicle flight in military field, unmanned and numerous fields such as underwater robot.

Still further technical scheme is: the feature extraction and comparison module is provided with a feature point extraction tool, and the feature point extraction tool is provided with a video reading function block, a frame taking function block, an AKAZE algorithm calling function block, a feature point normalization function block and a file writing function block;

and the feature extraction and comparison module is also internally provided with video feature parameters, wherein the video feature parameters comprise the similarity threshold value, a video input path and a video output path.

Through the scheme, the feature extraction and comparison module is used for realizing functions of video reading, frame picture sending, feature receiving and the like, and searching the source video through a path.

Still further, the unique feature attributes of the frame features in the frame feature sequence include a feature abscissa, a feature ordinate, a feature angle, a feature size, a feature weight, a feature extension, and a feature pyramid layer; according to the unique characteristic attribute, the same characteristic of the attribute such as different shooting angles, different amplified sizes and the like can be extracted, the characteristic extraction is not easy to lose, the extraction integrity is good, and the different angles are convenient.

Still further, all the reference frame feature sequences in the video feature point set are sequentially arranged according to the stored sequence. When the frame characteristic sequence to be compared is compared with the reference frame characteristic sequence, the similarity is smaller than the set similarity threshold value, the frame characteristic sequence to be compared is stored in the characteristic sequence storage module, and the frame characteristic sequence to be compared is arranged behind the previous reference frame characteristic sequence. After the final stored video feature point set is used, the approximate position of the feature sequence can be obtained through the feature sequence of any unknown position after comparison, and positioning is carried out.

Still further, any frame feature sequence or picture feature sequence to be queried comprises a frame sequence number, a feature sequence start label, feature sequence content and a feature sequence end label which are connected in sequence; in order to mark each frame, a frame number, a feature sequence start label and a feature sequence end label are set, each frame can be distinguished through marking, and the start and the end of each frame are digitally identified and marked.

Still further, the signature sequence start index consists of an integer of X bytes; the characteristic sequence end mark consists of an integer of Y bytes; the video feature point set starting point is connected with a starting mark; and every two reference frame feature sequences are connected through feature connectors.

Namely, for the video A1, the video feature point set starting label is: video A1 feature sequence.

The image search system based on image feature extraction comprises the following key technologies: the method comprises a video acquisition end and a picture acquisition end; the video acquisition end is connected with a storage module to be extracted, the storage module to be extracted is connected with a feature extraction and comparison module, a frame sending unit and a feature sequence receiving unit are arranged in the feature extraction and comparison module, a third-party visual library connecting end is arranged on the feature extraction and comparison module and used for respectively connecting the frame sending unit and the feature sequence receiving unit with a third-party visual library, a feature sequence storage module is also connected with the feature extraction and comparison module, the feature sequence storage module is connected with a compressed video database, and the compressed video database is connected with a picture searching module;

The image acquisition end is connected with the compressed video database through an image feature extraction module, and a third-party visual library connection end is arranged on the image feature extraction module and used for extracting an image feature sequence to be queried, the image acquisition end is also connected with the image searching module, and the image searching module is connected with the source video library.

Through the system, the functions of acquiring, storing, generating a feature sequence, extracting video frames and deleting video frames of a source video are realized, the most simplified video feature point set is output, a cluster group of video features is formed by compressing a video database, and the cluster group is combined to obtain a cluster sequence set of M video feature points. The system is used for obtaining the pictures, extracting and classifying the picture features, classifying the comparison results after comparing the picture with the cluster sequence sets of M video feature points, obtaining the pictures with high similarity, and further determining the search result after comparing the actual pictures by combining the pictures to be queried and the frame pictures, thereby obtaining the corresponding video or the position where the video is stored by combining the source video index in the compressed video database, and realizing quick and accurate video search.

Still further, a clustering unit and a data unit are arranged in the compressed video database;

the clustering unit is used for carrying out clustering operation on the M video feature point sets to obtain a clustering cluster and a cluster sequence set of the M video feature points; clustering all the features in the feature sequence of the picture to be queried to obtain a cluster sequence of the picture to be queried;

m video feature point sets, M cluster sequence sets of video feature points, a source video index of each video, any picture feature sequence to be queried and any picture cluster sequence to be queried are stored in the data unit; the image searching module is provided with a cluster comparison unit, a picture acquisition unit, a frame picture acquisition unit and a source video acquisition unit;

the cluster comparison unit is used for comparing the image cluster sequence to be queried with a cluster sequence set of M video feature points to obtain a sequence similarity; the picture acquisition unit is used for acquiring a picture to be queried; the frame picture acquisition unit is used for acquiring a frame picture corresponding to the highest sequence similarity;

the source video acquisition unit is used for acquiring a source video corresponding to the highest sequence similarity.

Through the scheme, the clustering unit performs clustering operation on each feature in the M videos, and classifies each feature. The data unit is used for storing the characteristic data, the characteristic classification data and the indexes corresponding to each video in the source videos of all videos. And when searching for images by using the images, classifying and comparing the acquired images by using the cluster comparison unit, and finding out the most similar sequence from the cluster sequence set of M video feature points.

The invention has the beneficial effects that: and (3) carrying out mutual comparison of feature extraction on each frame in the video by utilizing an image feature extraction technology, and eliminating repeated frame feature sequences. The low memory occupation is realized, and the data volume is reduced after video compression. And when the image searching is carried out, classifying the acquired images based on the clustering cluster group obtained by the clustering algorithm, and determining the similarity according to the classification result. Process blocks are compared and the approximate number of frames that a picture appears in the video can be located. Compared with the prior art, the invention can accelerate the searching speed and has accurate positioning.

Drawings

FIG. 1 is a block diagram of the system architecture of the present invention;

FIG. 2 is a flow chart of the method of searching for shadows in the drawing according to the present invention;

FIG. 3 is a flow chart of an equivalent video compression storage method of the present invention;

FIG. 4 is a diagram of a MADlib open source machine learning library architecture;

FIG. 5 is a diagram of a detected video list;

fig. 6 is an equivalent video compression operation transient state diagram.

FIG. 7 is a picture to be queried;

FIG. 8 is a diagram of the original frame picture search result;

FIG. 9 is a diagram of a frame picture search result for crop size 1;

fig. 10 is a schematic diagram of a frame picture search result of clipping size 2.

Detailed Description

The following describes the embodiments and working principles of the present invention in further detail with reference to the drawings.

As can be seen in fig. 1 and fig. 2, a method for image search based on image feature extraction specifically includes the following steps: step 1: extracting and compressing the characteristics of each frame of picture in M videos to obtain M video characteristic point sets; step 2: sending the M video feature point sets into a compressed video database, and obtaining a clustering cluster and a cluster sequence set of the M video feature points through clustering operation; step 3: the picture obtaining end obtains a picture to be queried, and a picture feature sequence to be queried is obtained after the picture feature extraction module; sending the image clusters into a compressed video database for clustering to obtain a to-be-queried image cluster sequence; step 4: comparing the image cluster sequence to be queried with a cluster sequence set of M video feature points to obtain a sequence similarity; extracting a video characteristic point set and a frame characteristic sequence which correspond to the sequence similarity of the picture cluster sequence to be queried, wherein the video characteristic point set corresponds to the highest sequence similarity of the picture cluster sequence to be queried; step 5: extracting a frame picture in a corresponding video according to the video feature point set and the index of the frame feature sequence, and comparing the frame picture with a picture to be queried to obtain a picture similarity;

In this embodiment, a sequence similarity threshold is further set, where the sequence similarity threshold is 80%, and when the highest sequence similarity is lower than the threshold, the next step is not performed, which indicates that no related content is retrieved. And when the highest sequence similarity is greater than or equal to the threshold value, extracting a video feature point set and a frame feature sequence which correspond to the highest sequence similarity of the picture cluster sequence to be queried. In this embodiment, the picture similarity threshold is 90%. As can be seen from fig. 3, in step 1, the specific steps for obtaining M video feature point sets after feature extraction and compression are performed on each frame of picture in M videos are as follows:

step 11: the video acquisition end sequentially acquires a section of video Ax from M video feature points and stores the video Ax in a storage module to be extracted; step 12: the storage module to be extracted obtains the total frame number N of the video Ax, and each frame in the video Ax is sequentially sent to the feature extraction and comparison module; step 13: the feature extraction and comparison module is used for designating a reference frame, preprocessing the video Ax according to the reference frame, extracting, comparing, screening, deleting and compressing the image features to obtain a video feature point set, wherein the specific content is as follows:

S131: the feature extraction and comparison module acquires a first frame, enables the first frame to be a reference frame, sequences the extracted features after extracting the features of the first frame image to obtain a reference frame feature sequence, and stores the reference frame feature sequence in the feature sequence storage module; s132: the feature extraction and comparison module obtains the next frame, enables the frame to be a frame to be compared, extracts the image features of the frame to be compared, and then sorts the extracted features to obtain a frame feature sequence to be compared;

if the similarity is greater than or equal to the set similarity threshold, discarding the feature sequence of the frame to be compared, and entering step S134; if the similarity is smaller than the set similarity threshold, the frame feature sequence to be compared is made to be a new reference frame feature sequence, namely: the corresponding frames to be compared are made to be reference frames; sequentially storing the new reference frame characteristic sequences in a characteristic sequence storage module, and entering step S134;

In the present embodiment, when preprocessing is performed on each frame of picture in the video Ax, after the picture size resetting process, the picture is reset to a size of 400X300, and the gradation value is set to 50%. In this embodiment, the feature extraction and comparison module is connected to a third party visual library OpenCV; in this embodiment, an AKAZE algorithm is set in the OpenCV vision library.

In the embodiment, in step 2, the specific step of obtaining a cluster group by clustering the M video feature point sets is as follows:

step 21a: randomly selecting k features from the M video feature point sets as the clustering centers of the initial clusters; step 21b: respectively calculating the clustering distances from all the features to k features in the M video feature point sets as the centers of the clustering clusters, and setting the minimum clustering distance to divide the features;

In the embodiment, the image feature clustering is realized by connecting with an MADlib open source machine learning library, and the adopted clustering method is a k-means algorithm.

MADlib is an open source machine learning library developed by pitotal corporation in cooperation with berkeley university, providing a variety of data conversion, data exploration, statistics, data mining, and machine learning methods with which structured data can be easily analyzed and mined. The user can conveniently load MADlib into the database, and the analysis function of the database is expanded. MADlib was the incubator project of Apache software foundation, 7 in 2015, and was developed in two years, 8 in 2017, as the top-level Apache project. The latest version of the system is MADlib 1.15, and can be seamlessly integrated with a PostgreSQL, greenplum database system, a HAWQ database system and the like. In this embodiment, a plug-in integrated into PostgreSQL is employed in the pre-roll.

As can be seen from fig. 4, the MADlib open source machine learning library architecture is at the top level of the architecture, the user interface. As previously described, the user need only complete the data mining by calling the MADlib-provided function in the SQL query statement. The SQL syntax here is to be matched to the particular database management system. The bottom layer is a database management system such as Greenplum, postgreSQL, and the query request is finally processed by the database management system.

It can also be seen from fig. 4 that the MADlib system architecture consists of the following four main components from top to bottom: python calls a driving function realized by an SQL template, a high-level abstract layer realized by Python, a core function realized by C++, and a low-level database abstract layer realized by C++.

The main design idea of driving the MADlib architecture is consistent with Hadoop, and is shown as follows:

the local data within the database is manipulated without unnecessary data movement in multiple runtime environments.

The database engine functions are fully utilized, but the data mining logic is separated from the implementation details of the particular database. Parallelism and scalability provided by MPP shared-nothing techniques, such as Greemplum or HAWQ database systems. The maintenance activities performed are open to the Apache community and ongoing academic research.

The basic idea of the k-means algorithm is as follows: the basic idea of the k-means cluster partitioning method is: a given set of N data records is partitioned into K packets, each representing a cluster, K < N. And the K packets satisfy the following condition: each packet contains at least one data record. Each data record belongs to and only belongs to one packet.

The algorithm gives an initial grouping first, and then changes the grouping by iterative method, so that the grouping scheme after each improvement is better than before, and the so-called good standard is: the closer the objects in the same group are (having converged, iterating until there is little difference in the data within the group), the better the objects in different groups are the farther they are.

The working principle of the k-means algorithm is as follows:

firstly, randomly selecting K points from a data set, wherein each point initially represents the center of each cluster, then calculating the distance from each residual sample to the center point, assigning the distance to the nearest cluster, and then recalculating the average value of each cluster as a new center point, wherein the whole process is repeated continuously, and if the adjacent two adjustments have no obvious change, the cluster formed by the data clusters is converged. One feature of the present algorithm is to examine whether the classification of each sample is correct in each iteration. If the sample is incorrect, the center point is modified after all samples are adjusted, and the next iteration is performed. This process will repeat until a certain termination condition is met, which may be any of the following: no objects are reassigned to different clusters. The cluster center no longer changes.

The sum of squares of errors is locally minimal.

The k-means algorithm is a very typical distance-based clustering algorithm, and uses distance as an evaluation index of similarity, that is, the closer the distance between two objects is, the greater the similarity is. The algorithm considers clusters to consist of objects that are close together, thus targeting a compact and independent cluster as the final target. The inputs to the k-means algorithm are the number of clusters k, and the n data objects, the outputs are k clusters that meet the error minimization criteria. The processing flow is as follows: k objects are arbitrarily selected from the n data objects as initial centers.

The distance between each object and the center objects is calculated, and the corresponding objects are divided according to the minimum distance. The mean value of each changed cluster is recalculated as a new center.

The

loops

2, 3 are cycled until each cluster no longer changes. The termination condition is typically to minimize the sum of squares of the distances of the objects to their cluster centers: the collection category is greater than but not limited to the following categories of television programs:

in the implementation process, firstly, the films classified by categories are collected, so that the division of the group centers is objective and reasonable as far as possible, and the group centers produced later are divided as far as possible and better distinguished.

The collection category is greater than but not limited to the following categories of television programs: news, variety, movie and television series, event series, landscape series, animal series, foreign major, domestic major, war, and indoor series … …

In the embodiment, in step 3, a picture obtaining end obtains a picture to be queried, and a feature sequence of the picture to be queried is obtained after feature extraction; and sending the image clusters into a compressed video database for clustering, wherein the specific steps for obtaining the image cluster sequence to be queried are as follows:

The further technical scheme is as follows: the feature extraction and comparison module and the picture feature extraction module are connected with a third-party visual library; the frame sending unit is used for sending the frame to be compared to the third-party visual library to extract the image characteristics of the frame to be compared, and feeding the obtained frame characteristic sequence to be compared back to the characteristic sequence receiving unit;

The image feature extraction module is provided with an image sending unit and an image feature sequence receiving unit to be queried, and the image sending unit is used for sending the image to be queried to the third-party visual library for feature extraction, and feeding the obtained image feature sequence to be queried back to the image feature sequence receiving unit to be queried.

In this embodiment, the feature extraction and comparison module is provided with a feature point extraction tool, where the feature point extraction tool is provided with a video reading function block, a frame taking function block, an AKAZE algorithm calling function block, a feature point normalization function block, and a file writing function block;

and the feature extraction and comparison module is also internally provided with video feature parameters, wherein the video feature parameters comprise the similarity threshold value, a video input path and a video output path. And sequentially calling a feature point extraction tool by adopting the written shell script, and carrying out feature comparison and input and output according to the feature point extraction tool combined with a similarity threshold of 90 percent, and a video input path and a video output path. When the features are extracted, the operating environment sequentially calls a feature point extraction tool to start feature extraction in a docker container in combination with a written shell script.

In this embodiment, the third party visual library is an OpenCV visual library. In this embodiment, an AKAZE algorithm is set in the OpenCV vision library. In this embodiment, the feature points are extracted and output as a csv file in a specified format; and then packaged into jar files.

In this embodiment, the unique feature attributes of the frame features in the frame feature sequence include a feature abscissa, a feature ordinate, a feature angle, a feature size, a feature weight, a feature extension, and a feature pyramid layer;

all the reference frame feature sequences in the video feature point set are sequentially arranged according to the stored sequence;

any frame characteristic sequence or picture characteristic sequence to be inquired comprises a frame sequence number, a characteristic sequence starting label, characteristic sequence content and a characteristic sequence ending label which are connected in sequence;

the feature sequence start index consists of an integer of X bytes; the characteristic sequence end mark consists of an integer of Y bytes; in the present embodiment, x=y=4.

The video feature point set starting point is connected with a starting mark; and every two reference frame feature sequences are connected through feature connectors.

The characteristic sequence formed by the video Ax is:

video Ax feature sequence + feature connector + frame number of 1 st frame + frame feature sequence start tag + 1 st frame feature sequence content + 1 st frame feature sequence end tag + feature connector + frame number of i1 st frame + frame start tag + i1 st frame feature sequence content + i1 st frame feature sequence end tag + feature connector + frame number of i2 nd frame feature sequence start tag + i2 nd frame feature sequence content + i2 nd frame feature sequence end tag + … … wherein i1, i2 are integers greater than 1 and i1> i2.

In this embodiment, the video feature point compression packet is a binary sequence storage file subjected to gzip compression stroke. As can be seen in fig. 1, a picture searching system based on image feature extraction includes a video acquisition end and a picture acquisition end;

the video acquisition end is connected with a storage module to be extracted, the storage module to be extracted is connected with a feature extraction and comparison module, a frame sending unit and a feature sequence receiving unit are arranged in the feature extraction and comparison module, a third-party visual library connecting end is arranged on the feature extraction and comparison module and used for respectively connecting the frame sending unit and the feature sequence receiving unit with a third-party visual library, a feature sequence storage module is also connected with the feature extraction and comparison module, the feature sequence storage module is connected with a compressed video database, and the compressed video database is connected with a picture searching module;

the image acquisition end is connected with the compressed video database through the image feature extraction module, and a third-party visual library connection end is arranged on the image feature extraction module and used for extracting an image feature sequence to be queried, and the image acquisition end is also connected with the image searching module which is connected with the source video library.

As can also be seen in connection with fig. 1, a clustering unit and a data unit are arranged in the compressed video database;

m video feature point sets, M cluster sequence sets of video feature points, source video indexes of each video, any picture feature sequence to be queried and any picture cluster sequence to be queried are stored in a data unit;

the image searching module is provided with a cluster comparison unit, a picture acquisition unit, a frame picture acquisition unit and a source video acquisition unit; the cluster comparison unit is used for comparing the image cluster sequence to be queried with a cluster sequence set of M video feature points to obtain a sequence similarity;

the picture acquisition unit is used for acquiring a picture to be queried;

the frame picture acquisition unit is used for acquiring a frame picture corresponding to the highest sequence similarity;

To further illustrate this embodiment, 18 videos of the television column "speak aloud" and any two small videos acquired randomly are used as the source videos.

The 18 video download addresses of the "loud speaking" are: http:// qjcq.cbg.cn/dsscl/1.Shtml.

And acquiring and downloading part of videos in the first page of the web pages by using a crawler technology, wherein the total number of the videos is 18. Most of the videos are about 30 minutes, one 45 minutes and one 1 hour long. The other two videos are approximately 3-5 minutes. A total of 20 videos. The video list diagram is shown in detail in fig. 5.

In connection with fig. 3, the nomenclature is explained: wherein 170101001 and 170101002 are short videos of 3-5 minutes, and the rest names are broadcast time and number of a certain period of video of 'loudly speaking'. As 170524001, it is a "suspicion" of a first-period program broadcast in a natural order on the day of 5 months and 24 years of 17. The remaining designations all conform to the rules described above. The total size of the video file is as follows: 2.49GB

The equivalent video compression storage system is installed in an upper computer with a machine configuration of 2U, 4 cores, 16 threads and 32G memory, extraction is carried out by using CPU, calculation force is limited, and average load in the calculation process is 30 percent+. The operation transient state is shown in fig. 6. After feature point extraction is carried out on the whole video, the disk space occupied by the obtained csv file is 1.4GB. And packaging the video characteristic point compression packets into tar.gz files, wherein the disk space occupied by the video characteristic point compression packets is 480MB.

As can be seen from fig. 7, a schematic diagram of the picture to be inspected, wherein the infringement characteristics of the infringement video are shown in reality: most of the video source is partly cut and even self logo information is added. But the case of source video information as part of the infringing video picture is still relatively small. In this embodiment, the testing process mainly includes that a certain frame in the video is subjected to original frame interception: test01; cutting size 1: test02; cutting size 2: three cases of test05 were verified.

As shown in fig. 8-10, it can be seen from the above test that the number of searched results is smaller and smaller with the increase of the cutting size. Because we consider infringement features of infringement videos when computing feature points, source videos are also cropped, and then feature points are extracted. The version currently used for the above presentation was 3 min x 2 segments x 4 scaling scales x 80% frames developed and tested out of furnace (4W frames total), with maximum compatible cut up, down, left, right dimensions of 10%20%.

It should be noted that the above description is not intended to limit the invention, but rather the invention is not limited to the above examples, and that variations, modifications, additions or substitutions within the spirit and scope of the invention will be within the scope of the invention.

Claims

1. A picture searching method based on image feature extraction is characterized by comprising the following specific steps:

step 6: if the picture similarity is greater than the picture similarity threshold, outputting the corresponding video or the index of the corresponding video;

In step 1, the specific steps of obtaining M video feature point sets after feature extraction and compression are performed on each frame of picture in M videos are as follows:

2. The image search method based on image feature extraction of claim 1, wherein: in the step 2, the specific step of obtaining a clustering cluster group by clustering M video feature point sets is as follows:

in the step 2, the specific steps of obtaining the cluster sequence set of the M video feature points through clustering operation by the M video feature point sets are as follows:

numbering k cluster groups in the cluster group;

3. The image searching method based on image feature extraction as claimed in claim 2, wherein in step 3, the image obtaining end obtains the image to be queried, and the feature sequence of the image to be queried is obtained after feature extraction; and sending the image clusters into a compressed video database for clustering, wherein the specific steps for obtaining the image cluster sequence to be queried are as follows:

4. The image search method based on image feature extraction of claim 1, wherein: the feature extraction and comparison module and the picture feature extraction module are connected with a third-party visual library;

the characteristic extraction and comparison module is at least provided with a frame sending unit and a characteristic sequence receiving unit, wherein the frame sending unit is used for sending the frame to be compared to the third-party visual library to extract the characteristic of the image of the frame to be compared, and feeding the obtained characteristic sequence of the frame to be compared back to the characteristic sequence receiving unit;

5. The image search method based on image feature extraction of claim 4, wherein: the feature extraction and comparison module is provided with a feature point extraction tool, and the feature point extraction tool is provided with a video reading function block, a frame taking function block, an AKAZE algorithm calling function block, a feature point normalization function block and a file writing function block;

6. The image search method based on image feature extraction of claim 1, wherein: the third party visual library is an OpenCV visual library.

7. The image search method based on image feature extraction of claim 1, wherein: the unique characteristic attributes of the frame characteristics in the frame characteristic sequence comprise a characteristic abscissa, a characteristic ordinate, a characteristic angle, a characteristic size, a characteristic weight, a characteristic expansion and a characteristic pyramid layer;

The feature sequence start label consists of an integer of X bytes; the characteristic sequence end mark consists of an integer of Y bytes;

8. The image search system based on image feature extraction is characterized in that: the method comprises a video acquisition end and a picture acquisition end;

the image acquisition end is connected with the compressed video database through an image feature extraction module, a third-party visual library connection end is arranged on the image feature extraction module and used for extracting an image feature sequence to be queried, the image acquisition end is also connected with the image searching module, and the image searching module is connected with a source video library;

The specific working steps of the image search system based on image feature extraction are as follows:

9. The image feature extraction based image search system of claim 8, wherein: a clustering unit and a data unit are arranged in the compressed video database;

M video feature point sets, M cluster sequence sets of video feature points, a source video index of each video, any picture feature sequence to be queried and any picture cluster sequence to be queried are stored in the data unit;

the image searching module is provided with a cluster comparison unit, a picture acquisition unit, a frame picture acquisition unit and a source video acquisition unit;

the cluster comparison unit is used for comparing the image cluster sequence to be queried with a cluster sequence set of M video feature points to obtain a sequence similarity;

the picture acquisition unit is used for acquiring a picture to be queried;