CN108416013B

CN108416013B - Video matching, retrieving, classifying and recommending methods and devices and electronic equipment

Info

Publication number: CN108416013B
Application number: CN201810177607.5A
Authority: CN
Inventors: 潘凌越; 傅一峰; 吴金贵
Original assignee: Beijing QIYI Century Science and Technology Co Ltd
Current assignee: Beijing QIYI Century Science and Technology Co Ltd
Priority date: 2018-03-02
Filing date: 2018-03-02
Publication date: 2020-12-18
Anticipated expiration: 2038-03-02
Also published as: CN108416013A

Abstract

The embodiment of the invention provides a video matching method, a video searching device, a video classifying method, a video recommending device and electronic equipment, wherein the video matching method comprises the following steps: acquiring a video to be matched; extracting a plurality of key frames from the video to be matched; extracting an image characteristic value of each key frame of the video to be matched; selecting key frames meeting preset conditions from a plurality of key frames of the video to be matched as core scene frames to be matched according to the image characteristic value of each key frame by adopting a density clustering algorithm; carrying out similarity judgment on the core scene frame to be matched and the pre-stored core scene frames of each original video; and if any core scene frame of an original video is similar to the core scene frame to be matched, determining the original video as a target video matched with the video to be matched. The embodiment of the invention solves the problem of large calculation amount caused by excessive long video frame number and frequent scene switching.

Description

Video matching, retrieving, classifying and recommending methods and devices and electronic equipment

Technical Field

The invention relates to the technical field of video retrieval, in particular to a video matching, retrieving, classifying and recommending method, a video matching, retrieving, classifying and recommending device and electronic equipment.

Background

With the rapid development of computer technology, network technology, multimedia technology and the like, video data has been widely applied to various fields such as entertainment, education, commerce and the like, and the production and the propagation of videos are more, more and more convenient and rapid, so that the information content of multimedia videos is increased explosively. Meanwhile, users watching videos such as movies and television shows on line are increasing, and the demand for video retrieval is increasing.

In the video retrieval, classification and recommendation processes, similarity matching of videos is required. For example: video retrieval, which comprises the following main steps: the method comprises the steps of segmenting each original video in a video database in advance to obtain a plurality of video segments, extracting video features of each video segment, and storing the video features of each video segment of each original video. During retrieval, video features of a video to be retrieved are extracted, similarity matching is carried out on the video features of the video to be retrieved and the video features of each video segment of each original video in the video database, and the successfully matched original video is output to a user as a video retrieval result.

At present, the basic process of video matching includes: dividing the video into a plurality of sub-video segments based on continuous frames of the video, extracting feature information in each sub-video segment to construct a feature vector, and performing similarity matching on the original video in the database and the video to be matched by measuring the distance of the feature vector of each video segment and the like.

However, the inventor finds that the prior art has at least the following problems in the process of implementing the invention:

the method can achieve better effect in the aspects of matching, searching and the like of short videos or videos with single scenes, but when the videos are long videos or videos with high scene switching frequency, sub-video clips needing to be segmented are correspondingly increased, so that the calculation amount during video segmentation, feature extraction and similarity judgment is large.

Disclosure of Invention

The embodiment of the invention aims to provide a video matching, retrieving, classifying and recommending method, a video searching, classifying and recommending device and electronic equipment, and aims to solve the problem of large calculation amount caused by excessive frame number of long videos and frequent scene switching. The specific technical scheme is as follows:

in order to achieve the above object, an embodiment of the present invention discloses a method for video matching, where the method includes:

acquiring a video to be matched;

extracting a plurality of key frames from the video to be matched;

extracting an image characteristic value of each key frame of the video to be matched;

selecting key frames meeting preset conditions from a plurality of key frames of the video to be matched as core scene frames to be matched according to the image characteristic value of each key frame by adopting a density clustering algorithm;

carrying out similarity judgment on the core scene frame to be matched and the pre-stored core scene frames of each original video; the core scene frame of the original video is obtained by calculating each original video by adopting the density clustering algorithm in advance;

and if any core scene frame of an original video is similar to the core scene frame to be matched, determining the original video as a target video matched with the video to be matched.

Optionally, the step of extracting a plurality of key frames from the video to be matched includes:

and extracting a plurality of video frames from the video to be matched as key frames according to a preset time interval.

Optionally, the step of extracting an image feature value of each key frame of the video to be matched includes:

and extracting the color distribution characteristic value of each key frame as the image characteristic value of each key frame aiming at each extracted key frame.

Optionally, the selecting, by using a density clustering algorithm, a key frame meeting a preset condition from a plurality of key frames of the video to be matched as a core scene frame to be matched according to an image feature value of each key frame includes:

adopting a density clustering algorithm, taking the color distribution characteristic value of each key frame as a sample, and calculating the Euclidean distance between every two samples;

calculating the local density of each sample according to the Euclidean distance between every two samples and a preset field radius;

determining a core object of a video to be matched from the samples according to the local density of each sample and the preset minimum neighborhood point number;

and determining the key frame corresponding to the core object of the video to be matched with the maximum local density as a core scene frame to be matched.

The similarity judgment of the core scene frame to be matched and the core scene frames of the pre-stored original videos comprises the following steps:

obtaining each core scene frame of an original video and a color distribution characteristic value of each core scene frame;

respectively calculating the Euclidean distances to be judged between the color distribution characteristic value of the core scene frame to be matched and each color distribution characteristic value of the original video by adopting a density clustering algorithm and taking the color distribution characteristic value of the core scene frame to be matched and each color distribution characteristic value of the original video as samples;

if one Euclidean distance to be judged is smaller than the preset domain radius, determining that the core scene frame to be matched is similar to one core scene frame of the original video;

or the like, or, alternatively,

if the Euclidean distance to be judged is not smaller than the preset domain radius, determining that the core scene frame to be matched is not similar to each core scene frame of the original video; and if the core scene frame to be matched is not similar to all the core scene frames of the original video, obtaining each core scene frame of the next original video and the color distribution characteristic value of each core scene frame to carry out similarity judgment.

Optionally, the core scene frame of the original video is obtained by calculating each original video in advance by using the density clustering algorithm, and the method includes:

extracting a plurality of key frames of an original video aiming at the original video;

extracting a color distribution characteristic value of each key frame of the original video;

determining core objects of the original video from the samples according to the local density of each sample and preset minimum neighborhood point numbers, and taking all the core objects of the original video as a first core object set;

taking any core object in the first core object set as a seed, searching a sample with the seed density being up to the seed density, and generating a key frame cluster;

for each key frame clustering cluster, selecting any core object in the cluster as a first core object, and taking a core object with local density greater than that of the first core object as a second core object;

calculating the distance between the first core object and each second core object, and determining the distance between the second core object and the first core object, which is the minimum distance, as the distance of the high local density point of the first core object;

determining a core scene frame of each key frame cluster according to the local density and the high local density point distance of each core object in each key frame cluster;

and taking the core scene frame of each determined key frame cluster as the core scene frame of the original video.

Optionally, the step of searching for a sample with a reachable seed density by using any core object in the first core object set as a seed to generate a key frame cluster includes:

taking any non-clustered core object in the first core object set as a seed, and searching all samples with the seed density reaching;

generating a first key frame cluster by the key frames corresponding to all the searched samples;

taking core objects except the core objects contained in the generated first key frame cluster in the first core object set as a second core object set;

judging whether the second core object set is an empty set or not;

if the second core object set is not an empty set, taking the second core object set as a first core object set, returning to the step of searching all samples with the seed density being up to the seed with any core object in the first core object set as a seed;

and if the second core object set is an empty set, the step of generating the key frame cluster is finished.

In order to achieve the above object, an embodiment of the present invention discloses a method for video retrieval, where the method includes:

acquiring a video to be retrieved;

extracting a plurality of key frames from the video to be retrieved;

extracting an image characteristic value of each key frame of the video to be retrieved;

selecting key frames meeting preset conditions from a plurality of key frames of the video to be retrieved as core scene frames to be retrieved by adopting a density clustering algorithm according to the image characteristic value of each key frame;

carrying out similarity judgment on the core scene frame to be retrieved and the core scene frames of the pre-stored original videos; the core scene frame of the original video is obtained by calculating each original video by adopting the density clustering algorithm in advance;

if any core scene frame of an original video is similar to the core scene frame to be retrieved, determining the original video as a target video matched with the video to be retrieved;

and outputting the target video as a retrieval result to a user.

In order to achieve the above object, an embodiment of the present invention discloses a method for video recommendation, where the method includes:

determining a target user;

acquiring a video to be recommended;

extracting a plurality of key frames from the video to be recommended;

extracting an image characteristic value of each key frame of the video to be recommended;

selecting key frames meeting preset conditions from a plurality of key frames of the video to be recommended as core scene frames to be recommended according to the image characteristic value of each key frame by adopting a density clustering algorithm;

performing similarity judgment on the core scene frame to be recommended and the prestored core scene frames of the historical videos watched by the target user; the core scene frame of the historical videos is obtained by calculating each historical video watched by the target user by adopting the density clustering algorithm in advance;

and if the core scene frame to be recommended is similar to any core scene frame of a historical video, recommending the video to be recommended to the target user as a recommended video.

In order to achieve the above object, an embodiment of the present invention discloses a method for video classification, where the method includes:

acquiring a video to be classified;

extracting a plurality of key frames from the video to be classified;

extracting an image characteristic value of each key frame of the video to be classified;

selecting key frames meeting preset conditions from a plurality of key frames of the video to be classified as core scene frames to be classified according to the image characteristic value of each key frame by adopting a density clustering algorithm;

carrying out similarity judgment on the core scene frame to be classified and the pre-stored core scene frames of the original videos; the core scene frame of the original video is obtained by calculating each original video by adopting the density clustering algorithm in advance;

if any core scene frame of an original video is similar to the core scene frame to be classified, determining the original video as a target video matched with the video to be classified;

and outputting the type of the target video to a user as the type of the video to be classified.

In order to achieve the above object, an embodiment of the present invention discloses a video matching apparatus, including:

the first acquisition module is used for acquiring a video to be matched;

the first extraction module is used for extracting a plurality of key frames from the video to be matched;

the second extraction module is used for extracting the image characteristic value of each key frame of the video to be matched;

the first selection module is used for selecting key frames meeting preset conditions from a plurality of key frames of the video to be matched as core scene frames to be matched according to the image characteristic value of each key frame by adopting a density clustering algorithm;

the first judgment module is used for carrying out similarity judgment on the core scene frame to be matched and the core scene frames of the pre-stored original videos; the core scene frame of the original video is obtained by calculating each original video by adopting the density clustering algorithm in advance;

the first determining module is used for determining an original video as a target video matched with the video to be matched if any core scene frame of the original video is similar to the core scene frame to be matched.

In order to achieve the above object, an embodiment of the present invention discloses a video retrieval apparatus, including:

the second acquisition module is used for acquiring a video to be retrieved;

the third extraction module is used for extracting a plurality of key frames from the video to be retrieved;

the fourth extraction module is used for extracting the image characteristic value of each key frame of the video to be retrieved;

the second selection module is used for selecting key frames meeting preset conditions from a plurality of key frames of the video to be retrieved as core scene frames to be retrieved by adopting a density clustering algorithm according to the image characteristic value of each key frame;

the second judgment module is used for carrying out similarity judgment on the core scene frame to be retrieved and the core scene frames of the pre-stored original videos; the core scene frame of the original video is obtained by calculating each original video by adopting the density clustering algorithm in advance;

the second determining module is used for determining an original video as a target video matched with the video to be retrieved if any core scene frame of the original video is similar to the core scene frame to be retrieved;

and the first output module is used for outputting the target video serving as a retrieval result to a user.

In order to achieve the above object, an embodiment of the present invention discloses a video recommendation apparatus, including:

the third determining module is used for determining a target user;

the third acquisition module is used for acquiring a video to be recommended;

the fifth extraction module is used for extracting a plurality of key frames from the video to be recommended;

the sixth extraction module is used for extracting the image characteristic value of each key frame of the video to be recommended;

the third selection module is used for selecting a key frame meeting preset conditions from a plurality of key frames of the video to be recommended as a core scene frame to be recommended according to the image characteristic value of each key frame by adopting a density clustering algorithm;

the third judgment module is used for carrying out similarity judgment on the core scene frame to be recommended and the prestored core scene frames of the historical videos watched by the target user; the core scene frame of the historical videos is obtained by calculating each historical video watched by the target user by adopting the density clustering algorithm in advance;

and the second output module is used for recommending the video to be recommended to the target user as the recommended video if the core scene frame to be recommended is similar to any core scene frame of a historical video.

In order to achieve the above object, an embodiment of the present invention discloses a video classification apparatus, including:

the fourth acquisition module is used for acquiring the video to be classified;

the seventh extraction module is used for extracting a plurality of key frames from the video to be classified;

the eighth extraction module is used for extracting the image characteristic value of each key frame of the video to be classified;

the fourth selection module is used for selecting key frames meeting preset conditions from a plurality of key frames of the video to be classified as core scene frames to be classified according to the image characteristic value of each key frame by adopting a density clustering algorithm;

the fourth judgment module is used for judging the similarity between the core scene frame to be classified and the core scene frames of the pre-stored original videos; the core scene frame of the original video is obtained by calculating each original video by adopting the density clustering algorithm in advance;

a third determining module, configured to determine, if any core scene frame of an original video is similar to the core scene frame to be classified, the original video as a target video matching the video to be classified;

and the third output module is used for outputting the type of the target video as the type of the video to be classified to a user.

In another aspect of the present invention, there is also provided an electronic device, including a processor, a communication interface, a memory, and a communication bus, where the processor, the communication interface, and the memory complete communication with each other through the communication bus;

the memory is used for storing a computer program;

the processor is used for realizing the video matching method in the aspect of the above object when executing the program stored in the memory.

In yet another aspect of the present invention, there is also provided a computer-readable storage medium having stored therein instructions, which when run on a computer, cause the computer to perform any one of the above-described methods of video matching.

In yet another aspect of the present invention, the present invention also provides a computer program product containing instructions, which when run on a computer, causes the computer to execute any of the above-mentioned methods for video matching.

the memory is used for storing a computer program;

the processor is configured to implement the video retrieval method according to the above object when executing the program stored in the memory.

In yet another aspect of the present invention, there is also provided a computer-readable storage medium having stored therein instructions, which when run on a computer, cause the computer to execute the above-mentioned method for video retrieval.

In another aspect of the present invention, the present invention also provides a computer program product containing instructions, which when run on a computer, causes the computer to execute the above-mentioned video retrieval method.

the memory is used for storing a computer program;

the processor is configured to implement the video recommendation method according to the above object when executing the program stored in the memory.

In yet another aspect of the present invention, there is also provided a computer-readable storage medium having stored therein instructions, which when executed on a computer, cause the computer to execute the above-mentioned video recommendation method.

In another aspect of the present invention, the present invention also provides a computer program product containing instructions, which when run on a computer, causes the computer to execute the above-mentioned video recommendation method.

the memory is used for storing a computer program;

the processor is configured to implement the video classification method according to the above object when executing the program stored in the memory.

In yet another aspect of the present invention, there is also provided a computer-readable storage medium having stored therein instructions, which when run on a computer, cause the computer to execute the above-mentioned method of video classification.

In another aspect of the present invention, the present invention also provides a computer program product containing instructions, which when run on a computer, causes the computer to execute the above-mentioned video classification method.

The video matching, retrieving, classifying and recommending method, device and electronic equipment provided by the embodiment of the invention are based on a density clustering algorithm in machine learning, the essential characteristics of a video which is usually composed of a plurality of similar scenes are fully considered, the continuity of the video is ignored, the video key frames are extracted, the image characteristic values of the key frames are obtained, then the key frames which meet the preset conditions are selected from the key frames according to the image characteristic values of the key frames to serve as core scene frames, and finally the continuous video is simplified into a plurality of core scene frames. The embodiment of the invention carries out feature extraction based on the video content, overcomes the problems of large information amount of the video content, complex feature extraction and the like, and solves the problem of large calculated amount during video segmentation, feature extraction and similarity judgment caused by excessive frame number of long video and frequent scene switching. Of course, it is not necessary for any product or method of practicing the invention to achieve all of the above-described advantages at the same time.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below.

Fig. 1 is a flowchart of a video matching method according to an embodiment of the present invention;

FIG. 2 is a flowchart of a method for selecting a core scene frame to be matched according to an embodiment of the present invention;

fig. 3 is a flowchart of a method for determining similarity between a core scene frame to be matched and a core scene frame of a pre-stored original video according to an embodiment of the present invention;

FIG. 4 is a flowchart illustrating a method for computing an original video to obtain a core scene frame of the original video according to an embodiment of the present invention;

FIG. 5 is a flowchart of a method for generating a key frame cluster according to an embodiment of the present invention;

fig. 6 is a flowchart of a method for video retrieval according to an embodiment of the present invention;

fig. 7 is a flowchart of a method for video recommendation according to an embodiment of the present invention;

fig. 8 is a flowchart of a video classification method according to an embodiment of the present invention;

fig. 9 is a schematic structural diagram of a video matching apparatus according to an embodiment of the present invention;

FIG. 10 is a block diagram of a first selection module according to an embodiment of the present disclosure;

FIG. 11 is a schematic structural diagram of a first determining module according to an embodiment of the present disclosure;

FIG. 12 is a block diagram of an original video core scene frame determination module according to an embodiment of the present invention;

FIG. 13 is a schematic structural diagram of a first generation submodule in an embodiment of the present invention;

fig. 14 is a schematic structural diagram of a video retrieval apparatus according to an embodiment of the present invention;

fig. 15 is a schematic structural diagram of a video recommendation apparatus according to an embodiment of the present invention;

fig. 16 is a schematic structural diagram of a video classification apparatus according to an embodiment of the present invention;

fig. 17 is a schematic structural diagram of a first electronic device according to an embodiment of the present invention;

fig. 18 is a schematic structural diagram of a second electronic device according to an embodiment of the invention;

fig. 19 is a schematic structural diagram of a third electronic device according to an embodiment of the invention;

fig. 20 is a schematic structural diagram of a fourth electronic device according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be described below with reference to the drawings in the embodiments of the present invention.

In order to solve the problem that in the prior art, when a video is a long video or a video with a high scene switching frequency, the number of sub-video segments to be segmented is increased, so that the calculation amount is large during video segmentation, feature extraction and similarity judgment, the embodiment of the invention provides a video matching, retrieving, classifying and recommending method, device and electronic equipment.

The video matching, retrieving, classifying and recommending method provided by the embodiment of the invention is realized based on a density clustering algorithm.

For clarity, the density clustering algorithm and the related definitions are introduced:

the density clustering algorithm is used for observing the direct continuity of the samples from the perspective of the sample density, and continuously expanding the cluster based on the connectable samples to obtain the final cluster result. In the embodiment of the present invention, the Density Clustering algorithm is used as an example of a DBSCAN (Density-Based Spatial Clustering of Applications with Noise, Density-Based Clustering method) algorithm, and the Density Clustering algorithm described herein is equivalent to the DBSCAN algorithm, but the Density Clustering algorithm that can be used in practical Applications is not limited thereto.

The definition of the density clustering algorithm related in the embodiment of the invention mainly comprises the following steps:

assume that the sample set is D ═ (x)₁，x₂，…x_m) And then:

-field: for x_jE.g. D, whose-domain contains the sum x in the sample set D_jIs not greater than a subsample set of(x_j)＝{x_i∈D|distance(x_i，x_j) ≦ and the number of this subsample set is denoted as | N(x_j)|；

Core object: for any sample x_je.D if it-domain corresponds to N(x_j) Containing at least MinPts samples, i.e. if | N(x_j) | is not less than MinPts, then x_jIs a core object, where MinPts is called the minimum neighborhood points;

the density is up to: if x_jAt x_iIn the field of and x_iIs a core object, then called x_jFrom x_iThe density is direct;

the density can reach: for x_iAnd x_jIf a sample sequence p is present₁，p₂，…p_nSatisfy p₁＝x_i，p_n＝x_jAnd p is_i+1From p_iWhen the density is up to, it is called x_jFrom x_iThe density can be reached;

density connection: for x_iAnd x_jIf there is a sample x_kSo that x_iAnd x_jAre all x_kWhen the density is up, it is called x_iAnd x_jThe densities are connected.

The following describes in detail a video matching method, a video retrieval method, a video classification method, a video recommendation device, and an electronic device, which are provided by the embodiments of the present invention.

Referring to fig. 1, fig. 1 is a method for video matching according to an embodiment of the present invention, where the method may include:

and S110, acquiring the video to be matched.

And S120, extracting a plurality of key frames from the video to be matched.

In the embodiment of the invention, aiming at the obtained video to be matched, the key frame of the obtained video to be matched needs to be extracted. One optional implementation is: and extracting a plurality of video frames from the acquired video to be matched as key frames according to a preset time interval. The preset time interval may be 1 second, 3 seconds, 5 seconds, or the like, and the setting of the specific time interval may be designed by those skilled in the art according to the time length of the video to be matched or other requirements, and the embodiment of the present invention is not limited herein.

In the embodiment of the present invention, a plurality of video frames may be extracted from an acquired video to be matched as key frames, or an I frame or other frames in the video to be matched may be selected as key frames according to a preset time interval, and a specific person skilled in the art may select the key frames according to an actual application situation, which is not limited herein.

S130, extracting the image characteristic value of each key frame of the video to be matched.

In the embodiment of the invention, after a plurality of key frames of a video to be matched are extracted, the image characteristic value of each key frame needs to be acquired. An optional implementation method for extracting an image feature value of each key frame of a video to be matched may be:

In the embodiment of the invention, for each key frame of the extracted video to be matched, the color information of the video image corresponding to the key frame is acquired, the color information of the video image corresponding to each key frame is expressed by using a tensor as the color distribution characteristic value of the key frame, and the color distribution characteristic value of the key frame is used as the image characteristic value of the key frame. Specifically, the method for acquiring and representing the color information of the video image and the like in the art can refer to the prior art, and will not be described herein again.

And S140, selecting key frames meeting preset conditions from a plurality of key frames of the video to be matched as core scene frames to be matched according to the image characteristic value of each key frame by adopting a density clustering algorithm.

In the embodiment of the invention, after the image characteristic value of each key frame of the video to be matched is obtained, a density clustering algorithm is adopted to select the key frame meeting the preset condition from the obtained multiple key frames as the core scene frame to be matched.

Specifically, a method for selecting a key frame meeting a preset condition from a plurality of key frames of a video to be matched as a core scene frame to be matched will be described in detail in the following text.

S150, performing similarity judgment on the core scene frame to be matched and the pre-stored core scene frames of each original video; the core scene frame of the original video is obtained by calculating each original video in advance by adopting the density clustering algorithm.

In the embodiment of the invention, after the core scene frame to be matched is obtained, the core scene frame to be matched and the core scene frame of the pre-stored original video are subjected to similarity judgment. Specifically, a method for determining similarity between a core scene frame to be matched and a core scene frame of a pre-stored original video, and a method for acquiring the core scene frame of the original video will be described in detail in the following text.

And S160, if any core scene frame of an original video is similar to the core scene frame to be matched, determining the original video as a target video matched with the video to be matched.

The video matching method provided by the embodiment of the invention is based on a density clustering algorithm in machine learning, fully considers the essential characteristics of a video which is usually composed of a plurality of similar scenes, ignores the continuity of the video, extracts video key frames, obtains the image characteristic values of the key frames, selects the key frames which meet the preset conditions from the key frames as core scene frames according to the image characteristic values of the key frames, finally simplifies the continuous video into a plurality of core scene frames, judges the similarity between the video to be matched and the core scene frames of the original video, and further judges whether the original video is matched with the video to be matched. The embodiment of the invention carries out feature extraction based on the video content, overcomes the problems of large information amount of the video content, complex feature extraction and the like, and solves the problem of large calculated amount during video segmentation, feature extraction and similarity judgment caused by excessive frame number of long video and frequent scene switching.

In the embodiment of the present invention, an implementation manner of selecting a key frame meeting a preset condition as a core scene frame to be matched from a plurality of key frames of a video to be matched in step S140 may be as shown in fig. 2, where fig. 2 is a method for selecting a core scene frame to be matched in the embodiment of the present invention, the method may include:

s210, adopting a density clustering algorithm, taking the color distribution characteristic value of each key frame as a sample, and calculating the Euclidean distance between every two samples.

The embodiment of the invention is realized based on a density clustering algorithm, firstly, the color distribution characteristic value of each key frame of the acquired video to be matched is taken as a sample, then, the Euclidean distance between every two samples is calculated, namely, the Euclidean distance between every two color distribution characteristic values expressed by tensor is calculated, for example, the Euclidean distance between a sample i and a sample j can be expressed as d_ij. In practical applications, a person skilled in the art can calculate the distance between every two samples according to actual requirements, and is not limited to the euclidean distance recited in the embodiments of the present invention.

And S220, calculating the local density of each sample according to the Euclidean distance between every two samples and a preset field radius.

In the embodiment of the invention, the local density of each sample can be calculated according to the Euclidean distance between every two samples obtained through calculation and the preset field radius. In practical applications, the preset value of the radius of the field can be set by a person skilled in the art according to practical requirements, and the invention is not limited herein.

In the examples of the present invention, the radius of the field is represented as d_cThen the local density can be calculated as:

where i and j represent the ith and jth samples, respectively, ρ_iDenotes the local density of the sample i, d_ijIs the Euclidean distance between sample i and sample j, X (d)_ij-d_c) The calculation method of (2) may be:

x represents d_ij-d_c. As will be understood by those skilled in the art, the local density of sample i is: centered at i, a distance i at d_cNumber of samples in the range.

And S230, determining a core object of the video to be matched from the samples according to the local density of each sample and the preset minimum neighborhood point number.

In the embodiment of the invention, the core object can be determined from the samples according to the local density of each sample obtained by calculation and the preset minimum neighborhood point number. In practical applications, the preset value of the minimum neighborhood point number MinPts may be set by a person skilled in the art according to actual requirements, and the embodiment of the present invention is not limited herein.

In the embodiment of the invention, according to the local density of each sample obtained by calculation and the preset minimum neighborhood point MinPts, the sample of which the local density is greater than or equal to the preset minimum neighborhood point MinPts is determined as the core object, namely, the sample of which the local density is greater than or equal to the preset minimum neighborhood point MinPts in the samples obtained according to the video to be matched is determined as the core object of the video to be matched.

S240, determining the key frame corresponding to the core object of the video to be matched with the maximum local density as a core scene frame to be matched.

In the embodiment of the invention, a plurality of core objects of the video to be matched can be provided, and the key frame corresponding to the core object of the video to be matched with the maximum local density is determined as the core scene frame to be matched. Of course, the present application is only described in the foregoing implementation manner, and the manner of determining the core scene frame to be matched in practical application is not limited thereto.

In the embodiment of the present invention, for a longer video to be matched or a video to be matched whose picture changes frequently, there may be a plurality of determined core scene frames to be matched, and the specific determination method in the embodiment of the present invention is not limited herein.

According to the video matching method provided by the embodiment of the invention, a density clustering algorithm in machine learning is adopted, the essential characteristics that a video usually consists of a plurality of similar scenes are fully considered, the video can be clustered according to different scenes, the continuity of the video is ignored, video key frames are extracted, the image characteristic values of the key frames are obtained, then the key frames meeting preset conditions are selected from the key frames according to the image characteristic values of the key frames to serve as core scene frames, the continuous video is finally simplified into a plurality of core scene frames, then the similarity between the video to be matched and the core scene frames of the original video is judged, and further whether the original video is matched with the video to be matched is judged. The feature extraction is carried out based on the video content, the problems of large information amount of the video content, complex feature extraction and the like are solved, and the problem of large calculation amount during video segmentation, feature extraction and similarity judgment caused by excessive frame number of long video and frequent scene switching is solved.

In this embodiment of the present invention, an implementation manner of determining the similarity between the core scene frame to be matched and the core scene frame of the pre-stored original video in step S150 may be shown in fig. 3, where fig. 3 is a method for determining the similarity between the core scene frame to be matched and the core scene frame of the pre-stored original video in this embodiment of the present invention, the method may include:

s310, obtaining each core scene frame of an original video and the color distribution characteristic value of each core scene frame.

In the embodiment of the invention, before similarity judgment, objects for similarity judgment, namely, a core scene frame to be matched and a color distribution characteristic value corresponding to the core scene frame, and each core scene frame of an original video and a color distribution characteristic value corresponding to each core scene frame, are required to be obtained. The core scene frames to be matched are obtained and stored, and the color distribution characteristic values of the core scene frames serving as samples are stored, so that when similarity judgment is carried out, each core scene frame of an original video and the color distribution characteristic values of each core scene frame are obtained. Specifically, each core scene frame of the original video and the color distribution characteristic value of each core scene frame may be obtained by calculating the original video in advance and then storing the obtained result in a database.

And S320, adopting a density clustering algorithm, taking the color distribution characteristic value of the core scene frame to be matched and each color distribution characteristic value of the original video as samples, and respectively calculating each Euclidean distance to be judged between the color distribution characteristic value of the core scene frame to be matched and each color distribution characteristic value of the original video.

In the embodiment of the invention, based on a density clustering algorithm, the color distribution characteristic value of the core scene frame to be matched and each color distribution characteristic value of the acquired original video are used as samples, and then, the Euclidean distances to be judged between the color distribution characteristic value of the core scene frame to be matched and each color distribution characteristic value of the acquired original video are respectively calculated, namely, the Euclidean distances between the sample of the core scene frame to be matched and each sample of the original video are respectively calculated and used as the Euclidean distances to be judged.

S330, if one Euclidean distance to be judged is smaller than the preset domain radius, determining that the core scene frame to be matched is similar to a core scene frame of the original video.

In the embodiment of the invention, each core scene frame of the original video is selected from the generated cluster, and the cluster clusters do not contain samples with reachable density and connected density. Therefore, when the calculated Euclidean distance to be judged is smaller than the preset domain radius, the core scene frame to be matched is determined to be similar to a core scene frame of the original video, and the preset domain radius is the same value in the application.

S340, if all Euclidean distances to be judged are not smaller than the preset domain radius, determining that the core scene frames to be matched are not similar to all the core scene frames of the original video; and if the core scene frame to be matched is not similar to all the core scene frames of the original video, obtaining each core scene frame of the next original video and the color distribution characteristic value of each core scene frame to carry out similarity judgment.

In the embodiment of the invention, when all Euclidean distances to be judged are not smaller than the preset field radius, the core scene frame to be matched is not similar to all the core scene frames of the original video; if the core scene frame to be matched is not similar to all the core scene frames of the original video, each core scene frame of the next original video stored in the video database and the color distribution characteristic value of each core scene frame can be obtained, and the similarity judgment is carried out next time.

In the embodiment of the present invention, the core scene frame of the original video in step S150 is obtained by calculating each original video in advance by using a density clustering algorithm, a specific obtaining manner may be shown in fig. 4, where fig. 4 is a method for obtaining the core scene frame of the original video by calculating the original video in the embodiment of the present invention, and the method may include:

s400, aiming at an original video, extracting a plurality of key frames of the original video.

S410, extracting a color distribution characteristic value of each key frame of the original video;

in the embodiment of the present invention, for the original video, the method for extracting a plurality of key frames of the original video and extracting the color distribution characteristic value of each key frame may refer to the method for extracting a plurality of key frames of the video to be matched and extracting the color distribution characteristic value of each key frame in steps S120 and S130, and details are not repeated here.

And S420, adopting a density clustering algorithm, taking the color distribution characteristic value of each key frame as a sample, and calculating the Euclidean distance between every two samples.

And S430, calculating the local density of each sample according to the Euclidean distance between every two samples and the preset field radius.

In the embodiment of the present invention, for the original video, the method for calculating the euclidean distance between every two samples and the local density of each sample may refer to the methods in steps S210 and S220, and details are not repeated here.

S440, according to the local density of each sample and the number of preset minimum neighborhood points, determining the core objects of the original video from the samples, and taking all the core objects of the original video as a first core object set.

In the embodiment of the present invention, for the original video, the method for determining the core object of the original video may refer to step S230, which is not described herein again. In the embodiment of the invention, after the core objects of the original video are determined, all the core objects of the original video are used as the first core object set.

S450, with any core object in the first core object set as a seed, searching a sample with the seed density being up to the seed density, and generating a key frame cluster.

In the embodiment of the present invention, any core object in the acquired first core object set is used as a seed, a sample that can be obtained by the seed with a reachable density is searched for generating a key frame cluster, and a specific implementation method for generating the key frame cluster will be described in detail in the following text.

S460, for each key frame cluster, selecting any core object in the cluster as a first core object, and taking the core object with the local density larger than that of the first core object as a second core object.

In the embodiment of the invention, for each generated key frame cluster, when at least two core objects exist in the key frame cluster, any core object in the cluster is selected as a first core object, and a core object with local density larger than that of the first core object in the cluster is taken as a second core object.

S470, calculating a distance between the first core object and each second core object, and determining a distance between the second core object and the first core object, which is the minimum distance, as the distance of the high local density point of the first core object.

In the embodiment of the present invention, first, the distance between the first core object and each second core object is calculated, then, the calculated distances may be sorted or compared, and the minimum distance is determined as the distance of the high local density point of the first core object.

In the embodiment of the present invention, the formula for calculating the distance between the high local density points of the core object may be:

wherein i and j represent the ith and jth core objects, respectively,_irepresenting the high local density point distance, ρ, of the core object i_iRepresenting the local density, p, of the core object i_jRepresenting the local density, D, of the core object j_ijRepresenting the distance between core object i and core object j.

And S480, determining the core scene frame of each key frame cluster according to the local density and the high local density point distance of each core object in each key frame cluster.

In the embodiment of the present invention, when there are at least two core objects in the obtained key frame cluster, after the local density and the high local density point distance of each core object in each key frame cluster are obtained, one implementation manner may be that the key frame corresponding to the core object with the larger local density and high local density point distance in the key frame cluster is determined as the core scene frame of the key frame cluster. As will be understood by those skilled in the art, the larger the local density of the core object in the key frame cluster, the larger the distance between the high local density points, which indicates that the core object has a high density and is farther from the cluster closest to the core object density, the core object is considered as the center of the cluster. Specifically, in the determination method for determining the distances between the local density points and the high local density points in the key frame cluster are relatively large, various combination modes of different distances between the local density points and the high local density points can be provided, and all the combination modes are within the protection range of the application, and the application is not listed one by one.

In the embodiment of the invention, when one core object is obtained from the key frame cluster, the key frame corresponding to the core object is determined as the core scene frame of the key frame cluster.

And S490, taking the core scene frame of each determined key frame cluster as the core scene frame of the original video.

After the core scene frame is determined for each original video, the core scene frame of each original video and the color distribution characteristic value of each core scene frame can be stored in a database for use in video matching.

In the embodiment of the present invention, an implementation manner of generating the key frame cluster in step S450 may refer to fig. 5, where fig. 5 is a method for generating a key frame cluster in the embodiment of the present invention, and the method may include:

s510, taking any non-clustered core object in the first core object set as a seed, and searching all samples with the seed density reaching.

In the embodiment of the present invention, any non-clustered core object in the acquired first core object set is used as a seed, and all samples that can be obtained by the seed and have a density that can be reached are searched according to the above-described definition that the density can be reached.

In this embodiment of the present invention, another optional implementation manner may be that any non-clustered core object in the obtained first core object set is used as a seed, all samples that can be obtained by the seed with the density that can be reached are searched according to the definition that the density can be reached, further, samples that are connected to all samples with the density that can be obtained by the seed are found, and the maximums of all samples with the density that can be obtained by the seed and connected to the density are obtained.

S520, generating a first key frame cluster by the searched key frames corresponding to all the samples.

In the embodiment of the present invention, a key frame cluster is generated from the key frames corresponding to all the samples found in step S510. And when the searched samples are the maximums of all samples with reachable seed density and connected density, the generated key frame cluster is the maximum cluster corresponding to the seed.

S530, regarding the core objects in the first core object set except the core objects included in the generated first key frame cluster as a second core object set.

In the embodiment of the present invention, the generated key frame cluster may include one core object or may include a plurality of core objects. And after a key frame cluster is generated, taking core objects except the core objects contained in the key frame cluster generated in the first core object set as a second core object set.

S540, judging whether the second core object set is an empty set.

S550, if the second core object set is not an empty set, the second core object set is used as the first core object set, and the process returns to step S510.

And S560, if the second core object set is an empty set, the step of generating the key frame cluster is finished.

In the embodiment of the present invention, if the second core object set is not an empty set, the second core object set is used as the first core object set, and the step S510 is returned to, and this process is repeated to generate the key frame cluster. If the second core object set is an empty set, the key frame cluster is generated by all the core objects in the core object set, and the step of generating the key frame cluster is finished.

One implementation method for generating the key frame cluster in the embodiment of the present invention may be: for example, assuming that the domain parameter (MinPts) is set to 0.68 and MinPts is set to 5, the determined set of core objects is Ω { x3, x5, x6, x8, x9, x13, x14, x18, x19, and x29}, then, randomly selecting a core object from Ω as a seed, finding out all samples reachable by the seed density, and forming the first key frame cluster. Assuming that the core object X8 is selected as a seed, the first generated cluster is C1 ═ { X6, X7, X8, X10, X12, X20, and X23}, the core object included in C1 is removed from Ω, Ω ═ Ω \ C1 ═ X3, X5, X9, X13, X14, X18, X19, and X29}, and then a core object is randomly selected from the updated set Ω as a seed to generate a next key frame cluster.

Fig. 6 is a method for video retrieval according to an embodiment of the present invention, where the method may include:

s610, obtaining the video to be retrieved.

S620, extracting a plurality of key frames from the video to be retrieved.

S630, extracting the image characteristic value of each key frame of the video to be retrieved.

And S640, selecting a key frame meeting preset conditions from a plurality of key frames of the video to be retrieved as a core scene frame to be retrieved according to the image characteristic value of each key frame by adopting a density clustering algorithm.

S650, carrying out similarity judgment on the core scene frame to be retrieved and the core scene frames of the pre-stored original videos; the core scene frame of the original video is obtained by calculating each original video in advance by adopting the density clustering algorithm.

And S660, if any core scene frame of an original video is similar to the core scene frame to be retrieved, determining the original video as a target video matched with the video to be retrieved.

In the embodiment of the present invention, specific implementation manners of S610 to S660 for operations of a video to be retrieved and an original video may refer to implementation manners of steps S110 to S160 in fig. 1, which are not described herein again.

And S670, outputting the target video serving as a retrieval result to a user.

In the embodiment of the present invention, a specific manner for outputting the search result to the user in the prior art can be referred to as a manner for outputting the target video to the user as the search result, and details are not repeated here.

The video retrieval method provided by the embodiment of the invention is based on a density clustering algorithm in machine learning, fully considers the essential characteristics of a video which is usually composed of a plurality of similar scenes, ignores the continuity of the video, extracts video key frames, obtains the image characteristic values of the key frames, selects the key frames which meet the preset conditions from the key frames as core scene frames according to the image characteristic values of the key frames, finally simplifies the continuous video into a plurality of core scene frames, then judges the similarity between the video to be retrieved and the core scene frames of the original video, further judges whether the original video is matched with the video to be retrieved, and outputs the matched original video as a retrieval result to a user if the matched original video is matched with the core scene frames of the original video. The embodiment of the invention carries out feature extraction based on the video content, overcomes the problems of large information amount of the video content, complex feature extraction and the like, and solves the problem of large calculated amount during video segmentation, feature extraction and similarity judgment caused by excessive frame number of long video and frequent scene switching.

Fig. 7 is a method for video recommendation according to an embodiment of the present invention, where the method may include:

and S710, determining a target user.

And S720, acquiring the video to be recommended.

S730, extracting a plurality of key frames from the video to be recommended.

And S740, extracting the image characteristic value of each key frame of the video to be recommended.

And S750, selecting a key frame meeting preset conditions from a plurality of key frames of the video to be recommended as a core scene frame to be recommended according to the image characteristic value of each key frame by adopting a density clustering algorithm.

S760, performing similarity judgment on the core scene frame to be recommended and the prestored core scene frames of the historical videos watched by the target user; the core scene frame of the historical videos is obtained by calculating each historical video watched by the target user by adopting the density clustering algorithm in advance.

In the embodiment of the present invention, specific implementation manners of S720 to S760 for operations on a to-be-recommended video and a history video viewed by a target user may refer to implementation manners of steps S110 to S150 in fig. 1, which are not described herein again.

S770, if the core scene frame to be recommended is similar to any core scene frame of a historical video, recommending the video to be recommended to the target user as a recommended video.

In the embodiment of the present invention, a specific implementation manner for recommending videos to users in the prior art can be referred to as a manner for recommending videos to target users to be recommended, and details are not repeated here.

The video recommendation method provided by the embodiment of the invention is based on a density clustering algorithm in machine learning, fully considers the essential characteristics of a video generally composed of a plurality of similar scenes, ignores the continuity of the video, extracts video key frames, obtains image characteristic values of the key frames, selects the key frames meeting preset conditions from the key frames as core scene frames according to the image characteristic values of the key frames, finally simplifies the continuous video into a plurality of core scene frames, judges the similarity between the video to be recommended and the core scene frames of the historical video watched by a target user, and further recommends the video to be recommended as the recommended video to the target user. The embodiment of the invention carries out feature extraction based on the video content, overcomes the problems of large information amount of the video content, complex feature extraction and the like, and solves the problem of large calculated amount during video segmentation, feature extraction and similarity judgment caused by excessive frame number of long video and frequent scene switching.

Fig. 8 is a method for video classification according to an embodiment of the present invention, where the method may include:

and S810, acquiring the video to be classified.

S820, extracting a plurality of key frames from the video to be classified.

S830, extracting the image characteristic value of each key frame of the video to be classified.

And S840, selecting a key frame meeting preset conditions from a plurality of key frames of the video to be classified as a core scene frame to be classified according to the image characteristic value of each key frame by adopting a density clustering algorithm.

S850, performing similarity judgment on the core scene frame to be classified and the pre-stored core scene frames of the original videos; the core scene frame of the original video is obtained by calculating each original video in advance by adopting the density clustering algorithm.

S860, if any core scene frame of an original video is similar to the core scene frame to be classified, determining the original video as a target video matched with the video to be classified.

In the embodiment of the present invention, specific implementation manners of S810 to S860 for the operation of the video to be classified and the original video may refer to implementation manners of steps S110 to S160 in fig. 1, which are not described herein again.

And S870, outputting the type of the target video as the type of the video to be classified to a user.

In the embodiment of the present invention, an implementation manner for outputting the type of the target video to the user as the type of the video to be classified may refer to a specific implementation manner for outputting the type of the video to be classified for the user in the prior art, which is not described herein again.

The video classification method provided by the embodiment of the invention is based on a density clustering algorithm in machine learning, fully considers the essential characteristics of a video generally composed of a plurality of similar scenes, ignores the continuity of the video, extracts video key frames, obtains the image characteristic values of the key frames, selects the key frames meeting preset conditions from the key frames as core scene frames according to the image characteristic values of the key frames, finally simplifies the continuous video into a plurality of core scene frames, then judges the similarity between the video to be classified and the core scene frames of the original video, further judges whether the original video is matched with the video to be classified, and outputs the type of the original video as the type of the video to be classified to a user if the original video is matched with the core scene frames of the original video. The embodiment of the invention carries out feature extraction based on the video content, overcomes the problems of large information amount of the video content, complex feature extraction and the like, and solves the problem of large calculated amount during video segmentation, feature extraction and similarity judgment caused by excessive frame number of long video and frequent scene switching.

Corresponding to the foregoing video matching method, in yet another aspect of the present invention, a video matching apparatus is also provided. Fig. 9 is a schematic structural diagram of a video matching apparatus according to an embodiment of the present invention, where the apparatus includes:

a first obtaining module 910, configured to obtain a video to be matched;

a first extraction module 920, configured to extract a plurality of key frames from the video to be matched;

a second extraction module 930, configured to extract an image feature value of each key frame of the video to be matched;

a first selecting module 940, configured to select, by using a density clustering algorithm, a key frame that meets a preset condition from a plurality of key frames of the video to be matched as a core scene frame to be matched according to an image feature value of each key frame;

a first judging module 950, configured to perform similarity judgment on the core scene frame to be matched and a core scene frame of each pre-stored original video; the core scene frame of the original video is obtained by calculating each original video by adopting the density clustering algorithm in advance;

the first determining module 960 is configured to determine an original video as a target video matching with the to-be-matched video if any core scene frame of the original video is similar to the to-be-matched core scene frame.

The video matching device provided by the embodiment of the invention is based on a density clustering algorithm in machine learning, fully considers the essential characteristics of a video which is usually composed of a plurality of similar scenes, ignores the continuity of the video, extracts video key frames, obtains the image characteristic values of the key frames, selects the key frames which meet the preset conditions from the key frames as core scene frames according to the image characteristic values of the key frames, finally simplifies the continuous video into a plurality of core scene frames, judges the similarity between the video to be matched and the core scene frames of the original video, and further judges whether the original video is matched with the video to be matched. The embodiment of the invention carries out feature extraction based on the video content, overcomes the problems of large information amount of the video content, complex feature extraction and the like, and solves the problem of large calculated amount during video segmentation, feature extraction and similarity judgment caused by excessive frame number of long video and frequent scene switching.

Fig. 10 is a schematic structural diagram of a first selecting module in an embodiment of the present invention, which corresponds to the method for selecting a core scene frame to be matched in fig. 2, and the apparatus includes:

the first distance calculation submodule 101 is configured to calculate an euclidean distance between every two samples by using a density clustering algorithm and using the color distribution characteristic value of each key frame as a sample;

the first density calculation submodule 102 is configured to calculate the local density of each sample according to the euclidean distance between every two samples and a preset domain radius;

the first determining submodule 103 is configured to determine a core object of a video to be matched from the samples according to the local density of each sample and a preset minimum neighborhood point number;

and the second determining submodule 104 is configured to determine the key frame corresponding to the core object of the video to be matched, which has the largest local density, as the core scene frame to be matched.

Fig. 11 is a schematic structural diagram of a first determining module in an embodiment of the present invention, which corresponds to the aforementioned method for determining similarity between the core scene frame to be matched and the core scene frame of the pre-stored original video in fig. 3, the apparatus includes:

the first obtaining submodule 111 is configured to obtain each core scene frame of an original video and a color distribution characteristic value of each core scene frame;

a second distance calculating submodule 112, configured to use a density clustering algorithm, take the color distribution characteristic value of the core scene frame to be matched and each color distribution characteristic value of the original video as samples, and calculate each euclidean distance to be determined between the color distribution characteristic value of the core scene frame to be matched and each color distribution characteristic value of the original video, respectively;

the first judging submodule 113 is configured to determine that the core scene frame to be matched is similar to a core scene frame of the original video if one euclidean distance to be judged is smaller than the preset domain radius; if the Euclidean distance to be judged is not smaller than the preset domain radius, determining that the core scene frame to be matched is not similar to each core scene frame of the original video; and if the core scene frame to be matched is not similar to all the core scene frames of the original video, obtaining each core scene frame of the next original video and the color distribution characteristic value of each core scene frame to carry out similarity judgment.

Fig. 12 is a schematic structural diagram of a core scene frame determining module of an original video in an embodiment of the present invention, which corresponds to the foregoing method for obtaining a core scene frame of an original video by calculating an original video in fig. 4, where the apparatus includes:

a first extraction sub-module 121, configured to extract, for an original video, a plurality of key frames of the original video;

a second extraction submodule 122, configured to extract a color distribution characteristic value of each key frame of the original video;

the third distance calculating submodule 123 is configured to calculate an euclidean distance between every two samples by using a density clustering algorithm and using the color distribution characteristic value of each key frame as a sample;

the second density calculation submodule 124 is configured to calculate the local density of each sample according to the euclidean distance between every two samples and a preset domain radius;

a third determining submodule 125, configured to determine, according to the local density of each sample and a preset minimum neighborhood point number, a core object of the original video from the samples, and use all the core objects of the original video as a first core object set;

the first generation submodule 126 is configured to search for a sample with a reachable seed density by using any one of the first core object set as a seed, and generate a key frame cluster; for each key frame clustering cluster, selecting any core object in the cluster as a first core object, and taking a core object with local density greater than that of the first core object as a second core object;

a fourth distance calculating submodule 127, configured to calculate a distance between the first core object and each second core object, and determine a distance between the second core object and the first core object, which is the smallest distance, as a high local density point distance of the first core object;

a fourth determining submodule 128, configured to determine a core scene frame of each key frame cluster according to the local density and the high local density point distance of each core object in each key frame cluster; and taking the core scene frame of each determined key frame cluster as the core scene frame of the original video.

Fig. 13 is a schematic structural diagram of a first generation submodule in an embodiment of the present invention, which corresponds to the method for generating a key frame cluster in fig. 5, where the apparatus includes:

the searching submodule 131 is configured to search all samples that can reach the seed density by using any non-clustered core object in the first core object set as a seed;

a cluster generation submodule 132, configured to generate a first key frame cluster from the key frames corresponding to all the searched samples; taking core objects except the core objects contained in the generated first key frame cluster in the first core object set as a second core object set;

a second determining submodule 133, configured to determine whether the second core object set is an empty set; if the second core object set is not an empty set, the second core object set is used as the first core object set, and the search submodule 131 is triggered; and if the second core object set is an empty set, the step of generating the key frame cluster is finished.

It should be noted that the apparatus according to the embodiment of the present invention is an apparatus corresponding to the video matching method shown in fig. 1, and all embodiments of the video matching method shown in fig. 1 are applicable to the apparatus and can achieve the same or similar beneficial effects.

Corresponding to the method for video retrieval, in another aspect of the implementation of the present invention, a device for video retrieval is also provided. Fig. 14 is a schematic structural diagram of an apparatus for video retrieval according to an embodiment of the present invention, where the apparatus includes:

the second obtaining module 141 is configured to obtain a video to be retrieved;

a third extraction module 142, configured to extract a plurality of key frames from the video to be retrieved;

a fourth extraction module 143, configured to extract an image feature value of each key frame of the video to be retrieved;

a second selecting module 144, configured to select, by using a density clustering algorithm, a key frame that meets a preset condition from multiple key frames of the video to be retrieved as a core scene frame to be retrieved according to an image feature value of each key frame;

the second judging module 145 is configured to perform similarity judgment on the core scene frame to be retrieved and core scene frames of pre-stored original videos; the core scene frame of the original video is obtained by calculating each original video by adopting the density clustering algorithm in advance;

a second determining module 146, configured to determine, if any core scene frame of an original video is similar to the core scene frame to be retrieved, that the original video is a target video matching the video to be retrieved;

and a first output module 147, configured to output the target video to a user as a retrieval result.

The video retrieval device provided by the embodiment of the invention fully considers the essential characteristics of a video generally composed of a plurality of similar scenes based on a density clustering algorithm in machine learning, ignores the continuity of the video, extracts video key frames, obtains image characteristic values of the key frames, selects the key frames meeting preset conditions from the key frames as core scene frames according to the image characteristic values of the key frames, finally simplifies the continuous video into a plurality of core scene frames, judges the similarity between the video to be retrieved and the core scene frames of the original video, further judges whether the original video is matched with the video to be retrieved, and outputs the matched original video as a retrieval result to a user if the original video is matched with the core scene frames of the original video. The embodiment of the invention carries out feature extraction based on the video content, overcomes the problems of large information amount of the video content, complex feature extraction and the like, and solves the problem of large calculated amount during video segmentation, feature extraction and similarity judgment caused by excessive frame number of long video and frequent scene switching.

It should be noted that the apparatus according to the embodiment of the present invention is an apparatus corresponding to the video retrieval method shown in fig. 6, and all embodiments of the video retrieval method shown in fig. 6 are applicable to the apparatus and can achieve the same or similar beneficial effects.

Corresponding to the method for video recommendation, in another aspect of the implementation of the present invention, a device for video recommendation is also provided. Fig. 15 is a schematic structural diagram of an apparatus for video recommendation according to an embodiment of the present invention, where the apparatus includes:

a third determining module 151, configured to determine a target user;

a third obtaining module 152, configured to obtain a video to be recommended;

a fifth extraction module 153, configured to extract a plurality of key frames from the video to be recommended;

a sixth extraction module 154, configured to extract an image feature value of each key frame of the video to be recommended;

the third selecting module 155 is configured to select, by using a density clustering algorithm, a key frame meeting a preset condition from a plurality of key frames of the video to be recommended as a core scene frame to be recommended according to an image feature value of each key frame;

a third determining module 156, configured to perform similarity determination on the core scene frame to be recommended and a core scene frame of each pre-stored historical video watched by the target user; the core scene frame of the historical videos is obtained by calculating each historical video watched by the target user by adopting the density clustering algorithm in advance;

a second output module 157, configured to recommend the video to be recommended as a recommended video to the target user if the core scene frame to be recommended is similar to any core scene frame of a historical video.

The video recommendation device provided by the embodiment of the invention fully considers the essential characteristics of a video generally composed of a plurality of similar scenes based on a density clustering algorithm in machine learning, ignores the continuity of the video, extracts video key frames, obtains image characteristic values of the key frames, selects the key frames meeting preset conditions from the key frames as core scene frames according to the image characteristic values of the key frames, finally simplifies the continuous video into a plurality of core scene frames, judges the similarity between the video to be recommended and the core scene frames of the historical video watched by a target user, and further recommends the video to be recommended as the recommended video to the target user. The embodiment of the invention carries out feature extraction based on the video content, overcomes the problems of large information amount of the video content, complex feature extraction and the like, and solves the problem of large calculated amount during video segmentation, feature extraction and similarity judgment caused by excessive frame number of long video and frequent scene switching.

It should be noted that the apparatus according to the embodiment of the present invention is an apparatus corresponding to the video recommendation method shown in fig. 7, and all embodiments of the video recommendation method shown in fig. 7 are applicable to the apparatus and can achieve the same or similar beneficial effects.

Corresponding to the method for video classification, in another aspect of the implementation of the present invention, a device for video classification is also provided. Fig. 16 is a schematic structural diagram of a video classification apparatus according to an embodiment of the present invention, where the apparatus includes:

a fourth obtaining module 161, configured to obtain a video to be classified;

a seventh extraction module 162, configured to extract a plurality of key frames from the video to be classified;

an eighth extracting module 163, configured to extract an image feature value of each key frame of the video to be classified;

a fourth selecting module 164, configured to select, by using a density clustering algorithm, a key frame that meets a preset condition from a plurality of key frames of the video to be classified as a core scene frame to be classified according to an image feature value of each key frame;

a fourth determining module 165, configured to perform similarity determination on the core scene frame to be classified and a core scene frame of each pre-stored original video; the core scene frame of the original video is obtained by calculating each original video by adopting the density clustering algorithm in advance;

a third determining module 166, configured to determine, if any core scene frame of an original video is similar to the core scene frame to be classified, the original video as a target video matched with the video to be classified;

and a third output module 167, configured to output the type of the target video to the user as the type of the video to be classified.

The video classification device provided by the embodiment of the invention fully considers the essential characteristics of a video generally composed of a plurality of similar scenes based on a density clustering algorithm in machine learning, ignores the continuity of the video, extracts video key frames, obtains image characteristic values of the key frames, selects the key frames meeting preset conditions from the key frames as core scene frames according to the image characteristic values of the key frames, finally simplifies the continuous video into a plurality of core scene frames, judges the similarity between the video to be classified and the core scene frames of the original video, further judges whether the original video is matched with the video to be classified, and outputs the type of the original video as the type of the video to be classified to a user if the original video is matched with the core scene frames of the original video. The embodiment of the invention carries out feature extraction based on the video content, overcomes the problems of large information amount of the video content, complex feature extraction and the like, and solves the problem of large calculated amount during video segmentation, feature extraction and similarity judgment caused by excessive frame number of long video and frequent scene switching.

It should be noted that the apparatus according to the embodiment of the present invention is an apparatus corresponding to the video classification method shown in fig. 8, and all embodiments of the video classification method shown in fig. 8 are applicable to the apparatus and can achieve the same or similar beneficial effects.

Fig. 17 is a schematic structural diagram of the first electronic device provided in the embodiment of the present invention, and includes a processor 171, a communication interface 172, a memory 173, and a communication bus 174, where the processor 171, the communication interface 172, and the memory 173 complete communication with each other through the communication bus 174;

a memory 173 for storing computer programs;

the processor 171, when executing the program stored in the memory 173, implements the following steps:

acquiring a video to be matched;

extracting a plurality of key frames from the video to be matched;

The first electronic device provided by the embodiment of the invention performs feature extraction based on video content, overcomes the problems of large information amount of video content, complex feature extraction and the like, and solves the problem of large calculation amount during video segmentation, feature extraction and similarity judgment due to excessive frame number of long video and frequent scene switching.

Fig. 18 is a schematic structural diagram of the second electronic device provided in the embodiment of the present invention, and includes a processor 181, a communication interface 182, a memory 183, and a communication bus 184, where the processor 181, the communication interface 182, and the memory 183 complete mutual communication through the communication bus 184;

a memory 183 for storing a computer program;

the processor 181 is configured to implement the following steps when executing the program stored in the memory 183:

acquiring a video to be retrieved;

extracting a plurality of key frames from the video to be retrieved;

and outputting the target video as a retrieval result to a user.

The second electronic device provided by the embodiment of the invention performs feature extraction based on video content, overcomes the problems of large information amount of video content, complex feature extraction and the like, and solves the problem of large calculation amount during video segmentation, feature extraction and similarity judgment due to excessive frame number of long video and frequent scene switching.

Fig. 19 is a schematic structural diagram of the third electronic device according to the embodiment of the present invention, which includes a processor 191, a communication interface 192, a memory 193, and a communication bus 194, where the processor 191, the communication interface 192, and the memory 193 complete communication with each other through the communication bus 194;

a memory 193 for storing a computer program;

the processor 191, when executing the program stored in the memory 193, implements the following steps:

determining a target user;

acquiring a video to be recommended;

extracting a plurality of key frames from the video to be recommended;

The third electronic device provided by the embodiment of the invention performs feature extraction based on video content, overcomes the problems of large information amount of video content, complex feature extraction and the like, and solves the problem of large calculation amount during video segmentation, feature extraction and similarity judgment due to excessive frame number of long video and frequent scene switching.

Fig. 20 is a schematic structural diagram of the fourth electronic device provided in the embodiment of the present invention, and includes a processor 201, a communication interface 202, a memory 203, and a communication bus 204, where the processor 201, the communication interface 202, and the memory 203 complete communication with each other through the communication bus 204;

a memory 203 for storing a computer program;

the processor 201 is configured to implement the following steps when executing the program stored in the memory 203:

acquiring a video to be classified;

extracting a plurality of key frames from the video to be classified;

The fourth electronic device provided by the embodiment of the invention performs feature extraction based on video content, overcomes the problems of large information amount of video content, complex feature extraction and the like, and solves the problem of large calculation amount during video segmentation, feature extraction and similarity judgment due to excessive frame number of long video and frequent scene switching.

The communication bus mentioned in the electronic device may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The communication bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown, but this does not mean that there is only one bus or one type of bus.

The communication interface is used for communication between the electronic equipment and other equipment.

The Memory may include a Random Access Memory (RAM) or a non-volatile Memory (non-volatile Memory), such as at least one disk Memory. Optionally, the memory may also be at least one memory device located remotely from the processor.

The Processor may be a general-purpose Processor, and includes a Central Processing Unit (CPU), a Network Processor (NP), and the like; the Integrated Circuit may also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, a discrete Gate or transistor logic device, or a discrete hardware component.

In another embodiment of the present invention, there is also provided a computer-readable storage medium, having stored therein instructions, which when run on a computer, cause the computer to execute the video matching method described in any of the above embodiments to obtain the same technical effect.

In another embodiment of the present invention, a computer-readable storage medium is further provided, which stores instructions that, when executed on a computer, cause the computer to execute the video retrieval method described in the above embodiments to obtain the same technical effect.

In yet another embodiment of the present invention, a computer-readable storage medium is further provided, which stores instructions that, when executed on a computer, cause the computer to execute the video recommendation method described in the above embodiments to achieve the same technical effect.

In yet another embodiment provided by the present invention, a computer-readable storage medium is further provided, which stores instructions that, when executed on a computer, cause the computer to execute the video classification method described in the above embodiments to achieve the same technical effect.

In yet another embodiment of the present invention, there is also provided a computer program product containing instructions which, when run on a computer, cause the computer to perform the video matching method of any of the above embodiments to achieve the same technical effect.

In yet another embodiment of the present invention, there is also provided a computer program product containing instructions which, when run on a computer, cause the computer to perform the video retrieval method described in the above embodiments to achieve the same technical effect.

In yet another embodiment of the present invention, there is also provided a computer program product containing instructions which, when run on a computer, cause the computer to perform the video recommendation method described in the above embodiments to achieve the same technical effect.

In yet another embodiment provided by the present invention, there is also provided a computer program product containing instructions which, when run on a computer, cause the computer to perform the video classification method described in the above embodiments to achieve the same technical effect.

In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, cause the processes or functions described in accordance with the embodiments of the invention to occur, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, from one website site, computer, server, or data center to another website site, computer, server, or data center via wired (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that incorporates one or more of the available media. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., Solid State Disk (SSD)), among others.

It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. The term "comprising", without further limitation, means that the element so defined is not excluded from the group consisting of additional identical elements in the process, method, article, or apparatus that comprises the element.

All the embodiments in the present specification are described in a related manner, and the same and similar parts among the embodiments may be referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, as for the apparatus embodiment, since it is substantially similar to the method embodiment, the description is relatively simple, and for the relevant points, reference may be made to the partial description of the method embodiment.

Claims

1. A method of video matching, comprising:

acquiring a video to be matched;

extracting a plurality of key frames from the video to be matched;

if any core scene frame of an original video is similar to the core scene frame to be matched, determining the original video as a target video matched with the video to be matched;

the similarity judgment of the core scene frame to be matched and the core scene frames of the pre-stored original videos includes:

if one Euclidean distance to be judged is smaller than a preset domain radius, determining that the core scene frame to be matched is similar to one core scene frame of the original video;

or the like, or, alternatively,

if the Euclidean distance to be judged is not smaller than the preset domain radius, determining that the core scene frame to be matched is not similar to each core scene frame of the original video; if the core scene frame to be matched is not similar to all the core scene frames of the original video, obtaining each core scene frame of the next original video and the color distribution characteristic value of each core scene frame to carry out similarity judgment;

the selecting, by using a density clustering algorithm, a key frame meeting a preset condition from a plurality of key frames of the video to be matched as a core scene frame to be matched according to an image feature value of each key frame includes:

2. The method according to claim 1, wherein the step of extracting a plurality of key frames from the video to be matched comprises:

3. The method according to claim 1, wherein the step of extracting the image feature value of each key frame of the video to be matched comprises:

4. The method according to claim 1, wherein the core scene frames of the original videos are obtained by performing calculation on each original video in advance by using the density clustering algorithm, and the method comprises the following steps:

5. The method according to claim 4, wherein the step of generating a key frame cluster by using any core object in the first core object set as a seed and searching for a sample with a reachable seed density comprises:

judging whether the second core object set is an empty set or not;

6. A method of video retrieval, comprising:

acquiring a video to be retrieved;

extracting a plurality of key frames from the video to be retrieved;

outputting the target video serving as a retrieval result to a user;

wherein, the similarity judgment of the core scene frame to be retrieved and the core scene frames of the pre-stored original videos comprises the following steps:

respectively calculating the Euclidean distances to be judged between the color distribution characteristic value of the core scene frame to be retrieved and each color distribution characteristic value of the original video by adopting a density clustering algorithm and taking the color distribution characteristic value of the core scene frame to be retrieved and each color distribution characteristic value of the original video as samples;

if one Euclidean distance to be judged is smaller than a preset domain radius, determining that the core scene frame to be retrieved is similar to one core scene frame of the original video;

or the like, or, alternatively,

if the Euclidean distance to be judged is not smaller than the preset domain radius, determining that the core scene frame to be retrieved is not similar to each core scene frame of the original video; if the core scene frame to be retrieved is not similar to all the core scene frames of the original video, obtaining each core scene frame of the next original video and the color distribution characteristic value of each core scene frame to carry out similarity judgment;

the method for selecting the key frames meeting the preset conditions from the plurality of key frames of the video to be retrieved as the core scene frames to be retrieved by adopting the density clustering algorithm and according to the image characteristic value of each key frame comprises the following steps:

determining a core object of a video to be retrieved from the samples according to the local density of each sample and the preset minimum neighborhood point number;

and determining the key frame corresponding to the core object of the video to be retrieved with the maximum local density as a core scene frame to be retrieved.

7. A method for video recommendation, comprising:

determining a target user;

acquiring a video to be recommended;

extracting a plurality of key frames from the video to be recommended;

if the core scene frame to be recommended is similar to any core scene frame of a historical video, recommending the video to be recommended to the target user as a recommended video;

the similarity judgment of the core scene frame to be recommended and the pre-stored core scene frames of the historical videos watched by the target user includes:

obtaining each core scene frame of a historical video and a color distribution characteristic value of each core scene frame;

respectively calculating the Euclidean distances to be judged between the color distribution characteristic value of the core scene frame to be recommended and each color distribution characteristic value of the historical video by adopting a density clustering algorithm and taking the color distribution characteristic value of the core scene frame to be recommended and each color distribution characteristic value of the historical video as samples;

if one Euclidean distance to be judged is smaller than a preset domain radius, determining that the core scene frame to be recommended is similar to one core scene frame of the historical video;

or the like, or, alternatively,

if the Euclidean distance to be judged is not smaller than the preset domain radius, determining that the core scene frame to be recommended is not similar to each core scene frame of the historical video; if the core scene frame to be recommended is not similar to all the core scene frames of the historical video, obtaining each core scene frame of the next historical video and the color distribution characteristic value of each core scene frame to perform similarity judgment;

the selecting, by using a density clustering algorithm, a key frame meeting a preset condition from a plurality of key frames of the video to be recommended as a core scene frame to be recommended according to an image feature value of each key frame includes:

determining a core object of a video to be recommended from the samples according to the local density of each sample and a preset minimum neighborhood point number;

and determining the key frame corresponding to the core object of the video to be matched with the maximum local density as a core scene frame to be recommended.

8. A method of video classification, comprising:

acquiring a video to be classified;

extracting a plurality of key frames from the video to be classified;

outputting the type of the target video to a user as the type of the video to be classified;

the similarity judgment of the core scene frame to be classified and the pre-stored core scene frames of the original videos includes:

respectively calculating the Euclidean distances to be judged between the color distribution characteristic value of the core scene frame to be classified and each color distribution characteristic value of the original video by adopting a density clustering algorithm and taking the color distribution characteristic value of the core scene frame to be classified and each color distribution characteristic value of the original video as samples;

if one Euclidean distance to be judged is smaller than a preset domain radius, determining that the core scene frame to be classified is similar to one core scene frame of the original video;

or the like, or, alternatively,

if the Euclidean distance to be judged is not smaller than the preset domain radius, determining that the core scene frames to be classified are not similar to the core scene frames of the original video; if the core scene frame to be classified is not similar to all the core scene frames of the original video, obtaining each core scene frame of the next original video and the color distribution characteristic value of each core scene frame to perform similarity judgment;

the method for selecting the key frames meeting the preset conditions from the plurality of key frames of the video to be classified as the core scene frames to be classified by adopting the density clustering algorithm and according to the image characteristic value of each key frame comprises the following steps:

determining a core object of a video to be classified from the samples according to the local density of each sample and a preset minimum neighborhood point number;

and determining the key frame corresponding to the core object of the video to be matched with the maximum local density as a core scene frame to be classified.

9. An apparatus for video matching, comprising:

the first acquisition module is used for acquiring a video to be matched;

the first determining module is used for determining an original video as a target video matched with the video to be matched if any core scene frame of the original video is similar to the core scene frame to be matched;

wherein, the first judging module comprises:

the first obtaining submodule is used for obtaining each core scene frame of an original video and a color distribution characteristic value of each core scene frame;

the second distance calculation submodule is used for respectively calculating the Euclidean distances to be judged between the color distribution characteristic value of the core scene frame to be matched and each color distribution characteristic value of the original video by adopting a density clustering algorithm and taking the color distribution characteristic value of the core scene frame to be matched and each color distribution characteristic value of the original video as samples;

the first judgment submodule is used for determining that the core scene frame to be matched is similar to a core scene frame of the original video if one Euclidean distance to be judged is smaller than a preset domain radius; if the Euclidean distance to be judged is not smaller than the preset domain radius, determining that the core scene frame to be matched is not similar to each core scene frame of the original video; if the core scene frame to be matched is not similar to all the core scene frames of the original video, obtaining each core scene frame of the next original video and the color distribution characteristic value of each core scene frame to carry out similarity judgment;

the first selecting module comprises:

the eighth distance calculation submodule is used for calculating the Euclidean distance between every two samples by using the color distribution characteristic value of each key frame as a sample by adopting a density clustering algorithm;

the first density calculation submodule is used for calculating the local density of each sample according to the Euclidean distance between every two samples and a preset field radius;

the first determining submodule is used for determining a core object of a video to be matched from the samples according to the local density of each sample and the number of preset minimum neighborhood points;

and the second determining submodule is used for determining the key frame corresponding to the core object of the video to be matched with the maximum local density as the core scene frame to be matched.

10. The apparatus of claim 9, wherein the first extraction module is specifically configured to:

11. The apparatus according to claim 9, wherein the second extraction module is specifically configured to:

12. The apparatus of claim 9, further comprising: an original video core scene frame determining module;

an original video core scene frame determination module, comprising:

the first extraction submodule is used for extracting a plurality of key frames of an original video aiming at the original video;

the second extraction submodule is used for extracting the color distribution characteristic value of each key frame of the original video;

the third distance calculation submodule is used for calculating the Euclidean distance between every two samples by using the color distribution characteristic value of each key frame as a sample by adopting a density clustering algorithm;

the second density calculation submodule is used for calculating the local density of each sample according to the Euclidean distance between every two samples and a preset field radius;

a third determining submodule, configured to determine a core object of the original video from the samples according to the local density of each sample and a preset minimum neighborhood point number, and use all the core objects of the original video as a first core object set;

the first generation submodule is used for searching samples with the seed density being reachable by taking any core object in the first core object set as a seed, and generating a key frame cluster; for each key frame clustering cluster, selecting any core object in the cluster as a first core object, and taking a core object with local density greater than that of the first core object as a second core object;

a fourth distance calculating submodule, configured to calculate a distance between the first core object and each second core object, and determine a distance between the second core object and the first core object, which is the smallest distance, as a high local density point distance of the first core object;

a fourth determining submodule, configured to determine a core scene frame of each key frame cluster according to the local density and the high local density point distance of each core object in each key frame cluster; and taking the core scene frame of each determined key frame cluster as the core scene frame of the original video.

13. The apparatus of claim 12, wherein the first generation submodule comprises:

the searching sub-module is used for searching all samples with the seed density reaching by taking any non-clustered core object in the first core object set as a seed;

the cluster generation submodule is used for generating a first key frame cluster from the searched key frames corresponding to all the samples; taking core objects except the core objects contained in the generated first key frame cluster in the first core object set as a second core object set;

a second judgment submodule, configured to judge whether the second core object set is an empty set; if the second core object set is not an empty set, taking the second core object set as a first core object set, and triggering the search submodule; and if the second core object set is an empty set, the step of generating the key frame cluster is finished.

14. An apparatus for video retrieval, comprising:

the second acquisition module is used for acquiring a video to be retrieved;

the first output module is used for outputting the target video serving as a retrieval result to a user;

wherein, the second judging module comprises:

the second obtaining submodule is used for obtaining each core scene frame of an original video and the color distribution characteristic value of each core scene frame;

a fifth distance calculation submodule, configured to use a density clustering algorithm, take the color distribution characteristic value of the core scene frame to be retrieved and each color distribution characteristic value of the original video as samples, and calculate each euclidean distance to be determined between the color distribution characteristic value of the core scene frame to be retrieved and each color distribution characteristic value of the original video, respectively;

a third judging submodule, configured to determine that the core scene frame to be retrieved is similar to a core scene frame of the original video if one euclidean distance to be judged is smaller than a preset domain radius; if the Euclidean distance to be judged is not smaller than the preset domain radius, determining that the core scene frame to be retrieved is not similar to each core scene frame of the original video; if the core scene frame to be retrieved is not similar to all the core scene frames of the original video, obtaining each core scene frame of the next original video and the color distribution characteristic value of each core scene frame to carry out similarity judgment;

a second selection module comprising:

the ninth distance calculation submodule is used for calculating the Euclidean distance between every two samples by using the color distribution characteristic value of each key frame as a sample by adopting a density clustering algorithm;

the third density calculation submodule is used for calculating the local density of each sample according to the Euclidean distance between every two samples and a preset field radius;

a fifth determining submodule, configured to determine a core object of the video to be retrieved from the samples according to the local density of each sample and a preset minimum neighborhood point number;

and the sixth determining submodule is used for determining the key frame corresponding to the core object of the video to be matched with the maximum local density as the core scene frame to be retrieved.

15. An apparatus for video recommendation, comprising:

the third determining module is used for determining a target user;

the third acquisition module is used for acquiring a video to be recommended;

the second output module is used for recommending the video to be recommended to the target user as a recommended video if the core scene frame to be recommended is similar to any core scene frame of a historical video;

wherein, the third judging module comprises:

the third acquisition submodule is used for acquiring each core scene frame of a historical video and the color distribution characteristic value of each core scene frame;

a sixth distance calculation submodule, configured to use a density clustering algorithm, take the color distribution characteristic value of the core scene frame to be recommended and each color distribution characteristic value of the historical video as samples, and calculate each euclidean distance to be determined between the color distribution characteristic value of the core scene frame to be recommended and each color distribution characteristic value of the historical video, respectively;

the fourth judgment submodule is used for determining that the core scene frame to be recommended is similar to a core scene frame of the historical video if one Euclidean distance to be judged is smaller than a preset domain radius; if the Euclidean distance to be judged is not smaller than the preset domain radius, determining that the core scene frame to be recommended is not similar to each core scene frame of the historical video; if the core scene frame to be recommended is not similar to all the core scene frames of the historical video, obtaining each core scene frame of the next historical video and the color distribution characteristic value of each core scene frame to perform similarity judgment;

a third selecting module, comprising:

the tenth distance calculation submodule is used for calculating the Euclidean distance between every two samples by using the color distribution characteristic value of each key frame as a sample by adopting a density clustering algorithm;

the fourth density calculation submodule is used for calculating the local density of each sample according to the Euclidean distance between every two samples and a preset field radius;

a seventh determining submodule, configured to determine a core object of the video to be recommended from the samples according to the local density of each sample and a preset minimum neighborhood point number;

and the eighth determining submodule is used for determining the key frame corresponding to the core object of the video to be matched with the maximum local density as the core scene frame to be recommended.

16. An apparatus for video classification, comprising:

the fourth acquisition module is used for acquiring the video to be classified;

the third output module is used for outputting the type of the target video as the type of the video to be classified to a user;

wherein, the fourth judging module includes:

the fourth obtaining submodule is used for obtaining each core scene frame of an original video and the color distribution characteristic value of each core scene frame;

a seventh distance calculating submodule, configured to use a density clustering algorithm, take the color distribution characteristic value of the core scene frame to be classified and each color distribution characteristic value of the original video as samples, and calculate each euclidean distance to be determined between the color distribution characteristic value of the core scene frame to be classified and each color distribution characteristic value of the original video, respectively;

a fifth judging submodule, configured to determine that the core scene frame to be classified is similar to a core scene frame of the original video if one euclidean distance to be judged is smaller than a preset domain radius; if the Euclidean distance to be judged is not smaller than the preset domain radius, determining that the core scene frames to be classified are not similar to the core scene frames of the original video; if the core scene frame to be classified is not similar to all the core scene frames of the original video, obtaining each core scene frame of the next original video and the color distribution characteristic value of each core scene frame to perform similarity judgment;

a fourth selecting module, comprising:

the eleventh distance calculation submodule is used for calculating the Euclidean distance between every two samples by using the color distribution characteristic value of each key frame as a sample by adopting a density clustering algorithm;

the fifth density calculation submodule is used for calculating the local density of each sample according to the Euclidean distance between every two samples and a preset field radius;

a ninth determining submodule, configured to determine a core object of a video to be classified from the samples according to the local density of each sample and a preset minimum neighborhood point number;

and the tenth determining submodule is used for determining the key frame corresponding to the core object of the video to be matched with the maximum local density as the core scene frame to be classified.

17. An electronic device is characterized by comprising a processor, a communication interface, a memory and a communication bus, wherein the processor and the communication interface are used for realizing mutual communication by the memory through the communication bus;

a memory for storing a computer program;

a processor for implementing the method steps of any one of claims 1 to 5 when executing a program stored in the memory.

18. An electronic device is characterized by comprising a processor, a communication interface, a memory and a communication bus, wherein the processor and the communication interface are used for realizing mutual communication by the memory through the communication bus;

a memory for storing a computer program;

a processor for implementing the method steps of claim 6 when executing a program stored in the memory.

19. An electronic device is characterized by comprising a processor, a communication interface, a memory and a communication bus, wherein the processor and the communication interface are used for realizing mutual communication by the memory through the communication bus;

a memory for storing a computer program;

a processor for implementing the method steps of claim 7 when executing a program stored in the memory.

20. An electronic device is characterized by comprising a processor, a communication interface, a memory and a communication bus, wherein the processor and the communication interface are used for realizing mutual communication by the memory through the communication bus;

a memory for storing a computer program;

a processor for implementing the method steps of claim 8 when executing a program stored in the memory.