CN111444137A

CN111444137A - Multimedia file identity recognition method based on feature codes

Info

Publication number: CN111444137A
Application number: CN202010223841.4A
Authority: CN
Inventors: 罗尉
Original assignee: Hunan Seud Network Science & Technology Co ltd
Current assignee: Hunan Seud Network Science & Technology Co ltd
Priority date: 2020-03-26
Filing date: 2020-03-26
Publication date: 2020-07-24

Abstract

The invention discloses a multimedia file identity recognition method based on a feature code, which comprises the following steps: firstly, a client acquires an original multimedia file and initiates an identification request; the processor receives a multimedia file identification request from the client and identifies the format of the multimedia file; thirdly, detecting and extracting the characteristic points of the picture, the audio and the video file; fourthly, the client acquires the multimedia files for matching and sends a request to the processor; fifthly, the processor receives the request and executes the second step and the third step on the matched multimedia file; and sixthly, comparing the characteristic points of the original multimedia file and the matched multimedia file by the processor, calculating the matching degree, and feeding the result back to the client. The method has high efficiency, simplifies the complex multimedia comparison into simple stable feature code comparison, and shortens the manual or machine comparison time.

Description

Multimedia file identity recognition method based on feature codes

Technical Field

The invention relates to the field of multimedia file identification, in particular to a multimedia file identity identification method based on feature codes.

Background

Internet multimedia files including pictures, text, sound, video, etc. are easily and quickly copied and spread because they exist on a network carrier in a digitally encoded form. And more infringement disputes are generated in the process of propagation, so that how to prevent the infringement cases is a problem to be solved on the basis of the prior art if the infringement behaviors are rapidly identified.

As is well known, no matter an instant messaging tool (such as WeChat) or an information interactive website (such as microblog), the occurrence source of picture infringement cannot be fundamentally stopped. After all, the data volume of the multimedia file is huge, and the infringement risk cannot be effectively compared one by one before the infringement happens.

The traditional multimedia requires a large amount of storage space, the cost is high, and the files are distributed in all corners of the network, so that the comparison time period is very long.

Disclosure of Invention

The technical problem to be solved by the invention is to provide a multimedia file identity recognition method based on the feature codes, which has high efficiency, can simplify the comparison of complex multimedia files into the comparison of simple stable feature codes, and can shorten the time of man-made or machine comparison.

In order to solve the technical problems, the invention adopts the following technical scheme:

the invention provides a multimedia file identity recognition method based on a feature code, which comprises the following steps:

firstly, a client acquires an original multimedia file and initiates an identification request;

the processor receives a multimedia file identification request from the client and identifies the format of the multimedia file;

(1) if the identified multimedia file format is a picture file, performing the following operations;

if the multimedia picture file is a color image, carrying out graying treatment on the multimedia picture file:

carrying out image filtering processing on the multimedia picture file subjected to the graying processing;

carrying out file structure analysis and shape description on the multimedia picture file;

detecting and extracting picture characteristic points of the multimedia picture file;

(2) if the identified multimedia file format is an audio file, the following operations are performed:

carrying out audio filtering processing on the multimedia audio file;

detecting and extracting audio characteristic points of the multimedia audio file;

(3) if the identified multimedia file format is a video file, the following operations are performed

Extracting the picture of the multimedia video file;

extracting corresponding picture files and audio files in the multimedia video files;

executing the step (1) on the extracted picture file;

performing the step (2) on the extracted audio file;

integrating the picture characteristic points and the audio characteristic points extracted in the step;

thirdly, the client collects multimedia files for matching and sends a request to the processor;

fourthly, the processor receives the request and executes the second step to the matched multimedia file;

fifthly, comparing the characteristic points of the original multimedia file and the matched multimedia file by the processor, calculating the matching degree, and feeding the result back to the client;

the detection and extraction of the picture feature points in the step (1) comprise the following steps:

a1, identifying extreme points which are not changed by illumination and scale change through a differential function, wherein the extreme points are candidate feature points;

a2, filtering out candidate characteristic points with poor stability on the basis of the candidate characteristic points through fitting judgment;

a3, allocating a plurality of vector directions to the selected characteristic points;

a4, calculating the rotation invariant characteristic in the neighborhood of each determined feature point according to the distributed vector direction;

the audio characteristic point detection and extraction in the step (2) comprises the following steps:

b1, obtaining the amplitude of the change of the energy value of the sound along with the time, and then squaring the amplitude to obtain short-time energy as a characteristic of the audio file;

b2, calculating the times of the audio signal passing through zero value in each frame, namely, the short-time zero-crossing rate as the second characteristic of the audio file;

b3, calculating the degree of correlation of the signal, namely a short-time autocorrelation function as a third characteristic of the audio file.

Furthermore, in the second step, the multimedia file identification identifies the file format by collecting file headers, and the picture files comprise file headers such as jpg, png, tif and bmp; the audio files comprise file headers such as wav, flac, ape, alac, cda, mp3 and aac; the video file comprises file headers such as rm, rmvb, mpeg1-4, mov, mtv, dat, wmv and avi.

Further, in the step (1), the picture graying processing is to equalize the pixel values of R, G, B three components of the color image, wherein the pixel values are 255 at maximum and 0 at minimum, so that the color image is converted into a gray image.

Further, the image filtering processing in the step (1) adopts nonlinear filtering to enhance the image, and performs compromise processing by combining the spatial proximity and the pixel value similarity of the image, and simultaneously considers spatial information and gray level similarity.

Further, in the step (1), the picture file structure analysis and the shape description adopt a binary image retrieval outline, and non-zero pixels are processed as 1, and the zero pixels are kept unchanged.

Further, the audio filtering process in the step (2) adopts a linear phase characteristic filter design; the phase shift produced by the sine wave of different frequencies and the frequency of the sine wave are described as a straight line, and after the signal in the filter channel passes through the filter, the whole signal in the pass band can be retained without distortion except for the time delay determined by the slope of the phase shift characteristic.

Further, the step a1 specifically includes the following steps:

c1, representing multiple scales by a gaussian scale space, and the gaussian scale space of an image can be obtained by convolving it with different gaussians:

L(x，yσ)＝G(x，y，σ)*I(x，y)

where x and y are sample points, б are scale space parameters, L (x, y, б) is gaussian scale space, and G (x, y, б) is a gaussian kernel function:

the scale space parameters are obtained through the standard deviation of a Gaussian normal distribution function, and the larger the value is, the larger the obtained scale is; to reduce the amount of computation, the system selects a difference gaussian for the large graph to compute according to the size of the image:

D(x，y，σ)＝L(x，y，kσ)-L(x，y，σ)

performing Gaussian calculation on each layer of the image, and arranging the layers according to the principle of from large to small and from bottom to top after the calculation is completed;

c2, subtracting adjacent arrangements to obtain a response image;

c3, traversing each point in the response image space, comparing the point with the adjacent point where the point is located, and determining the point as a characteristic point only when the gray value of the point is greater than or less than the gray values of all the field points;

further, the calculation of the short-term energy in step b1 is:

where n denotes a point, x (n) is an audio signal, w (n) is a window function, and STE is a result value of the short-time energy calculation.

Further, the short-time zero-crossing rate in step b2 is calculated as:

where x (n) is the audio signal, sgn [ ] is the sign function, and ZCR is the result of the short-time zero-crossing rate calculation.

Further, the calculation of the short-time autocorrelation function in step b3 is:

wherein x is_i(n) denotes an ith frame of the audio signal, L denotes a length of each frame after the audio signal is framed, K is a delay amount, and STAF is a result value calculated by the short-time autocorrelation function.

The invention has the beneficial effects that:

1. the method has high efficiency, simplifies the complex multimedia comparison into simple stable feature code comparison, and shortens the manual or machine comparison time.

2. The method of the invention has wide application, and the comparison of the stable characteristic codes can be applied to multimedia infringement, and also can be applied to Internet health development and national defense safety, such as: reaction pictures, videos, speech, and the like.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

Fig. 1 is a flowchart of a multimedia file identification method according to an embodiment of the present invention.

Detailed Description

In order to better illustrate the content of the invention, the invention is further verified by the following specific examples. It should be noted that the examples are given for the purpose of describing the invention more directly and are only a part of the present invention, which should not be construed as limiting the invention in any way.

As shown in fig. 1, an embodiment of the present invention provides a multimedia file identity recognition method based on feature codes, including the following steps:

acquiring file headers to identify file formats, wherein the picture files comprise file headers such as jpg, png, tif and bmp; the audio files comprise file headers such as wav, flac, ape, alac, cda, mp3 and aac; the video file comprises file headers such as rm, rmvb, mpeg1-4, mov, mtv, dat, wmv and avi.

if the multimedia picture file is a color image, carrying out graying processing on the multimedia picture file;

the pixel values of the R, G, B three components of the color image are equal and are 255 at maximum and 0 at minimum, so that the color image is converted into a gray image.

and enhancing the image by adopting nonlinear filtering, carrying out compromise processing by combining the spatial proximity and the pixel value similarity of the image, and simultaneously considering spatial information and gray level similarity.

and (4) retrieving the contour by using a binary image, treating a non-zero pixel as 1, and keeping the zero pixel unchanged.

The method for detecting and extracting the picture characteristic points of the multimedia picture file comprises the following steps:

L(x，y，σ)＝G(x，y，σ)*I(x，y)

D(x，y，σ)＝L(x，y，kσ)-L(x，y，σ)

c2, subtracting adjacent arrangements to obtain a response image;

c4, assigning values in combination with the orientation, so as to extract the characteristic that the image has no change in rotation.

(2) Obtaining the identified multimedia file format as an audio file;

carrying out audio filtering processing on the multimedia audio file;

adopting a linear phase characteristic filter design; the phase shift produced by the sine wave of different frequencies and the frequency of the sine wave are described as a straight line, and after the signal in the filter channel passes through the filter, the whole signal in the pass band can be retained without distortion except for the time delay determined by the slope of the phase shift characteristic.

b1, obtaining the amplitude of the sound energy value changing with time, then squaring the amplitude to obtain the short-time energy as a characteristic of the audio file, and calculating the short-time energy as follows:

b2, calculating the number of times that the audio signal passes through zero value in each frame, namely, the short-time zero-crossing rate is taken as the second characteristic of the audio file, and the calculation of the short-time zero-crossing rate is as follows:

b3, calculating the degree of correlation of the signal, namely, calculating a short-time autocorrelation function as a third characteristic of the audio file, wherein the short-time autocorrelation function is calculated as follows:

(3) Obtaining the identified multimedia file format as a video file;

extracting the picture of the multimedia video file;

executing the step (1) on the extracted picture file;

performing the step (2) on the extracted audio file;

the specific embodiments described herein are merely illustrative of the spirit of the invention. Various modifications or additions may be made to the described embodiments or alternatives may be employed by those skilled in the art without departing from the spirit or scope of the invention as defined in the appended claims.

Claims

1. A multimedia file identity recognition method based on feature codes is characterized by comprising the following steps:

carrying out audio filtering processing on the multimedia audio file;

Extracting the picture of the multimedia video file;

executing the step (1) on the extracted picture file;

performing the step (2) on the extracted audio file;

a3, distributing a plurality of vector directions to the selected feature points;

2. The method as claimed in claim 1, wherein the multimedia file identification in step two is performed by collecting a file header to identify a file format.

3. The method for identifying the identity of the multimedia file based on the feature code according to claim 1, wherein the graying of the picture in step (1) is performed by equalizing the pixel values of R, G, B three components of the color image, and the pixel values are 255 at maximum and 0 at minimum, that is, converting the color image into a gray image.

4. The method for identifying the identity of the multimedia file based on the feature code according to claim 1, wherein the image filtering processing in the step (1) adopts nonlinear filtering to enhance the image, and the compromise processing is performed by combining the spatial proximity and the pixel value similarity of the image, and simultaneously the spatial information and the gray level similarity are considered.

5. The method for identifying the identity of the multimedia file based on the feature code according to claim 1, wherein the picture file structure analysis and the shape description in step (1) adopt a binary image retrieval contour, and non-zero pixels are treated as 1 and zero pixels are kept unchanged.

6. The method for identifying the identity of the multimedia file based on the feature code of claim 1, wherein the audio filtering process in the step (2) adopts a linear phase characteristic filter design; the phase shift produced by the sine wave of different frequencies and the frequency of the sine wave are described as a straight line, and after the signal in the filter channel passes through the filter, the whole signal in the pass band can be retained without distortion except for the time delay determined by the slope of the phase shift characteristic.

7. The method for identifying the identity of the multimedia file based on the feature code according to claim 1, wherein the step a1 specifically comprises the following steps:

L(x，y，σ)＝G(x，y，σ)*I(x，y)

where x and y are sample points, σ is a scale space parameter, L (x, y, σ) is a gaussian scale space, and G (x, y, σ) is a gaussian kernel function:

D(x，y，σ)＝L(x，y，kσ)-L(x，y，σ)

c2, subtracting adjacent arrangements to obtain a response image;

c3, traversing each point in the response image space, comparing the point with the adjacent point where the point is located, and determining the point as the characteristic point only when the gray value of the point is larger or smaller than the gray values of all the field points.

8. The method for identifying the identity of the multimedia file based on the feature code of claim 1, wherein the short-term energy in the step b1 is calculated by using the following formula:

9. The method for identifying the identity of the multimedia file based on the feature code of claim 1, wherein the short-time zero-crossing rate in the step b2 is calculated by using the following formula:

10. The method for identifying the identity of the multimedia file based on the feature code of claim 1, wherein the short-time autocorrelation function in the step b3 is calculated by using the following formula: