CN111785296A - Music segmentation boundary identification method based on repeated melody - Google Patents

Music segmentation boundary identification method based on repeated melody Download PDF

Info

Publication number
CN111785296A
CN111785296A CN202010459989.8A CN202010459989A CN111785296A CN 111785296 A CN111785296 A CN 111785296A CN 202010459989 A CN202010459989 A CN 202010459989A CN 111785296 A CN111785296 A CN 111785296A
Authority
CN
China
Prior art keywords
frame
point
music
graph
points
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010459989.8A
Other languages
Chinese (zh)
Other versions
CN111785296B (en
Inventor
张克俊
朱凯丽
殷叶航
叶雨晴
伍文棋
王昊阳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Original Assignee
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU filed Critical Zhejiang University ZJU
Priority to CN202010459989.8A priority Critical patent/CN111785296B/en
Publication of CN111785296A publication Critical patent/CN111785296A/en
Application granted granted Critical
Publication of CN111785296B publication Critical patent/CN111785296B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2210/00Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
    • G10H2210/031Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2210/00Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
    • G10H2210/031Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal
    • G10H2210/061Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal for extraction of musical phrases, isolation of musically relevant segments, e.g. musical thumbnail generation, or for temporal structure analysis of a musical piece, e.g. determination of the movement sequence of a musical work

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Auxiliary Devices For Music (AREA)

Abstract

The invention relates to a music segmentation boundary identification method based on repeated melody, belonging to the technical field of audio signal processing. The method comprises the following steps: 1) extracting chroma characteristics from the audio, filling zero from beginning to end, aggregating every adjacent N frames to form a new frame vector, and forming a new frame characteristic vector sequence by all the frame vectors; 2) calculating Euclidean distance between each frame vector and other frame vectors in the frame feature sequence to obtain a self-similarity matrix S; 3) based on the self-similarity matrix S, obtaining a set N of the ith frame vector nearest neighbor frameiAnd obtaining a recursion graph R of the self-similarity matrix S; 4) carrying out time delay processing on the recursive graph R to obtain a time delay matrix L; 5) carrying out line segment normalization and denoising on the L, and then carrying out reverse time delay processing to obtain a recursion graph R'; 6) and detecting all line segments, clustering the line segments, and sequentially processing from the cluster with the largest line segment to obtain a music segmentation boundary point set B. The recognition capability of the repeated melody in the music can be improved, and the music can be segmented in a shorter time.

Description

Music segmentation boundary identification method based on repeated melody
Technical Field
The invention relates to the technical field of audio signal processing, in particular to a music segmentation boundary identification method based on repeated melody.
Background
Information is often organized in a structure or hierarchy to facilitate dissemination or understanding. Humans are often very good at perceiving such structures, and this behavior is sometimes even unconsciously performed to let us analyze and fully capture the meaning of given information. However, in consideration of the situation in the big data era, there is an increasing need to obtain information processing support from computers. Therefore, the structure of automatically acquiring information becomes a key task of today's content processing systems. Music is a typical example of a wide range of multimedia content.
Important applications of music segment boundary identification algorithm research are in-production navigation, automatic generation of segments and mashups of players, identification of versions of the same production and large-scale musics research. The popularity and development of networking and digital entertainment products has made music one of the most important digital media content.
At present, music plays an important role in the form of soundtracks in movie works as well as in stand-alone entertainment products. Music segmentation is an important basic process in music analysis as a stand-alone entertainment product. For the analysis scene of certain musical works, the great number of the works highlights the importance of automatic music segmentation. As the music score, in practical application, more situations are that the music score is taken for use than the music score appears in the whole, and the automatic music segmentation can greatly improve the efficiency of music score extraction. Therefore, the research of the music segmentation boundary identification algorithm has wide market application prospect.
Foote first used the self-similarity matrix in the study of music segmentation algorithms in 2000 for finding repeating melodies in music. Bruderer et al, 2006, noted that there were some clues that humans were highly related in the perception of musical structure, such as timbre changes, repetition and pause. The 2010 study by Paulus et al indicates that there are three principles for inferring the structure of music: novel, homogeneous and repetitive. The music segmentation algorithm proposed by Serra et al in 2014 comprehensively considers the principles, introduces a calculation method of a recursive graph, and greatly improves the segmentation accuracy, so that the automatic music segmentation efficiency is improved, and the development of the music automatic segmentation algorithm is promoted.
However, the current algorithm applied to music segmentation has many defects, such as large segmentation granularity of an unsupervised method, difficulty in acquiring short segments of partial music, low degree of combined music theory knowledge, and excessive dependence on a mathematical method. The deep learning method cannot fully consider the repeated property in the segmentation, and has the problems of dependence on data, high model training cost and difficulty in combining with music knowledge.
Disclosure of Invention
The invention aims to provide a music segmentation boundary identification method based on repeated melody, so as to improve the identification capability of the repeated melody in the music and segment the music in a shorter time scale.
In order to achieve the above object, the method for recognizing the boundary of a music segment based on a repeated melody according to the present invention comprises the following steps:
1) extracting chroma characteristics from the audio to obtain a characteristic vector sequence, wherein the sequence is M frames in total; zero padding the head and the tail of the feature vector sequence, aggregating every adjacent N frames to form a new frame vector, and forming a new frame feature vector sequence by all the frame vectors;
2) calculating Euclidean distance between each frame vector and other frame vectors in the frame feature sequence to obtain a self-similarity matrix S;
3) based on the self-similarity matrix S, obtaining a set N of the ith frame vector nearest neighbor frameiI is 1, 2, …, M, and a recursion graph R of the self-similarity matrix S is obtained;
4) carrying out time delay processing on the recursive graph R to obtain a time delay matrix L;
5) carrying out line segment normalization and denoising on the time delay matrix L, and then carrying out reverse time delay processing to obtain a normalized and denoised recursive graph R';
6) and detecting all line segments based on the recursion graph R', clustering the line segments, and sequentially processing from the cluster with the most line segments to obtain a music segmentation boundary point set B.
In the above technical solution, for repeated segments of music, pitch class summary (PitchClass Profile) features of the music, also called Chroma features, are extracted in frames, and the features organize frequencies in a given range into 12 pitch classes to highlight the melody of the music.
Optionally, in one embodiment, in step 3), for the set NiThe k elements in the frame vector are k frame vectors which are most similar to the ith frame vector in all the frame vectors, and the value of k is 0.01 of the total number of the frame vectors. For each point R in the recursive graph Ri,jIf i belongs to NjAnd j belongs to NiThen get Ri,jEqual to 1, otherwise take Ri,jEqual to 0, thus obtaining a recursion map R of the self-similarity matrix S.
Optionally, in one embodiment, in step 4), let Li,j=Ri,(i+j)mod(M-1)I is 1, 2, …, M, j is 1, 2, …, M, and the time delay matrix L of the recursion graph R is obtained, that is, the main diagonal direction in the recursion graph R is converted into the horizontal direction.
Optionally, in one embodiment, step 5) comprises:
5-1) traversing the time delay matrix L, and defining a point with the value of 1; when one point is found, all points connected with the point are determined through breadth-first searching, and if the step distance is less than 3, the points are considered to be connected;
5-2) counting the number of each point with the same vertical coordinate in the connected points, if the number of the point with the largest number of points in the vertical coordinate is more than 5, keeping the point of the vertical coordinate in the points, and taking the value of other points as 0; otherwise, all the points are set to be 0;
5-3) R'i,(i+j)mod(M-1)=Li,jI is 1, 2, …, M, j is 1, 2, …, M, and a regularized and denoised recursion map R' is obtained.
Optionally, in an embodiment, in step 6), the clustering of line segments includes:
the recursive graph R' is traversed and the stride is set to 3.
Find all line segments in the graph and use { x1,x2,y1,y2Get into each line segmentLine normalized representation, x1And x2Is the horizontal coordinate of the starting point and the stopping point, y1And y2Is the ordinate of the start and stop points;
taking a line segment, traversing other line segments, and finding all line segments which correspond to the line segment and are the same segment of melody for clustering; the basis for judging the melody corresponding to the same segment is as follows: x is the number of1And x2The common length of (a) accounts for more than 80% of each.
Optionally, in an embodiment, in step 6), after clustering the line segments, the cluster with the largest number of line segments is taken, and all x are subjected to clustering1And x2Taking an average value to obtain
Figure BDA0002510613080000041
And
Figure BDA0002510613080000042
then for each line segment in the cluster, according to x1And
Figure BDA0002510613080000043
x2and
Figure BDA0002510613080000044
respectively for y1And y2Corrected to obtain y'1And y'2(ii) a Will be provided with
Figure BDA0002510613080000045
And all of y'1、y’2The following processing is performed as time x: and (3) checking whether a segment boundary point which is less than n frames away from the time point x exists in the music segment boundary point set B, and if not, adding the time point x into the B.
Compared with the prior art, the invention has the advantages that:
the matrix denoising method utilizes the music theory knowledge and the actual experience to perform matrix denoising, fully considers the main reasons of noise generation in music segmentation, and can more thoroughly and efficiently reduce errors caused by noise. The segment clustering-based segment point acquisition method preferentially considers melody segments with a large number of repeated times, and the method of taking the average value as the segment point further reduces errors and improves generalization performance.
Drawings
FIG. 1 is a flowchart illustrating a method for recognizing boundaries of music segments based on repeated melodies according to an embodiment of the present invention;
FIG. 2 is a diagram of a recurrence plot R in an embodiment of the present invention;
FIG. 3 is a diagram of a delay matrix L according to an embodiment of the present invention;
FIG. 4 is a recursive graph R' after warping and denoising in the embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be further described with reference to the following embodiments and accompanying drawings. It is to be understood that the embodiments described are only a few embodiments of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the described embodiments without any inventive step, are within the scope of protection of the invention.
Unless defined otherwise, technical or scientific terms used herein shall have the ordinary meaning as understood by one of ordinary skill in the art to which this invention belongs. The use of the word "comprising" or "comprises", and the like, in the context of this application, is intended to imply that the elements or steps preceding the word comprise those elements or steps listed below, but not the exclusion of other elements or steps.
Examples
In the music segmentation boundary identification method based on the repeated melody, a music segmentation algorithm based on a self-similarity matrix is constructed, and automatic identification of music structure segmentation points is realized. The method can replace manual labeling, is used for generating the music structure sequence, and can be further applied to music analysis, automatic fragment generation and the like. Referring to fig. 1, the specific process is as follows:
s100, extracting chroma characteristics from the audio to obtain a characteristic vector sequence, wherein the sequence is M frames in total; zero padding the head and the tail of the feature vector sequence, aggregating every adjacent N frames to form a new frame vector, and forming a new frame feature vector sequence by all the frame vectors;
the sample music has a characteristic sequence of 12-dimensional vectors of length 1344. And zero padding from head to tail to obtain a length sequence with the length of 1350, and aggregating each adjacent 7 frames to form a new frame feature sequence to obtain a 12x 7-dimensional vector sequence with the length of 1344.
S200, calculating Euclidean distance between each frame vector in the frame feature sequence and other frame vectors to obtain a self-similarity matrix S of 1344x 1344.
S300, obtaining a set N of the nearest neighbor frames of the ith frame based on the self-similarity matrix SiI is 1, 2, …, M, and in turn, a recurrence plot R of the self-similarity matrix S is obtained, see fig. 2;
set NiThe k elements in (a) are the k frames most similar to the ith frame in all frames. For each point R in the recursive graphi,jIf i belongs to NjAnd j belongs to NiThen get Ri,jEqual to 1, otherwise take Ri,jEqual to 0, resulting in a recurrence plot R of 1344x 1344. The value of k is 0.01 of the total number of frames, and 13 is taken in this embodiment.
S400, carrying out time delay processing on the recursion graph R to obtain a time delay matrix L, which is shown in FIG. 3;
preface Li,j=Ri,(i+j)mod(M-1)And obtaining a time delay matrix L of the recursion diagram R, and converting the main diagonal direction in the recursion diagram R into the horizontal direction, so that the calculation efficiency is improved.
S500, conducting line segment normalization and denoising on the time delay matrix L, and then conducting reverse time delay processing to obtain a normalized and denoised recursion graph R', see figure 4.
Firstly, traversing the time delay matrix L, and defining the time delay matrix L with the value of 1 as a point. Every time a point is found, all points connected with the point are determined through breadth-first search, and if the step distance is less than 3, the points are considered to be connected. And counting the number of each point with the same vertical coordinate in the connected points, if the number of the point with the largest number of points in the vertical coordinate is more than 5, keeping the point of the vertical coordinate in the points, and taking the values of other points as 0. Otherwise, taking the values of all the points to be 0. For example, a series of points are { (1,1), (2,1), (3,1), (4,1), (5,1), (6,1), (2,2), (3,2) (4,2) }, a maximum of 6 points with ordinate 1 will be retained, while a point with ordinate 2 will be erased. Then, let R'i,(i+j)mod(M-1)=Li,jAnd obtaining a normalized and denoised recursive graph R'.
S600, based on the recursion graph R', all line segments are detected and clustered, and the cluster with the most line segments is processed in sequence to obtain a music segmentation boundary point set B.
First, all line segments in the recursive graph R 'are found and expressed in a standardized way, the recursive graph R' is traversed, the step distance is set to be 3, and all line segments are found. After finding the line segment, use { x1,x2,y1,y2Denotes x1And x2Is the horizontal coordinate of the starting point and the stopping point, y1And y2Is the ordinate of the start and stop points. If a segment is {1,9,10,19}, it represents that the 10 th frame to the 18 th frame are similar to the 1 st frame to the 9 th frame. Then x in all line segments1And x2The portions of the common portion accounting for more than 80% of each other are grouped in the same cluster, such as {1,9,10,18}, {2,9,20,27} and {2,9,31,38 }. After clustering, for x1Take the average value and corresponding y1Marks, e.g. here x1Average value of 2, corresponding to 3 y1Will be taken as 11, 20 and 31. The set of boundary points B is checked for the presence of points within 20 frames (associated with the required segmentation duration) of their gap and if not, they are added to B. Thus, a result of segmenting the sample music is obtained.

Claims (8)

1. A music segmentation boundary identification method based on repeated melody is characterized by comprising the following steps:
1) extracting chroma characteristics from the audio to obtain a characteristic vector sequence, wherein the sequence is M frames in total; zero padding the head and the tail of the feature vector sequence, aggregating every adjacent N frames to form a new frame vector, and forming a new frame feature vector sequence by all the frame vectors;
2) calculating Euclidean distance between each frame vector and other frame vectors in the frame feature sequence to obtain a self-similarity matrix S;
3) based on self-similarity matrix S, obtaining the set of the ith frame vector nearest neighbor frameNiI is 1, 2, …, M, and a recursion graph R of the self-similarity matrix S is obtained;
4) carrying out time delay processing on the recursive graph R to obtain a time delay matrix L;
5) carrying out line segment normalization and denoising on the time delay matrix L, and then carrying out reverse time delay processing to obtain a normalized and denoised recursive graph R';
6) and detecting all line segments based on the recursion graph R', clustering the line segments, and sequentially processing from the cluster with the most line segments to obtain a music segmentation boundary point set B.
2. The method as claimed in claim 1, wherein the step 3) is performed for a set NiThe k elements in the frame vector are k frame vectors which are most similar to the ith frame vector in all the frame vectors, and the value of k is 0.01 of the total number of the frame vectors.
3. The method as claimed in claim 1, wherein in step 3), for each point R in the recursive graph R, the boundary of the music segment is identifiedi,jIf i belongs to NjAnd j belongs to NiThen get Ri,jEqual to 1, otherwise take Ri,jEqual to 0, thus obtaining a recursion map R of the self-similarity matrix S.
4. The method as claimed in claim 1, wherein the step 4) comprises the step of Li,j=Ri,(i+j)mod(M-1)I is 1, 2, …, M, j is 1, 2, …, M, and the time delay matrix L of the recursion graph R is obtained, that is, the main diagonal direction in the recursion graph R is converted into the horizontal direction.
5. The method as claimed in claim 4, wherein the step 5) comprises:
5-1) traversing the time delay matrix L, and defining a point with the value of 1; when one point is found, all points connected with the point are determined through breadth-first searching, and if the step distance is less than 3, the points are considered to be connected;
5-2) counting the number of each point with the same vertical coordinate in the connected points, if the number of the point with the largest number of points in the vertical coordinate is more than 5, keeping the point of the vertical coordinate in the points, and taking the value of other points as 0; otherwise, all the points are set to be 0;
5-3) R'i,(i+j)mod(M-1)=Li,jI is 1, 2, …, M, j is 1, 2, …, M, and a regularized and denoised recursion map R' is obtained.
6. The method as claimed in claim 1, wherein the clustering of line segments in step 6) comprises:
traverse the recursive graph R', find all the line segments in the graph, and use { x }1,x2,y1,y2Normalizing each line segment, x1And x2Is the horizontal coordinate of the starting point and the stopping point, y1And y2Is the ordinate of the start and stop points;
taking a line segment, traversing other line segments, and finding all line segments which correspond to the line segment and are the same segment of melody for clustering; the basis for judging the melody corresponding to the same segment is as follows: x is the number of1And x2The common length of (a) accounts for more than 80% of each.
7. The method of claim 6, wherein the step pitch is set to 3 when traversing the recursive graph R'.
8. The method as claimed in claim 6, wherein the cluster with the largest number of segments is selected after clustering the segments in step 6), and all x segments are processed1And x2Taking an average value to obtain
Figure FDA0002510613070000021
And
Figure FDA0002510613070000022
then for each line segment in the cluster, according to x1And
Figure FDA0002510613070000023
x2and
Figure FDA0002510613070000031
respectively for y1And y2Corrected to obtain y'1And y'2(ii) a Will be provided with
Figure FDA0002510613070000032
And all of y'1、y’2The following processing is performed as time x: and (3) checking whether a segment boundary point which is less than n frames away from the time point x exists in the music segment boundary point set B, and if not, adding the time point x into the B.
CN202010459989.8A 2020-05-26 2020-05-26 Music segmentation boundary identification method based on repeated melody Active CN111785296B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010459989.8A CN111785296B (en) 2020-05-26 2020-05-26 Music segmentation boundary identification method based on repeated melody

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010459989.8A CN111785296B (en) 2020-05-26 2020-05-26 Music segmentation boundary identification method based on repeated melody

Publications (2)

Publication Number Publication Date
CN111785296A true CN111785296A (en) 2020-10-16
CN111785296B CN111785296B (en) 2022-06-10

Family

ID=72753490

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010459989.8A Active CN111785296B (en) 2020-05-26 2020-05-26 Music segmentation boundary identification method based on repeated melody

Country Status (1)

Country Link
CN (1) CN111785296B (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6278972B1 (en) * 1999-01-04 2001-08-21 Qualcomm Incorporated System and method for segmentation and recognition of speech signals
WO2005010865A2 (en) * 2003-07-31 2005-02-03 The Registrar, Indian Institute Of Science Method of music information retrieval and classification using continuity information
US20060065106A1 (en) * 2004-09-28 2006-03-30 Pinxteren Markus V Apparatus and method for changing a segmentation of an audio piece
US20070291958A1 (en) * 2006-06-15 2007-12-20 Tristan Jehan Creating Music by Listening
CN103116646A (en) * 2013-02-26 2013-05-22 浙江大学 Cloud gene expression programming based music emotion recognition method
CN103854661A (en) * 2014-03-20 2014-06-11 北京百度网讯科技有限公司 Method and device for extracting music characteristics
US20140205103A1 (en) * 2011-08-19 2014-07-24 Dolby Laboratories Licensing Corporation Measuring content coherence and measuring similarity
US20170148424A1 (en) * 2015-11-23 2017-05-25 Adobe Systems Incorporated Intuitive music visualization using efficient structural segmentation
CN108665903A (en) * 2018-05-11 2018-10-16 复旦大学 A kind of automatic testing method and its system of audio signal similarity degree

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6278972B1 (en) * 1999-01-04 2001-08-21 Qualcomm Incorporated System and method for segmentation and recognition of speech signals
WO2005010865A2 (en) * 2003-07-31 2005-02-03 The Registrar, Indian Institute Of Science Method of music information retrieval and classification using continuity information
US20060065106A1 (en) * 2004-09-28 2006-03-30 Pinxteren Markus V Apparatus and method for changing a segmentation of an audio piece
US20070291958A1 (en) * 2006-06-15 2007-12-20 Tristan Jehan Creating Music by Listening
US20140205103A1 (en) * 2011-08-19 2014-07-24 Dolby Laboratories Licensing Corporation Measuring content coherence and measuring similarity
CN103116646A (en) * 2013-02-26 2013-05-22 浙江大学 Cloud gene expression programming based music emotion recognition method
CN103854661A (en) * 2014-03-20 2014-06-11 北京百度网讯科技有限公司 Method and device for extracting music characteristics
US20170148424A1 (en) * 2015-11-23 2017-05-25 Adobe Systems Incorporated Intuitive music visualization using efficient structural segmentation
CN108665903A (en) * 2018-05-11 2018-10-16 复旦大学 A kind of automatic testing method and its system of audio signal similarity degree

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
李伟等: "理解数字音乐――音乐信息检索技术综述", 《复旦学报(自然科学版)》 *
肖川等: "多版本音乐识别技术研究综述", 《小型微型计算机系统》 *

Also Published As

Publication number Publication date
CN111785296B (en) 2022-06-10

Similar Documents

Publication Publication Date Title
US11294955B2 (en) System and method for optimization of audio fingerprint search
EP2191400B1 (en) Detection and classification of matches between time-based media
EP2657884B1 (en) Identifying multimedia objects based on multimedia fingerprint
CN111460961B (en) Static video abstraction method for CDVS-based similarity graph clustering
US8175392B2 (en) Time segment representative feature vector generation device
CN111291824B (en) Time series processing method, device, electronic equipment and computer readable medium
CN110767248B (en) Anti-modulation interference audio fingerprint extraction method
Lu et al. Unsupervised speaker segmentation and tracking in real-time audio content analysis
Wu et al. UBM-based real-time speaker segmentation for broadcasting news
CN111785296B (en) Music segmentation boundary identification method based on repeated melody
WO2021088176A1 (en) Binary multi-band power distribution-based low signal-to-noise ratio sound event detection method
JPWO2006009035A1 (en) Signal detection method, signal detection system, signal detection processing program, and recording medium recording the program
CN114595360A (en) Homologous video retrieval method and system based on time sequence characteristics
CN113761282B (en) Video duplicate checking method and device, electronic equipment and storage medium
CN114005069A (en) Video feature extraction and retrieval method
CN110336817B (en) Unknown protocol frame positioning method based on TextRank
CN110400578B (en) Hash code generation and matching method and device, electronic equipment and storage medium
CN111291224A (en) Video stream data processing method, device, server and storage medium
CN110674337A (en) Audio-video image-text recognition system
Min et al. Near-duplicate video detection using temporal patterns of semantic concepts
CN117251598A (en) Video retrieval method
CN114819067A (en) Spliced audio detection and positioning method and system based on spectrogram segmentation
KR101081459B1 (en) Apparatus and Method for high-dimensional binary data search
CN117909841A (en) Data analysis processing method and system
CN111008301A (en) Method for searching video by using picture

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant