CN115238127A - Video consistency comparison method based on labels - Google Patents
Video consistency comparison method based on labels Download PDFInfo
- Publication number
- CN115238127A CN115238127A CN202210812641.1A CN202210812641A CN115238127A CN 115238127 A CN115238127 A CN 115238127A CN 202210812641 A CN202210812641 A CN 202210812641A CN 115238127 A CN115238127 A CN 115238127A
- Authority
- CN
- China
- Prior art keywords
- video
- label
- time
- consistency
- hash value
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/70—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F16/78—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/7867—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using information manually generated, e.g. tags, keywords, comments, title and artist information, manually generated time, location and usage information, user ratings
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/70—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F16/71—Indexing; Data structures therefor; Storage structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/70—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F16/78—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/783—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
- G06F16/7834—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using audio features
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Multimedia (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Library & Information Science (AREA)
- Software Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention relates to a video analysis technology and discloses a video consistency comparison method based on labels, which comprises the steps of extracting labels and extracting sound aiming at the labels in a video; comparing the labels, namely comparing the hash values of the extracted labels; thereby determining the consistency of the tagged video. The invention only needs to process the sound data, does not need to process the video, and greatly improves the processing speed and concurrency; the ASR and keyword extraction technology is mature, and the technical difficulty is low; only the hash value and the time information need to be recorded, and the storage requirement is low; the comparison adopts a hash value comparison mode, so that the speed is high and the performance is excellent; the comparison algorithm is updated without extracting the video again, so that the comparison algorithm is convenient to adjust and upgrade in combination with the result and the service.
Description
Technical Field
The invention relates to a video analysis technology, in particular to a video consistency comparison method based on labels.
Background
Before the internet era, people enjoy video and audio through traditional players such as radios, televisions and the like, and cannot enjoy the video and audio randomly and anytime. With the advent of the internet era, digital technology has greatly reduced the cost of publishing, copying, storing, and propagating text, images, audio and video content, so that information can be nearly freely propagated, shared, and used. However, the copyright protection system suffers from unprecedented impact, and digital contents, especially video and audio, spread on the internet have a great deal of piracy and infringement problems. The phenomenon also attracts more and more authors and distributors, internet copyright protection becomes a hot topic in the internet era, and meanwhile, internet video and audio copyright protection technology also becomes a hot point of research.
The internet video and audio copyright protection needs technical means and legal measures to prevent and strike illegal copying, plagiarism, embezzlement, falsification, distribution and other infringement behaviors.
The existing mainstream protection technology of internet video and audio copyright comprises the following steps: auditing of the video by watermarking as in the prior art; the hidden watermark is added to audio, picture or video in the form of digital data, but cannot be seen in normal conditions. The emerging watermark affects the user viewing experience. The hidden watermark has the problems of poor robustness, need of adopting a special detection device and complex technical upgrading.
In the prior art, features of pictures or sounds in a video are extracted, and the features of the video are converted into information character strings to form a digital gene. And carrying out consistency comparison through a sample matching technology of the target video and the digital gene library. At present, the problems that the technical difficulty is high, the gene needs to be provided again when the technology is upgraded and the like exist.
For example, in the prior art CN202210454687.0, the technical difficulty is large, and the technical upgrade is complex.
Disclosure of Invention
The invention provides a tag video consistency comparison method, which aims at solving the problems of great technical difficulty and complex technical upgrading of the tag video consistency comparison method in the prior art.
In order to solve the technical problems, the invention is solved by the following technical scheme:
the video consistency comparison method based on the label comprises the following steps:
extracting a label, namely extracting sound aiming at the label in the video;
comparing the labels, namely comparing the hash values of the extracted labels; thereby determining the consistency of the tagged video.
Preferably, the extraction method of the label comprises the following steps:
step 1, acquiring a video, namely acquiring the video to be extracted in an uploading or local scanning mode;
step 2, decoding the video, namely decoding the extracted video through a decoder and obtaining video decoding data; the video decoding data comprises sound data and video time point information corresponding to the sound data;
step 3, recognizing the sound data, namely outputting time deviation information and structured data of a recognized text result to the decoded sound data through an automatic voice recognition technology; the time offset information includes a start time and an end time; the structured data of the text result comprises an absolute start time offset ST of the text relative to the video and an absolute end time offset ET of the text relative to the video;
step 4, identifying the segmentation of the text, and segmenting the text identification result;
step 5, extracting keywords, namely extracting the keywords of the segmented recognition text through a TF-IDF algorithm; the number of extracted keywords is at most N;
step 6, merging the character string data, merging the extracted keywords into the character string data in a character string connection mode;
step 7, calculating a paragraph hash value, and obtaining the paragraph hash value through an MD5 hash algorithm for the combined character string data;
step 8, forming records of the paragraph hash value and the time point information start time offset ST and the end time offset ET corresponding to the segments, and storing the records in a database;
and 9, finally forming a video corresponding to a plurality of label extraction result times and hash value records in the database by repeating the steps 4 to 7.
Preferably, the method for aligning tags comprises:
s1, acquiring label result information, and respectively inquiring a plurality of pieces of label result information corresponding to a first video and a second video to be compared from a database; the label result information comprises time deviation and a hash value;
s2, sorting the label results, and sorting the label result information in a descending order according to the starting time in the time information;
and S3, judging the consistency of the matching result, and comparing the first video hash value with the second video hash value to determine the consistency of the matching result.
Preferably, step 4 identifies a segment of text that is segmented by periods, question marks and exclamations.
Preferably, the judging of the consistency of the matching results comprises:
searching a starting position and an ending position of a subsequence corresponding to the two video hash value sequences and completely matched with the first video and the second video;
acquiring the starting time of the starting position and the ending time of the ending position;
and calculating the duration corresponding to the matching result according to the starting time of the starting position and the ending time of the ending position, wherein if the duration corresponding to the matching result is greater than the threshold of the deviation, the matching result is valid, and all valid matching results are consistent positions.
Preferably, the threshold value of the time length deviation of the matching result is not less than 5 seconds.
In order to solve the above technical problem, the present invention further provides a storage medium, which includes a storage medium implemented based on a tag video consistency comparison method.
In order to solve the technical problem, the invention further provides electronic equipment which comprises the electronic equipment realized based on the tag video consistency comparison method.
Due to the adoption of the technical scheme, the invention has the remarkable technical effects that:
the invention only needs to process the sound data, does not need to process the video, and greatly improves the processing speed and concurrency;
the invention adopts ASR and keyword extraction technology, the technology is mature, and the technical difficulty is low;
the invention only needs to record the hash value and the time information, and has low storage requirement; the invention adopts a hash value comparison mode, and has high speed and excellent performance;
the invention does not need to extract the video again when the comparison algorithm is updated, and is convenient for adjustment and upgrade of the comparison algorithm in combination with the result and the service.
Drawings
FIG. 1 is a flow chart of the present invention;
FIG. 2 is a flow chart of the extraction of tags of the present invention;
FIG. 3 is a flowchart of the alignment of tags according to the present invention.
Detailed Description
The invention is described in further detail below with reference to the figures and examples.
Example 1
The video consistency comparison method based on the label comprises the following steps:
extracting a label, namely extracting sound aiming at the label in the video;
comparing the labels, namely comparing the hash values of the extracted labels; thereby determining the consistency of the tagged video.
The extraction method of the label comprises the following steps:
step 1, acquiring a video, namely acquiring the video to be extracted in an uploading or local scanning mode;
step 2, decoding the video, namely decoding the extracted video through a decoder and obtaining video decoding data; the video decoding data comprises sound data and video time point information corresponding to the sound data;
step 3, recognizing the sound data, namely outputting time deviation information and structured data of a recognized text result to the decoded sound data through an automatic voice recognition technology; the time offset information includes a start time and an end time; the structured data of the text result comprises an absolute start time offset ST of the text relative to the video and an absolute end time offset ET of the text relative to the video;
step 4, identifying the segmentation of the text, and segmenting the text identification result;
step 5, extracting keywords, namely extracting the keywords of the segmented recognition text through a TF-IDF algorithm; the number of extracted keywords is at most N;
step 6, merging the character string data, merging the extracted keywords into the character string data in a character string connection mode;
step 7, calculating a paragraph hash value, and obtaining the paragraph hash value through an MD5 hash algorithm for the combined character string data;
step 8, forming records of the paragraph hash value and the time point information start time offset ST and the end time offset ET corresponding to the segments, and storing the records in a database;
and 9, finally forming a video corresponding to a plurality of label extraction result times and hash value records in the database by repeating the steps 4 to 7.
The label alignment method comprises the following steps:
s1, acquiring label result information, and respectively inquiring a plurality of pieces of label result information corresponding to a first video and a second video to be compared from a database; the label result information comprises time deviation and a hash value;
s2, sorting the label results, and sorting the label result information in a descending order according to the starting time in the time information;
and S3, judging the consistency of the matching result, and comparing the first video hash value with the second video hash value to determine the consistency of the matching result.
And 4, identifying the segmentation of the text, wherein the segmentation is carried out through periods, question marks and exclamation marks.
The judgment of the consistency of the matching results comprises the following steps:
searching a starting position and an ending position of a subsequence corresponding to the two video hash value sequences and completely matched with the first video and the second video;
acquiring the starting time of the starting position and the ending time of the ending position;
and calculating the duration corresponding to the matching result according to the starting time of the starting position and the ending time of the ending position, wherein if the duration corresponding to the matching result is greater than the deviation threshold value, the matching result is valid, and all valid matching results are consistent positions.
The threshold value of the time length deviation of the matching result is not less than 5 seconds.
Example 2
On the basis of embodiment 1, the threshold value of the time length deviation of the matching result of the present embodiment is 5 seconds.
Example 3
On the basis of embodiment 1, this embodiment is a storage medium.
Example 4
On the basis of embodiment 1, this embodiment is an electronic apparatus.
Claims (8)
1. The video consistency comparison method based on the label comprises the following steps:
extracting a label, namely extracting sound aiming at the label in the video;
comparing the labels, namely comparing the hash values of the extracted labels; thereby determining the consistency of the tagged video.
2. The tag-based video consistency comparison method according to claim 1, wherein the tag extraction method comprises:
step 1, acquiring a video, namely acquiring the video to be extracted in an uploading or local scanning mode;
step 2, decoding the video, namely decoding the extracted video through a decoder and obtaining video decoding data; the video decoding data comprises sound data and video time point information corresponding to the sound data;
step 3, recognizing the sound data, namely outputting time deviation information and structured data of a recognition text result to the decoded sound data through an automatic speech recognition technology; the time offset information includes a start time and an end time; the structured data of the text result comprises an absolute start time offset ST of the text relative to the video and an absolute end time offset ET of the text relative to the video;
step 4, identifying the segmentation of the text, and segmenting the text identification result;
step 5, extracting keywords, namely extracting the keywords of the segmented identification text through a TF-IDF algorithm; the number of extracted keywords is at most N;
step 6, merging the character string data, merging the extracted keywords into the character string data in a character string connection mode;
step 7, calculating a paragraph hash value, and obtaining the paragraph hash value through an MD5 hash algorithm for the combined character string data;
step 8, forming records of the paragraph hash value and the time point information start time offset ST and the end time offset ET corresponding to the segments, and storing the records in a database;
and 9, finally forming a video corresponding to a plurality of label extraction result times and hash value records in the database by repeating the steps 4 to 7.
3. The tag-based video consistency comparison method according to claim 1, wherein the tag-based video consistency comparison method comprises:
s1, acquiring label result information, and respectively inquiring a plurality of pieces of label result information corresponding to a first video and a second video to be compared from a database; the label result information comprises time deviation and a hash value;
s2, sorting the label results, and sorting the label result information in a descending order according to the starting time in the time information;
and S3, judging the consistency of the matching result, and comparing the first video hash value with the second video hash value to determine the consistency of the matching result.
4. The tag-based video consistency comparison method of claim 2, wherein, in step 4, the segments of the text are identified and are segmented by periods, question marks and exclamation marks.
5. The tag-based video consistency comparison method according to claim 3, wherein the judgment of the consistency of the matching results comprises:
searching a starting position and an ending position of a subsequence corresponding to the two video hash value sequences and completely matched with the first video and the second video;
acquiring the starting time of the starting position and the ending time of the ending position;
and calculating the duration corresponding to the matching result according to the starting time of the starting position and the ending time of the ending position, wherein if the duration corresponding to the matching result is greater than the threshold of the deviation, the matching result is valid, and all valid matching results are consistent positions.
6. The tag-based video consistency comparison method according to claim 5, wherein a threshold value of a time length deviation of the matching result is not less than 5 seconds.
7. A storage medium comprising a storage medium implemented by the tag-based video consistency comparison method of claims 1 to 6.
8. An electronic device, comprising an electronic device implemented by the tag-based video consistency comparison method according to claims 1 to 6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210812641.1A CN115238127A (en) | 2022-07-11 | 2022-07-11 | Video consistency comparison method based on labels |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210812641.1A CN115238127A (en) | 2022-07-11 | 2022-07-11 | Video consistency comparison method based on labels |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115238127A true CN115238127A (en) | 2022-10-25 |
Family
ID=83670507
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210812641.1A Pending CN115238127A (en) | 2022-07-11 | 2022-07-11 | Video consistency comparison method based on labels |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115238127A (en) |
-
2022
- 2022-07-11 CN CN202210812641.1A patent/CN115238127A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9226047B2 (en) | Systems and methods for performing semantic analysis of media objects | |
EP2321964B1 (en) | Method and apparatus for detecting near-duplicate videos using perceptual video signatures | |
Cano et al. | Audio fingerprinting: concepts and applications | |
KR101171536B1 (en) | Temporal segment based extraction and robust matching of video fingerprints | |
US8503523B2 (en) | Forming a representation of a video item and use thereof | |
WO2008097051A1 (en) | Method for searching specific person included in digital data, and method and apparatus for producing copyright report for the specific person | |
US20100063978A1 (en) | Apparatus and method for inserting/extracting nonblind watermark using features of digital media data | |
US20020059208A1 (en) | Information providing apparatus and method, and recording medium | |
US20170185675A1 (en) | Fingerprinting and matching of content of a multi-media file | |
WO2017067400A1 (en) | Video file identification method and device | |
JP2008166914A (en) | Method and apparatus for synchronizing data stream of content with meta data | |
WO2003096337A2 (en) | Watermark embedding and retrieval | |
US20070201764A1 (en) | Apparatus and method for detecting key caption from moving picture to provide customized broadcast service | |
US9367744B2 (en) | Systems and methods of fingerprinting and identifying media contents | |
US20070220265A1 (en) | Searching for a scaling factor for watermark detection | |
US20080256576A1 (en) | Method and Apparatus for Detecting Content Item Boundaries | |
CN111274450A (en) | Video identification method | |
CN115238127A (en) | Video consistency comparison method based on labels | |
CN109101964B (en) | Method, device and storage medium for determining head and tail areas in multimedia file | |
KR100930529B1 (en) | Harmful video screening system and method through video identification | |
Duong et al. | Movie synchronization by audio landmark matching | |
US20060092327A1 (en) | Story segmentation method for video | |
JP2002014973A (en) | Video retrieving system and method, and recording medium with video retrieving program recorded thereon | |
GB2617681A (en) | Non-fingerprint-based automatic content recognition | |
Klein et al. | Identifying Source Videos for Video Clips Based on Video Fingerprints and Embeddings |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |