CN115019235B - Scene division and content detection method and system - Google Patents
Scene division and content detection method and system Download PDFInfo
- Publication number
- CN115019235B CN115019235B CN202210685018.4A CN202210685018A CN115019235B CN 115019235 B CN115019235 B CN 115019235B CN 202210685018 A CN202210685018 A CN 202210685018A CN 115019235 B CN115019235 B CN 115019235B
- Authority
- CN
- China
- Prior art keywords
- features
- semantic
- multimedia data
- vector matrix
- content detection
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/41—Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/762—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using clustering, e.g. of similar faces in social networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/46—Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Abstract
The invention provides a scene division and content detection method and system, which are characterized in that a first vector matrix is generated by extracting various features in multimedia data, the first vector matrix is input into a state chain model to obtain an explicit feature distribution area, a semantic feature set of a required implicit feature distribution area is further determined, the first vector matrix and the semantic feature set are input into a calculation function, probability density parameters of the state chain model are simultaneously introduced, and dividing lines of different scene divisions are calculated and determined, so that accurate segmentation content detection is realized.
Description
Technical Field
The present application relates to the field of network multimedia, and in particular, to a method and system for scene division and content detection.
Background
The existing network has a large amount of scene information and very rich video data, a plurality of completely different scenes are often clipped in one video, whether the video content is legal or not is detected in the different scenes, different detection algorithms are required to be called, a large amount of burden is brought to a processing link, and the operation amount is increased. Meanwhile, whether boundary lines of different scenes can be accurately divided is also an important point for improving detection accuracy.
Thus, there is an urgent need for a method and system for targeted scene division and content detection.
Disclosure of Invention
The invention aims to provide a method and a system for scene division and content detection, which are characterized in that a first vector matrix is generated by extracting various features in multimedia data, the first vector matrix is input into a state chain model to obtain an explicit feature distribution area, a semantic feature set of a needed implicit feature distribution area is further determined, the first vector matrix and the semantic feature set are input into a calculation function, probability density parameters of the state chain model are simultaneously introduced, and dividing lines of different scene divisions are calculated and determined, so that accurate segmentation content detection is realized.
In a first aspect, the present application provides a method of scene division and content detection, the method comprising:
receiving multimedia data sent by an acquisition terminal, extracting visual features, sound features and text features from the multimedia data, and generating a first vector matrix according to preset rules by the visual features, the sound features and the text features;
inputting the first vector matrix into a state chain model, determining an explicit feature distribution area corresponding to the multimedia data according to a preset probability density function, obtaining a possible implicit feature distribution area, extracting a plurality of second vector matrices in the possible implicit feature distribution area, and decomposing the second vector matrices to obtain implicit features;
semantically analyzing the latent features to obtain a plurality of undetermined semantic features, calculating the correlation degree among the undetermined semantic features, removing undetermined semantic features with the correlation degree lower than a threshold value, and determining a semantic feature set corresponding to the multimedia data;
inputting the first vector matrix and the semantic feature set into a calculation function, introducing probability density parameters of a state chain model to obtain a conditional probability formula from the second vector matrix to the first vector matrix, calculating the conditional probability formula through a neural network model, and calculating to obtain an optimal second vector matrix;
determining dividing lines of different scene divisions according to the distribution condition among the optimal second vector matrixes, dividing the multimedia data into different scene sections according to the dividing lines, and sequentially carrying out semantic analysis to obtain semantic tags corresponding to the different scene sections;
and according to the semantic tags, different content detection algorithms are called, and content detection is carried out on scene segments corresponding to the semantic tags.
With reference to the first aspect, in a first possible implementation manner of the first aspect, the semantic analysis further includes a clustering operation, and the scene segments of the same class are analyzed in a centralized manner.
With reference to the first aspect, in a second possible implementation manner of the first aspect, the receiving the multimedia data stream sent by the acquisition terminal includes encoding and decoding the multimedia data stream.
With reference to the first aspect, in a third possible implementation manner of the first aspect, the semantic analysis uses a neural network model.
In a second aspect, the present application provides a system for scene division and content detection, the system comprising a processor and a memory:
the memory is used for storing program codes and transmitting the program codes to the processor;
the processor is configured to perform the method according to any one of the four possible aspects of the first aspect according to instructions in the program code.
In a third aspect, the present application provides a computer readable storage medium for storing program code for performing the method of any one of the four possible aspects.
Advantageous effects
The invention provides a scene division and content detection method and system, which are characterized in that a required semantic feature set is determined through a state chain model, a calculation function is input, probability density parameters of the state chain model are introduced, and dividing lines of different scene divisions are calculated and determined, so that accurate segmentation content detection can be realized, different content detection algorithms are respectively invoked by different scene segments, the detection precision is improved, and the operation amount is reduced.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the embodiments will be briefly described below, and it will be obvious to those skilled in the art that other drawings can be obtained from these drawings without inventive effort.
FIG. 1 is a flow chart of the method of the present invention.
Detailed Description
The preferred embodiments of the present invention will be described in detail below with reference to the accompanying drawings so that the advantages and features of the present invention can be more easily understood by those skilled in the art, thereby making clear and defining the scope of the present invention.
Fig. 1 is a flowchart of a method for scene division and content detection provided in the present application, including:
receiving multimedia data sent by an acquisition terminal, extracting visual features, sound features and text features from the multimedia data, and generating a first vector matrix according to preset rules by the visual features, the sound features and the text features;
inputting the first vector matrix into a state chain model, determining an explicit feature distribution area corresponding to the multimedia data according to a preset probability density function, obtaining a possible implicit feature distribution area, extracting a plurality of second vector matrices in the possible implicit feature distribution area, and decomposing the second vector matrices to obtain implicit features;
semantically analyzing the latent features to obtain a plurality of undetermined semantic features, calculating the correlation degree among the undetermined semantic features, removing undetermined semantic features with the correlation degree lower than a threshold value, and determining a semantic feature set corresponding to the multimedia data;
inputting the first vector matrix and the semantic feature set into a calculation function, introducing probability density parameters of a state chain model to obtain a conditional probability formula from the second vector matrix to the first vector matrix, calculating the conditional probability formula through a neural network model, and calculating to obtain an optimal second vector matrix;
determining dividing lines of different scene divisions according to the distribution condition among the optimal second vector matrixes, dividing the multimedia data into different scene sections according to the dividing lines, and sequentially carrying out semantic analysis to obtain semantic tags corresponding to the different scene sections;
and according to the semantic tags, different content detection algorithms are called, and content detection is carried out on scene segments corresponding to the semantic tags.
In some preferred embodiments, the semantic analysis further includes a clustering operation that centrally analyzes scene segments of the same class.
In some preferred embodiments, the receiving the multimedia data stream sent by the acquisition terminal includes encoding and decoding the multimedia data stream.
In some preferred embodiments, the semantic analysis employs a neural network model.
The application provides a system for scene division and content detection, the system comprising: the system includes a processor and a memory:
the memory is used for storing program codes and transmitting the program codes to the processor;
the processor is configured to perform the method according to any of the embodiments of the first aspect according to instructions in the program code.
The present application provides a computer readable storage medium for storing program code for performing the method of any one of the embodiments of the first aspect.
In a specific implementation, the present invention also provides a computer storage medium, where the computer storage medium may store a program, where the program may include some or all of the steps in the various embodiments of the present invention when executed. The storage medium may be a magnetic disk, an optical disk, a read-only memory (ROM) or a Random Access Memory (RAM).
It will be apparent to those skilled in the art that the techniques of embodiments of the present invention may be implemented in software plus a necessary general purpose hardware platform. Based on such understanding, the technical solutions in the embodiments of the present invention may be embodied in essence or a part contributing to the prior art in the form of a software product, which may be stored in a storage medium, such as a ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the embodiments or some parts of the embodiments of the present invention.
The same or similar parts between the various embodiments of the present description are referred to each other. In particular, for the embodiments, since they are substantially similar to the method embodiments, the description is relatively simple, and reference should be made to the description of the method embodiments for the matters.
The embodiments of the present invention described above do not limit the scope of the present invention.
Claims (6)
1. A method of scene division and content detection, the method comprising:
receiving multimedia data sent by an acquisition terminal, extracting visual features, sound features and text features from the multimedia data, and generating a first vector matrix according to preset rules by the visual features, the sound features and the text features;
inputting the first vector matrix into a state chain model, determining an explicit feature distribution area corresponding to the multimedia data according to a preset probability density function, obtaining a possible implicit feature distribution area, extracting a plurality of second vector matrices in the possible implicit feature distribution area, and decomposing the second vector matrices to obtain implicit features;
semantically analyzing the latent features to obtain a plurality of undetermined semantic features, calculating the correlation degree among the undetermined semantic features, removing undetermined semantic features with the correlation degree lower than a threshold value, and determining a semantic feature set corresponding to the multimedia data;
inputting the first vector matrix and the semantic feature set into a calculation function, introducing probability density parameters of a state chain model to obtain a conditional probability formula from the second vector matrix to the first vector matrix, calculating the conditional probability formula through a neural network model, and calculating to obtain an optimal second vector matrix;
determining dividing lines of different scene divisions according to the distribution condition among the optimal second vector matrixes, dividing the multimedia data into different scene sections according to the dividing lines, and sequentially carrying out semantic analysis to obtain semantic tags corresponding to the different scene sections;
and according to the semantic tags, different content detection algorithms are called, and content detection is carried out on scene segments corresponding to the semantic tags.
2. The method according to claim 1, characterized in that: the semantic analysis also comprises clustering operation, and the scene segments of the same class are analyzed in a concentrated mode.
3. The method according to claim 2, characterized in that: the receiving and collecting the multimedia data stream sent by the terminal comprises encoding and decoding the multimedia data stream.
4. A method according to claim 3, characterized in that: the semantic analysis adopts a neural network model.
5. A system for scene division and content detection, the system comprising a processor and a memory:
the memory is used for storing program codes and transmitting the program codes to the processor;
the processor is configured to perform the method according to any of the claims 1-4 according to instructions in the program code.
6. A computer readable storage medium, characterized in that the computer readable storage medium is for storing a program code for performing a method implementing any of claims 1-4.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210685018.4A CN115019235B (en) | 2022-06-15 | 2022-06-15 | Scene division and content detection method and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210685018.4A CN115019235B (en) | 2022-06-15 | 2022-06-15 | Scene division and content detection method and system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN115019235A CN115019235A (en) | 2022-09-06 |
CN115019235B true CN115019235B (en) | 2023-06-27 |
Family
ID=83075176
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210685018.4A Active CN115019235B (en) | 2022-06-15 | 2022-06-15 | Scene division and content detection method and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115019235B (en) |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111598710A (en) * | 2020-05-11 | 2020-08-28 | 北京邮电大学 | Method and device for detecting social network events |
Family Cites Families (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6185534B1 (en) * | 1998-03-23 | 2001-02-06 | Microsoft Corporation | Modeling emotion and personality in a computer user interface |
US7382933B2 (en) * | 2005-08-24 | 2008-06-03 | International Business Machines Corporation | System and method for semantic video segmentation based on joint audiovisual and text analysis |
CN109859741A (en) * | 2019-01-31 | 2019-06-07 | 成都终身成长科技有限公司 | Voice assessment method, device, electronic equipment and storage medium |
GB2581808B (en) * | 2019-02-26 | 2022-08-10 | Imperial College Innovations Ltd | Scene representation using image processing |
CN111241849A (en) * | 2020-01-21 | 2020-06-05 | 重庆理工大学 | Text semantic analysis method and system |
CN112488116B (en) * | 2020-11-27 | 2024-02-02 | 杭州电子科技大学 | Scene understanding semantic generation method based on multi-mode embedding |
CN114490926A (en) * | 2021-12-30 | 2022-05-13 | 特斯联科技集团有限公司 | Method and device for determining similar problems, storage medium and terminal |
-
2022
- 2022-06-15 CN CN202210685018.4A patent/CN115019235B/en active Active
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111598710A (en) * | 2020-05-11 | 2020-08-28 | 北京邮电大学 | Method and device for detecting social network events |
Also Published As
Publication number | Publication date |
---|---|
CN115019235A (en) | 2022-09-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110781960B (en) | Training method, classification method, device and equipment of video classification model | |
CN110210038B (en) | Core entity determining method, system, server and computer readable medium thereof | |
CN112528637A (en) | Text processing model training method and device, computer equipment and storage medium | |
CN117409419A (en) | Image detection method, device and storage medium | |
CN112711944B (en) | Word segmentation method and system, and word segmentation device generation method and system | |
CN111723182B (en) | Key information extraction method and device for vulnerability text | |
CN115019235B (en) | Scene division and content detection method and system | |
CN115314268B (en) | Malicious encryption traffic detection method and system based on traffic fingerprint and behavior | |
CN115203206A (en) | Data content searching method and device, computer equipment and readable storage medium | |
CN115186647A (en) | Text similarity detection method and device, electronic equipment and storage medium | |
CN112287663B (en) | Text parsing method, equipment, terminal and storage medium | |
CN114302227A (en) | Method and system for collecting and analyzing network video based on container collection | |
CN114172705A (en) | Network big data analysis method and system based on pattern recognition | |
CN113420127A (en) | Threat information processing method, device, computing equipment and storage medium | |
CN115550684B (en) | Improved video content filtering method and system | |
CN115019234A (en) | Improved scene content detection method and system | |
CN113139187B (en) | Method and device for generating and detecting pre-training language model | |
CN114519357B (en) | Natural language processing method and system based on machine learning | |
CN116866211B (en) | Improved depth synthesis detection method and system | |
CN115526179A (en) | Semantic analysis and identification method and system based on weak supervision network | |
CN116431773A (en) | Dialogue flow extraction method and device, computer readable storage medium and terminal | |
CN114519828A (en) | Video detection method and system based on semantic analysis | |
CN112632229A (en) | Text clustering method and device | |
CN114155461A (en) | Method and system for filtering and purifying tiny video content | |
CN114691824A (en) | Theme extraction method, device and equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |