CN115019235B - Scene division and content detection method and system - Google Patents

Scene division and content detection method and system Download PDF

Info

Publication number
CN115019235B
CN115019235B CN202210685018.4A CN202210685018A CN115019235B CN 115019235 B CN115019235 B CN 115019235B CN 202210685018 A CN202210685018 A CN 202210685018A CN 115019235 B CN115019235 B CN 115019235B
Authority
CN
China
Prior art keywords
features
semantic
multimedia data
vector matrix
content detection
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210685018.4A
Other languages
Chinese (zh)
Other versions
CN115019235A (en
Inventor
孙涛
孙中民
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin Guorui Digital Safety System Co ltd
Original Assignee
Tianjin Guorui Digital Safety System Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin Guorui Digital Safety System Co ltd filed Critical Tianjin Guorui Digital Safety System Co ltd
Priority to CN202210685018.4A priority Critical patent/CN115019235B/en
Publication of CN115019235A publication Critical patent/CN115019235A/en
Application granted granted Critical
Publication of CN115019235B publication Critical patent/CN115019235B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/41Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/762Arrangements for image or video recognition or understanding using pattern recognition or machine learning using clustering, e.g. of similar faces in social networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/46Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention provides a scene division and content detection method and system, which are characterized in that a first vector matrix is generated by extracting various features in multimedia data, the first vector matrix is input into a state chain model to obtain an explicit feature distribution area, a semantic feature set of a required implicit feature distribution area is further determined, the first vector matrix and the semantic feature set are input into a calculation function, probability density parameters of the state chain model are simultaneously introduced, and dividing lines of different scene divisions are calculated and determined, so that accurate segmentation content detection is realized.

Description

Scene division and content detection method and system
Technical Field
The present application relates to the field of network multimedia, and in particular, to a method and system for scene division and content detection.
Background
The existing network has a large amount of scene information and very rich video data, a plurality of completely different scenes are often clipped in one video, whether the video content is legal or not is detected in the different scenes, different detection algorithms are required to be called, a large amount of burden is brought to a processing link, and the operation amount is increased. Meanwhile, whether boundary lines of different scenes can be accurately divided is also an important point for improving detection accuracy.
Thus, there is an urgent need for a method and system for targeted scene division and content detection.
Disclosure of Invention
The invention aims to provide a method and a system for scene division and content detection, which are characterized in that a first vector matrix is generated by extracting various features in multimedia data, the first vector matrix is input into a state chain model to obtain an explicit feature distribution area, a semantic feature set of a needed implicit feature distribution area is further determined, the first vector matrix and the semantic feature set are input into a calculation function, probability density parameters of the state chain model are simultaneously introduced, and dividing lines of different scene divisions are calculated and determined, so that accurate segmentation content detection is realized.
In a first aspect, the present application provides a method of scene division and content detection, the method comprising:
receiving multimedia data sent by an acquisition terminal, extracting visual features, sound features and text features from the multimedia data, and generating a first vector matrix according to preset rules by the visual features, the sound features and the text features;
inputting the first vector matrix into a state chain model, determining an explicit feature distribution area corresponding to the multimedia data according to a preset probability density function, obtaining a possible implicit feature distribution area, extracting a plurality of second vector matrices in the possible implicit feature distribution area, and decomposing the second vector matrices to obtain implicit features;
semantically analyzing the latent features to obtain a plurality of undetermined semantic features, calculating the correlation degree among the undetermined semantic features, removing undetermined semantic features with the correlation degree lower than a threshold value, and determining a semantic feature set corresponding to the multimedia data;
inputting the first vector matrix and the semantic feature set into a calculation function, introducing probability density parameters of a state chain model to obtain a conditional probability formula from the second vector matrix to the first vector matrix, calculating the conditional probability formula through a neural network model, and calculating to obtain an optimal second vector matrix;
determining dividing lines of different scene divisions according to the distribution condition among the optimal second vector matrixes, dividing the multimedia data into different scene sections according to the dividing lines, and sequentially carrying out semantic analysis to obtain semantic tags corresponding to the different scene sections;
and according to the semantic tags, different content detection algorithms are called, and content detection is carried out on scene segments corresponding to the semantic tags.
With reference to the first aspect, in a first possible implementation manner of the first aspect, the semantic analysis further includes a clustering operation, and the scene segments of the same class are analyzed in a centralized manner.
With reference to the first aspect, in a second possible implementation manner of the first aspect, the receiving the multimedia data stream sent by the acquisition terminal includes encoding and decoding the multimedia data stream.
With reference to the first aspect, in a third possible implementation manner of the first aspect, the semantic analysis uses a neural network model.
In a second aspect, the present application provides a system for scene division and content detection, the system comprising a processor and a memory:
the memory is used for storing program codes and transmitting the program codes to the processor;
the processor is configured to perform the method according to any one of the four possible aspects of the first aspect according to instructions in the program code.
In a third aspect, the present application provides a computer readable storage medium for storing program code for performing the method of any one of the four possible aspects.
Advantageous effects
The invention provides a scene division and content detection method and system, which are characterized in that a required semantic feature set is determined through a state chain model, a calculation function is input, probability density parameters of the state chain model are introduced, and dividing lines of different scene divisions are calculated and determined, so that accurate segmentation content detection can be realized, different content detection algorithms are respectively invoked by different scene segments, the detection precision is improved, and the operation amount is reduced.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the embodiments will be briefly described below, and it will be obvious to those skilled in the art that other drawings can be obtained from these drawings without inventive effort.
FIG. 1 is a flow chart of the method of the present invention.
Detailed Description
The preferred embodiments of the present invention will be described in detail below with reference to the accompanying drawings so that the advantages and features of the present invention can be more easily understood by those skilled in the art, thereby making clear and defining the scope of the present invention.
Fig. 1 is a flowchart of a method for scene division and content detection provided in the present application, including:
receiving multimedia data sent by an acquisition terminal, extracting visual features, sound features and text features from the multimedia data, and generating a first vector matrix according to preset rules by the visual features, the sound features and the text features;
inputting the first vector matrix into a state chain model, determining an explicit feature distribution area corresponding to the multimedia data according to a preset probability density function, obtaining a possible implicit feature distribution area, extracting a plurality of second vector matrices in the possible implicit feature distribution area, and decomposing the second vector matrices to obtain implicit features;
semantically analyzing the latent features to obtain a plurality of undetermined semantic features, calculating the correlation degree among the undetermined semantic features, removing undetermined semantic features with the correlation degree lower than a threshold value, and determining a semantic feature set corresponding to the multimedia data;
inputting the first vector matrix and the semantic feature set into a calculation function, introducing probability density parameters of a state chain model to obtain a conditional probability formula from the second vector matrix to the first vector matrix, calculating the conditional probability formula through a neural network model, and calculating to obtain an optimal second vector matrix;
determining dividing lines of different scene divisions according to the distribution condition among the optimal second vector matrixes, dividing the multimedia data into different scene sections according to the dividing lines, and sequentially carrying out semantic analysis to obtain semantic tags corresponding to the different scene sections;
and according to the semantic tags, different content detection algorithms are called, and content detection is carried out on scene segments corresponding to the semantic tags.
In some preferred embodiments, the semantic analysis further includes a clustering operation that centrally analyzes scene segments of the same class.
In some preferred embodiments, the receiving the multimedia data stream sent by the acquisition terminal includes encoding and decoding the multimedia data stream.
In some preferred embodiments, the semantic analysis employs a neural network model.
The application provides a system for scene division and content detection, the system comprising: the system includes a processor and a memory:
the memory is used for storing program codes and transmitting the program codes to the processor;
the processor is configured to perform the method according to any of the embodiments of the first aspect according to instructions in the program code.
The present application provides a computer readable storage medium for storing program code for performing the method of any one of the embodiments of the first aspect.
In a specific implementation, the present invention also provides a computer storage medium, where the computer storage medium may store a program, where the program may include some or all of the steps in the various embodiments of the present invention when executed. The storage medium may be a magnetic disk, an optical disk, a read-only memory (ROM) or a Random Access Memory (RAM).
It will be apparent to those skilled in the art that the techniques of embodiments of the present invention may be implemented in software plus a necessary general purpose hardware platform. Based on such understanding, the technical solutions in the embodiments of the present invention may be embodied in essence or a part contributing to the prior art in the form of a software product, which may be stored in a storage medium, such as a ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the embodiments or some parts of the embodiments of the present invention.
The same or similar parts between the various embodiments of the present description are referred to each other. In particular, for the embodiments, since they are substantially similar to the method embodiments, the description is relatively simple, and reference should be made to the description of the method embodiments for the matters.
The embodiments of the present invention described above do not limit the scope of the present invention.

Claims (6)

1. A method of scene division and content detection, the method comprising:
receiving multimedia data sent by an acquisition terminal, extracting visual features, sound features and text features from the multimedia data, and generating a first vector matrix according to preset rules by the visual features, the sound features and the text features;
inputting the first vector matrix into a state chain model, determining an explicit feature distribution area corresponding to the multimedia data according to a preset probability density function, obtaining a possible implicit feature distribution area, extracting a plurality of second vector matrices in the possible implicit feature distribution area, and decomposing the second vector matrices to obtain implicit features;
semantically analyzing the latent features to obtain a plurality of undetermined semantic features, calculating the correlation degree among the undetermined semantic features, removing undetermined semantic features with the correlation degree lower than a threshold value, and determining a semantic feature set corresponding to the multimedia data;
inputting the first vector matrix and the semantic feature set into a calculation function, introducing probability density parameters of a state chain model to obtain a conditional probability formula from the second vector matrix to the first vector matrix, calculating the conditional probability formula through a neural network model, and calculating to obtain an optimal second vector matrix;
determining dividing lines of different scene divisions according to the distribution condition among the optimal second vector matrixes, dividing the multimedia data into different scene sections according to the dividing lines, and sequentially carrying out semantic analysis to obtain semantic tags corresponding to the different scene sections;
and according to the semantic tags, different content detection algorithms are called, and content detection is carried out on scene segments corresponding to the semantic tags.
2. The method according to claim 1, characterized in that: the semantic analysis also comprises clustering operation, and the scene segments of the same class are analyzed in a concentrated mode.
3. The method according to claim 2, characterized in that: the receiving and collecting the multimedia data stream sent by the terminal comprises encoding and decoding the multimedia data stream.
4. A method according to claim 3, characterized in that: the semantic analysis adopts a neural network model.
5. A system for scene division and content detection, the system comprising a processor and a memory:
the memory is used for storing program codes and transmitting the program codes to the processor;
the processor is configured to perform the method according to any of the claims 1-4 according to instructions in the program code.
6. A computer readable storage medium, characterized in that the computer readable storage medium is for storing a program code for performing a method implementing any of claims 1-4.
CN202210685018.4A 2022-06-15 2022-06-15 Scene division and content detection method and system Active CN115019235B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210685018.4A CN115019235B (en) 2022-06-15 2022-06-15 Scene division and content detection method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210685018.4A CN115019235B (en) 2022-06-15 2022-06-15 Scene division and content detection method and system

Publications (2)

Publication Number Publication Date
CN115019235A CN115019235A (en) 2022-09-06
CN115019235B true CN115019235B (en) 2023-06-27

Family

ID=83075176

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210685018.4A Active CN115019235B (en) 2022-06-15 2022-06-15 Scene division and content detection method and system

Country Status (1)

Country Link
CN (1) CN115019235B (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111598710A (en) * 2020-05-11 2020-08-28 北京邮电大学 Method and device for detecting social network events

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6185534B1 (en) * 1998-03-23 2001-02-06 Microsoft Corporation Modeling emotion and personality in a computer user interface
US7382933B2 (en) * 2005-08-24 2008-06-03 International Business Machines Corporation System and method for semantic video segmentation based on joint audiovisual and text analysis
CN109859741A (en) * 2019-01-31 2019-06-07 成都终身成长科技有限公司 Voice assessment method, device, electronic equipment and storage medium
GB2581808B (en) * 2019-02-26 2022-08-10 Imperial College Innovations Ltd Scene representation using image processing
CN111241849A (en) * 2020-01-21 2020-06-05 重庆理工大学 Text semantic analysis method and system
CN112488116B (en) * 2020-11-27 2024-02-02 杭州电子科技大学 Scene understanding semantic generation method based on multi-mode embedding
CN114490926A (en) * 2021-12-30 2022-05-13 特斯联科技集团有限公司 Method and device for determining similar problems, storage medium and terminal

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111598710A (en) * 2020-05-11 2020-08-28 北京邮电大学 Method and device for detecting social network events

Also Published As

Publication number Publication date
CN115019235A (en) 2022-09-06

Similar Documents

Publication Publication Date Title
CN110781960B (en) Training method, classification method, device and equipment of video classification model
CN110210038B (en) Core entity determining method, system, server and computer readable medium thereof
CN112528637A (en) Text processing model training method and device, computer equipment and storage medium
CN117409419A (en) Image detection method, device and storage medium
CN112711944B (en) Word segmentation method and system, and word segmentation device generation method and system
CN111723182B (en) Key information extraction method and device for vulnerability text
CN115019235B (en) Scene division and content detection method and system
CN115314268B (en) Malicious encryption traffic detection method and system based on traffic fingerprint and behavior
CN115203206A (en) Data content searching method and device, computer equipment and readable storage medium
CN115186647A (en) Text similarity detection method and device, electronic equipment and storage medium
CN112287663B (en) Text parsing method, equipment, terminal and storage medium
CN114302227A (en) Method and system for collecting and analyzing network video based on container collection
CN114172705A (en) Network big data analysis method and system based on pattern recognition
CN113420127A (en) Threat information processing method, device, computing equipment and storage medium
CN115550684B (en) Improved video content filtering method and system
CN115019234A (en) Improved scene content detection method and system
CN113139187B (en) Method and device for generating and detecting pre-training language model
CN114519357B (en) Natural language processing method and system based on machine learning
CN116866211B (en) Improved depth synthesis detection method and system
CN115526179A (en) Semantic analysis and identification method and system based on weak supervision network
CN116431773A (en) Dialogue flow extraction method and device, computer readable storage medium and terminal
CN114519828A (en) Video detection method and system based on semantic analysis
CN112632229A (en) Text clustering method and device
CN114155461A (en) Method and system for filtering and purifying tiny video content
CN114691824A (en) Theme extraction method, device and equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant