CN110097026A - A kind of paragraph correlation rule evaluation method based on multidimensional element Video segmentation - Google Patents

A kind of paragraph correlation rule evaluation method based on multidimensional element Video segmentation Download PDF

Info

Publication number
CN110097026A
CN110097026A CN201910395119.6A CN201910395119A CN110097026A CN 110097026 A CN110097026 A CN 110097026A CN 201910395119 A CN201910395119 A CN 201910395119A CN 110097026 A CN110097026 A CN 110097026A
Authority
CN
China
Prior art keywords
video
frame
segmentation
audio
key frame
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910395119.6A
Other languages
Chinese (zh)
Other versions
CN110097026B (en
Inventor
胡燕祝
田雯嘉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Posts and Telecommunications
Original Assignee
Beijing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Posts and Telecommunications filed Critical Beijing University of Posts and Telecommunications
Priority to CN201910395119.6A priority Critical patent/CN110097026B/en
Publication of CN110097026A publication Critical patent/CN110097026A/en
Application granted granted Critical
Publication of CN110097026B publication Critical patent/CN110097026B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/46Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/49Segmenting video sequences, i.e. computational techniques such as parsing or cutting the sequence, low-level clustering or determining units such as shots or scenes

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Computing Systems (AREA)
  • Image Analysis (AREA)

Abstract

Present invention generally provides a kind of paragraph correlation rule evaluation methods based on multidimensional element Video segmentation, and particular content includes: step 1: video parsing;Step 2: the key-frame extraction in scene cut;Step 3: the scene cut based on key frame;Step 4, the audio segmentation of video;Step 5, the semantic segmentation of video;Step 6: the paragraph correlation rule evaluation method of the segmentation video of GNN network;Step 7: building related network.After the present invention carries out multidimensional segmentation to same section of video, corresponding multidimensional element is matched by the way of constructing paragraph correlation rule.Compared with the paragraph correlation rule evaluation method of other Video segmentations, pixel realizes good segmentation in image dimension to video in the variation in time-domain and the correlation between consecutive frame in present invention combination image sequence, the key message of video is remained, a kind of paragraph correlation rule evaluation method of effective multidimensional element Video segmentation can be provided.

Description

A kind of paragraph correlation rule evaluation method based on multidimensional element Video segmentation
Technical field
The invention mainly relates to a kind of paragraph correlation rule evaluation methods, are regarded more particularly to one kind based on multidimensional element The paragraph correlation rule evaluation method that frequency division is cut.
Background technique
It is directed to video structural problem at present, most of is all point in terms of carrying out this one-dimensional element of image to video It cuts, is related in the video structural technique study divided based on multidimensional less.And in practice, the audio letter for including in video Breath, text information etc. also play an important role video monitoring work.In addition, being split to the moving object in video When extracting key frame, in order to consider operation efficiency problem, it is only to take a certain frame in video as key frame, often neglects The important information for slightly including in video, or in such a way that threshold value be set video frame is successively carried out visual signature comparison come Key frame is chosen, above method does not account in image sequence pixel in the variation in time-domain and the phase between consecutive frame Closing property and the corresponding relationship between previous frame and present frame.Meanwhile scene, sound, text three are being carried out to same section of video After segmentation in a dimension, video in different time periods has been obtained.The video divided in these three dimensions can not be complete The case where being aligned entirely, intersection can be generated.Therefore, it is necessary to establish it is a kind of scene, sound, this third dimension element of text can be carried out it is complete Matched paragraph correlation rule evaluation method.
It is currently very widely used in terms of video structural.For example, being based on video structural fire-fighting in public places Video structural in application, public safety and video structural technology in facility monitoring system and in safe city Using etc..With the large scale deployment of city video monitoring system, video monitoring has goed deep into each corner in city, in intelligent friendship The all trades and professions such as logical, government regulation, enterprise operation generate a large amount of monitor video data.With edge calculations, cloud computing, big number According to deepening continuously for technology, the video data volume is huge, storage is difficult, retrieves the problems such as inconvenient becomes increasingly conspicuous, towards extensive Real-time video monitoring data, to carry out real-time space time information mark, character extraction, feature extraction, mesh to video stream data Mark classification, the image processing works such as structuring mark, and it is quickly transferred to center calculation processing, need to construct a kind of multidimensional element The paragraph correlation rule evaluation method of Video segmentation can quick and precisely match scene, sound, text realization, be the Chinese government And each enterprise operation provides the monitoring means of real-time high-efficiency.
Summary of the invention
For above-mentioned problems of the prior art, present invention generally provides a kind of based on multidimensional element Video segmentation Paragraph correlation rule evaluation method, detailed process are as shown in Figure 1.
Technical solution implementation steps are as follows:
Step 1: video parsing.
The first step of video parsing is data receiver, needs to do video the processing of one demultiplexing, is decomposed into picture track Road, audio track, subtitle track.
Step 2: the key-frame extraction in scene cut.
Extraction method of key frame is broadly divided into five classes, and specific method is as shown in Figure 2.
(1) it is based on Boundary Extraction key frame.This method is directly made each camera lens first frame and last frame or intermediate frame It is come out for key frame extraction.Operand is small in this way, is suitable for the camera lens that content activity is small or remains unchanged.
(2) view-based access control model feature extraction key frame.This method selects first frame as nearest key frame first, then, Successively visual signature compared with it, these features include color, movement, edge, shape and spatial relationship etc. to subsequent frame.If As soon as the difference between present frame and nearest key frame has been more than a scheduled threshold value, then present frame is chosen as key frame.
(3) key frame is extracted based on cluster.Such methods have used clustering technique, and all frames of a camera lens are gathered Class, the frame number then according to certain criterion, such as in classification choose crucial classification in these classifications, then in crucial classification The smallest frame of clustering parameter is chosen as key frame.
(4) key frame is extracted based on multi-mode.Such method is mainly imitated human perception ability and simplify in video Hold analysis, usually comprehensive video, audio, text etc..For example, the scene switching in the videos such as film, sport, video and sound Frequency content often changes simultaneously, so the extracting method with multi-mode is just needed, when the audio and video feature of shot boundary is same When changing greatly, which is new scene boundary.
(5) key frame is extracted based on compression domain.Method based on compression domain is not necessarily to decompress video flowing or only part is needed to solve Pressure, directly extracts key frame from mpeg compressed video stream, reduces the complexity of calculating.
Step 3: the scene cut based on key frame.
In terms of mainly including following three:
(1) it is detected based on inter-frame difference.Frame differential method is a kind of poor by making to two frame adjacent in sequence of video images The method that partite transport is calculated to obtain moving target profile, it can be perfectly suitable for, and there are multiple moving targets and video camera are mobile The case where.
(2) it is based on background Differential Detection.Background subtraction is the universal method that a kind of pair of static scene carries out motion segmentation, The picture frame currently obtained and background image are done calculus of differences by it, obtain the grayscale image of target moving region, to grayscale image into Row thresholding extracts moving region, and to avoid ambient lighting variation from influencing, background image according to current acquisition picture frame into Row updates.Particular content is as shown in Figure 3.
(3) it is detected based on optical flow method.Optical flow method utilizes variation and consecutive frame of the pixel in time-domain in image sequence Between correlation, according to the corresponding relationship between previous frame and present frame, be calculated object between consecutive frame movement letter Breath.
(4) video after dividing, can be represented as x1,…,xi, wherein x indicates the period of divided video, i table Show the number of divided video.
Step 4: the audio segmentation of video.
Audio frequency splitting method based on EMD, detailed process is as follows:
(1) original audio data sequence X (t) determines all maximum points, and is fitted to form original with cubic spline functions The coenvelope line of data.
(2) all minimum points are found out, and all minimum points are fitted to be formed by cubic spline functions The lower envelope line of data.
(3) mean value of coenvelope line and lower envelope line is denoted as ml, and former data sequence X (t) is subtracted average envelope ml, is obtained The audio data sequence hl new to one, as shown by the equation:
Hl=X (t)-ml
(4) audio data after decomposing to EMD carries out cluster segmentation.
(5) audio after dividing, can be represented as y1..., yj, wherein y indicates the period of divided audio, j table Show the number of divided audio.
Step 5: the semantic segmentation of video.
It is main comprising following aspects for the semantic segmentation of paragraph:
(1) semantic chunk is defined.Semantic chunk, which refers to, is divided into several relatively independent semantic primitives, length for a sentence Based on the meaning of a word sentence justice under;It is a kind of grammer, semanteme, the associated preprocessing means of pragmatic.Onrecurrent between each semantic chunk, It is non-nested, be not overlapped.
(2) sentence justice is divided.Natural language processing usually requires three aspects of analysis: grammer, semantic and context, therefore head Advanced this participle of style of writing and part of speech target statistical disposition work, after having carried out word classification, carry out it quickly to mark work, then Semantic recombination is carried out for word, finally according to the semantic chunk defined, carries out the segmentation of sentence justice.
(3) paragraph after dividing, can be represented as z1..., zk, wherein z indicates the period of divided audio, k table Show the number of divided audio.
Step 6: the paragraph correlation rule evaluation method of the segmentation video of GNN network.
Figure neural network (GNN, Graph Neural Network) mainly can effectively in modeling object it Between relationship or interaction.For same section of video, after carrying out scenario above, sound, being split in three dimensions of paragraph, obtain Video in different time periods has been arrived, can not be perfectly aligned in the video of three dimensions segmentation, the case where intersection can be generated, because This present invention uses GNN neural network, evaluates the relevance of the video paragraph after above-mentioned segmentation.T is indicated each second Video, GNN (t | x), GNN (t | y), GNN (t | z) refer to the feature vector currently extracted in each dimension segmentation video-frequency band.
Step 7: building related network.
Building related network is broadly divided into 2 steps.
(1) from single dimension, according to Euclidean distance or Hamming distance, the network associate rule in each video-frequency band are constructed Then, including between node intensity and direction.
(2) related network of three dimensions is combined with each other, forms a new oriented related network.
The present invention has the advantage that than the prior art:
(1) present invention combine in image sequence pixel in the variation in time-domain and the correlation between consecutive frame and Corresponding relationship between previous frame and present frame realizes good segmentation in image dimension to video, remains the key of video Information.
(2) after the present invention is split same section of video in three scene, sound, text dimensions, building is used The mode of paragraph correlation rule matches corresponding scene, sound and text.
Detailed description of the invention
For a better understanding of the present invention, it is further described with reference to the accompanying drawing.
Fig. 1 is the step flow chart for establishing the paragraph correlation rule evaluation method based on multidimensional element Video segmentation;
Fig. 2 is extraction method of key frame schematic diagram;
Fig. 3 is the content schematic diagram based on background difference detecting method;
Specific embodiment
Below by case study on implementation, invention is further described in detail.
Technical solution implementation steps are as follows:
Step 1: video parsing.
The first step of video parsing is data receiver, needs to do video the processing of one demultiplexing, is decomposed into picture track Road, audio track, subtitle track.
Demultiplexing process is carried out to the Traffic Surveillance Video in Beijing somewhere, video length 1 is divided 50 seconds, and figure is broken down into As track, audio track and subtitle track, when audio track after decomposition, subtitle track it is a length of 1 point 50 seconds.
Step 2: the key-frame extraction in scene cut.
Extraction method of key frame is broadly divided into five classes, and specific method is as shown in Figure 2.
(1) it is based on Boundary Extraction key frame.This method is directly made each camera lens first frame and last frame or intermediate frame It is come out for key frame extraction.Operand is small in this way, is suitable for the camera lens that content activity is small or remains unchanged.
(2) view-based access control model feature extraction key frame.This method selects first frame as nearest key frame first, then, Successively visual signature compared with it, these features include color, movement, edge, shape and spatial relationship etc. to subsequent frame.If As soon as the difference between present frame and nearest key frame has been more than a scheduled threshold value, then present frame is chosen as key frame.
(3) key frame is extracted based on cluster.Such methods have used clustering technique, and all frames of a camera lens are gathered Class, the frame number then according to certain criterion, such as in classification choose crucial classification in these classifications, then in crucial classification The smallest frame of clustering parameter is chosen as key frame.
(4) key frame is extracted based on multi-mode.Such method is mainly imitated human perception ability and simplify in video Hold analysis, usually comprehensive video, audio, text etc..For example, the scene switching in the videos such as film, sport, video and sound Frequency content often changes simultaneously, so the extracting method with multi-mode is just needed, when the audio and video feature of shot boundary is same When changing greatly, which is new scene boundary.
(5) key frame is extracted based on compression domain.Method based on compression domain is not necessarily to decompress video flowing or only part is needed to solve Pressure, directly extracts key frame from mpeg compressed video stream, reduces the complexity of calculating.
In this example, video is handled using the method that cluster extracts key frame, is 5 major class by key frame cluster.
Step 3: the scene cut based on key frame.
In terms of mainly including following three:
(1) it is detected based on inter-frame difference.Frame differential method is a kind of poor by making to two frame adjacent in sequence of video images The method that partite transport is calculated to obtain moving target profile, it can be perfectly suitable for, and there are multiple moving targets and video camera are mobile The case where.
(2) it is based on background Differential Detection.Background subtraction is the universal method that a kind of pair of static scene carries out motion segmentation, The picture frame currently obtained and background image are done calculus of differences by it, obtain the grayscale image of target moving region, to grayscale image into Row thresholding extracts moving region, and to avoid ambient lighting variation from influencing, background image according to current acquisition picture frame into Row updates.Particular content is as shown in Figure 3.
(3) it is detected based on optical flow method.Optical flow method utilizes variation and consecutive frame of the pixel in time-domain in image sequence Between correlation, according to the corresponding relationship between previous frame and present frame, be calculated object between consecutive frame movement letter Breath.
(4) video after dividing, can be represented as x1..., xi, wherein x indicates the period of divided video, i table Show the number of divided video.
After carrying out key-frame extraction to video, video is split using optical flow method detection technique, the view after segmentation Frequency shares 25 sections, respectively x1, x2..., x25
Step 4: the audio segmentation of video.
Audio frequency splitting method based on EMD, detailed process is as follows:
(1) original audio data sequence X (t) determines all maximum points, and is fitted to form original with cubic spline functions The coenvelope line of data.
(2) all minimum points are found out, and all minimum points are fitted to be formed by cubic spline functions The lower envelope line of data.
(3) mean value of coenvelope line and lower envelope line is denoted as ml, and former data sequence X (t) is subtracted average envelope ml, is obtained The audio data sequence hl new to one, as shown by the equation:
Hl=X (t)-ml
(4) audio data after decomposing to EMD carries out cluster segmentation.
(5) audio after dividing, can be represented as y1..., yj, wherein y indicates the period of divided audio, j table Show the number of divided audio.
The maximum point for including in former audio data sequence X (t) has 2.3,2.1,2,1.9,1.8,1.7,0.9 respectively, 0.8.Minimum has -1.9 respectively, and -2.1, -2.6, -3.0,0, -1.0, -0.5.The mean value for calculating coenvelope line is 1.6875, The mean value of lower envelope line is -1.586.Audio number after segmentation is 25, respectively y1, y2..., y25
Step 5: the semantic segmentation of video.
It is main comprising following aspects for the semantic segmentation of paragraph:
(1) semantic chunk is defined.Semantic chunk, which refers to, is divided into several relatively independent semantic primitives, length for a sentence Based on the meaning of a word sentence justice under;It is a kind of grammer, semanteme, the associated preprocessing means of pragmatic.Onrecurrent between each semantic chunk, It is non-nested, be not overlapped.
(2) sentence justice is divided.Natural language processing usually requires three aspects of analysis: grammer, semantic and context, therefore head Advanced this participle of style of writing and part of speech target statistical disposition work, after having carried out word classification, carry out it quickly to mark work, then Semantic recombination is carried out for word, finally according to the semantic chunk defined, carries out the segmentation of sentence justice.
(3) paragraph after dividing, can be represented as z1..., zk, wherein z indicates the period of divided audio, k table Show the number of divided audio.
Text number after segmentation is 25, respectively z1, z2..., z25, particular content has " crossroad right-hand rotation ", " capable People halts ", " vehicle congestion phenomenon is serious " etc..
Step 6: the paragraph correlation rule evaluation method of the segmentation video of GNN network.
Figure neural network (GNN, Graph Neural Network) mainly can effectively in modeling object it Between relationship or interaction.For same section of video, after carrying out scenario above, sound, being split in three dimensions of paragraph, obtain Video in different time periods has been arrived, can not be perfectly aligned in the video of three dimensions segmentation, the case where intersection can be generated, because This present invention uses GNN neural network, evaluates the relevance of the video paragraph after above-mentioned segmentation.T is indicated each second Video, GNN (t | x), GNN (t | y), GNN (t | z) refer to the feature vector currently extracted in each dimension segmentation video-frequency band.
Scene characteristic vector is obtained as GNN after 5s moment each dimension divides the feature vector of video-frequency band by extracting (5|x1, x2..., x25), sound characteristic vector be GNN (5 | y1, y2..., y25), paragraph feature vector be GNN (5 | z1, z2..., z25)。
Step 7: building related network.
Building related network is broadly divided into 2 steps.
(1) from single dimension, according to Euclidean distance or Hamming distance, the network associate rule in each video-frequency band are constructed Then, including between node intensity and direction.
(2) related network of three dimensions is combined with each other, forms a new oriented related network.

Claims (1)

1. present invention generally provides a kind of paragraph correlation rule evaluation method based on multidimensional element Video segmentation, feature exists In:
Step 1: video parsing.
The first step of video parsing is data receiver, needs to do video the processing of one demultiplexing, is decomposed into picture track, sound Frequency track, subtitle track.
Step 2: the key-frame extraction in scene cut.
Extraction method of key frame is broadly divided into five classes, and specific method is as shown in Figure 2.
(1) it is based on Boundary Extraction key frame.This method is each camera lens first frame and last frame or intermediate frame directly as pass Key frame, which selects, to be come.Operand is small in this way, is suitable for the camera lens that content activity is small or remains unchanged.
(2) view-based access control model feature extraction key frame.This method selects first frame as nearest key frame first, then, behind Frame successively visual signature compared with it, these features include color, movement, edge, shape and spatial relationship etc..If current As soon as the difference between frame and nearest key frame has been more than a scheduled threshold value, then present frame is chosen as key frame.
(3) key frame is extracted based on cluster.Such methods have used clustering technique, and all frames of a camera lens are clustered, Then the frame number according to certain criterion, such as in classification is chosen crucial classification in these classifications, then is chosen in crucial classification The smallest frame of clustering parameter is as key frame.
(4) key frame is extracted based on multi-mode.Such method mainly imitates human perception ability and carries out simplifying video content point Analysis, usually comprehensive video, audio, text etc..For example, in the scene switching in the videos such as film, sport, video and audio Appearance often changes simultaneously, so the extracting method with multi-mode is just needed, when the same time-varying of audio and video feature of shot boundary When changing larger, which is new scene boundary.
(5) key frame is extracted based on compression domain.Method based on compression domain is not necessarily to decompress video flowing or only part is needed to decompress, directly It connects and extracts key frame from mpeg compressed video stream, reduce the complexity of calculating.
Step 3: the scene cut based on key frame.
In terms of mainly including following three:
(1) it is detected based on inter-frame difference.Frame differential method is one kind by making difference fortune to two frame adjacent in sequence of video images The method for calculating to obtain moving target profile, it can be perfectly suitable for the feelings mobile there are multiple moving targets and video camera Condition.
(2) it is based on background Differential Detection.Background subtraction is the universal method that a kind of pair of static scene carries out motion segmentation, it will The picture frame and background image currently obtained does calculus of differences, obtains the grayscale image of target moving region, carries out threshold to grayscale image Moving region is extracted in value, and to avoid ambient lighting variation from influencing, and background image carries out more according to the current picture frame that obtains Newly.Particular content is as shown in Figure 3.
(3) it is detected based on optical flow method.Optical flow method using pixel in image sequence in time-domain variation and consecutive frame between Correlation the motion information of object between consecutive frame is calculated according to the corresponding relationship between previous frame and present frame.
(4) video after dividing, can be represented as x1,…,xi, wherein x indicates the period of divided video, and i indicates quilt Divide the number of video.
Step 4: the audio segmentation of video.
Audio frequency splitting method based on EMD, detailed process is as follows:
(1) original audio data sequence X (t) determines all maximum points, and is fitted to form former data with cubic spline functions Coenvelope line.
(2) all minimum points are found out, and all minimum points are fitted to form data by cubic spline functions Lower envelope line.
(3) mean value of coenvelope line and lower envelope line is denoted as ml, and former data sequence X (t) is subtracted average envelope ml, obtains one A new audio data sequence hl, as shown by the equation:
Hl=X (t)-ml
(4) audio data after decomposing to EMD carries out cluster segmentation.
(5) audio after dividing, can be represented as y1,…,yj, wherein y indicates the period of divided audio, and j indicates quilt Divide the number of audio.
Step 5: the semantic segmentation of video.
It is main comprising following aspects for the semantic segmentation of paragraph:
(1) semantic chunk is defined.Semantic chunk, which refers to, is divided into several relatively independent semantic primitives for a sentence, and length is based on On the meaning of a word under sentence justice;It is a kind of grammer, semanteme, the associated preprocessing means of pragmatic.It is onrecurrent between each semantic chunk, non-embedding Set is not overlapped.
(2) sentence justice is divided.Natural language processing usually requires three aspects of analysis: grammer, semantic and context, thus first into Compose a piece of writing this participle and part of speech target statistical disposition work, after having carried out word classification, it is carried out quickly to mark work, subsequently for Word carries out semantic recombination, finally according to the semantic chunk defined, carries out the segmentation of sentence justice.
(3) paragraph after dividing, can be represented as z1,…,zk, wherein z indicates the period of divided audio, and k indicates quilt Divide the number of audio.
Step 6: the paragraph correlation rule evaluation method of the segmentation video of GNN network.
Figure neural network (GNN, Graph Neural Network) mainly can be effectively in modeling between object Relationship or interaction.For same section of video, after carrying out scenario above, sound, being split in three dimensions of paragraph, obtain The case where video in different time periods can not be perfectly aligned in the video of three dimensions segmentation, can generate intersection, therefore this Invention uses GNN neural network, evaluates the relevance of the video paragraph after above-mentioned segmentation.T indicates each second video, GNN (t | x), GNN (t | y), GNN (t | z) refer to the feature vector currently extracted in each dimension segmentation video-frequency band.
Step 7: building related network.
Building related network is broadly divided into 2 steps.
(1) from single dimension, according to Euclidean distance or Hamming distance, the network associate rule in each video-frequency band is constructed, Including between node intensity and direction.
(2) related network of three dimensions is combined with each other, forms a new oriented related network.
CN201910395119.6A 2019-05-13 2019-05-13 Paragraph association rule evaluation method based on multi-dimensional element video segmentation Active CN110097026B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910395119.6A CN110097026B (en) 2019-05-13 2019-05-13 Paragraph association rule evaluation method based on multi-dimensional element video segmentation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910395119.6A CN110097026B (en) 2019-05-13 2019-05-13 Paragraph association rule evaluation method based on multi-dimensional element video segmentation

Publications (2)

Publication Number Publication Date
CN110097026A true CN110097026A (en) 2019-08-06
CN110097026B CN110097026B (en) 2021-04-27

Family

ID=67447957

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910395119.6A Active CN110097026B (en) 2019-05-13 2019-05-13 Paragraph association rule evaluation method based on multi-dimensional element video segmentation

Country Status (1)

Country Link
CN (1) CN110097026B (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111126262A (en) * 2019-12-24 2020-05-08 中国科学院自动化研究所 Video highlight detection method and device based on graph neural network
CN111586494A (en) * 2020-04-30 2020-08-25 杭州慧川智能科技有限公司 Intelligent strip splitting method based on audio and video separation
CN111914118A (en) * 2020-07-22 2020-11-10 珠海大横琴科技发展有限公司 Video analysis method, device and equipment based on big data and storage medium
CN113470048A (en) * 2021-07-06 2021-10-01 北京深睿博联科技有限责任公司 Scene segmentation method, device, equipment and computer readable storage medium
CN113810782A (en) * 2020-06-12 2021-12-17 阿里巴巴集团控股有限公司 Video processing method and device, server and electronic device
CN115665359A (en) * 2022-10-09 2023-01-31 西华县环境监察大队 Intelligent compression method for environmental monitoring data
CN115905584A (en) * 2023-01-09 2023-04-04 共道网络科技有限公司 Video splitting method and device

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080310731A1 (en) * 2007-06-18 2008-12-18 Zeitera, Llc Methods and Apparatus for Providing a Scalable Identification of Digital Video Sequences
CN106780503A (en) * 2016-12-30 2017-05-31 北京师范大学 Remote sensing images optimum segmentation yardstick based on posterior probability information entropy determines method
CN109344780A (en) * 2018-10-11 2019-02-15 上海极链网络科技有限公司 A kind of multi-modal video scene dividing method based on sound and vision
CN109711379A (en) * 2019-01-02 2019-05-03 电子科技大学 A kind of complex environment traffic lights candidate region is extracted and recognition methods

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080310731A1 (en) * 2007-06-18 2008-12-18 Zeitera, Llc Methods and Apparatus for Providing a Scalable Identification of Digital Video Sequences
CN106780503A (en) * 2016-12-30 2017-05-31 北京师范大学 Remote sensing images optimum segmentation yardstick based on posterior probability information entropy determines method
CN109344780A (en) * 2018-10-11 2019-02-15 上海极链网络科技有限公司 A kind of multi-modal video scene dividing method based on sound and vision
CN109711379A (en) * 2019-01-02 2019-05-03 电子科技大学 A kind of complex environment traffic lights candidate region is extracted and recognition methods

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111126262A (en) * 2019-12-24 2020-05-08 中国科学院自动化研究所 Video highlight detection method and device based on graph neural network
CN111126262B (en) * 2019-12-24 2023-04-28 中国科学院自动化研究所 Video highlight detection method and device based on graphic neural network
CN111586494A (en) * 2020-04-30 2020-08-25 杭州慧川智能科技有限公司 Intelligent strip splitting method based on audio and video separation
CN111586494B (en) * 2020-04-30 2022-03-11 腾讯科技(深圳)有限公司 Intelligent strip splitting method based on audio and video separation
CN113810782A (en) * 2020-06-12 2021-12-17 阿里巴巴集团控股有限公司 Video processing method and device, server and electronic device
CN111914118A (en) * 2020-07-22 2020-11-10 珠海大横琴科技发展有限公司 Video analysis method, device and equipment based on big data and storage medium
CN113470048A (en) * 2021-07-06 2021-10-01 北京深睿博联科技有限责任公司 Scene segmentation method, device, equipment and computer readable storage medium
CN115665359A (en) * 2022-10-09 2023-01-31 西华县环境监察大队 Intelligent compression method for environmental monitoring data
CN115905584A (en) * 2023-01-09 2023-04-04 共道网络科技有限公司 Video splitting method and device

Also Published As

Publication number Publication date
CN110097026B (en) 2021-04-27

Similar Documents

Publication Publication Date Title
CN110097026A (en) A kind of paragraph correlation rule evaluation method based on multidimensional element Video segmentation
CN110197135A (en) A kind of video structural method based on multidimensional segmentation
Zellers et al. Neural motifs: Scene graph parsing with global context
CN109389055B (en) Video classification method based on mixed convolution and attention mechanism
CN107358195B (en) Non-specific abnormal event detection and positioning method based on reconstruction error and computer
CN107704862A (en) A kind of video picture segmentation method based on semantic instance partitioning algorithm
CN103929685A (en) Video abstract generating and indexing method
CN112668559A (en) Multi-mode information fusion short video emotion judgment device and method
CN103235944A (en) Crowd flow division and crowd flow abnormal behavior identification method
CN109803112A (en) Video analysis management method based on big data, apparatus and system, storage medium
CN110705412A (en) Video target detection method based on motion history image
CN109948721A (en) A kind of video scene classification method based on video presentation
CN113792606B (en) Low-cost self-supervision pedestrian re-identification model construction method based on multi-target tracking
Avrithis et al. Broadcast news parsing using visual cues: A robust face detection approach
US20160307044A1 (en) Process for generating a video tag cloud representing objects appearing in a video content
Wang et al. Intermediate fused network with multiple timescales for anomaly detection
Ul Haq et al. An effective video summarization framework based on the object of interest using deep learning
Xu et al. Violent video classification based on spatial-temporal cues using deep learning
Hu et al. AVMSN: An audio-visual two stream crowd counting framework under low-quality conditions
Tao et al. CENet: A channel-enhanced spatiotemporal network with sufficient supervision information for recognizing industrial smoke emissions
CN112989950A (en) Violent video recognition system oriented to multi-mode feature semantic correlation features
CN111488813A (en) Video emotion marking method and device, electronic equipment and storage medium
Nandini et al. Automatic traffic control system using PCA based approach
Boufares et al. Moving object detection system based on the modified temporal difference and otsu algorithm
CN109977891A (en) A kind of object detection and recognition method neural network based

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant