CN103077236A - System and method for realizing video knowledge acquisition and marking function of portable-type device - Google Patents

System and method for realizing video knowledge acquisition and marking function of portable-type device Download PDF

Info

Publication number
CN103077236A
CN103077236A CN2013100072912A CN201310007291A CN103077236A CN 103077236 A CN103077236 A CN 103077236A CN 2013100072912 A CN2013100072912 A CN 2013100072912A CN 201310007291 A CN201310007291 A CN 201310007291A CN 103077236 A CN103077236 A CN 103077236A
Authority
CN
China
Prior art keywords
video
file
module
picture
mark
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN2013100072912A
Other languages
Chinese (zh)
Other versions
CN103077236B (en
Inventor
李逸
胡传平
梅林�
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Third Research Institute of the Ministry of Public Security
Original Assignee
Third Research Institute of the Ministry of Public Security
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Third Research Institute of the Ministry of Public Security filed Critical Third Research Institute of the Ministry of Public Security
Priority to CN201310007291.2A priority Critical patent/CN103077236B/en
Publication of CN103077236A publication Critical patent/CN103077236A/en
Application granted granted Critical
Publication of CN103077236B publication Critical patent/CN103077236B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Television Signal Processing For Recording (AREA)
  • Image Analysis (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to a system and a method for realizing a video knowledge acquisition and marking function of a portable-type device, which belongs to the technical field of computer application. The system comprises a resource input module, a structuralized description module, a marking device module, a two-direction cutting module and a central control module. In the method, the structuralized description module generates structuralized description data; the marking device module conducts coarseness mark on a video and a picture file according to the structuralized description data; the video and the picture file are classified into a plurality of video clips with different semantics or different areas in a picture according to the coarseness mark; and then fineness mark is conducted for the video and the picture file. Due to the adoption of the system and the method, the subjective influence of the manual device mark standard on the video marking can be reduced, the precision of the video marking can be greatly improved, and convenience for automatically enlarging the video mark training set can be realized; and simplicity and convenience in application way of the system and the method can be realized, the realization cost is low, and the application range is wider.

Description

Portable set is realized the system and method for video knowledge acquisition and marking Function
Technical field
The present invention relates to the Computer Applied Technology field, particularly technical field of video image processing specifically refers to a kind of system and method for realizing video knowledge acquisition and marking Function based on portable set.
Background technology
Develop rapidly along with computer technology and internet, a large amount of video equipments can pass through the easily digitized video information of bulk transfer on portable set of internet, development and the popularization of portable video playback equipment (iPad, iPod, iPhone, Digital Video), be conducive to the user and can search easily, share and use all kinds of resources in the network, such as video film, monitor video, video conference and Video chat etc. whenever and wherever possible.
Portable set is to a large amount of retrievals of video and share, and has drawn about how using portable set collection, marking the research of the problems such as video/image resource.A lot of experts propose directly to utilize the content information of video to carry out machine recognition, automatic marking now.This machine recognition and automatic marking are by means of the processing of the video frames/images information in the video being carried out from the bottom to the high level, and the content that marks in the process of analysis and understanding is so that carry out follow-up video frequency searching.The marked content of these bottoms refers to the proper vectors such as the color, texture, motion of video.Although these contents can better be expressed the information of video, the demand to video labeling in this mark mode and the follow-up retrieval differs too large.For example, the purpose that general networking user carries out video or image retrieval is to depend on the text message of video or the description of metadata are retrieved, and the aspects such as spatial relationship that can not go to pay close attention to its color, texture, shape and form on this basis, take the bottom vision of image and image characteristics as main, the disposal route that image is marked has calculates characteristics simple, stable performance, but these features have certain limitation at present.
Find through the retrieval to prior art, Chinese patent literature CN102508923, a kind of " based on the automatic video frequency annotate method of automatic classification and keyword " disclosed, this technology at first extracts global characteristics and the local feature by artificial mark video, use global characteristics that the classification of video is classified, set up again the multidimensional characteristic of video and the corresponding relation between the mark key word, on the basis of the training set of having set up, automatically process the video without mark again.
Chinese patent literature CN101650728, " video high-level characteristic retrieval system and realization thereof " disclosed, this technology is extracted the low-level image feature (color, shape, texture etc.) of key frame of video image, and utilize support vector machine (Support Vector Machine, SVM) feature of extracting is classified, and then extract corresponding video semanteme.
Above-mentioned research has proposed some preferably methods in the video semanteme extractive technique, but has equally following defective:
1, needs a large amount of artificial mark video sets, could form accurate, a complete sample training collection.
2, owing to the artificial video content that marks, its subjectivity is too strong, and different people can not be just the same to feature, the mark of same video, the very difficult artificial labeled standards that forms unification.
3, the video content of artificial mark, its extendibility is too poor, for a video that does not have sample training to concentrate and never marked, also need to search for original storehouse, the keyword extraction of close feature with it out, set up the contact between feature and the keyword, this just can finish the note to video.
Summary of the invention
The objective of the invention is to have overcome above-mentioned shortcoming of the prior art, a kind of extraction of this video data being carried out the structural description of coarseness is provided, again on the basis of the structural description of coarseness, set cut-point and the interior cut zone piece of frame of frame of video, then in the cut-point of having set and block, carry out fine-grained mark, thereby reduce the artificial apparatus labeled standards to the subjectivity impact of video labeling, improve the degree of accuracy of video labeling, and be convenient to the automatic expansion of video labeling training set, simultaneously, application mode is easy, realize with low costly, range of application also realizes the system and method for video knowledge acquisition and marking Function comparatively widely based on portable set.
In order to realize above-mentioned purpose, the system based on portable set realization video knowledge acquisition and marking Function of the present invention has following formation:
This system comprises: resource load module, structural description module, tagging equipment module, Double Directional Cutting module and Central Control Module.Wherein, resource load module connected system external unit is in order to obtain the picture file of video file and intercepting video frames from the external unit that connects; The structural description module produces the structural description data in order to the structural description content of the picture file that extracts described video file and intercepting video frames; The tagging equipment module is described portable set, in order to mark according to the picture file of described structural description data to described video file and intercepting video frames; The Double Directional Cutting module arranges a plurality of different semantic segmentation points or zone according to the mark of the picture file of described video file and intercepting video frames, and the picture file of described video file and intercepting video frames is divided into a plurality of the have video segments of different semantemes or the zoness of different in the picture according to described semantic segmentation point or zone; Central Control Module connects described resource load module, structural description module, tagging equipment module and Double Directional Cutting module, in order to send assignment instructions to described each module, dispatches the operation of each module.
Should realize in the system of video knowledge acquisition and marking Function based on portable set, described structural description module comprises semantic relation unit, space-time dividing unit, feature extraction unit and the object recognition unit that is linked in sequence, in order to produce the discernible hierarchical structure descriptor about video and image of people and computer system.
Should realize in the system of video knowledge acquisition and marking Function based on portable set, described Double Directional Cutting module comprises cut-point setting unit and cutter unit, and described cut-point setting unit arranges a plurality of different semantic segmentation points or zone according to the described picture file that is labeled in described video file or intercepting video frames; Described cutter unit is divided into a plurality of have different semantic video segments or the zoness of different in the picture in order to the picture file with described video file and intercepting video frames according to described semantic segmentation point or zone.
Should realize in the system of video knowledge acquisition and marking Function based on portable set, described tagging equipment module comprises coarse particle mark processing unit and fine grained mark processing unit, and described coarse particle mark processing unit is in order to carry out the coarse particle mark according to described structural description data to the picture file of described video file and intercepting video frames; Described fine grained mark processing unit is in order to carry out fine grained mark to described through the different semantic video segments of having of Double Directional Cutting module segmentation or the zones of different in the picture according to described structural description data.
Should realize that described portable set comprised panel computer, smart mobile phone and Digital Video in the system of video knowledge acquisition and marking Function based on portable set.
The invention also discloses a kind of method of utilizing described system to realize portable set video knowledge acquisition and mark processing, the method may further comprise the steps:
(201) described resource load module obtains the picture file of video file and intercepting video frames from the internet, LAN (Local Area Network) or the video database that connect;
(202) described structural description module is extracted the structural description content of the picture file of described video file and intercepting video frames, produces the structural description data;
(203) described structural description module is extracted video image characteristic and the semantic data of the picture file of described video file and intercepting video frames, and described characteristics of image and semantic data are input in the structural description data;
(204) described tagging equipment module is carried out the coarseness mark according to described structural description data to the picture file of described video file and intercepting video frames;
(205) described Double Directional Cutting module marks according to described coarseness, and a plurality of different semantic segmentation points or zone are set in the picture file of described video file and intercepting video frames;
(206) described Double Directional Cutting module is divided into a plurality of have different semantic video segments or the zoness of different in the picture with the picture file of described video file and intercepting video frames according to described semantic segmentation point or zone;
(207) described tagging equipment module is carried out the fine grained mark to described video segments or the zones of different in the picture with different semantemes.
This is realized portable set video knowledge acquisition and marks in the method for processing, described structural description module comprises semantic relation unit, space-time dividing unit, feature extraction unit and the object recognition unit that is linked in sequence, and described step (202) specifically may further comprise the steps:
(301) implication of the object in the picture file of described video file and intercepting video frames is determined in described semantic relation unit;
(302) described space-time dividing unit, isolate significant object according to described object implication from the picture file of described video file and intercepting video frames, described significant object comprises object shapes information and the object texture information of semantic level;
(303) described feature extraction unit is extracted feature according to color characteristic and the space characteristics of the picture file of described video file and intercepting video frames;
(304) described object recognition unit is identified unknown object according to known training objects set and is classified;
(305) according to the result of above-mentioned space-time dividing, feature extraction and object identification, generate described hierarchical structure data of description about video and image.
This is realized portable set video knowledge acquisition and marks in the method for processing, described Double Directional Cutting module comprises cut-point setting unit and cutter unit, described Double Directional Cutting module is labeled in according to described coarseness in the picture file of described video file and intercepting video frames a plurality of different semantic segmentation points or zone is set, and is specially:
Described cut-point setting unit arranges a plurality of different semantic segmentation points or zone according to the described picture file that is labeled in described video file or intercepting video frames;
Described Double Directional Cutting module is divided into a plurality of have different semantic video segments or the zoness of different in the picture with the picture file of described video file and intercepting video frames according to described semantic segmentation point or zone, is specially:
Described cutter unit is divided into a plurality of have different semantic video segments or the zoness of different in the picture in order to the picture file with described video file and intercepting video frames according to described semantic segmentation point or zone.
This is realized portable set video knowledge acquisition and marks in the method for processing, described tagging equipment module comprises coarse particle mark processing unit and fine grained mark processing unit, described tagging equipment module is carried out the coarseness mark according to described structural description data to the picture file of described video file and intercepting video frames, is specially:
Described coarse particle mark processing unit carries out the coarse particle mark in order to described structural description data to the picture file of described video file and intercepting video frames;
Described tagging equipment module is carried out the fine grained mark to described video segments or the zones of different in the picture with different semantemes, is specially:
Described fine grained mark processing unit carries out fine grained mark to described through the different semantic video segments of having of Double Directional Cutting module segmentation or the zones of different in the picture according to described structural description data.
This realizes portable set video knowledge acquisition and marks in the method for processing that described step (203) specifically may further comprise the steps:
(203-1) described structural description module is divided into some video image fragments, key frame and crucial subregion with the picture file of described video file and intercepting video frames;
(203-2) described structural description module is carried out feature extraction and semantic analysis processing to described video image fragment, key frame and crucial subregion, obtains video image characteristic and semantic data;
(203-3) described structural description module is input to described characteristics of image and semantic data in the structural description data.
This is realized portable set video knowledge acquisition and marks in the method for processing, described Double Directional Cutting module marks according to described coarseness, a plurality of different semantic segmentation points or zone are set in the picture file of described video file and intercepting video frames, are specially: described Double Directional Cutting module is labeled in described key frame and the crucial subregion according to described coarseness a plurality of different semantic segmentation points or zone is set;
Described Double Directional Cutting module is divided into a plurality of have different semantic video segments or the zoness of different in the picture with the picture file of described video file and intercepting video frames according to described semantic segmentation point or zone, is specially: described Double Directional Cutting module is divided into a plurality of have different semantic video segments or zoness of different in picture with crucial subregion according to described semantic segmentation point or zone with described key frame.
Adopted the system and method based on portable set realization video knowledge acquisition and marking Function of this invention, its system comprises resource load module, structural description module, tagging equipment module, Double Directional Cutting module and Central Control Module.In the method, the structural description module is extracted the structural description content of the picture file of video file and intercepting video frames, produces the structural description data; The tagging equipment module is first carried out the coarseness mark according to the structural description data to the picture file of video file and intercepting video frames; The picture file that marks video file and intercepting video frames according to coarseness again is divided into a plurality of have different semantic video segments or the zoness of different in the picture according to semantic segmentation point or zone; Then carry out the fine grained mark to having different semantic video segments or the zones of different in the picture.Thereby can utilize system and method for the present invention, effectively reduce the artificial apparatus labeled standards to the subjectivity impact of video labeling, significantly improved the degree of accuracy of video labeling, also be convenient to the automatic expansion of video labeling training set, and realize the system and method for video knowledge acquisition and marking Function based on portable set, its application mode is easy, realizes with low costly, and range of application is also comparatively extensive.
Description of drawings
Fig. 1 is the structural representation of realizing the system of video knowledge acquisition and marking Function based on portable set of the present invention.
Fig. 2 is realization portable set video knowledge acquisition of the present invention and the flow chart of steps that marks the method for processing.
Fig. 3 is realization portable set video knowledge acquisition of the present invention and the schematic flow sheet that marks video structural description in the method for processing
Fig. 4 is for adopting method of the present invention video A and image B to be carried out the sequential chart of the example of knowledge acquisition and mark.
Embodiment
In order more clearly to understand technology contents of the present invention, describe in detail especially exemplified by following examples.
See also shown in Figure 1ly, be the structural representation of realizing the system of video knowledge acquisition and marking Function based on portable set of the present invention.
In one embodiment, should realize that the system of video knowledge acquisition and marking Function comprised resource load module, structural description module, tagging equipment module, Double Directional Cutting module and Central Control Module based on portable set.
Wherein, resource load module connected system external unit is in order to obtain the picture file of video file and intercepting video frames from the external unit that connects.The structural description module produces the structural description data in order to the structural description content of the picture file that extracts described video file and intercepting video frames.The tagging equipment module is the portable sets such as panel computer, smart mobile phone and Digital Video, in order to mark according to the picture file of described structural description data to described video file and intercepting video frames.The Double Directional Cutting module arranges a plurality of different semantic segmentation points or zone according to the mark of the picture file of described video file and intercepting video frames, and the picture file of described video file and intercepting video frames is divided into a plurality of the have video segments of different semantemes or the zoness of different in the picture according to described semantic segmentation point or zone.Central Control Module connects described resource load module, structural description module, tagging equipment module and Double Directional Cutting module, in order to send assignment instructions to described each module, dispatches the operation of each module.
Utilize the described system of this embodiment to realize the method that portable set video knowledge acquisition and mark are processed, as shown in Figure 2, may further comprise the steps:
(201) described resource load module obtains the picture file of video file and intercepting video frames from the internet, LAN (Local Area Network) or the video database that connect;
(202) described structural description module is extracted the structural description content of the picture file of described video file and intercepting video frames, produces the structural description data;
(203) described structural description module is extracted video image characteristic and the semantic data of the picture file of described video file and intercepting video frames, and described characteristics of image and semantic data are input in the structural description data;
(204) described tagging equipment module is carried out the coarseness mark according to described structural description data to the picture file of described video file and intercepting video frames;
(205) described Double Directional Cutting module marks according to described coarseness, and a plurality of different semantic segmentation points or zone are set in the picture file of described video file and intercepting video frames;
(206) described Double Directional Cutting module is divided into a plurality of have different semantic video segments or the zoness of different in the picture with the picture file of described video file and intercepting video frames according to described semantic segmentation point or zone;
(207) described tagging equipment module is carried out the fine grained mark to described video segments or the zones of different in the picture with different semantemes.
In a kind of more preferably embodiment, described structural description module comprises semantic relation unit, space-time dividing unit, feature extraction unit and the object recognition unit that is linked in sequence, in order to produce the discernible hierarchical structure descriptor about video and image of people and computer system.
Utilize this more preferably the described system of embodiment realize that described step (202) specifically may further comprise the steps in the method that portable set video knowledge acquisition and mark process:
(301) implication of the object in the picture file of described video file and intercepting video frames is determined in described semantic relation unit;
(302) described space-time dividing unit, isolate significant object according to described object implication from the picture file of described video file and intercepting video frames, described significant object comprises object shapes information and the object texture information of semantic level;
(303) described feature extraction unit is extracted feature according to color characteristic and the space characteristics of the picture file of described video file and intercepting video frames;
(304) described object recognition unit is identified unknown object according to known training objects set and is classified;
(305) according to the result of above-mentioned space-time dividing, feature extraction and object identification, generate described hierarchical structure data of description about video and image.
At another kind more preferably in the embodiment, described Double Directional Cutting module comprises cut-point setting unit and cutter unit, and described cut-point setting unit arranges a plurality of different semantic segmentation points or zone according to the described picture file that is labeled in described video file or intercepting video frames; Described cutter unit is divided into a plurality of have different semantic video segments or the zoness of different in the picture in order to the picture file with described video file and intercepting video frames according to described semantic segmentation point or zone.
Utilize this more preferably the described system of embodiment realize in the method that portable set video knowledge acquisition and mark process, described step (205) Double Directional Cutting module is labeled in according to described coarseness in the picture file of described video file and intercepting video frames a plurality of different semantic segmentation points or zone are set, and is specially: described cut-point setting unit arranges a plurality of different semantic segmentation points or zone according to the described picture file that is labeled in described video file or intercepting video frames.
Described step (206) Double Directional Cutting module is divided into a plurality of have different semantic video segments or the zoness of different in the picture with the picture file of described video file and intercepting video frames according to described semantic segmentation point or zone, is specially: described cutter unit is divided into a plurality of have different semantic video segments or the zoness of different in the picture in order to the picture file with described video file and intercepting video frames according to described semantic segmentation point or zone.
In further preferred embodiment, described tagging equipment module comprises coarse particle mark processing unit and fine grained mark processing unit, and described coarse particle mark processing unit is in order to carry out the coarse particle mark according to described structural description data to the picture file of described video file and intercepting video frames; Described fine grained mark processing unit is in order to carry out fine grained mark to described through the different semantic video segments of having of Double Directional Cutting module segmentation or the zones of different in the picture according to described structural description data.
Utilizing the described system of further preferred embodiment to realize portable set video knowledge acquisition and mark in the method for processing, described step (204) tagging equipment module is carried out the coarseness mark according to described structural description data to the picture file of described video file and intercepting video frames, is specially: described coarse particle mark processing unit carries out the coarse particle mark in order to described structural description data to the picture file of described video file and intercepting video frames.
Described step (207) tagging equipment module has different semantic video segments or the zones of different in the picture is carried out the fine grained mark to described, and be specially: described fine grained mark processing unit carries out fine grained mark to described through the different semantic video segments of having of Double Directional Cutting module segmentation or the zones of different in the picture according to described structural description data.
Further preferred embodiment in, described step (203) specifically may further comprise the steps:
(203-1) described structural description module is divided into some video image fragments, key frame and crucial subregion with the picture file of described video file and intercepting video frames;
(203-2) described structural description module is carried out feature extraction and semantic analysis processing to described video image fragment, key frame and crucial subregion, obtains video image characteristic and semantic data;
(203-3) described structural description module is input to described characteristics of image and semantic data in the structural description data.
In preferred embodiment, described step (205) Double Directional Cutting module marks according to described coarseness, a plurality of different semantic segmentation points or zone are set in the picture file of described video file and intercepting video frames, are specially: described Double Directional Cutting module is labeled in described key frame and the crucial subregion according to described coarseness a plurality of different semantic segmentation points or zone is set.
Described step (206) Double Directional Cutting module is divided into a plurality of have different semantic video segments or the zoness of different in the picture with the picture file of described video file and intercepting video frames according to described semantic segmentation point or zone, is specially: described Double Directional Cutting module is divided into a plurality of have different semantic video segments or zoness of different in picture with crucial subregion according to described semantic segmentation point or zone with described key frame.
In actual applications, the system based on portable set realization video knowledge acquisition mark of the present invention comprises five modules such as mark resource load module, video structural description module, video Double Directional Cutting module, markup information management center module and tagging equipment module.
The various markup informations that provide for labeling system are provided described mark resource load module.From the marked content angle, the mark resource mainly comprises video file and with intercepting video frames picture out.Video file comes from the content of video monitoring, picture pick-up device preservation, and the picture in the video is obtained by intercepting in the video file, and this intercepting process can manually be finished, and also having can be that equipment intercepts the key frame in the video automatically, then carries out the picture of unloading.From the angle of mark source resource, the mark resource is divided into traffic mark resource and society's mark resource.All kinds of video surveillance networks that traffic mark source resource is disposed in public security department, video, the image resource that equipment gathers, all kinds of video surveillance networks that society's mark source resource is disposed in all kinds of social units or individual, video, the image resource that equipment gathers.
Described video structural description module refers to extract the structural description content of video, this is a kind ofly to be automated as main video information process and analytical approach with machine, adopt the processing means such as semantic relation, space-time dividing, feature extraction, object identification, be organized into the technology of the text message that can supply computing machine and people's understanding.It depend on image processing techniques, mode identification technology, semantic understanding technology etc. to video data carry out at many levels, the feature extraction of various dimensions.Video at first is sent to description equipment, and similar people's brain carries out analysis and understanding to content, and produces multi-level structuring output; Then data of description is transmitted and store, the equipment of describing is simultaneously also exported an original video confession system storage.From the flow process that data are processed, the video structural description technology can transform monitor video and be people and machine understandable information, and realizes that video data is to the conversion of key message.
Described video Double Directional Cutting module, from internet or LAN (Local Area Network), to download or to accept to obtain pending video, mark according to video arranges different semantic segmentation points or zone, video is divided into much have different semantic video segments or the zones of different in the same video image.
Described markup information management center module is the structural description result who inquires about described video, and the structuring annotation results of inquiry video is adjusted the parameter setting of supporting training set.
Described tagging equipment module, i.e. portable set, for example panel computer, smart mobile phone, Digital Video, the user realizes that by tagging equipment the behavior of tagging equipment is subject to management and the mandate of markup information administrative center to the mark of mark resource.
The present invention gathers label technology, non-relational database as the basis take portable support, has made up a kind of portable video/image collection, marks, uploads, inquiry system, can accomplish that the video image data is to the instant conversion processing of mark knowledge.
Adopt the method for said system may further comprise the steps:
From the Internet/LAN (Local Area Network), gather video file to be marked and the picture file in the video;
The result of the origin from video---structural description is starting point, and video file and picture file are carried out the feature extraction of structural description;
According to the structural description result of video file and picture file, form corresponding video semanteme coarseness mark;
According to video coarseness mark different semantic segmentation points is set again, video is divided into the video segment that much has different semantic hierarchies or the geometric areas in the frame of video;
At last again with each coarseness video segment as the mark unit, video content is carried out fine-grained semantic tagger.
Particularly, the process flow diagram of the embodiment of the invention mainly comprises the steps:
Step 201, the structured video data transmitting request that the acceptance of video filtering system is transmitted from internet, LAN (Local Area Network) or video database;
Step 202 gathers video and image pattern to be marked, and digital video signal is input to the video structural description module, if input is simulating signal, then carries out analog to digital conversion;
Step 203 extracts the structural description result of video, and the video signal that video input module is imported into carries out intellectual analysis to be processed, and video signal is divided into some each video image fragments, key frame and subregion; Video image fragment, key frame and subregion are carried out feature extraction and senior semantic analysis processing, obtain feature and the senior semantic data of video image, and enter data in the video structural description database;
Step 204 is carried out the video labeling of coarseness to video according to the structural description result of video;
Step 205 arranges the semantic segmentation point of video, and this step is cut apart key frame and the crucial subregion of video, mark according to above-mentioned coarseness annotation results;
Step 206 according to semantic segmentation sheet and the cut-point of video, is carried out the fine granularity mark of video again.
Fig. 3 is the process flow diagram of the embodiment of video structural description among the present invention, mainly comprises the steps:
Step 301, system are accepted the request of the structured video data transmitting that transmits from internet, LAN (Local Area Network) or video database.
Step 302, space-time dividing refer to that system separates significant object from video sequence, and each video object plane comprises shape and the texture information of semantic level object video.According to the difference of dividing method, partitioning algorithm can be divided into: two kinds of spatial segmentation algorithm and time domain partitioning algorithms.Spatial segmentation is to use watershed algorithm to obtain the border of zones of different; It is to utilize time domain to change and detect to separate object video that time domain is cut apart, and the position of Moving Objects and shape obtain by frame difference method and background subtraction.
Step 303, feature extraction refer to according to color characteristic and space characteristics, extract the feature of expression frame of video.
Step 304, object identification refers to the method according to statistical model identification, namely in the design identification of the basis of known training objects set and sorting algorithm, thereby unknown object is carried out discriminator.
Step 305, according to above-mentioned space-time dividing, feature extraction and object identification, the hierarchical structure that draws video is described the result.
Fig. 4 is the sequential chart that video A in the embodiment of the invention, image B mark by the knowledge acquisition labeling system:
Step 401, is set up transmission with network transport mechanism and is connected the data of receiver, video A and image B after the transmission request that receives video A and image B based on the knowledge acquisition labeling system of portable set.
Step 402 is carried out the data pre-service with the video/image data that receive, if input is simulating signal, then carries out analog to digital conversion;
Step 403 is carried out the intellectual analysis processing to the video/image signal that video input module imports into, and video/image signal is divided into several video image fragments, key frame and subregion; Video image fragment, key frame and subregion are carried out feature extraction and senior semantic analysis processing, obtain feature and the senior semantic data of video/image, and enter data in the video structural description database;
Step 404 is described the result with the coarseness of the image B of above-mentioned generation and is returned.
Step 405 according to the video time frame or with the semantic description of zones of different in the frame of video, can be carried out two-way cutting with video.
Step 406 according to above-mentioned Double Directional Cutting point, again extracts fine-grained video features and describes.
Step 407 is described the result with the video A fine granularity of above-mentioned generation and is returned.
Adopted the system and method based on portable set realization video knowledge acquisition and marking Function of this invention, its system comprises resource load module, structural description module, tagging equipment module, Double Directional Cutting module and Central Control Module.In the method, the structural description module is extracted the structural description content of the picture file of video file and intercepting video frames, produces the structural description data; The tagging equipment module is first carried out the coarseness mark according to the structural description data to the picture file of video file and intercepting video frames; The picture file that marks video file and intercepting video frames according to coarseness again is divided into a plurality of have different semantic video segments or the zoness of different in the picture according to semantic segmentation point or zone; Then carry out the fine grained mark to having different semantic video segments or the zones of different in the picture.Thereby can utilize system and method for the present invention, effectively reduce the artificial apparatus labeled standards to the subjectivity impact of video labeling, significantly improved the degree of accuracy of video labeling, also be convenient to the automatic expansion of video labeling training set, and realize the system and method for video knowledge acquisition and marking Function based on portable set, its application mode is easy, realizes with low costly, and range of application is also comparatively extensive.
In this instructions, the present invention is described with reference to its specific embodiment.But, still can make various modifications and conversion obviously and not deviate from the spirit and scope of the present invention.Therefore, instructions and accompanying drawing are regarded in an illustrative, rather than a restrictive.

Claims (11)

1. realize the system of video knowledge acquisition and marking Function it is characterized in that described system comprises based on portable set for one kind:
The resource load module, the connected system external unit is in order to obtain the picture file of video file and intercepting video frames from the external unit that connects;
The structural description module in order to the structural description content of the picture file that extracts described video file and intercepting video frames, produces the structural description data;
The tagging equipment module is described portable set, in order to mark according to the picture file of described structural description data to described video file and intercepting video frames;
The Double Directional Cutting module, mark according to the picture file of described video file and intercepting video frames arranges a plurality of different semantic segmentation points or zone, and the picture file of described video file and intercepting video frames is divided into a plurality of the have video segments of different semantemes or the zoness of different in the picture according to described semantic segmentation point or zone;
Central Control Module connects described resource load module, structural description module, tagging equipment module and Double Directional Cutting module, in order to send assignment instructions to described each module, dispatches the operation of each module.
2. the system that realizes video knowledge acquisition and marking Function based on portable set according to claim 1, it is characterized in that, described structural description module comprises semantic relation unit, space-time dividing unit, feature extraction unit and the object recognition unit that is linked in sequence, in order to produce the discernible hierarchical structure descriptor about video and image of people and computer system.
3. the system that realizes video knowledge acquisition and marking Function based on portable set according to claim 1, it is characterized in that, described Double Directional Cutting module comprises cut-point setting unit and cutter unit, and described cut-point setting unit arranges a plurality of different semantic segmentation points or zone according to the described picture file that is labeled in described video file or intercepting video frames; Described cutter unit is divided into a plurality of have different semantic video segments or the zoness of different in the picture in order to the picture file with described video file and intercepting video frames according to described semantic segmentation point or zone.
4. the system that realizes video knowledge acquisition and marking Function based on portable set according to claim 3, it is characterized in that, described tagging equipment module comprises coarse particle mark processing unit and fine grained mark processing unit, and described coarse particle mark processing unit is in order to carry out the coarse particle mark according to described structural description data to the picture file of described video file and intercepting video frames; Described fine grained mark processing unit is in order to carry out fine grained mark to described through the different semantic video segments of having of Double Directional Cutting module segmentation or the zones of different in the picture according to described structural description data.
5. each describedly realizes system of video knowledge acquisition and marking Function it is characterized in that described portable set comprises panel computer, smart mobile phone and Digital Video based on portable set in 4 according to claim 1.
6. one kind is utilized system claimed in claim 1 to realize the method that portable set video knowledge acquisition and mark are processed, and it is characterized in that described method may further comprise the steps:
(201) described resource load module obtains the picture file of video file and intercepting video frames from the internet, LAN (Local Area Network) or the video database that connect;
(202) described structural description module is extracted the structural description content of the picture file of described video file and intercepting video frames, produces the structural description data;
(203) described structural description module is extracted video image characteristic and the semantic data of the picture file of described video file and intercepting video frames, and described characteristics of image and semantic data are input in the structural description data;
(204) described tagging equipment module is carried out the coarseness mark according to described structural description data to the picture file of described video file and intercepting video frames;
(205) described Double Directional Cutting module marks according to described coarseness, and a plurality of different semantic segmentation points or zone are set in the picture file of described video file and intercepting video frames;
(206) described Double Directional Cutting module is divided into a plurality of have different semantic video segments or the zoness of different in the picture with the picture file of described video file and intercepting video frames according to described semantic segmentation point or zone;
(207) described tagging equipment module is carried out the fine grained mark to described video segments or the zones of different in the picture with different semantemes.
7. the method processed of realization portable set video knowledge acquisition according to claim 6 and mark, it is characterized in that, described structural description module comprises semantic relation unit, space-time dividing unit, feature extraction unit and the object recognition unit that is linked in sequence, and described step (202) specifically may further comprise the steps:
(301) implication of the object in the picture file of described video file and intercepting video frames is determined in described semantic relation unit;
(302) described space-time dividing unit, isolate significant object according to described object implication from the picture file of described video file and intercepting video frames, described significant object comprises object shapes information and the object texture information of semantic level;
(303) described feature extraction unit is extracted feature according to color characteristic and the space characteristics of the picture file of described video file and intercepting video frames;
(304) described object recognition unit is identified unknown object according to known training objects set and is classified;
(305) according to the result of above-mentioned space-time dividing, feature extraction and object identification, generate described hierarchical structure data of description about video and image.
8. the method processed of realization portable set video knowledge acquisition according to claim 6 and mark, it is characterized in that, described Double Directional Cutting module comprises cut-point setting unit and cutter unit, described Double Directional Cutting module is labeled in according to described coarseness in the picture file of described video file and intercepting video frames a plurality of different semantic segmentation points or zone is set, and is specially:
Described cut-point setting unit arranges a plurality of different semantic segmentation points or zone according to the described picture file that is labeled in described video file or intercepting video frames;
Described Double Directional Cutting module is divided into a plurality of have different semantic video segments or the zoness of different in the picture with the picture file of described video file and intercepting video frames according to described semantic segmentation point or zone, is specially:
Described cutter unit is divided into a plurality of have different semantic video segments or the zoness of different in the picture in order to the picture file with described video file and intercepting video frames according to described semantic segmentation point or zone.
9. the method processed of realization portable set video knowledge acquisition according to claim 6 and mark, it is characterized in that, described tagging equipment module comprises coarse particle mark processing unit and fine grained mark processing unit, described tagging equipment module is carried out the coarseness mark according to described structural description data to the picture file of described video file and intercepting video frames, is specially:
Described coarse particle mark processing unit carries out the coarse particle mark in order to described structural description data to the picture file of described video file and intercepting video frames;
Described tagging equipment module is carried out the fine grained mark to described video segments or the zones of different in the picture with different semantemes, is specially:
Described fine grained mark processing unit carries out fine grained mark to described through the different semantic video segments of having of Double Directional Cutting module segmentation or the zones of different in the picture according to described structural description data.
10. the method that each described realization portable set video knowledge acquisition and mark are processed in 9 according to claim 6 is characterized in that described step (203) specifically may further comprise the steps:
(203-1) described structural description module is divided into some video image fragments, key frame and crucial subregion with the picture file of described video file and intercepting video frames;
(203-2) described structural description module is carried out feature extraction and semantic analysis processing to described video image fragment, key frame and crucial subregion, obtains video image characteristic and semantic data;
(203-3) described structural description module is input to described characteristics of image and semantic data in the structural description data.
11. realization portable set video knowledge acquisition according to claim 10 and the method that marks processing, it is characterized in that, described Double Directional Cutting module marks according to described coarseness, a plurality of different semantic segmentation points or zone are set in the picture file of described video file and intercepting video frames, are specially:
Described Double Directional Cutting module is labeled in described key frame and the crucial subregion according to described coarseness a plurality of different semantic segmentation points or zone is set;
Described Double Directional Cutting module is divided into a plurality of have different semantic video segments or the zoness of different in the picture with the picture file of described video file and intercepting video frames according to described semantic segmentation point or zone, is specially:
Described Double Directional Cutting module is divided into a plurality of have different semantic video segments or zoness of different in picture with crucial subregion according to described semantic segmentation point or zone with described key frame.
CN201310007291.2A 2013-01-09 2013-01-09 Portable set realizes the system and method for video knowledge acquisition and marking Function Active CN103077236B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310007291.2A CN103077236B (en) 2013-01-09 2013-01-09 Portable set realizes the system and method for video knowledge acquisition and marking Function

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310007291.2A CN103077236B (en) 2013-01-09 2013-01-09 Portable set realizes the system and method for video knowledge acquisition and marking Function

Publications (2)

Publication Number Publication Date
CN103077236A true CN103077236A (en) 2013-05-01
CN103077236B CN103077236B (en) 2015-11-18

Family

ID=48153766

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310007291.2A Active CN103077236B (en) 2013-01-09 2013-01-09 Portable set realizes the system and method for video knowledge acquisition and marking Function

Country Status (1)

Country Link
CN (1) CN103077236B (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104185088A (en) * 2014-03-03 2014-12-03 无锡天脉聚源传媒科技有限公司 Video processing method and device
CN104866538A (en) * 2015-04-30 2015-08-26 北京海尔广科数字技术有限公司 Method, network and system of dynamic update semantic alarm database
CN104965814A (en) * 2015-06-30 2015-10-07 北京航空航天大学 Source data labeling extension method for technical publications of civil aircraft
CN105450978A (en) * 2014-06-24 2016-03-30 杭州海康威视数字技术股份有限公司 Method and device for achieving structural description in video monitoring system
CN109348161A (en) * 2018-09-21 2019-02-15 联想(北京)有限公司 Show markup information method and electronic equipment
CN110826101A (en) * 2019-11-05 2020-02-21 安徽数据堂科技有限公司 Privatization deployment data processing method for enterprise
CN111160380A (en) * 2018-11-07 2020-05-15 华为技术有限公司 Method for generating video analysis model and video analysis system
CN113076942A (en) * 2020-01-03 2021-07-06 上海依图网络科技有限公司 Method, device, chip and computer readable storage medium for detecting preset mark
CN113408261A (en) * 2021-08-10 2021-09-17 广东新瑞智安科技有限公司 Method and system for generating job requisition

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101211460A (en) * 2006-12-30 2008-07-02 中国科学院计算技术研究所 Method and device for automatically dividing and classifying sports vision frequency shot
CN101650958A (en) * 2009-07-23 2010-02-17 中国科学院声学研究所 Extraction method and index establishment method of movie video scene clip
WO2012099773A1 (en) * 2011-01-17 2012-07-26 Qualcomm Incorporated Pressure-based video recording

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101211460A (en) * 2006-12-30 2008-07-02 中国科学院计算技术研究所 Method and device for automatically dividing and classifying sports vision frequency shot
CN101650958A (en) * 2009-07-23 2010-02-17 中国科学院声学研究所 Extraction method and index establishment method of movie video scene clip
WO2012099773A1 (en) * 2011-01-17 2012-07-26 Qualcomm Incorporated Pressure-based video recording

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
郑烇: "MPEG-7的视频语义检索系统", 《计算机应用与软件》 *

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104185088B (en) * 2014-03-03 2017-05-31 无锡天脉聚源传媒科技有限公司 A kind of method for processing video frequency and device
CN104185088A (en) * 2014-03-03 2014-12-03 无锡天脉聚源传媒科技有限公司 Video processing method and device
CN105450978B (en) * 2014-06-24 2018-12-04 杭州海康威视数字技术股份有限公司 Method and apparatus for realizing structural description in video monitoring system
CN105450978A (en) * 2014-06-24 2016-03-30 杭州海康威视数字技术股份有限公司 Method and device for achieving structural description in video monitoring system
CN104866538A (en) * 2015-04-30 2015-08-26 北京海尔广科数字技术有限公司 Method, network and system of dynamic update semantic alarm database
CN104965814A (en) * 2015-06-30 2015-10-07 北京航空航天大学 Source data labeling extension method for technical publications of civil aircraft
CN104965814B (en) * 2015-06-30 2018-01-16 北京航空航天大学 A kind of source data mark extended method of civil aircraft technical publications
CN109348161A (en) * 2018-09-21 2019-02-15 联想(北京)有限公司 Show markup information method and electronic equipment
CN109348161B (en) * 2018-09-21 2021-05-18 联想(北京)有限公司 Method for displaying annotation information and electronic equipment
CN111160380A (en) * 2018-11-07 2020-05-15 华为技术有限公司 Method for generating video analysis model and video analysis system
CN110826101A (en) * 2019-11-05 2020-02-21 安徽数据堂科技有限公司 Privatization deployment data processing method for enterprise
CN110826101B (en) * 2019-11-05 2021-01-05 安徽数据堂科技有限公司 Privatization deployment data processing method for enterprise
CN113076942A (en) * 2020-01-03 2021-07-06 上海依图网络科技有限公司 Method, device, chip and computer readable storage medium for detecting preset mark
CN113408261A (en) * 2021-08-10 2021-09-17 广东新瑞智安科技有限公司 Method and system for generating job requisition

Also Published As

Publication number Publication date
CN103077236B (en) 2015-11-18

Similar Documents

Publication Publication Date Title
CN103077236B (en) Portable set realizes the system and method for video knowledge acquisition and marking Function
Hossain et al. Improving consumer satisfaction in smart cities using edge computing and caching: A case study of date fruits classification
US9563623B2 (en) Method and apparatus for correlating and viewing disparate data
Gomaa et al. Faster CNN-based vehicle detection and counting strategy for fixed camera scenes
CN102999640B (en) Based on the video of semantic reasoning and structural description and image indexing system and method
US10606824B1 (en) Update service in a distributed environment
CN108683877B (en) Spark-based distributed massive video analysis system
US20130243307A1 (en) Object identification in images or image sequences
CN104346370A (en) Method and device for image searching and image text information acquiring
CN102902771A (en) Method, device and server for searching pictures
EP3443482B1 (en) Classifying entities in digital maps using discrete non-trace positioning data
US20150227817A1 (en) Category Histogram Image Representation
CN111563398A (en) Method and device for determining information of target object
CN113052039B (en) Method, system and server for detecting pedestrian density of traffic network
Shahabi et al. Janus-multi source event detection and collection system for effective surveillance of criminal activity
CN113313098B (en) Video processing method, device, system and storage medium
Mo et al. Eventtube: An artificial intelligent edge computing based event aware system to collaborate with individual devices in logistics systems
Salehin et al. Fusion of Foreground Object, Spatial and Frequency Domain Motion Information for Video Summarization
US10282672B1 (en) Visual content analysis system with semantic framework
Zaman et al. A robust deep networks based multi-object multi-camera tracking system for city scale traffic
CN115880538A (en) Method and equipment for domain generalization of image processing model and image processing
Li et al. Research on fire detection algorithm based on deep learning
Abd Gani et al. A live-video automatic Number Plate Recognition (ANPR) system using convolutional neural network (CNN) with data labelling on an Android smartphone
Varun Chand et al. Design and implementation of parking system using feature extraction and pattern recognition technique
KR20150101846A (en) Image classification service system based on a sketch user equipment, service equipment, service method based on sketch and computer readable medium having computer program recorded therefor

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant