CN103077236B - Portable set realizes the system and method for video knowledge acquisition and marking Function - Google Patents

Portable set realizes the system and method for video knowledge acquisition and marking Function Download PDF

Info

Publication number
CN103077236B
CN103077236B CN201310007291.2A CN201310007291A CN103077236B CN 103077236 B CN103077236 B CN 103077236B CN 201310007291 A CN201310007291 A CN 201310007291A CN 103077236 B CN103077236 B CN 103077236B
Authority
CN
China
Prior art keywords
video
file
module
picture
semantic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201310007291.2A
Other languages
Chinese (zh)
Other versions
CN103077236A (en
Inventor
李逸
胡传平
梅林�
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Third Research Institute of the Ministry of Public Security
Original Assignee
Third Research Institute of the Ministry of Public Security
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Third Research Institute of the Ministry of Public Security filed Critical Third Research Institute of the Ministry of Public Security
Priority to CN201310007291.2A priority Critical patent/CN103077236B/en
Publication of CN103077236A publication Critical patent/CN103077236A/en
Application granted granted Critical
Publication of CN103077236B publication Critical patent/CN103077236B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Television Signal Processing For Recording (AREA)
  • Image Analysis (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention relates to a kind of system and method realizing video knowledge acquisition and marking Function based on portable set, belong to Computer Applied Technology field.This system comprises resource input module, structural description module, tagging equipment module, Double Directional Cutting module and Central Control Module.In the method, structural description module produces structural description data; Tagging equipment module first carries out coarseness mark according to structural description data to video and picture file; According to coarseness mark, video and picture file are divided into multiple zones of different had in Bu Tong semantic video segment or picture again; And then carry out fine grained mark.Thus effectively can reduce the subjective impact of artificial apparatus labeled standards to video labeling, significantly improve the degree of accuracy of video labeling, be also convenient to the automatic expansion of video labeling training set, and the application mode of system and method for the present invention is easy, realize with low cost, range of application is also comparatively extensive.

Description

Portable set realizes the system and method for video knowledge acquisition and marking Function
Technical field
The present invention relates to Computer Applied Technology field, particularly technical field of video image processing, specifically refer to a kind of system and method realizing video knowledge acquisition and marking Function based on portable set.
Background technology
Along with the develop rapidly of computer technology and internet, a large amount of video equipments can by the internet digitized video information of bulk transfer on portable devices easily, the development of portable video playback equipment (iPad, iPod, iPhone, Digital Video) and popularization, be conducive to user and can search, share and use all kinds of resources in network, such as video film, monitor video, video conference and Video chat etc. easily whenever and wherever possible.
Portable set is to a large amount of retrieval of video and share, and has drawn the research about how applying portable set to gather, marking the problems such as video/image resource.A lot of expert proposes directly to utilize the content information of video to carry out machine recognition, automatic marking now.This machine recognition and automatic marking are the process by means of carrying out the video frames/images information in video from bottom to high level, carry out the content marked, so that carry out follow-up video frequency searching in the process of analysis and understanding.The marked content of these bottoms refers to the proper vector such as color, texture, motion of video.Although these contents better can express the information of video, this notation methods differs too large with follow-up retrieval to the demand of video labeling.Such as, the object that general networking user carries out video or image retrieval depends on to retrieve the text message of video or the description of metadata, and the aspect such as color, texture, shape and the spatial relationship that formed on this basis that can not go to pay close attention to it, based on the bottom vision of image and image characteristics, the disposal route marked image has and calculates feature that is simple, stable performance, but these features have certain limitation at present.
Through finding the retrieval of prior art, Chinese patent literature CN102508923, disclose one " the automatic video frequency annotate method based on automatic classification and keyword ", first this technology extracts by the global characteristics and the local feature that manually mark video, the classification of global characteristics to video is used to classify, set up the corresponding relation between the multidimensional characteristic of video and mark key word again, then on the basis of the training set set up, process is without the video of mark automatically.
Chinese patent literature CN101650728, disclose " video high-level characteristic retrieval system and realization thereof ", this technology extracts the low-level image feature (color, shape, texture etc.) of Video Key two field picture, and utilize support vector machine (SupportVectorMachine, SVM) extracted feature is classified, and then extract corresponding video semanteme.
Above-mentioned research proposes some good methods in video semanteme extractive technique, but there is following defect equally:
1, need a large amount of artificial mark video sets, accurate, a complete sample training collection could be formed.
2, the video content owing to manually marking, its subjectivity is too strong, different people to the feature of same video, mark all can not be just the same, be difficult to form unified artificial labeled standards.
3, the video content of artificial mark, its extendibility is too poor, for the video that does not have sample training to concentrate never to mark, also need to search for original storehouse, the keyword extraction of close feature with it out, set up the contact between characteristic sum keyword, this just can complete the annotation to video.
Summary of the invention
The object of the invention is to overcome above-mentioned shortcoming of the prior art, a kind of extraction this video data being carried out to the structural description of coarseness is provided, again on the basis of the structural description of coarseness, cut zone block in the setting cut-point of frame of video and frame, then in the cut-point set and block, fine-grained mark is carried out, thus reduce artificial apparatus labeled standards to the subjective impact of video labeling, improve the degree of accuracy of video labeling, and be convenient to the automatic expansion of video labeling training set, simultaneously, application mode is easy, realize with low cost, range of application also realizes the system and method for video knowledge acquisition and marking Function comparatively widely based on portable set.
In order to realize above-mentioned object, the system realizing video knowledge acquisition and marking Function based on portable set of the present invention has following formation:
This system comprises: resource input module, structural description module, tagging equipment module, Double Directional Cutting module and Central Control Module.Wherein, resource input module connected system external unit, in order to obtain the picture file of video file and intercepting video frames from connected external unit; Structural description module, in order to extract the structural description content of the picture file of described video file and intercepting video frames, produces structural description data; Tagging equipment module is described portable set, in order to mark according to the picture file of described structural description data to described video file and intercepting video frames; Double Directional Cutting module arranges multiple different semantic segmentation point or region according to the mark of the picture file of described video file and intercepting video frames, is multiple zoness of different had in Bu Tong semantic video segment or picture by the picture file of described video file and intercepting video frames according to described semantic segmentation point or Region dividing; Resource input module, structural description module, tagging equipment module and Double Directional Cutting module described in Central Control Module connects, in order to send assignment instructions to described each module, dispatch the operation of each module.
Should realize in the system of video knowledge acquisition and marking Function based on portable set, described structural description module comprises the semantic relation unit, space-time dividing unit, feature extraction unit and the object identification unit that are linked in sequence, in order to produce people and the discernible hierarchical structure descriptor about video and image of computer system.
Should realize in the system of video knowledge acquisition and marking Function based on portable set, described Double Directional Cutting module comprises cut-point setting unit and cutter unit, described cut-point setting unit according to described be labeled in the picture file of described video file or intercepting video frames multiple different semantic segmentation point or region are set; Described cutter unit is in order to being multiple zoness of different had in Bu Tong semantic video segment or picture by the picture file of described video file and intercepting video frames according to described semantic segmentation point or Region dividing.
Should realize in the system of video knowledge acquisition and marking Function based on portable set, described tagging equipment module comprises coarse particle mark processing unit and fine grained mark processing unit, and described coarse particle mark processing unit is in order to carry out coarse particle mark according to the picture file of described structural description data to described video file and intercepting video frames; Described fine grained mark processing unit is in order to carry out fine grained mark according to described structural description data to the described zones of different had in different semantic video segment or picture through Double Directional Cutting module segmentation.
Should realize in the system of video knowledge acquisition and marking Function based on portable set, described portable set comprises panel computer, smart mobile phone and Digital Video.
The invention also discloses and a kind ofly utilize described system realize portable set video knowledge acquisition and mark the method that processes, the method comprises the following steps:
(201) resource input module described in obtains the picture file of video file and intercepting video frames from connected internet, LAN (Local Area Network) or video database;
(202) the structural description content of the video file described in structural description module extraction described in and the picture file of intercepting video frames, produces structural description data;
(203) video image characteristic of the video file described in structural description module extraction described in and the picture file of intercepting video frames and semantic data, and described characteristics of image and semantic data are input in structural description data;
(204) the tagging equipment module described in carries out coarseness mark according to the picture file of described structural description data to described video file and intercepting video frames;
(205) the Double Directional Cutting module described in, according to described coarseness mark, arranges multiple different semantic segmentation point or region in the picture file of described video file and intercepting video frames;
(206) picture file of described video file and intercepting video frames is multiple zoness of different had in Bu Tong semantic video segment or picture according to described semantic segmentation point or Region dividing by the Double Directional Cutting module described in;
(207) the tagging equipment module described in carries out fine grained mark to the described zones of different had in different semantic video segment or picture.
This realizes portable set video knowledge acquisition and marks in the method that processes, described structural description module comprises the semantic relation unit, space-time dividing unit, feature extraction unit and the object identification unit that are linked in sequence, and described step (202) specifically comprises the following steps:
(301) the semantic relation unit described in determines the implication of the object in the picture file of described video file and intercepting video frames;
(302) the space-time dividing unit described in, from the picture file of described video file and intercepting video frames, isolate significant object according to described object implication, described significant object comprises object shape information and the object textures of semantic level;
(303) feature extraction unit described in extracts feature according to the color characteristic of the picture file of described video file and intercepting video frames and space characteristics;
(304) object identification unit described in identify unknown object according to known training objects set and classifies;
(305) according to the result of above-mentioned space-time dividing, feature extraction and Object identifying, the hierarchical structure data of description about video and image described in generation.
This realizes portable set video knowledge acquisition and marks in the method that processes, described Double Directional Cutting module comprises cut-point setting unit and cutter unit, described Double Directional Cutting module is labeled according to described coarseness in the picture file of described video file and intercepting video frames and arranges multiple different semantic segmentation point or region, is specially:
Described cut-point setting unit according to described be labeled in the picture file of described video file or intercepting video frames multiple different semantic segmentation point or region are set;
The picture file of described video file and intercepting video frames is multiple zoness of different had in Bu Tong semantic video segment or picture according to described semantic segmentation point or Region dividing by described Double Directional Cutting module, is specially:
Described cutter unit is in order to being multiple zoness of different had in Bu Tong semantic video segment or picture by the picture file of described video file and intercepting video frames according to described semantic segmentation point or Region dividing.
This realizes portable set video knowledge acquisition and marks in the method that processes, described tagging equipment module comprises coarse particle mark processing unit and fine grained mark processing unit, described tagging equipment module carries out coarseness mark according to the picture file of described structural description data to described video file and intercepting video frames, is specially:
Described coarse particle mark processing unit carries out coarse particle mark in order to the picture file of described structural description data to described video file and intercepting video frames;
Described tagging equipment module carries out fine grained mark to the described zones of different had in different semantic video segment or picture, is specially:
Described fine grained mark processing unit carries out fine grained mark according to described structural description data to the described zones of different had in different semantic video segment or picture through Double Directional Cutting module segmentation.
This realizes portable set video knowledge acquisition and marks in the method that processes, and described step (203) specifically comprises the following steps:
(203-1) picture file of described video file and intercepting video frames is divided into some video image fragments, key frame and crucial subregion by the structural description module described in;
(203-2) structural description module described in carries out feature extraction and semantic analysis process to described video image fragments, key frame and crucial subregion, obtains video image characteristic and semantic data;
(203-3) described characteristics of image and semantic data are input in structural description data by the structural description module described in.
This realizes portable set video knowledge acquisition and marks in the method that processes, described Double Directional Cutting module marks according to described coarseness, multiple different semantic segmentation point or region are set in the picture file of described video file and intercepting video frames, are specially: described Double Directional Cutting module is labeled in described key frame and crucial subregion according to described coarseness and arranges multiple different semantic segmentation point or region;
The picture file of described video file and intercepting video frames is multiple zoness of different had in Bu Tong semantic video segment or picture according to described semantic segmentation point or Region dividing by described Double Directional Cutting module, is specially: described Double Directional Cutting module by described key frame and crucial subregion according to described semantic segmentation point or Region dividing be multiple there is Bu Tong semanteme video segment or picture in zoness of different.
Have employed the system and method realizing video knowledge acquisition and marking Function based on portable set of this invention, its system comprises resource input module, structural description module, tagging equipment module, Double Directional Cutting module and Central Control Module.In the method, structural description module extracts the structural description content of the picture file of video file and intercepting video frames, produces structural description data; Tagging equipment module first carries out coarseness mark according to the picture file of structural description data to video file and intercepting video frames; Again according to coarseness mark by the picture file of video file and intercepting video frames according to semantic segmentation point or Region dividing be multiple there is Bu Tong semanteme video segment or picture in zoness of different; Then fine grained mark is carried out to the zones of different had in different semantic video segment or picture.Thus system and method for the present invention can be utilized, effectively reduce the subjective impact of artificial apparatus labeled standards to video labeling, significantly improve the degree of accuracy of video labeling, also the automatic expansion of video labeling training set is convenient to, and the system and method realizing video knowledge acquisition and marking Function based on portable set, its application mode is easy, realizes with low cost, and range of application is also comparatively extensive.
Accompanying drawing explanation
Fig. 1 is the structural representation realizing the system of video knowledge acquisition and marking Function based on portable set of the present invention.
Fig. 2 is the flow chart of steps realizing the method that portable set video knowledge acquisition processes with mark of the present invention.
Fig. 3 of the present inventionly realizes portable set video knowledge acquisition and the schematic flow sheet marking video structural description in the method that processes
Fig. 4 is the sequential chart adopting method of the present invention video A and image B to be carried out to the example of knowledge acquisition and mark.
Embodiment
In order to more clearly understand technology contents of the present invention, describe in detail especially exemplified by following examples.
Referring to shown in Fig. 1, is the structural representation realizing the system of video knowledge acquisition and marking Function based on portable set of the present invention.
In one embodiment, this system realizing video knowledge acquisition and marking Function based on portable set comprises resource input module, structural description module, tagging equipment module, Double Directional Cutting module and Central Control Module.
Wherein, resource input module connected system external unit, in order to obtain the picture file of video file and intercepting video frames from connected external unit.Structural description module, in order to extract the structural description content of the picture file of described video file and intercepting video frames, produces structural description data.Tagging equipment module is the portable sets such as panel computer, smart mobile phone and Digital Video, in order to mark according to the picture file of described structural description data to described video file and intercepting video frames.Double Directional Cutting module arranges multiple different semantic segmentation point or region according to the mark of the picture file of described video file and intercepting video frames, is multiple zoness of different had in Bu Tong semantic video segment or picture by the picture file of described video file and intercepting video frames according to described semantic segmentation point or Region dividing.Resource input module, structural description module, tagging equipment module and Double Directional Cutting module described in Central Control Module connects, in order to send assignment instructions to described each module, dispatch the operation of each module.
Utilize the system described in this embodiment realize portable set video knowledge acquisition and mark the method processed, as shown in Figure 2, comprise the following steps:
(201) resource input module described in obtains the picture file of video file and intercepting video frames from connected internet, LAN (Local Area Network) or video database;
(202) the structural description content of the video file described in structural description module extraction described in and the picture file of intercepting video frames, produces structural description data;
(203) video image characteristic of the video file described in structural description module extraction described in and the picture file of intercepting video frames and semantic data, and described characteristics of image and semantic data are input in structural description data;
(204) the tagging equipment module described in carries out coarseness mark according to the picture file of described structural description data to described video file and intercepting video frames;
(205) the Double Directional Cutting module described in, according to described coarseness mark, arranges multiple different semantic segmentation point or region in the picture file of described video file and intercepting video frames;
(206) picture file of described video file and intercepting video frames is multiple zoness of different had in Bu Tong semantic video segment or picture according to described semantic segmentation point or Region dividing by the Double Directional Cutting module described in;
(207) the tagging equipment module described in carries out fine grained mark to the described zones of different had in different semantic video segment or picture.
In one more preferably embodiment, described structural description module comprises the semantic relation unit, space-time dividing unit, feature extraction unit and the object identification unit that are linked in sequence, in order to produce people and the discernible hierarchical structure descriptor about video and image of computer system.
Utilizing this system more preferably described in embodiment to realize portable set video knowledge acquisition and mark in the method that processes, described step (202) specifically comprises the following steps:
(301) the semantic relation unit described in determines the implication of the object in the picture file of described video file and intercepting video frames;
(302) the space-time dividing unit described in, from the picture file of described video file and intercepting video frames, isolate significant object according to described object implication, described significant object comprises object shape information and the object textures of semantic level;
(303) feature extraction unit described in extracts feature according to the color characteristic of the picture file of described video file and intercepting video frames and space characteristics;
(304) object identification unit described in identify unknown object according to known training objects set and classifies;
(305) according to the result of above-mentioned space-time dividing, feature extraction and Object identifying, the hierarchical structure data of description about video and image described in generation.
In another kind more preferably embodiment, described Double Directional Cutting module comprises cut-point setting unit and cutter unit, described cut-point setting unit according to described be labeled in the picture file of described video file or intercepting video frames multiple different semantic segmentation point or region are set; Described cutter unit is in order to being multiple zoness of different had in Bu Tong semantic video segment or picture by the picture file of described video file and intercepting video frames according to described semantic segmentation point or Region dividing.
Realize portable set video knowledge acquisition utilizing this system more preferably described in embodiment and mark in the method that processes, described step (205) Double Directional Cutting module is labeled according to described coarseness in the picture file of described video file and intercepting video frames and arranges multiple different semantic segmentation point or region, is specially: described cut-point setting unit according to described be labeled in the picture file of described video file or intercepting video frames multiple different semantic segmentation point or region are set.
The picture file of described video file and intercepting video frames is multiple zoness of different had in Bu Tong semantic video segment or picture according to described semantic segmentation point or Region dividing by described step (206) Double Directional Cutting module, is specially: described cutter unit in order to by the picture file of described video file and intercepting video frames according to described semantic segmentation point or Region dividing be multiple there is Bu Tong semanteme video segment or picture in zoness of different.
In further preferred embodiment, described tagging equipment module comprises coarse particle mark processing unit and fine grained mark processing unit, and described coarse particle mark processing unit is in order to carry out coarse particle mark according to the picture file of described structural description data to described video file and intercepting video frames; Described fine grained mark processing unit is in order to carry out fine grained mark according to described structural description data to the described zones of different had in different semantic video segment or picture through Double Directional Cutting module segmentation.
Realize portable set video knowledge acquisition utilizing the system described in further preferred embodiment and mark in the method that processes, described step (204) tagging equipment module carries out coarseness mark according to the picture file of described structural description data to described video file and intercepting video frames, is specially: described coarse particle mark processing unit carries out coarse particle mark in order to the picture file of described structural description data to described video file and intercepting video frames.
Described step (207) tagging equipment module carries out fine grained mark to the described zones of different had in different semantic video segment or picture, is specially: described fine grained mark processing unit carries out fine grained mark according to described structural description data to the described zones of different had in different semantic video segment or picture through Double Directional Cutting module segmentation.
Further preferred embodiment in, described step (203) specifically comprises the following steps:
(203-1) picture file of described video file and intercepting video frames is divided into some video image fragments, key frame and crucial subregion by the structural description module described in;
(203-2) structural description module described in carries out feature extraction and semantic analysis process to described video image fragments, key frame and crucial subregion, obtains video image characteristic and semantic data;
(203-3) described characteristics of image and semantic data are input in structural description data by the structural description module described in.
In preferred embodiment, described step (205) Double Directional Cutting module marks according to described coarseness, multiple different semantic segmentation point or region are set in the picture file of described video file and intercepting video frames, are specially: described Double Directional Cutting module is labeled in described key frame and crucial subregion according to described coarseness and arranges multiple different semantic segmentation point or region.
The picture file of described video file and intercepting video frames is multiple zoness of different had in Bu Tong semantic video segment or picture according to described semantic segmentation point or Region dividing by described step (206) Double Directional Cutting module, is specially: described Double Directional Cutting module by described key frame and crucial subregion according to described semantic segmentation point or Region dividing be multiple there is Bu Tong semanteme video segment or picture in zoness of different.
In actual applications, the system realizing video knowledge acquisition mark based on portable set of the present invention comprises five modules such as mark resource input module, video structural description module, video Double Directional Cutting module, markup information management center module and tagging equipment module etc.
Described mark resource input module refers to the various markup informations for labeling system provides.From marked content angle, mark resource mainly comprises video file and with intercepting video frames picture out.The content that video file comes from video monitoring, picture pick-up device is preserved, the picture in video obtains by intercepting in video file, and this intercepting process can manually complete, and also having can be the key frame that equipment intercepts in video automatically, then carries out the picture of unloading.From the angle of mark source resource, mark resource is divided into traffic to mark resource and society's mark resource.Video, image resource that all kinds of video surveillance networks, equipment that traffic mark source resource is disposed in public security department gather, video, image resource that society's mark source resource gathers in all kinds of video surveillance networks of all kinds of social unit or individual's deployment, equipment.
Described video structural description module refers to the structural description content for extracting video, this is a kind of based on Machine automated video information process and analytical approach, adopt the process means such as semantic relation, space-time dividing, feature extraction, Object identifying, be organized into the technology of the text message can understood for computing machine and people.It depends on, and image processing techniques, mode identification technology, semantic understanding technology etc. are carried out at many levels video data, the feature extraction of various dimensions.First video is sent to description equipment, the brain of similar people, carries out analysis and understanding to content, and produces multi-level structuring output; Then transmit data of description and store, the equipment that simultaneously describes also exports an original video for system storage.From the flow process of data processing, monitor video can transform as people and machine understandable information by video structural description technology, and realizes the conversion of video data to key message.
Described video Double Directional Cutting module, will download or accept to obtain pending video from internet or LAN (Local Area Network), different semantic segmentation points or region are set according to the mark of video, video are divided into the zones of different much had in different semantic video segments or same video image.
Described markup information management center module, is used to the structural description result of the video described in inquiring about, and the structuring annotation results of inquiry video, the optimum configurations of training set is supported in adjustment.
Described tagging equipment module, i.e. portable set, such as panel computer, smart mobile phone, Digital Video, user realizes the mark to mark resource by tagging equipment, and the behavior of tagging equipment is subject to management and the mandate of markup information administrative center.
The present invention gathers based on label technology, non-relational database by portable support, constructs a kind of portable video/image collection, marks, uploads, inquiry system, can accomplish that video image material arrives the instant conversion processing of mark knowledge.
The method of said system is adopted to comprise the following steps:
The picture file in video file to be marked and video is gathered from the Internet/LAN (Local Area Network);
From the origin of video---the result of structural description is starting point, video file and picture file is carried out to the feature extraction of structural description;
According to the structural description result of video file and picture file, form corresponding video semanteme coarseness mark;
Again different semantic segmentation points is set according to video coarseness mark, video is divided into the geometric areas in the video segment or frame of video much with different semantic hierarchies;
Last again using each coarseness video segment as mark unit, fine-grained semantic tagger is carried out to video content.
Specifically, the process flow diagram of the embodiment of the present invention, mainly comprises the steps:
Step 201, the video data that video filtering system accepts to transmit from internet, LAN (Local Area Network) or video database sends request;
Step 202, gathers video to be marked and image pattern, and digital video signal is input to video structural description module, if input is simulating signal, then carries out analog to digital conversion;
Step 203, extracts the structural description result of video, carries out intellectual analysis process to the video signal that video input module imports into, and video signal is divided into some each video image fragments, key frame and subregion; Feature extraction and high-level semantics analyzing and processing are carried out to video image fragments, key frame and subregion, obtains the characteristic sum high-level semantics data of video image, and enter data in video structural description database;
Step 204, carries out the video labeling of coarseness to video according to the structural description result of video;
Step 205, arranges the semantic segmentation point of video, and the key frame of video and crucial subregion, according to above-mentioned coarseness annotation results, carry out splitting, marking by this step;
Step 206, according to semantic segmentation sheet and the cut-point of video, then carries out the fine granularity mark of video.
Fig. 3 is the process flow diagram of the embodiment of video structural description in the present invention, mainly comprises the steps:
Step 301, the request that the video data that system accepts to transmit from internet, LAN (Local Area Network) or video database sends.
Step 302, space-time dividing refers to that system is separated significant object from video sequence, and each video object plane comprises shape and the texture information of semantic level object video.According to the difference of dividing method, partitioning algorithm can be divided into: spatial segmentation algorithm and temporal segmentation algorithm two kinds.Spatial segmentation is the border using watershed algorithm to obtain zones of different; Temporal segmentation utilizes time domain to change detection to separate object video, and the position of Moving Objects and shape are obtained by frame difference method and background subtraction.
Step 303, feature extraction refers to according to color characteristic and space characteristics, extracts the feature representing frame of video.
Step 304, Object identifying refers to the method according to statistical-simulation spectrometry, and namely on the basis of known training objects set, design identifies and sorting algorithm, thus carries out discriminator to unknown object.
Step 305, according to above-mentioned space-time dividing, feature extraction and Object identifying, show that the hierarchical structure of video describes result.
Fig. 4 is the sequential chart that video A in the embodiment of the present invention, image B are undertaken by knowledge acquisition labeling system marking:
Step 401, based on the knowledge acquisition labeling system of portable set after the transmission request receiving video A and image B, sets up to transmit with network transport mechanism and is connected, the data of receiver, video A and image B.
The video/image data received are carried out data prediction by step 402, if input is simulating signal, then carry out analog to digital conversion;
Step 403, carries out intellectual analysis process to the video/image signal that video input module imports into, video/image signal is divided into several video image fragments, key frame and subregion; Feature extraction and high-level semantics analyzing and processing are carried out to video image fragments, key frame and subregion, obtains the characteristic sum high-level semantics data of video/image, and enter data in video structural description database;
Step 404, describes result by the coarseness of the image B of above-mentioned generation and returns.
Step 405, according to video time frame or the semantic description with zones of different in frame of video, can carry out two-way cutting by video.
Step 406, according to above-mentioned Double Directional Cutting point, again extracts fine-grained video features and describes.
Step 407, describes result by the video A fine granularity of above-mentioned generation and returns.
Have employed the system and method realizing video knowledge acquisition and marking Function based on portable set of this invention, its system comprises resource input module, structural description module, tagging equipment module, Double Directional Cutting module and Central Control Module.In the method, structural description module extracts the structural description content of the picture file of video file and intercepting video frames, produces structural description data; Tagging equipment module first carries out coarseness mark according to the picture file of structural description data to video file and intercepting video frames; Again according to coarseness mark by the picture file of video file and intercepting video frames according to semantic segmentation point or Region dividing be multiple there is Bu Tong semanteme video segment or picture in zoness of different; Then fine grained mark is carried out to the zones of different had in different semantic video segment or picture.Thus system and method for the present invention can be utilized, effectively reduce the subjective impact of artificial apparatus labeled standards to video labeling, significantly improve the degree of accuracy of video labeling, also the automatic expansion of video labeling training set is convenient to, and the system and method realizing video knowledge acquisition and marking Function based on portable set, its application mode is easy, realizes with low cost, and range of application is also comparatively extensive.
In this description, the present invention is described with reference to its specific embodiment.But, still can make various amendment and conversion obviously and not deviate from the spirit and scope of the present invention.Therefore, instructions and accompanying drawing are regarded in an illustrative, rather than a restrictive.

Claims (11)

1. realize a system for video knowledge acquisition and marking Function based on portable set, it is characterized in that, described system comprises:
Resource input module, connected system external unit, in order to obtain the picture file of video file and intercepting video frames from connected external unit;
Structural description module, in order to extract the structural description content of the picture file of described video file and intercepting video frames, produces structural description data;
Tagging equipment module is described portable set, in order to mark according to the picture file of described structural description data to described video file and intercepting video frames;
Double Directional Cutting module, mark according to the picture file of described video file and intercepting video frames arranges multiple different semantic segmentation point or region, is multiple zoness of different had in Bu Tong semantic video segment or picture by the picture file of described video file and intercepting video frames according to described semantic segmentation point or Region dividing;
Central Control Module, the resource input module described in connection, structural description module, tagging equipment module and Double Directional Cutting module, in order to send assignment instructions to described each module, dispatch the operation of each module.
2. the system realizing video knowledge acquisition and marking Function based on portable set according to claim 1, it is characterized in that, described structural description module comprises the semantic relation unit, space-time dividing unit, feature extraction unit and the object identification unit that are linked in sequence, in order to produce people and the discernible hierarchical structure descriptor about video and image of computer system.
3. the system realizing video knowledge acquisition and marking Function based on portable set according to claim 1, it is characterized in that, described Double Directional Cutting module comprises cut-point setting unit and cutter unit, described cut-point setting unit according to described be labeled in the picture file of described video file or intercepting video frames multiple different semantic segmentation point or region are set; Described cutter unit is in order to being multiple zoness of different had in Bu Tong semantic video segment or picture by the picture file of described video file and intercepting video frames according to described semantic segmentation point or Region dividing.
4. the system realizing video knowledge acquisition and marking Function based on portable set according to claim 3, it is characterized in that, described tagging equipment module comprises coarse particle mark processing unit and fine grained mark processing unit, and described coarse particle mark processing unit is in order to carry out coarse particle mark according to the picture file of described structural description data to described video file and intercepting video frames; Described fine grained mark processing unit is in order to carry out fine grained mark according to described structural description data to the described zones of different had in different semantic video segment or picture through Double Directional Cutting module segmentation.
5. the system realizing video knowledge acquisition and marking Function based on portable set according to any one of claim 1 to 4, is characterized in that, described portable set comprises panel computer, smart mobile phone and Digital Video.
6. utilize the system described in claim 1 realize portable set video knowledge acquisition and mark the method processed, it is characterized in that, described method comprises the following steps:
(201) resource input module described in obtains the picture file of video file and intercepting video frames from connected internet, LAN (Local Area Network) or video database;
(202) the structural description content of the video file described in structural description module extraction described in and the picture file of intercepting video frames, produces structural description data;
(203) video image characteristic of the video file described in structural description module extraction described in and the picture file of intercepting video frames and semantic data, and described characteristics of image and semantic data are input in structural description data;
(204) the tagging equipment module described in carries out coarse particle mark according to the picture file of described structural description data to described video file and intercepting video frames;
(205) the Double Directional Cutting module described in, according to described coarse particle mark, arranges multiple different semantic segmentation point or region in the picture file of described video file and intercepting video frames;
(206) picture file of described video file and intercepting video frames is multiple zoness of different had in Bu Tong semantic video segment or picture according to described semantic segmentation point or Region dividing by the Double Directional Cutting module described in;
(207) the tagging equipment module described in carries out fine grained mark to the described zones of different had in different semantic video segment or picture.
7. the method realizing portable set video knowledge acquisition and process with mark according to claim 6, it is characterized in that, described structural description module comprises the semantic relation unit, space-time dividing unit, feature extraction unit and the object identification unit that are linked in sequence, and described step (202) specifically comprises the following steps:
(301) the semantic relation unit described in determines the implication of the object in the picture file of described video file and intercepting video frames;
(302) the space-time dividing unit described in, from the picture file of described video file and intercepting video frames, isolate significant object according to described object implication, described significant object comprises object shape information and the object textures of semantic level;
(303) feature extraction unit described in extracts feature according to the color characteristic of the picture file of described video file and intercepting video frames and space characteristics;
(304) object identification unit described in identify unknown object according to known training objects set and classifies;
(305) according to the result of above-mentioned space-time dividing, feature extraction and Object identifying, the hierarchical structure data of description about video and image described in generation.
8. the method realizing portable set video knowledge acquisition and process with mark according to claim 6, it is characterized in that, described Double Directional Cutting module comprises cut-point setting unit and cutter unit, described Double Directional Cutting module is labeled according to described coarseness in the picture file of described video file and intercepting video frames and arranges multiple different semantic segmentation point or region, is specially:
Described cut-point setting unit according to described be labeled in the picture file of described video file or intercepting video frames multiple different semantic segmentation point or region are set;
The picture file of described video file and intercepting video frames is multiple zoness of different had in Bu Tong semantic video segment or picture according to described semantic segmentation point or Region dividing by described Double Directional Cutting module, is specially:
Described cutter unit is in order to being multiple zoness of different had in Bu Tong semantic video segment or picture by the picture file of described video file and intercepting video frames according to described semantic segmentation point or Region dividing.
9. the method realizing portable set video knowledge acquisition and process with mark according to claim 6, it is characterized in that, described tagging equipment module comprises coarse particle mark processing unit and fine grained mark processing unit, described tagging equipment module carries out coarse particle mark according to the picture file of described structural description data to described video file and intercepting video frames, is specially:
Described coarse particle mark processing unit carries out coarse particle mark in order to the picture file of described structural description data to described video file and intercepting video frames;
Described tagging equipment module carries out fine grained mark to the described zones of different had in different semantic video segment or picture, is specially:
Described fine grained mark processing unit carries out fine grained mark according to described structural description data to the described zones of different had in different semantic video segment or picture through Double Directional Cutting module segmentation.
10. realizing portable set video knowledge acquisition and marking the method processed according to any one of claim 6 to 9, it is characterized in that, described step (203) specifically comprises the following steps:
(203-1) picture file of described video file and intercepting video frames is divided into some video image fragments, key frame and crucial subregion by the structural description module described in;
(203-2) structural description module described in carries out feature extraction and semantic analysis process to described video image fragments, key frame and crucial subregion, obtains video image characteristic and semantic data;
(203-3) described characteristics of image and semantic data are input in structural description data by the structural description module described in.
11. methods realizing portable set video knowledge acquisition and process with mark according to claim 10, it is characterized in that, described Double Directional Cutting module marks according to described coarse particle, multiple different semantic segmentation point or region are set in the picture file of described video file and intercepting video frames, are specially:
Described Double Directional Cutting module is labeled in described key frame and crucial subregion according to described coarse particle and arranges multiple different semantic segmentation point or region;
The picture file of described video file and intercepting video frames is multiple zoness of different had in Bu Tong semantic video segment or picture according to described semantic segmentation point or Region dividing by described Double Directional Cutting module, is specially:
Described key frame and crucial subregion are multiple zoness of different had in Bu Tong semantic video segment or picture according to described semantic segmentation point or Region dividing by described Double Directional Cutting module.
CN201310007291.2A 2013-01-09 2013-01-09 Portable set realizes the system and method for video knowledge acquisition and marking Function Active CN103077236B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310007291.2A CN103077236B (en) 2013-01-09 2013-01-09 Portable set realizes the system and method for video knowledge acquisition and marking Function

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310007291.2A CN103077236B (en) 2013-01-09 2013-01-09 Portable set realizes the system and method for video knowledge acquisition and marking Function

Publications (2)

Publication Number Publication Date
CN103077236A CN103077236A (en) 2013-05-01
CN103077236B true CN103077236B (en) 2015-11-18

Family

ID=48153766

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310007291.2A Active CN103077236B (en) 2013-01-09 2013-01-09 Portable set realizes the system and method for video knowledge acquisition and marking Function

Country Status (1)

Country Link
CN (1) CN103077236B (en)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104185088B (en) * 2014-03-03 2017-05-31 无锡天脉聚源传媒科技有限公司 A kind of method for processing video frequency and device
CN105450978B (en) * 2014-06-24 2018-12-04 杭州海康威视数字技术股份有限公司 Method and apparatus for realizing structural description in video monitoring system
CN104866538A (en) * 2015-04-30 2015-08-26 北京海尔广科数字技术有限公司 Method, network and system of dynamic update semantic alarm database
CN104965814B (en) * 2015-06-30 2018-01-16 北京航空航天大学 A kind of source data mark extended method of civil aircraft technical publications
CN109348161B (en) * 2018-09-21 2021-05-18 联想(北京)有限公司 Method for displaying annotation information and electronic equipment
CN111160380A (en) * 2018-11-07 2020-05-15 华为技术有限公司 Method for generating video analysis model and video analysis system
CN110826101B (en) * 2019-11-05 2021-01-05 安徽数据堂科技有限公司 Privatization deployment data processing method for enterprise
CN113076942A (en) * 2020-01-03 2021-07-06 上海依图网络科技有限公司 Method, device, chip and computer readable storage medium for detecting preset mark
CN113408261B (en) * 2021-08-10 2021-12-14 广东新瑞智安科技有限公司 Method and system for generating job requisition

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101211460A (en) * 2006-12-30 2008-07-02 中国科学院计算技术研究所 Method and device for automatically dividing and classifying sports vision frequency shot
CN101650958A (en) * 2009-07-23 2010-02-17 中国科学院声学研究所 Extraction method and index establishment method of movie video scene clip

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120183271A1 (en) * 2011-01-17 2012-07-19 Qualcomm Incorporated Pressure-based video recording

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101211460A (en) * 2006-12-30 2008-07-02 中国科学院计算技术研究所 Method and device for automatically dividing and classifying sports vision frequency shot
CN101650958A (en) * 2009-07-23 2010-02-17 中国科学院声学研究所 Extraction method and index establishment method of movie video scene clip

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
MPEG-7的视频语义检索系统;郑烇;《计算机应用与软件》;20120529;全文 *

Also Published As

Publication number Publication date
CN103077236A (en) 2013-05-01

Similar Documents

Publication Publication Date Title
CN103077236B (en) Portable set realizes the system and method for video knowledge acquisition and marking Function
CN111368934B (en) Image recognition model training method, image recognition method and related device
US9251425B2 (en) Object retrieval in video data using complementary detectors
CN108683877B (en) Spark-based distributed massive video analysis system
Gomaa et al. Faster CNN-based vehicle detection and counting strategy for fixed camera scenes
US20130243307A1 (en) Object identification in images or image sequences
CN111563398A (en) Method and device for determining information of target object
Li et al. Vehicle detection in remote sensing images using denoizing-based convolutional neural networks
CN113313098B (en) Video processing method, device, system and storage medium
Salehin et al. Fusion of Foreground Object, Spatial and Frequency Domain Motion Information for Video Summarization
Abd Gani et al. A live-video automatic Number Plate Recognition (ANPR) system using convolutional neural network (CNN) with data labelling on an Android smartphone
Mahayuddin et al. A comprehensive review towards appropriate feature selection for moving object detection using aerial images
CN115880538A (en) Method and equipment for domain generalization of image processing model and image processing
Varun Chand et al. Design and implementation of parking system using feature extraction and pattern recognition technique
CN112596894B (en) Tracking method and device based on edge calculation
Zhang et al. Human action recognition based on multifeature fusion
Kaimkhani et al. UAV with Vision to Recognise Vehicle Number Plates
Li et al. Fall Detection in the Wild: An Intelligent Emergency Assistance System
Seo et al. Cemo: Cloud edge architecture development for a multi object tracking
Chauhan et al. Smart surveillance based on video summarization: a comprehensive review, issues, and challenges
Wahyono et al. Stationary object detection for vision-based smart monitoring system
Buttan et al. On-road moving vehicle detection by spatio-temporal video analysis of static and dynamic backgrounds
KANDASWAMY et al. FED-AT-VIDEO NETS-A Federated Capsule–Self Gated Learning Architecture for the Multi-View Video Summarization Technique.
Rajeswari et al. ENHANCING MODEL PERFORMANCE OF AUTOMATIC DRIVER DISTRACTION DETECTION USING TRANSFER LEARNING
Huang et al. Video Surveillance System of Substation Based on AdaBoost Pedestrian Anti-misjudgment Algorithm

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant