CN106503170A

CN106503170A - A kind of based on the image base construction method for blocking dimension

Info

Publication number: CN106503170A
Application number: CN201610930997.XA
Authority: CN
Inventors: 马惠敏; 高磊; 王弈冬
Original assignee: Tsinghua University
Current assignee: Tsinghua University
Priority date: 2016-10-31
Filing date: 2016-10-31
Publication date: 2017-03-15
Anticipated expiration: 2036-10-31
Also published as: CN106503170B

Abstract

The invention discloses a kind of belong to technical field of image processing based on the image base construction method for blocking dimension, the method includes：Image of the collection with different shelter targets, and the image for collecting is classified according to shelter target, form tree class formation；Each image is labeled according to dimension is blocked；Image after by mark is added in corresponding tree-shaped taxonomic structure, forms image library；Blocking subsequent acquisition during figure adds to image library successively by same treatment method, makes image library further update and perfect.The present invention has with strong points, basic good, the wide advantage of application prospect.

Description

A kind of based on the image base construction method for blocking dimension

Technical field

The invention belongs to technical field of image processing, and in particular to a kind of based on the image base construction method for blocking dimension.

Background technology

Medium of the image as Information Communication, commonly used so which contains directly perceived and abundant information.In order to promote The development of computer vision, especially image segmentation, target detection, the research of recognition methodss, successively occur in that some worlds are commented Platform is surveyed the quality that compares and detect each algorithm：European Union in 2005 establish PASCAL (Pattern Analysis, Statistical Modelling and Computational Learning) data set, open VOC (Visual Object Classes) challenge match；4 big class of VOC image libraries (Visual Object Classes) image set point, respectively hand over Logical instrument, indoor object, animal, other；11530 pictures are had altogether comprising 20 catalogues under big class, image content is one Common object in daily a bit, purpose be exactly can more preferable evaluation algorithms practicality.Stanford University establishes generation within 2010 In boundary, maximum ImageNet image libraries provide data source and international evaluation and test platform for associated picture research, and image therein is basic On be all the higher simple image of identification；The image library is set up on the basis of WordNet tree structures, has nearly 15,000,000 Image is opened, point 17 classifications, each classification have carried out hierarchy, and are all labelled with regard to color, pattern, shape per a figure The attributes such as shape, texture, Microsoft is proposed the very high COCO of image complexity (Common Objects in Context) within 2014 Image data set.

As these image libraries are primarily servicing computer vision field, structure is not left for from the angle that blocks Build.And it is the phenomenon of a generally existing in complex scene image to block, and the automatic Pilot with various complex situations is regarded Feel the key problem that the practical applications such as navigation, public safety video monitoring cannot be avoided.Therefore, these image libraries can not be by It is directly used in the image application scenarios and correlational study with regard to blocking.

Content of the invention

It is an object of the invention to overcoming the weak point of prior art, propose a kind of based on the image library structure for blocking dimension Construction method, provides bigger more accurately training set for image recognition, preferably to serve based on all kinds of of image recognition Application.

Proposed by the present invention a kind of based on the image base construction method for blocking dimension, comprise the following steps：

1) image of the collection with different shelter targets, and the image for collecting is classified according to shelter target, shape Into tree class formation；

2) each image is labeled according to dimension is blocked；

3) by mark after image be added in corresponding tree-shaped taxonomic structure, formed image library；Screening by subsequent acquisition Gear figure is by step 1) and during processing method 2) is added successively to image library, make the further renewal of image library and perfect.

The present invention has advantages below：

(1) with strong points：The image library is specific to what the object detection and recognition under complicated circumstance of occlusion was set up, tool There is very strong specific aim；

(2) basic good：The image library will propose the quantitative criteria of coverage extent for the first time, is labelled with and blocks attribute, its Building process has very strong rationale to support；

(3) application prospect is wide：The image library can not only be applied to beyond Target detection and identification, and can be analysis The anti-performance of blocking of existing main flow algorithm lays the foundation, and blocks the affecting laws to image cognition to extraction significant.

Description of the drawings

Construction method flow charts based on the image library of blocking dimension of the Fig. 1 for the embodiment of the present invention；

Tree-shaped taxonomic structure schematic diagrams of the Fig. 2 for the embodiment of the present invention.

Specific embodiment

With reference to the accompanying drawings and examples the present invention is further described：

A kind of construction method based on the image library for blocking dimension proposed by the present invention, as shown in figure 1, specifically include following Step：

2) each image is labeled according to dimension is blocked；

1) above-mentioned steps gather the image with different shelter targets, and the image for collecting is carried out according to shelter target Classification, forms tree class formation；Specifically include：

According to carrying out manual sort with shelter target image, for example according to target difference is divided into 1 image 1) to collection Aircraft, vehicle, ship, personage, animal category, by the composition set of same category of target image；

2) the classification chart picture is formed tree-shaped taxonomic structure, tree-shaped taxonomic structure such as Fig. 2 dashed boxes institute of the present embodiment by 1 Show.The tree-shaped taxonomic structure adopts two grades of forms, the first order to be divided into different shelters, for example aircraft, vehicle, ship, personage, The classification of animal.The second level classification in, every class shelter is subdivided into and blocks position, such as aircraft be divided into head, wing, fuselage, Aircraft window etc. blocks element..

2) above-mentioned steps are labeled according to dimension is blocked to each image；

The dimension of blocking of the present embodiment includes：Blocking parts, shielded area, hiding relation, block complexity.Selection is carried The image for blocking, by its respectively according to blocking parts, shielded area, hiding relation, block complexity and be labeled.Wherein：Hide Stopper part, shielded area and hiding relation are labeled using Labelme instruments.Labelme is calculated by the Massachusetts Institute of Technology Open the Note tool that machine science is created with Artificial Intelligence Laboratory.The annotation of image is preserved with XML file.Can XML file is processed by MATLAB workboxes.Specifically include：

Blocking parts：Blocking for different parts has different degrees of impact to object identification.The present invention utilizes labelme Instrument depicts the polygonal profile of object element by the edge of click object element, and then object element is noted Release, different annotation object elements are marked with different colors in the picture.The title of each object element and click target The polygonal discrete coordinate formed by element border is stored in the XML file of correspondence image.

Part refers to the characteristic feature that object element has, such as " car " this classification, will be thin for the position that is blocked of vehicle It is divided into the parts such as headstock, car light, wheel, vehicle window.As shown in Figure 2.As this position of wheel is blocked, blocking parts is labeled as " wheel (wheel) ", to inquire about.

In object part mark is implemented, Module Division, and the color according to each module, stricture of vagina is carried out to whole scene The information such as reason, the module for belonging to same part is clustered, and the part for completing object is divided.Mark and block object all parts With the presence or absence of disappearance, disappearance degree on this basis according to part, the type of blocking parts are classified, i.e., according in image Hold to blocking object with the sorting objects that are blocked.Classification is encoded (can adopt any type of coding), to enter Storehouse.

Shielded area：The present embodiment is carried out to whole scene after Module Division using super-pixel segmentation method, using object completion Mode carry out blocking mark, i.e., according to image original information, predict the parameters such as shape, the size of the part that is blocked, so as to The shielded area ratio for calculating is labeled；The calculating of shielded area is based on blocking object with the contour of object that is blocked Extract.For shielded image, it is fitted with the object that is blocked to blocking object using approximate polygon method, according to shielded area Size, by shielded image classification annotation, as shown in Fig. 2 the shielded area mark of the present embodiment be subdivided into less than 20%, Between 20%-50%, between 50%-70%, more than 70% etc., but this method precision its precision not high be enough to For differentiating shielded image, and calculating can be caused to become convenient rapid by the appropriate simplification to marking precision.

The present embodiment is calculated using Labelme instruments.Polygonal area is calculated using the coordinate for obtaining, is calculated and is hidden Gear site area accounts for the percentage ratio of the object gross area that is blocked, and 1 for blocking object (Scover), and 2 are the object that is blocked (Scovered), 3 is total image area (Swhole), and computing formula is as follows：

Hiding relation：Hiding relation is determined by the shielding mode of object, position, distance.Hiding relation mark is subdivided into same Blocking, block certainly and mutually blocking between blocking between type objects, different type objects.

Block complexity：In conjunction with sight line focus detection technology and eye tracker definition is utilized to block complexity (eye tracker is used for Eye movement feature of the recorder when visual information is processed, is widely used in the research in the fields such as attention, visual perception, reading).This Inventive embodiment implementation method comprises the steps：

21) eye movement is detected using eye tracker, observer's point of fixation coordinate sequence is obtained by record and obtains point of fixation Track, the data point of the point of fixation coordinate sequence forward and backward 10% for obtaining is left out, to ensure the correctness of sequence；

22) point coordinates will be watched attentively to be arranged in sequentially in time；

23) coordinate that coordinate transformation eye tracker extracts is carried out with computer screen resolution as base to watching point coordinates attentively Standard, but undistorted in order to ensure image in test process, and image is not displayed in full screen, so need to carry out coordinate transformation.

If resolution is L × H, image shows that size is l × h, and display mode is to be shown centered on, and obtains coordinate transform formula As follows：

In formula：x_original、y_originalRespectively original coordinates；x_new、y_newCoordinate after respectively converting；

24) the not sight line focal coordinates in the same time after coordinate transform are recorded using eye tracker, draws sight line trajectory diagram, pass through Clustering algorithm draws the resident hotspot graph of sight line, according to the quantity that is blocked (complexity, the resident hotspot graph focus number of trajectory diagram Mesh) and the complexity of blocking that defines of average residence time (focus average residence time percentage ratio) be labeled.

Compare with existing part partitioned data set, the data set that the present invention sets up increased the mark of the part that is blocked. Although embodiments of the invention have been shown and described above, it is to be understood that above-described embodiment is exemplary, it is impossible to Be interpreted as limitation of the present invention, one of ordinary skill in the art in the case of the principle and objective without departing from the present invention Above-described embodiment can be changed in the scope of the present invention, be changed, being replaced and modification.

Claims

1. a kind of based on the image base construction method for blocking dimension, it is characterised in that the method specifically includes following steps：

1) image of the collection with different shelter targets, and the image for collecting is classified according to shelter target, form tree Class formation；

2) each image is labeled according to dimension is blocked；

3) by mark after image be added in corresponding tree-shaped taxonomic structure, formed image library；Occlusion Map by subsequent acquisition Shape is by step 1) and during processing method 2) is added successively to image library, image library is further updated and perfect.

2. as claimed in claim 1 based on the image base construction method for blocking dimension, it is characterised in that the step 2) to per width Image is labeled according to dimension is blocked, and is specifically included the image with different shelter targets respectively according to blocking parts, screening Block face product, hiding relation, block complexity and be labeled.

3. as claimed in claim 2 based on the image base construction method for blocking dimension, it is characterised in that the step 2) according to Blocking parts be labeled for：Carry out Module Division to whole scene, and the information such as the color according to each module, texture, will category Clustered in the module of same part, the part for completing object is divided.Mark object all parts are blocked with the presence or absence of disappearance, Disappearance degree on this basis according to part, the type of blocking parts are classified, i.e., according to picture material to blocking object With the sorting objects that are blocked.Classification is encoded, to put in storage.

4. as claimed in claim 2 based on the image base construction method for blocking dimension, it is characterised in that the shielded area mark For：Whole scene is carried out after Module Division using super-pixel segmentation method, the shape, size parameter according to image, using polygon Method of approximation is fitted with the object that is blocked to blocking object, according to the size of shielded area, shielded image is carried out contingency table Note.

5. as claimed in claim 4 based on the image base construction method for blocking dimension, it is characterised in that according to shielded area Size, by shielded image classification be labeled for：Polygonal area is calculated using the coordinate for obtaining, is calculated and is blocked site area The percentage ratio of the object gross area that is blocked is accounted for, computing formula is as follows：

\frac{S_{cov e r} + S_{cov e r e d} - S_{w h o l e}}{S_{cov e r e d}} \times 100 %

In formula：For blocking object, Scovered is the object that is blocked to Scover, and Swhole is total image area.

6. as claimed in claim 2 based on the image base construction method for blocking dimension, it is characterised in that the hiding relation mark For：Hiding relation is determined by the shielding mode of object, position, distance, hiding relation mark be divided into blocking between similar object, Blocking, block certainly and mutually blocking between different type objects.

7. as claimed in claim 2 based on the image base construction method for blocking dimension, it is characterised in that described block complicated scale Note specifically includes following steps：

21) eye movement is detected using eye tracker, observer's point of fixation coordinate sequence is obtained by record and obtains watching the locus of points attentively, The data point of the point of fixation coordinate sequence forward and backward 10% for obtaining is left out, to ensure the correctness of sequence；

23) coordinate transformation is carried out to watching point coordinates attentively,

If resolution is L × H, image shows that size is l × h, and display mode is to be shown centered on, and obtains coordinate transform formula such as Under：

x_{n e w} = (x_{o r i g i n a l} - \frac{L - l}{2}) \times \frac{l}{L}

y_{n e w} = (y_{o r i g i n a l} - \frac{H - h}{2}) \times \frac{h}{H}

24) the not sight line focus mark in the same time after coordinate transform is recorded using eye tracker, draw sight line trajectory diagram, calculated by cluster Method draws the resident hotspot graph of sight line, and the complexity of blocking that be blocked quantity and the average residence time according to trajectory diagram is defined is carried out Mark.