CN105117738B - Haar detection algorithm Fast implementations based on OmapL138 chips - Google Patents
Haar detection algorithm Fast implementations based on OmapL138 chips Download PDFInfo
- Publication number
- CN105117738B CN105117738B CN201510462430.XA CN201510462430A CN105117738B CN 105117738 B CN105117738 B CN 105117738B CN 201510462430 A CN201510462430 A CN 201510462430A CN 105117738 B CN105117738 B CN 105117738B
- Authority
- CN
- China
- Prior art keywords
- haar
- result
- memories
- dsp
- chips
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/19—Recognition using electronic means
- G06V30/192—Recognition using electronic means using simultaneous comparisons or correlations of the image signals with a plurality of references
- G06V30/195—Recognition using electronic means using simultaneous comparisons or correlations of the image signals with a plurality of references using a resistor matrix
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Theoretical Computer Science (AREA)
- Image Analysis (AREA)
Abstract
The Haar detection algorithm Fast implementations based on OmapL138 chips that the invention discloses a kind of, include the following steps:(1)To camera calibration as a result, one-time calculation goes out the detection zone of Haar algorithms, formation Haar detection zone tables are saved in DDR memories;(2)ARM extracts data according to Haar detection zones table and Haar parameter lists from DDR, in linear array to L2 memories;(3)DSP linearly extracts data according to Haar parameter lists from L2, calculates as a result, and in testing result update to L2 memories;(4)The next batch data of testing result tissue that ARM is preserved according to L2 memories, then DSP calculating is passed to, data are extracted according to haar parameter lists by DSP, and calculated, executed repeatedly, until the testing result of all images is stored in L2;(5)ARM extracts testing result from L2, obtains the Haar testing results of whole image.The hardware feature of present invention combination OmapL138 chips can raise speed to the realization of haar detection algorithms from multiple angles, improve its practical value.
Description
Technical field
The present invention relates to a kind of haar detection algorithms, specifically, being to be related to a kind of Haar based on OmapL138 chips
Detection algorithm Fast implementation.
Background technology
Haar detection algorithms are the methods of detection object in a sub-picture according to advance trained Haar mark sheets,
Its basic procedure is as shown in Figure 1.Haar mark sheets are made of haar features, i.e. matrix character in Fig. 1.Rectangular characteristic value refers to
In testing image in the identical two or more rectangles of shape gray-scale pixels difference, as the rectangular characteristic value in Fig. 2 refers to
The difference of the pixel in pixel and black image in white image, the rectangular characteristic value in Fig. 3 refer to the pixel in white image
And 2 times of difference of the pixel in black image.
In order to be quickly detected to vehicle, researcher has carried out induction and conclusion to matrix character, can be divided mainly into
Following three classes:
(1)Edge is detected, as shown in Figure 2;
(2)The linear direction of detection image, as shown in Figure 3;
(3)The difference of inspection center's pixel and surrounding pixel, as shown in Figure 4.
The detector of one width M × M pixel resolutions, the internal rectangle number comprising the condition that meets be it is very more, such as
It is considerably complicated that fruit calculates calculation amount of getting up one by one, and calculation amount is also prodigious.For this purpose, specially devising a kind of quick
Convenient algorithm, it is identical to the calculation amount of various sizes of rectangular characteristic value, it is exactly the calculating of integrogram.And one
It opens in image, only several car plates, but scanning window is really many, during scanning, most windows is equal
For non-car plate, in order to reduce residence time of the inessential window in grader, and the method for introducing cascade classifier.
The training of cascade classifier is different from the training method of other graders, it be one stage an of stage successively into
Capable.Trained first stage has used all positive samples and negative sample, generates a Weak Classifier;Second stage makes
With all positive samples, and the negative sample that negative sample is not all, but use first Weak Classifier to negative sample into
Row classification, if negative sample is classified as positive sample, which participates in the training of second stage, and otherwise the negative sample is not
The training of second stage is participated in again.The training for carrying out follow-up phase according to the method needs to train how many a stages in total, by
Operating personnel are previously set, generally in 17 step left-rights.This method for being phased out negative sample so that grader is from
Level-one level-one to the end, gradually enhances negative sample recognition capability.During training, it will be seen that the first order
The negative sample of misclassification is left for the second level, and the negative sample of misclassification is left for the third level by the second level, is gone down successively.Actually detected
When car plate, most puppet car plate what is just excluded in front, what subsequent grader, which is just concentrated, tackles some and be difficult to
The negative sample of differentiation, basic procedure schematic diagram figure are as shown in Figure 5.
But even if introducing integrogram and cascade classifier, with the continuous development of technology, haar detection algorithms are also got over
It more cannot be satisfied the actual demand of today's society, urgent need is further improved it.
Invention content
The Haar detection algorithm Fast implementations based on OmapL138 chips that the purpose of the present invention is to provide a kind of, solution
The problem of certainly haar detection algorithms are difficult to meet actual demand in the prior art improves its practical value.
To achieve the goals above, the technical solution adopted by the present invention is as follows:
Haar detection algorithm Fast implementations based on OmapL138 chips, include the following steps:
(1)To camera calibration as a result, one-time calculation goes out the detection zone of Haar algorithms, formation Haar detection zones
Table is saved in DDR memories;
(2)ARM extracts data, linear array to L2 memories according to Haar detection zones table and Haar parameter lists from DDR
In;
(3)DSP linearly extracts data according to Haar parameter lists from L2, calculates as a result, and testing result update is arrived
In L2 memories;
(4)The next batch data of testing result tissue that ARM is preserved according to L2 memories, then pass to DSP calculating, by DSP according to
Data are extracted according to haar parameter lists, and are calculated, are executed repeatedly, until the testing result of all images is stored in L2
In;
(5)ARM extracts testing result from L2, obtains the Haar testing results of whole image.
Further, BIT tables there are one individually being established in the L2, the step(3)The result of calculation of middle DSP is stored in
In the BIT tables, and the step(4)Middle ARM extracts testing result from the BIT tables, obtains the result of haar detection algorithms.
ARM and DSP directly carries out access in the same table and deposits number, and data need not carry out multiple unloading, can effectively improve number
According to treatment effeciency.
Further, the step(2)In the data extracted from DDR carry out every group of size of data phase when linear array
Deng.Linear array can reduce data extraction difficulty, be provided conveniently for data, and every group of size of data is equal, then is carrying
Access according to when directly extract one group every time, without it is time-consuming go calculate size of data, improve access efficiency.
Still further, the step(3)Middle DSP is calculated testing result and is carried out using pure compilation mode.Pure compilation mode
The utilization rate that can effectively improve eight cores of DSP, avoids the DSP wastings of resources.
Further, in step(3)During middle DSP calculates testing result, by weight, left sibling, right node and
Threshold value is arranged respectively to constant and is assigned to register to be calculated.Using the method it is possible to prevente effectively from being instructed using access Load
The increased functional unit expense of institute, saves DSP resources.
In the present invention, the OMAPL138 chips are the C6748 Floating-point DSPs kernel and ARM9 kernels that TI companies release
Double-core high-speed processor, the device collection image, network, are stored in one at voice, cost-effective;Its frequency is up to 456MHz
C6748 kernels the fixed point ability to work of floating-point ability to work and higher performance is provided;ARM9 kernels have the flexible of height
Property, developer can use the operating systems such as Linux on it, convenient for its application addition man-machine interface, network work(
Energy, touch screen etc..
The memory and peripheral resources very abundant of OMAP-L138 chips, can meet the system of mixed Gaussian algorithm completely
Design requirement, and be also convenient for carrying out the extension and upgrading of system in the future.
Compared with prior art, the invention has the advantages that:
The actual needs of present invention combination haar detection algorithms and the interior nuclear properties of OMAP-L138 chips, by the confession of the two
Need characteristic to be fully blended together, farthest play the resources advantage of two kernels of OMAP-L138 chips, from hardware and
It is designed in terms of software two, so that the realization speed of haar detection algorithms is greatly improved, solve existing
Haar detection algorithms realize that speed is slow, cannot be satisfied the problem of actual demand, improve the practical value of haar detection algorithms.
Description of the drawings
Fig. 1 is the basic procedure schematic diagram of Haar detection algorithms in the prior art.
Fig. 2 is a kind of schematic diagram of matrix character in the prior art.
Fig. 3 is another schematic diagram of matrix character in the prior art.
Fig. 4 is another schematic diagram of matrix character in the prior art.
Fig. 5 is the flow diagram of cascade classifier training in the prior art.
Fig. 6 is the linearly aligned schematic diagram of data in the present invention.
Specific implementation mode
The invention will be further described with reference to the accompanying drawings and examples, and embodiments of the present invention include but not limited to
The following example.
Embodiment
Haar detection algorithm Fast implementations based on OmapL138 chips disclosed in the present embodiment, cardinal principle exist
In the dual core characteristic for utilizing and giving full play to OmapL138 chips, logical process and specific calculate are separated, respectively by one
Kernel is completed, to improving the realization speed of haar detection algorithms.
Specifically, be integrated with two individual cores of ARM and DSP in OmapL138 chips, two cores are given full play to
Respective characteristic is the key that real-time Haar detections.Wherein, ARM is carried using Haar detection templates from DDR as controller
The calculating data for taking, organizing Haar, send the data organized in the L2 of DSP in batches;And DSP then stage extraction numbers from L2
According to using eight therein calculating cores, supercomputing goes out Haar testing results;Then ARM is given testing result by DSP,
Organize next batch data again by ARM;Repeatedly, until finally obtaining the testing result of Haar at the ends ARM.
The present invention is exactly the double-core feature for the design feature and process features and OmapL138 chips of Haar detections,
The present invention is devised, general steps are as follows:
(1)To camera calibration as a result, one-time calculation goes out the detection zone of Haar algorithms, formation Haar detection zones
Table is saved in DDR memories;
(2)ARM extracts data, linear array to L2 memories according to Haar detection zones table and Haar parameter lists from DDR
In;
(3)DSP linearly extracts data according to Haar parameter lists from L2, calculates as a result, and testing result update is arrived
In L2 memories;
(4)The next batch data of testing result tissue that ARM is preserved according to L2 memories, then pass to DSP calculating, by DSP according to
Data are extracted according to haar parameter lists, and are calculated, are executed repeatedly, until the testing result of all images is stored in L2
In;
(5)ARM extracts testing result from L2, obtains the Haar testing results of whole image.
Specifically, in the calculating process of DSP, in order to using the parallel execution feature of DSP, make full use of its eight cores
The computing capability of the heart, the present invention write calculation procedure in the form of pure compilation, and eight cores are distributed respectively instruction and
Arrangement assembly line.
In the parameter of haar detection algorithms, weight, left sibling, right node, threshold value(T)It is required for expending Load instructions,
To solve this drawback, it is contemplated that these values are given values, therefore normal using corresponding parameter as one in calculating process
Number is assigned to register to be calculated, to avoid frequently using access Load instructions increased functional unit expense.
In addition, can learn that only there are two squares below the level 0 to the 4th layer of each node of cascade classifier in advance
Battle array feature, and the number of 0-4 layers of verification node accounts for the 77% of the number that 0-22 layers need to verify in total node, the meter of other layers
Calculation amount is relatively small.Therefore, special 0-4 layers of structure is designed a kind of optimization method:
First, mould iteration interval layout table is drawn, then the unit of instruction is allocated.Because of the use function of instruction
Unit be it is conditional, such as LDDW instruction and STW instruction can only use .D units, MPYLI can only use .M units,
CMPLTSP and CMPGTSP can only use .S units, ADDSP instructions that can only use .L .S units.
Such as:It needs to expend in the calculating of two Feature of a Node:
The functional unit that number of instructions uses
(1)LDDW * 4 .D1 .D2 .D1 .D2
(2)ADD * 4 .D1 .D2 .L1 .L2
(3)SUB * 2 .L1 .L2
(4)INTSP * 2 .L1 .L2
(5)MPYSP * 2 .M1 .M2
(6)CMPLTSP * 2 .S1 .S2
(7)CMPGTSP * 2 .S1 .S2
(8)MPYLI * 4 .M1 .M2 .M1 .M2
(9)ADDSP * 2 .L1 .L2
(10)STW * 2 .D1 .D2
It finds in the design process, STW send several deviation ranges that can only add 5bit, that is, deviates 32*4=128
Byres cannot meet and send several requirements, therefore it is as follows to also need to addition ADDK instructions:
(11)ADDK * 2 .S1 .S2
It can be found that most being used using unit .D units is 8, hence, it can be determined that design mould iteration interval is compiled
Row's table minimum iteration interval is 2Cycles.
When writing pure assembly code with C call functions the difference is that:Need oneself Conservation environment.In the generation of compilation
A10-A15, B10-B15 are stored in stack by code before calculating, and are calculated and are restored to register from stack after completing;It preserves simultaneously
Return address PC and stack pointer SP.
It can be obtained from the assembly line of mould iteration interval layout table:Use 8+8+6+6 in 4Cycles=
28 functional units, i.e., it is average that 7 cores have been used in 1Cycles.In the calculating of assembly line, 2Cycles can calculate one
A Feature because 0-4 layers of each node only have 2 matrix characters, therefore for 0-4 layers, calculates 1 node and needs
4Cycles can be completed.The result of each matrix character is instructed with STW and is sent in memory, finally by the value in memory
It is cumulative, it calculates cumulative value and the threshold value of this layer is judged, and design label.
It can be from multiple angles in conjunction with the hardware feature of OmapL138 chips by the above-mentioned improvement to software program method
Degree raises speed to the realization of haar detection algorithms, greatly improves the efficiency of haar algorithms detection target vehicle, improves its practicality
Value.
Above-described embodiment is merely a preferred embodiment of the present invention, and it is not intended to limit the protection scope of the present invention, as long as using
The design principle of the present invention, and the non-creative variation worked and made is carried out on this basis, it should all belong to the present invention's
Within protection domain.
Claims (5)
1. the Haar detection algorithm Fast implementations based on OmapL138 chips, which is characterized in that include the following steps:
(1)To camera calibration as a result, one-time calculation goes out the detection zone of Haar algorithms, formation Haar detection zone tables are protected
It is stored in DDR memories;
(2)ARM extracts data according to Haar detection zones table and Haar parameter lists from DDR, in linear array to L2 memories;
(3)DSP linearly extracts data according to Haar parameter lists from L2 memories, calculates as a result, and result of calculation update is arrived
In L2 memories;
(4)The next batch data of result of calculation tissue that ARM is preserved according to L2 memories, then DSP calculating is passed to, by DSP foundations
Haar parameter lists extract data, and are calculated, and execute repeatedly, until the result of calculation of all images is stored in L2
In depositing;
(5)ARM extracts result of calculation from L2 memories, obtains the Haar result of calculations of whole image.
2. the Haar detection algorithm Fast implementations according to claim 1 based on OmapL138 chips, feature exist
In individually there are one BIT tables, the steps for foundation in the L2 memories(3)The result of calculation of middle DSP is stored in the BIT tables,
And the step(4)Middle ARM extracts result of calculation from the BIT tables, obtains the result of haar detection algorithms.
3. the Haar detection algorithm Fast implementations according to claim 2 based on OmapL138 chips, feature exist
In the step(2)In the data extracted from DDR to carry out every group of size of data when linear array equal.
4. the Haar detection algorithm Fast implementations according to claim 3 based on OmapL138 chips, feature exist
In the step(3)Middle DSP is calculated result of calculation and is carried out using pure compilation mode.
5. the Haar detection algorithm Fast implementations according to claim 4 based on OmapL138 chips, feature exist
In in step(3)During middle DSP calculates result of calculation, weight, left sibling, right node and threshold value are arranged respectively to often
Number is assigned to register and is calculated.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510462430.XA CN105117738B (en) | 2015-07-31 | 2015-07-31 | Haar detection algorithm Fast implementations based on OmapL138 chips |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510462430.XA CN105117738B (en) | 2015-07-31 | 2015-07-31 | Haar detection algorithm Fast implementations based on OmapL138 chips |
Publications (2)
Publication Number | Publication Date |
---|---|
CN105117738A CN105117738A (en) | 2015-12-02 |
CN105117738B true CN105117738B (en) | 2018-08-10 |
Family
ID=54665721
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510462430.XA Active CN105117738B (en) | 2015-07-31 | 2015-07-31 | Haar detection algorithm Fast implementations based on OmapL138 chips |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN105117738B (en) |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101526997A (en) * | 2009-04-22 | 2009-09-09 | 无锡名鹰科技发展有限公司 | Embedded infrared face image identifying method and identifying device |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9330336B2 (en) * | 2011-09-16 | 2016-05-03 | Arizona Board of Regents, a body corporate of the State of Arizona, acting for and on behalf of, Arizona State University | Systems, methods, and media for on-line boosting of a classifier |
-
2015
- 2015-07-31 CN CN201510462430.XA patent/CN105117738B/en active Active
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101526997A (en) * | 2009-04-22 | 2009-09-09 | 无锡名鹰科技发展有限公司 | Embedded infrared face image identifying method and identifying device |
Non-Patent Citations (2)
Title |
---|
Night Vehicle Detection Using Variable Haar-Like Feature;Jae-do KIM 等;《Journal of Measurement Science and Instrumentation》;20111231;全文 * |
OMAPL138 的双核通信设计;林淦 等;《机床与液压》;20141130(第 22 期);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN105117738A (en) | 2015-12-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Wu et al. | Object detection based on RGC mask R‐CNN | |
Yu et al. | Fully convolutional networks for surface defect inspection in industrial environment | |
CN110378297B (en) | Remote sensing image target detection method and device based on deep learning and storage medium | |
CN109325418A (en) | Based on pedestrian recognition method under the road traffic environment for improving YOLOv3 | |
Mao et al. | Finding every car: a traffic surveillance multi-scale vehicle object detection method | |
Wang et al. | FE-YOLOv5: Feature enhancement network based on YOLOv5 for small object detection | |
CN103164701B (en) | Handwritten Numeral Recognition Method and device | |
CN110852311A (en) | Three-dimensional human hand key point positioning method and device | |
Gao et al. | Face detection algorithm based on improved TinyYOLOv3 and attention mechanism | |
CN109740752A (en) | Depth model training method and device, electronic equipment and storage medium | |
CN112800955A (en) | Remote sensing image rotating target detection method and system based on weighted bidirectional feature pyramid | |
Sun et al. | Arbitrary-angle bounding box based location for object detection in remote sensing image | |
Qian et al. | FESSD: SSD target detection based on feature fusion and feature enhancement | |
Du et al. | Improved detection method for traffic signs in real scenes applied in intelligent and connected vehicles | |
CN102855484B (en) | Based on object detection method, the Apparatus and system of Local Integral image procossing | |
Wang et al. | Scene text recognition algorithm based on faster RCNN | |
Li et al. | An efficient SMD-PCBA detection based on YOLOv7 network model | |
CN109447943B (en) | Target detection method, system and terminal equipment | |
CN106909881A (en) | The method and system of corn breeding base ridge number are extracted based on unmanned aerial vehicle remote sensing images | |
Zhang et al. | A fully convolutional anchor-free object detector | |
Wang et al. | Enhancing representation learning by exploiting effective receptive fields for object detection | |
Wang et al. | Prior-information auxiliary module: an injector to a deep learning bridge detection model | |
CN105513079A (en) | Detection method for large-scale time sequence remote sensing image change area | |
CN105117738B (en) | Haar detection algorithm Fast implementations based on OmapL138 chips | |
Shi et al. | Anchor free remote sensing detector based on solving discrete polar coordinate equation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |