WO2017088249A1 - 特征提取方法及装置 - Google Patents

特征提取方法及装置 Download PDF

Info

Publication number
WO2017088249A1
WO2017088249A1 PCT/CN2015/099310 CN2015099310W WO2017088249A1 WO 2017088249 A1 WO2017088249 A1 WO 2017088249A1 CN 2015099310 W CN2015099310 W CN 2015099310W WO 2017088249 A1 WO2017088249 A1 WO 2017088249A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
blocks
hog
frequency domain
cells
Prior art date
Application number
PCT/CN2015/099310
Other languages
English (en)
French (fr)
Inventor
龙飞
陈志军
张涛
Original Assignee
小米科技有限责任公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 小米科技有限责任公司 filed Critical 小米科技有限责任公司
Priority to JP2017552215A priority Critical patent/JP6378453B2/ja
Priority to KR1020167005590A priority patent/KR101754046B1/ko
Priority to MX2016003738A priority patent/MX2016003738A/es
Priority to RU2016110721A priority patent/RU2632578C2/ru
Publication of WO2017088249A1 publication Critical patent/WO2017088249A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/50Extraction of image or video features by performing operations within image blocks; by using histograms, e.g. histogram of oriented gradients [HoG]; by summing image-intensity values; Projection analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/10Image enhancement or restoration by non-spatial domain filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/40Image enhancement or restoration by the use of histogram techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/42Global feature extraction by analysis of the whole pattern, e.g. using frequency domain transformations or autocorrelation
    • G06V10/431Frequency domain transformation; Autocorrelation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20021Dividing image into blocks, subimages or windows
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20048Transform domain processing
    • G06T2207/20052Discrete cosine transform [DCT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20048Transform domain processing
    • G06T2207/20056Discrete and fast Fourier transform, [DFT, FFT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30024Cell structures in vitro; Tissue sections in vitro

Definitions

  • the present disclosure relates to the field of image processing technologies, and in particular, to a feature extraction method and apparatus.
  • Image detection and recognition is an important research area in computer vision.
  • the most common method used in image detection and recognition technology is to detect and identify images by extracting certain features in the image.
  • an image is detected and identified by extracting a HOG (Histogram of Oriented Gradient) feature of an image.
  • the HOG feature extraction method is as follows: calculating a gradient of each pixel in the image; dividing the image into a plurality of cells, each cell including a plurality of pixels, and forming n blocks for each adjacent n cells; counting each unit The gradient histogram of all the pixels in the cell, and then the HOG feature of each block is obtained according to the gradient histogram of all the cells in each block; the HOG feature of all the blocks in the statistical image obtains the HOG feature of the image.
  • the present disclosure provides a feature extraction method and apparatus.
  • the technical solution is as follows:
  • a feature extraction method comprising:
  • a direction gradient histogram HOG feature of the image in the frequency domain is extracted.
  • converting each cell from a spatial domain to a frequency domain comprises: performing a discrete cosine transform DCT on each cell.
  • each cell is converted from a spatial domain to a frequency domain, including:
  • Each cell is subjected to a discrete Fourier transform DFT.
  • the direction gradient histogram HOG feature of the image in the frequency domain is extracted, including:
  • the HOG feature of each block in the frequency domain is counted to obtain the HOG feature of the image.
  • the HOG feature of each block in the image is obtained, and the HOG feature of the image is obtained, including:
  • the HOG features of each block in the image are concatenated into a matrix to obtain the HOG features of the image, and each column of the matrix is a HOG feature of one block.
  • the HOG feature of each block in the image is obtained, and the HOG feature of the image is obtained, including:
  • the HOG feature of the image is obtained based on the adjusted HOG characteristics of each block and the corresponding position of each block in the image.
  • the method further includes:
  • the image is normalized to obtain an image of a predetermined size.
  • a feature extraction apparatus comprising:
  • a partitioning module configured to divide an image into a plurality of blocks, each block comprising a plurality of cells
  • a conversion module configured to convert each cell from a spatial domain to a frequency domain
  • An extraction module configured to extract a direction gradient histogram HOG feature of the image in the frequency domain.
  • the conversion module is configured to perform a discrete cosine transform DCT on each cell.
  • the conversion module is configured to perform a discrete Fourier transform DFT for each cell.
  • the extraction module includes:
  • a calculation submodule configured to calculate a gradient size and a gradient direction of each cell in the frequency domain to obtain a descriptor for each cell
  • a first statistic sub-module configured to count each descriptor in each block in the frequency domain to obtain an HOG feature of each block
  • the second statistic sub-module is configured to count HOG features of the image in each block of the frequency domain to obtain an HOG feature of the image.
  • the second statistic sub-module is configured to concatenate HOG features of each block in the image into a matrix to obtain HOG features of the image, and each column of the matrix is a HOG feature of one block.
  • the second statistic submodule includes:
  • the feature extraction sub-module obtains the HOG feature of the image according to the adjusted HOG feature of each block and the corresponding position of each block in the image.
  • the device further includes:
  • the processing module is configured to normalize the image to obtain an image of a predetermined size.
  • a feature extraction apparatus comprising:
  • a memory for storing processor executable instructions
  • processor is configured to:
  • a direction gradient histogram HOG feature of the image in the frequency domain is extracted.
  • each block By dividing the image into several blocks, each block includes several cells; converting each cell from the spatial domain to the frequency domain; extracting the directional gradient histogram HOG features of the image in the frequency domain; solving the HOG feature
  • the extraction process is directly calculated for the spatial domain of the image, resulting in a low detection rate and accuracy in pattern recognition; the HOG feature of extracting images in the frequency domain is achieved, and the detection rate in pattern recognition is improved. The effect of accuracy.
  • FIG. 1 is a flowchart of a feature extraction method according to an exemplary embodiment
  • FIG. 2A is a flowchart of a feature extraction method according to another exemplary embodiment
  • 2B is a schematic diagram of image division according to an exemplary embodiment
  • 2C is a schematic diagram of an image division according to another exemplary embodiment
  • 2D is a schematic diagram of a statistical intra-block HOG feature, according to an exemplary embodiment
  • FIG. 3A is a flowchart of a feature extraction method according to an exemplary embodiment
  • FIG. 3B is a schematic diagram of a statistical image HOG feature, according to an exemplary embodiment
  • FIG. 4 is a block diagram of a feature extraction apparatus according to an exemplary embodiment
  • FIG. 5 is a block diagram of a feature extraction apparatus according to another exemplary embodiment
  • FIG. 6 is a block diagram of a sub-module of a feature extraction device, according to an exemplary embodiment
  • FIG. 7 is a block diagram of a feature extraction device, according to another exemplary embodiment.
  • FIG. 1 is a flowchart of a feature extraction method according to an exemplary embodiment. As shown in FIG. 1 , the embodiment is illustrated by using the method in hardware for pattern recognition. The method may include the following steps.
  • step 102 the image is divided into blocks, each block comprising a number of cells.
  • each cell is converted from a spatial domain to a frequency domain.
  • step 106 the HOG features of the image in the frequency domain are extracted.
  • the HOG features of the image are extracted in the frequency domain.
  • the feature extraction method divides an image into a plurality of blocks, each block includes a plurality of cells; converts each cell from a spatial domain to a frequency domain; and extracts an image in Directional gradient histogram HOG feature in the frequency domain; solved the problem of directly calculating the spatial domain of the image in the HOG feature extraction process, resulting in low detection rate and accuracy in pattern recognition; reaching the frequency domain Extracting the HOG features of the image improves the detection rate and accuracy in pattern recognition.
  • FIG. 2A is a flowchart of a feature extraction method according to another exemplary embodiment. As shown in FIG. 2A, the embodiment is illustrated by using the method in hardware for pattern recognition. The method may include the following steps. :
  • step 201 the image is normalized to obtain an image of a predetermined size.
  • the terminal Before the feature extraction of the image, the terminal first normalizes the image, and processes the images of different sizes into images of a predetermined size to facilitate unified processing of the image.
  • step 202 the image is divided into blocks, each block comprising a number of cells.
  • the dividing, by the terminal, the normalized image includes: dividing the image into a plurality of blocks, and dividing each block into a plurality of cells.
  • the terminal divides the normalized image into: dividing the image into a plurality of cells, and then combining the connected cells into a block, where each block includes a plurality of cells, for example: Two adjacent two adjacent rows of cells form a block.
  • the order of dividing the block and dividing the cell is not specifically limited, and the cell may be divided into cells before being divided, or the cells may be first divided into blocks.
  • whether there is an overlapping area between the block and the block in which the image is divided is not specifically limited, and there may be an overlapping area between the block and the block, and there may be no overlapping area.
  • step 203 a DCT transform is performed on each cell.
  • the DCT Discrete Cosine Transform
  • Each cell in the image is DCT transformed to convert the image from the spatial domain to the frequency domain.
  • the terminal performs DFT transform (Discrete Fourier Transform) on each cell.
  • DFT transform Discrete Fourier Transform
  • the terminal performs DFT transformation on each cell in the image to convert the image from the spatial domain to the frequency domain.
  • step 204 the gradient magnitude and gradient direction of each cell in the frequency domain are calculated to obtain a descriptor for each cell.
  • the terminal uses a gradient operator to calculate a lateral gradient and a vertical gradient of each pixel in each cell after DCT transformation or DFT transformation.
  • An exemplary, commonly used gradient operator is shown in Table 1 below:
  • any gradient operator in Table 1 may be selected, or other gradient operators may be selected.
  • the selection of the gradient operator is not specified. limited.
  • ⁇ (x, y) is the gradient direction of the pixel (x, y)
  • m(x, y) is the gradient size of the pixel (x, y).
  • the gradient direction ⁇ (x, y) ranges from -90 degrees to 90 degrees, and the gradient direction ⁇ (x, y) is equally divided into z parts, and all pixels in each cell are weighted by m (x, y). Each of the divisions in the gradient direction is counted, and finally each cell gets a z-dimensional vector, that is, the descriptor corresponding to each cell is obtained.
  • the gradient direction ⁇ (x, y) is equally divided into 9 parts, and each corresponding angle is 20 degrees, and all pixels in each cell are counted in each 20 degrees according to the weight m(x, y). Finally, a 9-dimensional vector is obtained for each cell.
  • the number of divisions of the gradient direction is not specifically limited.
  • step 205 individual descriptors within each block in the frequency domain are counted to obtain HOG features for each block.
  • the terminal counts the descriptors calculated in each cell included in each block to obtain the HOG feature of each block.
  • the terminal may concatenate the descriptors corresponding to the respective cells, such that the HOG feature of each block is a vector, and the dimension of the vector is a unit included in the block.
  • the grid corresponds to the k dimension of the descriptor dimension.
  • the descriptors in each cell are 9-dimensional vectors.
  • Each block contains 4 cells, and the 9-dimensional descriptors in the 4 cells are concatenated to form a 36-dimensional vector.
  • the vector of the dimension is taken as the HOG feature of the corresponding block.
  • step 206 the HOG features of the respective blocks in the frequency domain are counted to obtain the HOG features of the image.
  • the terminal counts the HOG features of each block to obtain the HOG feature of the image. Place each block in the image
  • the HOG features are concatenated into a matrix to obtain the HOG features of the image, and each column of the matrix is a HOG feature of a block.
  • K i HOG features are connected in series to form a matrix 25, and K 1 is placed in the first column 26 of the matrix in series, and K 2 is Placed in the second column 27 of the matrix in series, and so on. As shown in Figure 2D.
  • the feature extraction method divides an image into a plurality of blocks, each block includes a plurality of cells; performs DCT transformation or DFT transformation on each cell; and calculates a frequency domain.
  • the gradient size and gradient direction of each cell, the descriptor of each cell is obtained; each descriptor in each block in the statistical frequency domain is obtained, and the HOG feature of each block is obtained; the statistical image is in each block of the frequency domain.
  • the HOG feature obtains the HOG feature of the image; it solves the problem that the HOG feature extraction process is directly calculated for the spatial domain of the image, resulting in a low detection rate and accuracy in the pattern recognition; the image is extracted in the frequency domain.
  • the HOG feature improves the detection rate and accuracy in pattern recognition.
  • Step 206 in the process of obtaining the HOG features of the images in the HOG features of the individual blocks in the statistical image, they may be arranged in the form of corresponding positions in the image.
  • Step 206 can be replaced with the following steps 206a and 206b, as shown in FIG. 3A:
  • the HOG feature of each block is an L*1 dimensional vector obtained by concatenating the descriptors corresponding to the respective cells, and the L*1 dimensional vector is adjusted to a matrix of M*N, that is, the terminal sets the L* in each block.
  • the 1D vector is adjusted to the corresponding matrix according to the included cells, and each column of the corresponding matrix is a descriptor of a cell; then the descriptor of each cell is adjusted according to the corresponding pixel, and the matrix obtained after adjustment is obtained.
  • Each column is a HOG feature corresponding to a pixel of a corresponding column in the corresponding block.
  • step 206b the HOG feature of the image is obtained based on the adjusted HOG characteristics of each block and the corresponding position of each block in the image.
  • the HOG features of the corresponding pixel locations in the image are obtained based on the adjusted HOG features of each block and the corresponding locations of each block in the image.
  • the K i HOG features are adjusted to a matrix of M*N, and the matrix 31 adjusted by K 1 is placed in the first block 32. the corresponding position in the image, the K matrix 33 corresponding to the second adjustment in the position of the second block 34 in the image, and so on, and finally a corresponding position on the matrix MN last block MN in an image.
  • Figure 3B shows that
  • the feature extraction method adjusts the HOG feature of each block in the image from the initial L*1 dimensional vector to the M*N matrix, and each block includes M*N pixels.
  • L M*N; according to the adjusted HOG feature of each block and the corresponding position of each block in the image, the HOG feature of the image is obtained; so that the HOG feature of the extracted image corresponds to each block in the image.
  • the corresponding position can better highlight the features of each block in the image.
  • FIG. 4 is a block diagram of a feature extraction apparatus according to an exemplary embodiment. As shown in FIG. 4, the feature extraction apparatus includes, but is not limited to:
  • a partitioning module 420 is configured to divide the image into a number of blocks, each block comprising a number of cells.
  • a transformation module 440 is configured to convert each cell from a spatial domain to a frequency domain.
  • the transformation module 440 converts each cell to convert the image from the spatial domain to the frequency domain.
  • An extraction module 460 is configured to extract a direction gradient histogram HOG feature of the image in the frequency domain.
  • the feature extraction apparatus divides an image into a plurality of blocks, each block includes a plurality of cells; each cell is converted from a spatial domain to a frequency domain;
  • FIG. 5 is a block diagram of a feature extraction apparatus according to another exemplary embodiment. As shown in FIG. 5, the feature extraction apparatus includes, but is not limited to:
  • the processing module 410 is configured to normalize the image to obtain an image of a predetermined size.
  • the processing module 410 Before performing feature extraction on the image, the processing module 410 normalizes the image, and processes images of different sizes into images of a predetermined size to facilitate uniform processing of the image.
  • a partitioning module 420 is configured to divide the image into a number of blocks, each block comprising a number of cells.
  • dividing the normalized image by the dividing module 420 includes dividing the image into a plurality of blocks, and dividing each block into a plurality of cells.
  • the dividing module 420 divides the normalized image into: dividing the image into a plurality of cells, and then combining the connected cells into a block, where each block includes a plurality of cells, such as : Combine two adjacent two cells arranged in a field to form a block.
  • the dividing module 420 does not specifically define the order of dividing the block and dividing the cell in the image dividing process, and may first divide the block and then divide the cell, or first divide the cells into blocks.
  • the dividing module 420 does not specifically define whether there is an overlapping area between the block and the block divided by the image, and there may be an overlapping area between the block and the block, or there may be no overlapping area.
  • a transformation module 440 is configured to perform a discrete cosine transform DCT on each cell.
  • the DCT Discrete Cosine Transform
  • the transformation module 440 performs DCT transformation on each cell in the image to convert the image from the spatial domain to the frequency domain.
  • the transformation module 440 is configured to perform a DFT transform (Discrete Fourier Transform) on each of the cells.
  • DFT transform Discrete Fourier Transform
  • the transformation module 440 performs a DFT transformation on each of the cells in the image to convert the image from the spatial domain to the frequency domain.
  • An extraction module 460 is configured to extract a direction gradient histogram HOG feature of the image in the frequency domain.
  • the extraction module 460 can include the following sub-modules:
  • the calculation sub-module 461 is configured to calculate the gradient size and gradient direction of each cell in the frequency domain to obtain a descriptor for each cell.
  • the calculation sub-module 461 calculates a lateral gradient and a vertical gradient of each pixel in each cell after DCT transformation or DFT transformation using a gradient operator.
  • the selection of the gradient operator in this embodiment is not specifically limited.
  • ⁇ (x, y) is the gradient direction of the pixel (x, y)
  • m(x, y) is the gradient size of the pixel (x, y).
  • the gradient direction ⁇ (x, y) ranges from -90 degrees to 90 degrees, and the gradient direction ⁇ (x, y) is equally divided into z parts, and all pixels in each cell are weighted by m (x, y). Each of the divisions in the gradient direction is counted, and finally each cell gets a z-dimensional vector, that is, the descriptor corresponding to each cell is obtained.
  • the number of divisions of the gradient direction is not specifically limited.
  • the first statistic sub-module 462 is configured to count individual descriptors within each block in the frequency domain to obtain HOG features for each block.
  • the first statistic sub-module 462 performs statistics on the descriptors calculated in each cell included in each block to obtain the HOG feature of each block.
  • the first statistic sub-module 462 may concatenate the descriptors corresponding to the respective cells such that the HOG feature of each block is a vector, and the dimension of the vector is The block contains k times the number of descriptors corresponding to the cell.
  • the second statistic sub-module 463 is configured to count the HOG features of the respective blocks in the frequency domain of the image to obtain the HOG features of the image.
  • the second statistic sub-module 463 counts the HOG features of each block to obtain the HOG features of the image.
  • the second statistic sub-module 463 is configured to concatenate the HOG features of each block in the image into a matrix to obtain an HOG feature of the image, and each column of the matrix is a HOG feature of one block.
  • the feature extraction device divides an image into several Blocks, each block comprising several cells; performing DCT transform or DFT transform on each cell; calculating the gradient size and gradient direction of each cell in the frequency domain to obtain a descriptor for each cell; statistical frequency
  • Each descriptor in each block in the domain obtains the HOG feature of each block; the HOG feature of each block of the statistical image in the frequency domain is obtained, and the HOG feature of the image is obtained; and the space for the image in the HOG feature extraction process is solved.
  • the domain is directly calculated, which leads to the problem of low detection rate and accuracy in pattern recognition; the HOG feature of extracting images in the frequency domain is achieved, and the detection rate and accuracy in pattern recognition are improved.
  • the second statistic sub-module 463 can include the following sub-modules, as shown in FIG. 6:
  • the HOG feature of each block is an L*1 dimensional vector obtained by concatenating the descriptors corresponding to the respective cells, and the adjustment submodule 610 adjusts the L*1 dimensional vector to a matrix of M*N, that is, in each block
  • the L*1 dimensional vector is adjusted to the corresponding matrix according to the included cells, and each column of the corresponding matrix is a descriptor of a cell; then the descriptor of each cell is adjusted according to the corresponding pixel, and then adjusted.
  • Each column of the resulting matrix is a HOG feature corresponding to a pixel of a corresponding column in the corresponding block.
  • the feature extraction sub-module 620 is configured to obtain an HOG feature of the image according to the adjusted HOG feature of each block and the corresponding position of each block in the image.
  • the feature extraction sub-module 620 obtains the HOG feature of the corresponding pixel location in the image according to the adjusted HOG feature of each block and the corresponding position of each block in the image.
  • the feature extraction apparatus adjusts the HOG feature of each block in the image from the initial L*1 dimensional vector to the matrix of M*N, and each block includes M*N pixels.
  • L M*N; according to the adjusted HOG feature of each block and the corresponding position of each block in the image, the HOG feature of the image is obtained; so that the HOG feature of the extracted image corresponds to each block in the image.
  • the corresponding position can better highlight the features of each block in the image.
  • An exemplary embodiment of the present disclosure provides a feature extraction device capable of implementing the feature extraction method provided by the present disclosure, the feature extraction device including: a processor, a memory for storing processor executable instructions;
  • processor is configured to:
  • a direction gradient histogram HOG feature of the image in the frequency domain is extracted.
  • FIG. 7 is a block diagram of a feature extraction device, according to an exemplary embodiment.
  • device 700 can be a mobile phone, a computer, a digital broadcast terminal, a messaging device, a gaming console, a tablet device, a medical device, a fitness device, a personal digital assistant, and the like.
  • apparatus 700 can include one or more of the following components: processing component 702, memory 704, power component 706, multimedia component 708, audio component 710, input/output (I/O) interface 712, sensor component 714, and Communication component 716.
  • Processing component 702 typically controls the overall operation of device 700, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations.
  • Processing component 702 can include one or more processors 718 to execute instructions to perform all or part of the steps of the methods described above.
  • processing component 702 can include one or more modules to facilitate interaction between component 702 and other components.
  • processing component 702 can include a multimedia module to facilitate interaction between multimedia component 708 and processing component 702.
  • Memory 704 is configured to store various types of data to support operation at device 700. Examples of such data include instructions for any application or method operating on device 700, contact data, phone book data, messages, pictures, videos, and the like. Memory 704 can be implemented by any type of volatile or non-volatile storage device, or a combination thereof, such as static random access memory (SRAM), electrically erasable programmable read only memory (EEPROM), erasable Programmable Read Only Memory (EPROM), Programmable Read Only Memory (PROM), Read Only Memory (ROM), Magnetic Memory, Flash Memory, Disk or Optical Disk.
  • SRAM static random access memory
  • EEPROM electrically erasable programmable read only memory
  • EPROM erasable Programmable Read Only Memory
  • PROM Programmable Read Only Memory
  • ROM Read Only Memory
  • Magnetic Memory Flash Memory
  • Disk Disk or Optical Disk.
  • Power component 706 provides power to various components of device 700.
  • Power component 706 can include a power management system, one or more power sources, and other components associated with generating, managing, and distributing power for device 700.
  • the multimedia component 708 includes a screen between the device 700 and the user that provides an output interface.
  • the screen can include a liquid crystal display (LCD) and a touch panel (TP). If the screen includes a touch panel, the screen can be implemented as a touch screen to receive input signals from the user.
  • the touch panel includes one or more touch sensors to sense touches, slides, and gestures on the touch panel. The touch sensor can sense not only the boundaries of the touch or sliding action, but also the duration and pressure associated with the touch or slide operation.
  • the multimedia component 708 includes a front camera and/or a rear camera. When the device 700 is in an operation mode, such as a shooting mode or a video mode, the front camera and/or the rear camera can receive external multimedia data. Each front and rear camera can be a fixed optical lens system or have focal length and optical zoom capabilities.
  • the audio component 710 is configured to output and/or input an audio signal.
  • audio component 710 includes a microphone (MIC) that is configured to receive an external audio signal when device 700 is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode.
  • the received audio signal may be further stored in memory 704 or transmitted via communication component 716.
  • audio component 710 also includes a speaker for outputting an audio signal.
  • the I/O interface 712 provides an interface between the processing component 702 and the peripheral interface module, which may be a keyboard, a click wheel, a button, or the like. These buttons may include, but are not limited to, a home button, a volume button, a start button, and a lock button.
  • Sensor assembly 714 includes one or more sensors for providing device 700 with various aspects of status assessment.
  • sensor component 714 can detect an open/closed state of device 700, relative positioning of components, such as a display and a keypad of device 700, and sensor component 714 can also detect a change in position of device 700 or a component of device 700, user The presence or absence of contact with device 700, device 700 orientation or acceleration/deceleration and temperature variation of device 700.
  • Sensor assembly 714 can include a proximity sensor configured to detect the presence of nearby objects without any physical contact.
  • Sensor component 714 can also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications.
  • the sensor component 714 can also include an acceleration sensor, a gyro sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.
  • Communication component 716 is configured to facilitate wired or wireless communication between device 700 and other devices.
  • the device 700 can access a wireless network based on a communication standard, such as Wi-Fi, 2G or 3G, or a combination thereof.
  • communication component 716 receives broadcast signals or broadcast associated information from an external broadcast management system via a broadcast channel.
  • communication component 716 Also included is a Near Field Communication (NFC) module to facilitate short range communication.
  • NFC Near Field Communication
  • the NFC module can be implemented based on radio frequency identification (RFID) technology, infrared data association (IrDA) technology, ultra-wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.
  • RFID radio frequency identification
  • IrDA infrared data association
  • UWB ultra-wideband
  • Bluetooth Bluetooth
  • apparatus 700 may be implemented by one or more application specific integrated circuits (ASICs), digital signal processors (DSPs), digital signal processing devices (DSPDs), programmable logic devices (PLDs), field programmable A gate array (FPGA), controller, microcontroller, microprocessor or other electronic component implementation for performing the feature extraction method described above.
  • ASICs application specific integrated circuits
  • DSPs digital signal processors
  • DSPDs digital signal processing devices
  • PLDs programmable logic devices
  • FPGA field programmable A gate array
  • controller microcontroller, microprocessor or other electronic component implementation for performing the feature extraction method described above.
  • non-transitory computer readable storage medium comprising instructions, such as a memory 704 comprising instructions executable by processor 718 of apparatus 700 to perform the feature extraction method described above.
  • the non-transitory computer readable storage medium can be a ROM, a random access memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, and an optical data storage device.

Abstract

本公开揭示了一种特征提取方法及装置,属于图像处理技术领域。所述特征提取方法包括:将图像划分为若干个块,每个块包括若干个单元格;将每个单元格从空间域转化为频率域;提取图像在频率域中的方向梯度直方图HOG特征。通过在频率域提取图像的HOG特征;解决了在HOG特征提取过程中是针对图像的空间域直接计算得到,导致在模式识别中的检测率和准确度较低的问题;达到了在频率域提取图像的HOG特征,提高了在模式识别中的检测率和准确度的效果。

Description

特征提取方法及装置
本申请基于申请号为201510827886.1、申请日为2015年11月25日的中国专利申请提出,并要求该中国专利申请的优先权,该中国专利申请的全部内容在此引入本申请作为参考。
技术领域
本公开涉及图像处理技术领域,特别涉及一种特征提取方法及装置。
背景技术
图像检测与识别是计算机视觉中一个重要的研究领域。图像检测与识别技术中最常用的方法是通过提取图像中的某种特征从而对图像进行检测与识别。
在相关技术中,通过提取图像的HOG(Histogram of Oriented Gradient,方向梯度直方图)特征对图像进行检测与识别。HOG特征提取的方法如下:计算图像中每个像素的梯度;将图像划分成若干个单元格,每个单元格包括若干个像素,每相邻的n个单元格形成一个块;统计每个单元格中所有像素的梯度直方图,再根据每个块中的所有单元格的梯度直方图得到每个块的HOG特征;统计图像中所有块的HOG特征得到图像的HOG特征。
发明内容
为了解决相关技术中存在的问题,本公开提供一种特征提取方法及装置。所述技术方案如下:
根据本公开实施例的第一方面,提供一种特征提取方法,该方法包括:
将图像划分为若干个块,每个块包括若干个单元格;
将每个单元格从空间域转化为频率域;
提取图像在频率域中的方向梯度直方图HOG特征。
在一个可选的实施例中,将每个单元格从空间域转化为频率域,包括:对每个单元格进行离散余弦变换DCT。
在一个可选的实施例中,将每个单元格从空间域转化为频率域,包括:对 每个单元格进行离散傅里叶变换DFT。
在一个可选的实施例中,提取图像在频率域中的方向梯度直方图HOG特征,包括:
计算频率域中每个单元格的梯度大小和梯度方向,得到每个单元格的描述子;
统计频率域中每个块内的各个描述子,得到每个块的HOG特征;
统计图像在频率域中各个块的HOG特征,得到图像的HOG特征。
在一个可选的实施例中,统计图像中各个块的HOG特征,得到图像的HOG特征,包括:
将图像中各个块的HOG特征串联成一个矩阵,得到图像的HOG特征,矩阵的每一列为一个块的HOG特征。
在一个可选的实施例中,统计图像中各个块的HOG特征,得到图像的HOG特征,包括:
将图像中每个块的HOG特征由初始的L*1维向量调整为M*N的矩阵,每个块包括M*N个像素,L=M*N;
根据每个块的调整后的HOG特征和每个块在图像中的对应位置,得到图像的HOG特征。
在一个可选的实施例中,该方法,还包括:
将图像进行归一化处理,得到预定尺寸大小的图像。
根据本公开实施例的第二方面,提供一种特征提取装置,该装置包括:
划分模块,被配置为将图像划分为若干个块,每个块包括若干个单元格;
转化模块,被配置为将每个单元格从空间域转化为频率域;
提取模块,被配置为提取图像在频率域中的方向梯度直方图HOG特征。
在一个可选的实施例中,转化模块,被配置为对每个单元格进行离散余弦变换DCT。
在一个可选的实施例中,转化模块,被配置为对每个单元格进行离散傅里叶变换DFT。
在一个可选的实施例中,提取模块,包括:
计算子模块,被配置为计算频率域中每个单元格的梯度大小和梯度方向,得到每个单元格的描述子;
第一统计子模块,被配置为统计频率域中每个块内的各个描述子,得到每个块的HOG特征;
第二统计子模块,被配置为统计图像在频率域中各个块的HOG特征,得到图像的HOG特征。
在一个可选的实施例中,第二统计子模块,被配置为将图像中各个块的HOG特征串联成一个矩阵,得到图像的HOG特征,矩阵的每一列为一个块的HOG特征。
在一个可选的实施例中,第二统计子模块,包括:
调整子模块,被配置为将图像中每个块的HOG特征由初始的L*1维向量调整为M*N的矩阵,每个块包括M*N个像素,L=M*N;
特征提取子模块,根据每个块的调整后的HOG特征和每个块在图像中的对应位置,得到图像的HOG特征。
在一个可选的实施例中,该装置,还包括:
处理模块,被配置为将图像进行归一化处理,得到预定尺寸大小的图像。
根据本公开实施例的第三方面,提供一种特征提取装置,该装置包括:
处理器;
用于存储处理器可执行指令的存储器;
其中,处理器被配置为:
将图像划分为若干个块,每个块包括若干个单元格;
将每个单元格从空间域转化为频率域;
提取图像在频率域中的方向梯度直方图HOG特征。
本公开的实施例提供的技术方案可以包括以下有益效果:
通过将图像划分为若干个块,每个块包括若干个单元格;将每个单元格从空间域转化为频率域;提取图像在频率域中的方向梯度直方图HOG特征;解决了在HOG特征提取过程中是针对图像的空间域直接计算得到,导致在模式识别中的检测率和准确度较低的问题;达到了在频率域提取图像的HOG特征,提高了在模式识别中的检测率和准确度的效果。
应当理解的是,以上的一般描述和后文的细节描述仅是示例性的,并不能限制本公开。
附图说明
此处的附图被并入说明书中并构成本说明书的一部分,示出了符合本公开的实施例,并于说明书一起用于解释本公开的原理。
图1是根据一示例性实施例示出的一种特征提取方法的流程图;
图2A是根据另一示例性实施例示出的一种特征提取方法的流程图;
图2B是根据一示例性实施例示出的一种图像划分的示意图;
图2C是根据另一示例性实施例示出的一种图像划分的示意图;
图2D是根据一示例性实施例示出的一种统计块内HOG特征的示意图;
图3A是根据一示例性实施例示出的一种特征提取方法的流程图;
图3B是根据一示例性实施例示出的一种统计图像HOG特征的示意图;
图4是根据一示例性实施例示出的一种特征提取装置的框图;
图5是根据另一示例性实施例示出的一种特征提取装置的框图;
图6是根据一示例性实施例示出的一种特征提取装置的子模块的框图;
图7是根据另一示例性实施例示出的一种特征提取装置的框图。
具体实施方式
这里将详细地对示例性实施例进行说明,其示例表示在附图中。下面的描述涉及附图时,除非另有表示,不同附图中的相同数字表示相同或相似的要素。以下示例性实施例中所描述的实施方式并不代表与本公开相一致的所有实施方式。相反,它们仅是与如所附权利要求书中所详述的、本公开的一些方面相一致的装置和方法的例子。
图1是根据一示例性实施例示出的一种特征提取方法的流程图,如图1所示,本实施例以该方法应用于模式识别的硬件中来举例说明,该方法可以包括以下步骤。
在步骤102中,将图像划分为若干个块,每个块包括若干个单元格。
在步骤104中,将每个单元格从空间域转化为频率域。
对每个单元格进行转化,将图像从空间域转化为频率域。
在步骤106中,提取图像在频率域中的HOG特征。
在频率域中提取图像的HOG特征。
综上所述,本公开实施例中提供的特征提取方法,通过将图像划分为若干个块,每个块包括若干个单元格;将每个单元格从空间域转化为频率域;提取图像在频率域中的方向梯度直方图HOG特征;解决了在HOG特征提取过程中是针对图像的空间域直接计算得到,导致在模式识别中的检测率和准确度较低的问题;达到了在频率域提取图像的HOG特征,提高了在模式识别中的检测率和准确度的效果。
图2A是根据另一示例性实施例示出的一种特征提取方法的流程图,如图2A所示,本实施例以该方法应用于模式识别的硬件中来举例说明,该方法可以包括以下步骤:
在步骤201中,将图像进行归一化处理,得到预定尺寸大小的图像。
在模式识别中,一般会涉及到对多个图像的特征提取。
在对图像进行特征提取之前,终端先对图像进行归一化处理,将不同大小的图像处理为预定尺寸大小的图像,以便于对图像的统一处理。
在步骤202中,将图像划分为若干个块,每个块包括若干个单元格。
可选的,终端对归一化处理后的图像进行划分包括:将图像划分为若干个块,再将每个块划分为若干个单元格。
可选的,终端对归一化处理后的图像进行划分包括:将图像划分为若干个单元格,再将相连的单元格组成一个块,每个块中包含有若干个单元格,比如:将两两相邻的四个呈田字形排列的单元格组成一个块。
本实施例中,在图像划分过程中,对划分块和划分单元格的顺序不作具体限定,可以先划分块再划分单元格,也可以先划分单元格再组合成块。
本实施例中,对图像划分的块与块之间是否存在重叠区域不作具体限定,块与块之间可以存在重叠区域也可以不存在重叠区域。
比如:将128像素*128像素的图像20先划分为16像素*16像素互不重叠的块21,再将每个16像素*16像素的块21划分为8像素*8像素的单元格22,则图像中包含有8个*8个=64个互不重叠的块21,每个块中包含有2个*2个=4个单元格,如图2B所示。
比如:将128像素*128像素的图像20先划分为16像素*16像素存在重叠区域的块23,再将每个16像素*16像素的块23划分为8像素*8像素的单元格 24,则图像中包含有16个*16个=256个存在重叠区域的块23,每个块中包含有2个*2个=4个单元格,如图2C所示。
在步骤203中,对每个单元格进行DCT变换。
对于图像中的每个单元格而言,假定每个单元格的像素组成的矩阵A的大小为M像素*N像素,则矩阵A的DCT(Discrete Cosine Transform,离散余弦变换)变换系数如下公式:
Figure PCTCN2015099310-appb-000001
其中,
Figure PCTCN2015099310-appb-000002
Bp·q为矩阵A的DCT系数,p=0,1,2…,M-1,m=0,1,2…,M-1,q=0,1,2…,N-1,n=0,1,2…,N-1。
将图像中的每个单元格都进行DCT变换,从而将图像从空间域转换为频率域。
可选的,终端对每个单元格进行DFT变换(Discrete Fourier Transform,离散傅里叶变换)。
对于图像中的每个单元格而言,假定每个单元格的大小为M像素*N像素,像素(x,y)组成的函数为f(x,y),则函数f(x,y)的DFT系数F(u,v)如下公式:
Figure PCTCN2015099310-appb-000003
其中,u=0,1,2…,M-1,v=0,1,2…,N-1。
终端将图像中的每个单元格都进行DFT变换,从而将图像从空间域转换为频率域。
在步骤204中,计算频率域中每个单元格的梯度大小和梯度方向,得到每个单元格的描述子。
终端利用梯度算子计算经过DCT变换或DFT变换后的每个单元格中的每个像素的横向梯度和纵向梯度。示例性的,常用的梯度算子如下表一所示:
Figure PCTCN2015099310-appb-000004
表一
本实施例中,在计算每个单元格中像素的梯度大小时可以选择表一中的任意一个梯度算子,也可以选择其他的梯度算子,本实施例中对梯度算子的选择不作具体限定。
假定像素(x,y)的横向梯度为H(x,y),纵向梯度为V(x,y),则每个像素的梯度方向和梯度幅值的计算分别如下公式(1)和公式(2):
θ(x,y)=tan-1[V(x,y)/H(x,y)]  (1)
m(x,y)=[H(x,y)2+V(x,y)2]1/2  (2)
其中,θ(x,y)为像素(x,y)的梯度方向,m(x,y)为像素(x,y)的梯度大小。
梯度方向θ(x,y)的取值范围为-90度到90度,将梯度方向θ(x,y)平均分成z份,对每个单元格中的所有像素按照权重m(x,y)在梯度方向划分的每一份进行统计,最后每个单元格得到一个z维的向量,也即得到每个单元格对应的描述子。
比如:将梯度方向θ(x,y)平均分成9份,每一份对应的角度为20度,对每个单元格中所有像素按照权重m(x,y)在各个20度中进行统计,最后对应每个单元格得到一个9维的向量。
本实施例中,对将梯度方向划分为多少份不作具体限定。
在步骤205中,统计频率域中每个块内的各个描述子,得到每个块的HOG特征。
终端对每个块内包含的各个单元格中计算得到的描述子进行统计,得到每个块的HOG特征。
在对各个单元格中计算得到的描述子进行统计时,终端可以将各个单元格对应的描述子进行串联,使得每个块的HOG特征是一个向量,该向量的维数是该块内包含单元格对应的描述子维数的k倍。
比如:各个单元格中的描述子为9维的向量,每个块包含有4个单元格,则将4个单元格中的9维描述子进行串联,形成一个36维的向量,将该36维的向量作为对应块的HOG特征。
在步骤206中,统计图像在频率域中各个块的HOG特征,得到图像的HOG特征。
终端统计各个块的HOG特征,得到图像的HOG特征。将图像中各个块的 HOG特征串联成一个矩阵,得到图像的HOG特征,矩阵的每一列为一个块的HOG特征。
比如:图像中包含有K个块,各个块的HOG特征为Ki,则将Ki个HOG特征串联成一个矩阵25,将K1放在串联成的矩阵的第一列26,将K2放在串联的矩阵的第二列27,以此类推。如图2D所示。
综上所述,本公开实施例中提供的特征提取方法,通过将图像划分为若干个块,每个块包括若干个单元格;对每个单元格进行DCT变换或DFT变换;计算频率域中每个单元格的梯度大小和梯度方向,得到每个单元格的描述子;统计频率域中每个块内的各个描述子,得到每个块的HOG特征;统计图像在频率域中各个块的HOG特征,得到图像的HOG特征;解决了在HOG特征提取过程中是针对图像的空间域直接计算得到,导致在模式识别中的检测率和准确度较低的问题;达到了在频率域提取图像的HOG特征,提高了在模式识别中的检测率和准确度的效果。
在基于图2A所示的可选实施例中,在统计图像中各个块的HOG特征得到图像的HOG特征的过程中,可以按照在图像的对应位置的形式进行排列。步骤206可替换成为如下步骤206a和206b,如图3A所示:
在步骤206a中,将图像中每个块的HOG特征由初始的L*1维向量调整为M*N的矩阵,每个块包括M*N个像素,L=M*N。
每个块的HOG特征是将各个单元格对应的描述子进行串联得到的L*1维向量,将L*1维向量调整为M*N的矩阵,也即,终端将各个块中的L*1维向量按照包含的单元格调整为对应的矩阵,该对应的矩阵的每一列为一个单元格的描述子;再将每个单元格的描述子按照对应的像素进行调整,调整后得到的矩阵的每一列为对应块中对应列的像素对应的HOG特征。
在步骤206b中,根据每个块的调整后的HOG特征和每个块在图像中的对应位置,得到图像的HOG特征。
根据每个块的调整后的HOG特征和每个块在图像中的对应位置,得到图像中对应像素位置的HOG特征。
比如:图像中包含有K个块,各个块的HOG特征为Ki,则将Ki个HOG特征调整为M*N的矩阵,将K1调整后的矩阵31放在第一个块32在图像中的 对应位置,将K2调整后的矩阵33放在第二个块34在图像中的对应位置,以此类推,最后一个将矩阵MN放在最后一个块MN在图像中的对应位置。如图3B所示。
综上所述,本实施例提供的特征提取方法,通过将图像中每个块的HOG特征由初始的L*1维向量调整为M*N的矩阵,每个块包括M*N个像素,L=M*N;根据每个块的调整后的HOG特征和每个块在图像中的对应位置,得到图像的HOG特征;使得提取后的图像的HOG特征与对应于图像中每个块的对应位置,可以更好地突出图像中各个块的特征。
下述为本公开装置实施例,可以用于执行本公开方法实施例。对于本公开装置实施例中未披露的细节,请参照本公开方法实施例。
图4是根据一示例性实施例示出的一种特征提取装置的框图,如图4所示,该特征提取装置包括但不限于:
划分模块420,被配置为将图像划分为若干个块,每个块包括若干个单元格。
转化模块440,被配置为将每个单元格从空间域转化为频率域。
转化模块440对每个单元格进行转化,将图像从空间域转化为频率域。
提取模块460,被配置为提取图像在频率域中的方向梯度直方图HOG特征。
综上所述,本公开实施例中提供的特征提取装置,通过将图像划分为若干个块,每个块包括若干个单元格;将每个单元格从空间域转化为频率域;提取图像在频率域中的方向梯度直方图HOG特征;解决了在HOG特征提取过程中是针对图像的空间域直接计算得到,导致在模式识别中的检测率和准确度较低的问题;达到了在频率域提取图像的HOG特征,提高了在模式识别中的检测率和准确度的效果。
图5是根据另一示例性实施例示出的一种特征提取装置的框图,如图5所示,该特征提取装置包括但不限于:
处理模块410,被配置为将图像进行归一化处理,得到预定尺寸大小的图像。
在模式识别中,一般会涉及到对多个图像的特征提取。
在对图像进行特征提取之前,处理模块410对图像进行归一化处理,将不同大小的图像处理为预定尺寸大小的图像,以便于对图像的统一处理。
划分模块420,被配置为将图像划分为若干个块,每个块包括若干个单元格。
可选的,划分模块420对归一化处理后的图像进行划分包括:将图像划分为若干个块,再将每个块划分为若干个单元格。
可选的,划分模块420对归一化处理后的图像进行划分包括:将图像划分为若干个单元格,再将相连的单元格组成一个块,每个块中包含有若干个单元格,比如:将两两相邻的四个呈田字形排列的单元格组成一个块。
本实施例中,划分模块420在图像划分过程中,对划分块和划分单元格的顺序不作具体限定,可以先划分块再划分单元格,也可以先划分单元格再组合成块。
本实施例中,划分模块420对图像划分的块与块之间是否存在重叠区域不作具体限定,块与块之间可以存在重叠区域也可以不存在重叠区域。
转化模块440,被配置为对每个单元格进行离散余弦变换DCT。
对于图像中的每个单元格而言,假定每个单元格的像素组成的矩阵A的大小为M像素*N像素,则矩阵A的DCT(Discrete Cosine Transform,离散余弦变换)变换系数如下公式:
Figure PCTCN2015099310-appb-000005
其中,
Figure PCTCN2015099310-appb-000006
Bp·q为矩阵A的DCT系数,p=0,1,2…,M-1,m=0,1,2…,M-1,q=0,1,2…,N-1,n=0,1,2…,N-1。
转化模块440将图像中的每个单元格都进行DCT变换,从而将图像从空间域转换为频率域。
可选的,转化模块440,被配置为对每个单元格进行DFT变换(Discrete Fourier Transform,离散傅里叶变换)。
对于图像中的每个单元格而言,假定每个单元格的大小为M像素*N像素,组成的函数为f(x,y),则函数f(x,y)的DFT系数F(u,v)如下公式:
Figure PCTCN2015099310-appb-000007
其中,u=0,1,2…,M-1,v=0,1,2…,N-1,(x,y)为像素的位置。
转化模块440将图像中的每个单元格都进行DFT变换,从而将图像从空间域转换为频率域。
提取模块460,被配置为提取图像在频率域中的方向梯度直方图HOG特征。
本实施例中,提取模块460可以包括如下子模块:
计算子模块461,被配置为计算频率域中每个单元格的梯度大小和梯度方向,得到每个单元格的描述子。
计算子模块461利用梯度算子计算经过DCT变换或DFT变换后的每个单元格中的每个像素的横向梯度和纵向梯度。
本实施例中对梯度算子的选择不作具体限定。
假定像素的横向梯度为H(x,y),纵向梯度为V(x,y),则每个像素的梯度方向和梯度幅值的计算分别如下公式(1)和公式(2):
θ(x,y)=tan-1[V(x,y)/H(x,y)]  (1)
m(x,y)=[H(x,y)2+V(x,y)2]1/2  (2)
其中,θ(x,y)为像素(x,y)的梯度方向,m(x,y)为像素(x,y)的梯度大小。
梯度方向θ(x,y)的取值范围为-90度到90度,将梯度方向θ(x,y)平均分成z份,对每个单元格中的所有像素按照权重m(x,y)在梯度方向划分的每一份进行统计,最后每个单元格得到一个z维的向量,也即得到每个单元格对应的描述子。
本实施例中,对将梯度方向划分为多少份不作具体限定。
第一统计子模块462,被配置为统计频率域中每个块内的各个描述子,得到每个块的HOG特征。
第一统计子模块462对每个块内包含的各个单元格中计算得到的描述子进行统计,得到每个块的HOG特征。
在对各个单元格中计算得到的描述子进行统计时,第一统计子模块462可以将各个单元格对应的描述子进行串联,使得每个块的HOG特征是一个向量,该向量的维数是该块内包含单元格对应的描述子维数的k倍。
第二统计子模块463,被配置为统计图像在频率域中各个块的HOG特征,得到图像的HOG特征。
第二统计子模块463统计各个块的HOG特征,得到图像的HOG特征。
可选的,第二统计子模块463,被配置为将图像中各个块的HOG特征串联成一个矩阵,得到图像的HOG特征,矩阵的每一列为一个块的HOG特征。
综上所述,本公开实施例中提供的特征提取装置,通过将图像划分为若干 个块,每个块包括若干个单元格;对每个单元格进行DCT变换或DFT变换;计算频率域中每个单元格的梯度大小和梯度方向,得到每个单元格的描述子;统计频率域中每个块内的各个描述子,得到每个块的HOG特征;统计图像在频率域中各个块的HOG特征,得到图像的HOG特征;解决了在HOG特征提取过程中是针对图像的空间域直接计算得到,导致在模式识别中的检测率和准确度较低的问题;达到了在频率域提取图像的HOG特征,提高了在模式识别中的检测率和准确度的效果。
在基于图5所示的可选实施例中,第二统计子模块463可以包括如下子模块,如图6所示:
调整子模块610,被配置为将图像中每个块的HOG特征由初始的L*1维向量调整为M*N的矩阵,每个块包括M*N个像素,L=M*N。
每个块的HOG特征是将各个单元格对应的描述子进行串联得到的L*1维向量,调整子模块610将L*1维向量调整为M*N的矩阵,也即,将各个块中的L*1维向量按照包含的单元格调整为对应的矩阵,该对应的矩阵的每一列为一个单元格的描述子;再将每个单元格的描述子按照对应的像素进行调整,调整后得到的矩阵的每一列为对应块中对应列的像素对应的HOG特征。
特征提取子模块620,被配置为根据每个块的调整后的HOG特征和每个块在图像中的对应位置,得到图像的HOG特征。
特征提取子模块620根据每个块的调整后的HOG特征和每个块在图像中的对应位置,得到图像中对应像素位置的HOG特征。
综上所述,本实施例提供的特征提取装置,通过将图像中每个块的HOG特征由初始的L*1维向量调整为M*N的矩阵,每个块包括M*N个像素,L=M*N;根据每个块的调整后的HOG特征和每个块在图像中的对应位置,得到图像的HOG特征;使得提取后的图像的HOG特征与对应于图像中每个块的对应位置,可以更好地突出图像中各个块的特征。
关于上述实施例中的装置,其中各个模块执行操作的具体方式已经在有关该方法的实施例中进行了详细描述,此处将不做详细阐述说明。
本公开一示例性实施例提供了一种特征提取装置,能够实现本公开提供的特征提取方法,该特征提取装置包括:处理器、用于存储处理器可执行指令的存储器;
其中,处理器被配置为:
将图像划分为若干个块,每个块包括若干个单元格;
将每个单元格从空间域转化为频率域;
提取图像在频率域中的方向梯度直方图HOG特征。
图7是根据一示例性实施例示出的一种特征提取装置的框图。例如,装置700可以是移动电话,计算机,数字广播终端,消息收发设备,游戏控制台,平板设备,医疗设备,健身设备,个人数字助理等。
参照图7,装置700可以包括以下一个或多个组件:处理组件702,存储器704,电源组件706,多媒体组件708,音频组件710,输入/输出(I/O)接口712,传感器组件714,以及通信组件716。
处理组件702通常控制装置700的整体操作,诸如与显示,电话呼叫,数据通信,相机操作和记录操作相关联的操作。处理组件702可以包括一个或多个处理器718来执行指令,以完成上述的方法的全部或部分步骤。此外,处理组件702可以包括一个或多个模块,便于处理组件702和其他组件之间的交互。例如,处理组件702可以包括多媒体模块,以方便多媒体组件708和处理组件702之间的交互。
存储器704被配置为存储各种类型的数据以支持在装置700的操作。这些数据的示例包括用于在装置700上操作的任何应用程序或方法的指令,联系人数据,电话簿数据,消息,图片,视频等。存储器704可以由任何类型的易失性或非易失性存储设备或者它们的组合实现,如静态随机存取存储器(SRAM),电可擦除可编程只读存储器(EEPROM),可擦除可编程只读存储器(EPROM),可编程只读存储器(PROM),只读存储器(ROM),磁存储器,快闪存储器,磁盘或光盘。
电源组件706为装置700的各种组件提供电力。电源组件706可以包括电源管理系统,一个或多个电源,及其他与为装置700生成、管理和分配电力相关联的组件。
多媒体组件708包括在装置700和用户之间的提供一个输出接口的屏幕。在一些实施例中,屏幕可以包括液晶显示器(LCD)和触摸面板(TP)。如果屏幕包括触摸面板,屏幕可以被实现为触摸屏,以接收来自用户的输入信号。触摸面板包括一个或多个触摸传感器以感测触摸、滑动和触摸面板上的手势。触摸传感器可以不仅感测触摸或滑动动作的边界,而且还检测与触摸或滑动操作相关的持续时间和压力。在一些实施例中,多媒体组件708包括一个前置摄像头和/或后置摄像头。当装置700处于操作模式,如拍摄模式或视频模式时,前置摄像头和/或后置摄像头可以接收外部的多媒体数据。每个前置摄像头和后置摄像头可以是一个固定的光学透镜系统或具有焦距和光学变焦能力。
音频组件710被配置为输出和/或输入音频信号。例如,音频组件710包括一个麦克风(MIC),当装置700处于操作模式,如呼叫模式、记录模式和语音识别模式时,麦克风被配置为接收外部音频信号。所接收的音频信号可以被进一步存储在存储器704或经由通信组件716发送。在一些实施例中,音频组件710还包括一个扬声器,用于输出音频信号。
I/O接口712为处理组件702和外围接口模块之间提供接口,上述外围接口模块可以是键盘,点击轮,按钮等。这些按钮可包括但不限于:主页按钮、音量按钮、启动按钮和锁定按钮。
传感器组件714包括一个或多个传感器,用于为装置700提供各个方面的状态评估。例如,传感器组件714可以检测到装置700的打开/关闭状态,组件的相对定位,例如组件为装置700的显示器和小键盘,传感器组件714还可以检测装置700或装置700一个组件的位置改变,用户与装置700接触的存在或不存在,装置700方位或加速/减速和装置700的温度变化。传感器组件714可以包括接近传感器,被配置用来在没有任何的物理接触时检测附近物体的存在。传感器组件714还可以包括光传感器,如CMOS或CCD图像传感器,用于在成像应用中使用。在一些实施例中,该传感器组件714还可以包括加速度传感器,陀螺仪传感器,磁传感器,压力传感器或温度传感器。
通信组件716被配置为便于装置700和其他设备之间有线或无线方式的通信。装置700可以接入基于通信标准的无线网络,如Wi-Fi,2G或3G,或它们的组合。在一个示例性实施例中,通信组件716经由广播信道接收来自外部广播管理系统的广播信号或广播相关信息。在一个示例性实施例中,通信组件716 还包括近场通信(NFC)模块,以促进短程通信。例如,在NFC模块可基于射频识别(RFID)技术,红外数据协会(IrDA)技术,超宽带(UWB)技术,蓝牙(BT)技术和其他技术来实现。
在示例性实施例中,装置700可以被一个或多个应用专用集成电路(ASIC)、数字信号处理器(DSP)、数字信号处理设备(DSPD)、可编程逻辑器件(PLD)、现场可编程门阵列(FPGA)、控制器、微控制器、微处理器或其他电子元件实现,用于执行上述特征提取方法。
在示例性实施例中,还提供了一种包括指令的非临时性计算机可读存储介质,例如包括指令的存储器704,上述指令可由装置700的处理器718执行以完成上述特征提取方法。例如,非临时性计算机可读存储介质可以是ROM、随机存取存储器(RAM)、CD-ROM、磁带、软盘和光数据存储设备等。
本领域技术人员在考虑说明书及实践这里公开的发明后,将容易想到本公开的其它实施方案。本申请旨在涵盖本公开的任何变型、用途或者适应性变化,这些变型、用途或者适应性变化遵循本公开的一般性原理并包括本公开未公开的本技术领域中的公知常识或惯用技术手段。说明书和实施例仅被视为示例性的,本公开的真正范围和精神由下面的权利要求指出。
应当理解的是,本公开并不局限于上面已经描述并在附图中示出的精确结构,并且可以在不脱离其范围进行各种修改和改变。本公开的范围仅由所附的权利要求来限制。

Claims (15)

  1. 一种特征提取方法,其特征在于,所述方法包括:
    将图像划分为若干个块,每个所述块包括若干个单元格;
    将每个所述单元格从空间域转化为频率域;
    提取所述图像在所述频率域中的方向梯度直方图HOG特征。
  2. 根据权利要求1所述的方法,其特征在于,所述将每个所述单元格从空间域转化为频率域,包括:对每个所述单元格进行离散余弦变换DCT。
  3. 根据权利要求1所述的方法,其特征在于,所述将每个所述单元格从空间域转化为频率域,包括:对每个所述单元格进行离散傅里叶变换DFT。
  4. 根据权利要求2所述的方法,其特征在于,所述提取所述图像在所述频率域中的方向梯度直方图HOG特征,包括:
    计算所述频率域中每个所述单元格的梯度大小和梯度方向,得到每个所述单元格的描述子;
    统计所述频率域中每个所述块内的各个所述描述子,得到每个所述块的HOG特征;
    统计所述图像在所述频率域中各个所述块的HOG特征,得到所述图像的HOG特征。
  5. 根据权利要求4所述的方法,其特征在于,所述统计所述图像中各个所述块的HOG特征,得到所述图像的HOG特征,包括:
    将所述图像中各个所述块的HOG特征串联成一个矩阵,得到所述图像的HOG特征,所述矩阵的每一列为一个所述块的HOG特征。
  6. 根据权利要求4所述的方法,其特征在于,所述统计所述图像中各个所述块的HOG特征,得到所述图像的HOG特征,包括:
    将所述图像中每个所述块的HOG特征由初始的L*1维向量调整为M*N的 矩阵,每个所述块包括M*N个像素,L=M*N;
    根据每个所述块的调整后的所述HOG特征和每个所述块在所述图像中的对应位置,得到所述图像的HOG特征。
  7. 根据权利要求1至6任一所述的方法,其特征在于,所述方法,还包括:
    将所述图像进行归一化处理,得到预定尺寸大小的所述图像。
  8. 一种特征提取装置,其特征在于,所述装置包括:
    划分模块,被配置为将图像划分为若干个块,每个所述块包括若干个单元格;
    转化模块,被配置为将每个所述单元格从空间域转化为频率域;
    提取模块,被配置为提取所述图像在所述频率域中的方向梯度直方图HOG特征。
  9. 根据权利要求8所述的装置,其特征在于,所述转化模块,被配置为对每个所述单元格进行离散余弦变换DCT。
  10. 根据权利要求8所述的装置,其特征在于,所述转化模块,被配置为对每个所述单元格进行离散傅里叶变换DFT。
  11. 根据权利要求9所述的装置,其特征在于,所述提取模块,包括:
    计算子模块,被配置为计算所述频率域中每个所述单元格的梯度大小和梯度方向,得到每个所述单元格的描述子;
    第一统计子模块,被配置为统计所述频率域中每个所述块内的各个所述描述子,得到每个所述块的HOG特征;
    第二统计子模块,被配置为统计所述图像在所述频率域中各个所述块的HOG特征,得到所述图像的HOG特征。
  12. 根据权利要求11所述的装置,其特征在于,所述第二统计子模块,被配置为将所述图像中各个所述块的HOG特征串联成一个矩阵,得到所述图像的 HOG特征,所述矩阵的每一列为一个所述块的HOG特征。
  13. 根据权利要求11所述的装置,其特征在于,所述第二统计子模块,包括:
    调整子模块,被配置为将所述图像中每个所述块的HOG特征由初始的L*1维向量调整为M*N的矩阵,每个所述块包括M*N个像素,L=M*N;
    特征提取子模块,被配置为根据每个所述块的调整后的所述HOG特征和每个所述块在所述图像中的对应位置,得到所述图像的HOG特征。
  14. 根据权利要求8至13任一所述的装置,其特征在于,所述装置,还包括:
    处理模块,被配置为将所述图像进行归一化处理,得到预定尺寸大小的所述图像。
  15. 一种特征提取装置,其特征在于,所述装置包括:
    处理器;
    用于存储所述处理器可执行指令的存储器;
    其中,所述处理器被配置为:
    将图像划分为若干个块,每个所述块包括若干个单元格;
    将每个所述单元格从空间域转化为频率域;
    提取所述图像在所述频率域中的方向梯度直方图HOG特征。
PCT/CN2015/099310 2015-11-25 2015-12-29 特征提取方法及装置 WO2017088249A1 (zh)

Priority Applications (4)

Application Number Priority Date Filing Date Title
JP2017552215A JP6378453B2 (ja) 2015-11-25 2015-12-29 特徴抽出方法及び装置
KR1020167005590A KR101754046B1 (ko) 2015-11-25 2015-12-29 특징 추출 방법 및 장치
MX2016003738A MX2016003738A (es) 2015-11-25 2015-12-29 Metodo y dispositivo para extraer una caracteristica.
RU2016110721A RU2632578C2 (ru) 2015-11-25 2015-12-29 Способ и устройство выделения характеристики

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201510827886.1A CN105654093B (zh) 2015-11-25 2015-11-25 特征提取方法及装置
CN201510827886.1 2015-11-25

Publications (1)

Publication Number Publication Date
WO2017088249A1 true WO2017088249A1 (zh) 2017-06-01

Family

ID=56482145

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2015/099310 WO2017088249A1 (zh) 2015-11-25 2015-12-29 特征提取方法及装置

Country Status (8)

Country Link
US (1) US10297015B2 (zh)
EP (1) EP3173976A1 (zh)
JP (1) JP6378453B2 (zh)
KR (1) KR101754046B1 (zh)
CN (1) CN105654093B (zh)
MX (1) MX2016003738A (zh)
RU (1) RU2632578C2 (zh)
WO (1) WO2017088249A1 (zh)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105654093B (zh) 2015-11-25 2018-09-25 小米科技有限责任公司 特征提取方法及装置
CN107451583A (zh) * 2017-08-03 2017-12-08 四川长虹电器股份有限公司 票据图像特征提取的方法
CN107633226B (zh) * 2017-09-19 2021-12-24 北京师范大学珠海分校 一种人体动作跟踪特征处理方法
CN107832667A (zh) * 2017-10-11 2018-03-23 哈尔滨理工大学 一种基于深度学习的人脸识别方法
CN107516094A (zh) * 2017-10-12 2017-12-26 北京军秀咨询有限公司 一种基于人脸图像处理的人才测评方法
CN111243088A (zh) * 2020-01-08 2020-06-05 长春工程学院 工程地质勘察中的真三维航空遥感地质解译方法及系统

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102254185A (zh) * 2011-07-21 2011-11-23 西安电子科技大学 基于对比度敏感函数的背景杂波量化方法
US20120099790A1 (en) * 2010-10-20 2012-04-26 Electronics And Telecommunications Research Institute Object detection device and system
US20140169663A1 (en) * 2012-12-19 2014-06-19 Futurewei Technologies, Inc. System and Method for Video Detection and Tracking
CN103903238A (zh) * 2014-03-21 2014-07-02 西安理工大学 图像特征的显著结构和相关结构融合方法
CN105046224A (zh) * 2015-07-16 2015-11-11 东华大学 基于分块自适应加权梯度方向直方图特征的人脸识别方法

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0757077A (ja) 1993-08-13 1995-03-03 Ricoh Co Ltd 画像処理装置
JPH1056570A (ja) * 1996-08-12 1998-02-24 Fuji Xerox Co Ltd 画像処理装置
US8625861B2 (en) * 2008-05-15 2014-01-07 International Business Machines Corporation Fingerprint representation using gradient histograms
RU2461017C1 (ru) * 2011-04-15 2012-09-10 Государственное образовательное учреждение высшего профессионального образования "Военная академия войсковой противовоздушной обороны Вооруженных Сил Российской Федерации" имени Маршала Советского Союза А.М. Василевского Способ обнаружения точечных тепловых объектов на сложном атмосферном фоне
ITVI20120041A1 (it) * 2012-02-22 2013-08-23 St Microelectronics Srl Rilevazione di caratteristiche di un'immagine
AU2013312495B2 (en) 2012-09-05 2019-03-21 Element, Inc. Biometric authentication in connection with camera-equipped devices
KR101407070B1 (ko) 2012-09-28 2014-06-12 한국전자통신연구원 영상기반 사람 검출을 위한 특징 추출 방법 및 장치
TWI475495B (zh) 2013-02-04 2015-03-01 Wistron Corp 圖像的識別方法、電子裝置與電腦程式產品
CN104268528B (zh) * 2014-09-28 2017-10-17 中智科创机器人有限公司 一种人群聚集区域检测方法和装置
CN104866865B (zh) * 2015-05-11 2018-03-16 西南交通大学 一种基于dhog和离散余弦变换的接触网平衡线故障检测方法
CN105654093B (zh) * 2015-11-25 2018-09-25 小米科技有限责任公司 特征提取方法及装置

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120099790A1 (en) * 2010-10-20 2012-04-26 Electronics And Telecommunications Research Institute Object detection device and system
CN102254185A (zh) * 2011-07-21 2011-11-23 西安电子科技大学 基于对比度敏感函数的背景杂波量化方法
US20140169663A1 (en) * 2012-12-19 2014-06-19 Futurewei Technologies, Inc. System and Method for Video Detection and Tracking
CN103903238A (zh) * 2014-03-21 2014-07-02 西安理工大学 图像特征的显著结构和相关结构融合方法
CN105046224A (zh) * 2015-07-16 2015-11-11 东华大学 基于分块自适应加权梯度方向直方图特征的人脸识别方法

Also Published As

Publication number Publication date
MX2016003738A (es) 2018-06-22
RU2016110721A (ru) 2017-09-28
US10297015B2 (en) 2019-05-21
RU2632578C2 (ru) 2017-10-06
CN105654093B (zh) 2018-09-25
KR101754046B1 (ko) 2017-07-04
US20170148147A1 (en) 2017-05-25
JP2018504729A (ja) 2018-02-15
JP6378453B2 (ja) 2018-08-22
CN105654093A (zh) 2016-06-08
KR20170074214A (ko) 2017-06-29
EP3173976A1 (en) 2017-05-31

Similar Documents

Publication Publication Date Title
WO2017088249A1 (zh) 特征提取方法及装置
WO2017088266A1 (zh) 图片处理方法及装置
EP3032821B1 (en) Method and device for shooting a picture
US9959484B2 (en) Method and apparatus for generating image filter
WO2017088250A1 (zh) 特征提取方法及装置
WO2021051949A1 (zh) 一种图像处理方法及装置、电子设备和存储介质
TWI702544B (zh) 圖像處理方法、電子設備和電腦可讀儲存介質
WO2017215224A1 (zh) 指纹录入提示方法和装置
WO2017020476A1 (zh) 关联用户的确定方法及装置
WO2017143776A1 (zh) 图片类型的识别方法及装置
WO2015196715A1 (zh) 图像重定位方法、装置及终端
WO2017000491A1 (zh) 获取虹膜图像的方法、装置及红膜识别设备
CN109034150B (zh) 图像处理方法及装置
WO2017088248A1 (zh) 特征提取方法及装置
WO2017088259A1 (zh) 屏幕保护方法及装置
US9665925B2 (en) Method and terminal device for retargeting images
EP3088941A1 (en) Liquid crystal display, method and device for measuring pressure
CN111985280B (zh) 图像处理方法及装置
US20140301651A1 (en) Method and device for performing spatial filtering process on image

Legal Events

Date Code Title Description
ENP Entry into the national phase

Ref document number: 2017552215

Country of ref document: JP

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 20167005590

Country of ref document: KR

Kind code of ref document: A

WWE Wipo information: entry into national phase

Ref document number: MX/A/2016/003738

Country of ref document: MX

ENP Entry into the national phase

Ref document number: 2016110721

Country of ref document: RU

Kind code of ref document: A

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15909160

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 15909160

Country of ref document: EP

Kind code of ref document: A1