CN106355199A - Accelerator and method of K-nearest neighbor - Google Patents

Accelerator and method of K-nearest neighbor Download PDF

Info

Publication number
CN106355199A
CN106355199A CN201610716367.2A CN201610716367A CN106355199A CN 106355199 A CN106355199 A CN 106355199A CN 201610716367 A CN201610716367 A CN 201610716367A CN 106355199 A CN106355199 A CN 106355199A
Authority
CN
China
Prior art keywords
module
nearest neighbor
address
distance
neighbor algorithm
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610716367.2A
Other languages
Chinese (zh)
Inventor
朱亚涛
张志敏
范东睿
王达
张�浩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Smartcore Beijing Co ltd
Institute of Computing Technology of CAS
Original Assignee
Smartcore Beijing Co ltd
Institute of Computing Technology of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Smartcore Beijing Co ltd, Institute of Computing Technology of CAS filed Critical Smartcore Beijing Co ltd
Priority to CN201610716367.2A priority Critical patent/CN106355199A/en
Publication of CN106355199A publication Critical patent/CN106355199A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses an accelerating installation and method of K-nearest neighbor, which relates to the field of information retrieval, data mining and computer architecture, the device comprises a control module, used to control each module in the accelerating installation; an address calculation module, used to acquire the address of the training sample; an Euclidean distance calculation module, used to calculate distance when searching K-nearest neighbor; a result module, used to store and transmit the distance; a sequencing module, used to implement K-nearest neighbor sequencing for the distance. Wherein, the result module transmits the distance to the sequencing module, and sends the signal of the next address to the address calculation module for calculation, the address calculation module receives the signal and calculates the address of the next training sample, and the control module clears the contents in the result module. The accelerating installation and method of K-nearest neighbor adjusts KNN algorithm with samples under different dimensions through component increasing and decreasing, and meets different demands of sample algorithm under the same dimension by adjusting degree of parallelism.

Description

A kind of accelerator of k nearest neighbor algorithm and method
Technical field
The present invention relates to information retrieval, data mining and field of computer architecture, particularly to a kind of k nearest neighbor algorithm Accelerator and method.
Background technology
Knn algorithm (k-nearest neighbor, k nearest neighbor algorithm) is a kind of nonparametric classification algorithm, in unknown distribution Classification on there is high-class precision, nonnormal distribution classification on there is low error rate, be therefore widely used in engineering The fields such as habit, stochastic programming, pattern recognition, due to needing to calculate the tool of characteristic vector to be sorted and training when knn classifies There is the distance of each sample in the sample set of class label, lead to the computation complexity when sample dimension and sample size are larger Height, when being calculated using software mode, speed is slow, poor real, have impact on its application in systems in practice, in consideration of it, many is ground Study carefully the calculating speed being concerned with how to improve knn algorithm, for example, reduce computation complexity, lifting by quick knn computational methods Search speed, is stopped in advance to compare including being searched for by partial distance, and converts to judge whether in advance to belong to by sample and treat The methods such as the k- neighbour of classification samples, the specific works of this respect have knn algorithm using tree, based on wavelet transformation Quick knn algorithm, knn algorithm based on pyramid model etc.;Additionally, also by reduce training sample set quantity thus subtracting The method to improve calculating speed for the number of times that few every subseries compares.
In addition to the above-mentioned improvement for algorithm itself, many work are further speeded up by the way of specialized hardware The process of knn algorithm, using hardware-accelerated advantage be specialized hardware structure often at aspects such as performance, power consumptions than general cpu More optimize, but for different applications, typically require root including parameters such as the quantity of training sample, sample dimension, k variables It is adjusted according to actual demand, additionally, the calculating time is also required to meet the particular demands of different application, but be currently based on hardware The work more attention of the knn algorithm research aspect accelerating is how to improve speed by simplifying algorithm, and it is typically implemented Be specific knn algorithm under fixing dimension, limit the adaptability to different application.
Content of the invention
For the problems referred to above, the present invention has designed and Implemented a kind of accelerator of k nearest neighbor algorithm and method.
The present invention proposes a kind of accelerator of k nearest neighbor algorithm, comprising:
Control module, for controlling each module in described accelerator;
Address calculation module, for obtaining the address of training sample;
Oldham distance calculating module, when being used for carrying out K nearest neighbor search, computed range;
Object module, for storing described distance, and described distance is transmitted;
Order module, for carrying out k neighbour's sequence by described distance.
Wherein, described object module sends described distance to described order module, and sends out to described address calculation module Send the signal calculating next address, described address calculation module receives described signal and calculates the ground of next training sample Location, the content in described object module is emptied by described control module.
Described control module includes depositor group, and includes degree of parallelism, sample dimension, the value of arest neighbors k, system replacement letter Breath.
Described Oldham distance calculating module is made up of multiple dimension computing modules, and described Oldham distance calculating module according to The value of degree of parallelism increases the described dimension computing module of respective numbers.
Described address calculation module, according to degree of parallelism, sample dimension, obtains the address of training sample.
Described order module, when carrying out k neighbour sequence, directly abandons ascending order and comes k+1 position when sequence is more than k value Value and its label.
The present invention also proposes a kind of k nearest neighbor algorithm accelerated method of the accelerator using described k nearest neighbor algorithm, its feature It is, comprising:
Described object module sends described distance to described order module, and sends calculating to described address calculation module The signal of next address, described address calculation module receives described signal and calculates the address of next training sample, described Content in described object module is emptied by control module.
Setting depositor group, described depositor group includes degree of parallelism, sample dimension, the value of arest neighbors k, system replacement letter Breath.
Value according to degree of parallelism increases the described dimension computing module of respective numbers.
According to degree of parallelism, sample dimension, obtain the address of training sample.
When carrying out k neighbour sequence, directly abandon value and its label that ascending order comes k+1 position when sequence is more than k value.
From above scheme, it is an advantage of the current invention that:
The present invention possesses the hardware configuration of reconfigurable ability, is suitable for the knn of different dimensions sample by increasing and decreasing assembly Algorithm, the calculating to same dimension sample then can meet the different need to logical resource and process time by adjusting degree of parallelism Ask it is therefore an objective to overcome the existing defect that software knn calculating speed is slow, hard-wired knn motility is inadequate.
Brief description
Fig. 1 is the overall construction drawing of the present invention;
Fig. 2 is control module map;
Fig. 3 is sort module map.
Specific embodiment
It is an object of the invention to proposing a kind of accelerator of k nearest neighbor algorithm and method.The general thought of the present invention is: Design propose a kind of can rapid computations knn algorithms the accelerator structure possessing reconfigurable ability, by increasing multiple dimensions Computing module, carries out parallel-expansion according to scale in knn algorithm it is achieved that the acceleration of knn algorithm in calculating process.
The overall construction drawing of the present invention as shown in figure 1, and comprise the steps of successively:
Step 1) realization of signal control module
Step 11) control module (control module) realization
Control module is the control module of whole system, as shown in Fig. 2 it has a depositor group, comprises parallel The system control informations such as degree, the replacement of sample dimension, the value of arest neighbors k, system, control at the end of each sample searches to be sorted System will reset.Add up and sort passed to by result module when taking current training sample distance value to complete whole distance values During module, sort signal controls sort module in Fig. 1 to carry out k neighbour's sequence.
Step 12) addr gen module (address calculation module) realization
Addr gen module is the control module waiting to take classified training sample address, according to degree of parallelism, sample dimension Calculate address etc. information.After obtaining current distance value, next signal is sent to addr gen module from result, under starting The address computation of one training sample.
Step 2) realize apart from computing hardware
The Euclidean distance meeting different degree of parallelisms by reconfigurable hardware configuration calculates demand, and Euclidean distance calculates mould Block is made up of multiple dimension computing modules.For example, when degree of parallelism is for i, x1, y1 to xi, yi group module can be calculated simultaneously. Need only to increase corresponding dimension computing module when extending degree of parallelism, increase sample output bit wide, allow the data of parallel computation Simultaneously from the ram output of storage training sample;Update the parallel angle value of control module in Fig. 1, addr gen module meeting simultaneously The positioning of ram address when calculating address increment value, adaptation bit wide change.When each sample to be sorted carries out k proximity search, instruction White silk sample data is constantly transfused to Oldham distance calculating module and the part of sample to be sorted enters row distance calculating.When with every At the end of the distance of individual training sample calculates, can be cleared apart from accumulator, do for next the cumulative of training sample distance Good preparation.Clear signal, sort signal produce all in control module, and next address computation is complete in addr gen module Become.Can realize calculating the flexible of degree of parallelism by the method, to adapt to different application demands.
Step 3) result module (object module) realization
It is input to the depositor of result module apart from result of calculation, when result is effectively judged to that true time passes it to Sort module, and produce next control signal to addr gen module simultaneously.Add gen module calculates next to be sampled Ram address, and by control module generation clear signal, the content of result module is emptied.
Step 4) sort module (order module) realization
Sort module receives value and its corresponding sample label that result module produces, and receives the sort letter of control After number, all values of caching in sort module are ranked up, directly abandon the value that ascending order comes k+1 position when sequence is more than k value With its label.After all of training sample calculates and finishes, the final classification results of sort module output.
The present invention also proposes a kind of k nearest neighbor algorithm accelerated method of the accelerator using described k nearest neighbor algorithm, comprising:
Described object module sends described distance to described order module, and sends calculating to described address calculation module The signal of next address, described address calculation module receives described signal and calculates the address of next training sample, described Content in described object module is emptied by control module.
Setting depositor group, described depositor group includes degree of parallelism, sample dimension, the value of arest neighbors k, system replacement letter Breath.
Value according to degree of parallelism increases the described dimension computing module of respective numbers.
According to degree of parallelism, sample dimension, obtain the address of training sample.
When carrying out k neighbour sequence, directly abandon value and its label that ascending order comes k+1 position when sequence is more than k value.
It is below embodiments of the invention, the overall procedure block diagram of this method is as shown in Figure 3.
Adopt now a sample dimension to be 6, sample space be 20 knn calculated examples (k=8) as the one of the present invention Individual embodiment is it is assumed that the present invention is to 6 dimension parallel computations.
Step 1) signal control module
Step 11) control module (control module)
Depositor group in control module records degree of parallelism 6, sample dimension 6, sample space 20, arest neighbors k value respectively 8.
Step 12) addr gen module (address calculation module)
The first address of record sample space.
Step 2) apart from computing hardware
It is made up of Oldham distance calculating module 6 dimension computing modules, six groups of modules of x1, y1 to x6, y6 are counted simultaneously Calculate, and result of calculation is aggregated into accumulator and added up, accumulated result passes to the depositor of result module.
Step 3) result module (object module)
Result module judges to result of calculation, under the premise of result is effectively not less than zero, result is passed Pass sort module, produce next signal to add gen module simultaneously.Under add gen module calculates after receiving next signal The address of one training sample.Now control module can send clear signal and empty the content of result module, simultaneously Send sort control signal to sort module.
Step 4) sort module (order module)
Sort module is ranked up apart from result of calculation to receiving, and record current sequence number is 1 simultaneously, and is given The pumping signal that next sample calculates.Go to step 12).Finish until sample space calculates, export the affiliated class of example to be measured Not.

Claims (10)

1. a kind of accelerator of k nearest neighbor algorithm is it is characterised in that include:
Control module, for controlling each module in described accelerator;
Address calculation module, for obtaining the address of training sample;
Oldham distance calculating module, when being used for carrying out K nearest neighbor search, computed range;
Object module, for storing described distance, and described distance is transmitted;
Order module, for carrying out k neighbour's sequence by described distance.
Wherein, described object module sends described distance to described order module, and sends meter to described address calculation module Calculate the signal of next address, described address calculation module receives described signal and calculates the address of next training sample, institute State control module to empty the content in described object module.
2. the accelerator of k nearest neighbor algorithm as claimed in claim 1 is it is characterised in that described control module includes depositor Group, and include degree of parallelism, sample dimension, the value of arest neighbors k, system reset information.
3. k nearest neighbor algorithm as claimed in claim 1 accelerator it is characterised in that described Oldham distance calculating module by Multiple dimension computing modules are constituted, and described Oldham distance calculating module increases the described dimension of respective numbers according to the value of degree of parallelism Degree computing module.
4. the accelerator of k nearest neighbor algorithm as claimed in claim 1 is it is characterised in that described address calculation module is according to simultaneously Row degree, sample dimension, obtain the address of training sample.
5. the accelerator of k nearest neighbor algorithm as claimed in claim 1 is it is characterised in that described order module is carrying out k neighbour During sequence, directly abandon value and its label that ascending order comes k+1 position when sequence is more than k value.
6. a kind of k nearest neighbor algorithm accelerated method of the accelerator using k nearest neighbor algorithm as claimed in claim 1, its feature exists In, comprising:
Described object module sends described distance to described order module, and calculates next to the transmission of described address calculation module The signal of individual address, described address calculation module receives described signal and calculates the address of next training sample, described control Content in described object module is emptied by module.
7. the accelerated method of k nearest neighbor algorithm as claimed in claim 6 is it is characterised in that arrange depositor group, described depositor Group includes degree of parallelism, sample dimension, the value of arest neighbors k, system reset information.
8. the accelerated method of k nearest neighbor algorithm as claimed in claim 1 it is characterised in that increase corresponding according to the value of degree of parallelism The described dimension computing module of quantity.
9. the accelerated method of k nearest neighbor algorithm as claimed in claim 1 is it is characterised in that according to degree of parallelism, sample dimension, obtain Take the address of training sample.
10. the accelerated method of k nearest neighbor algorithm as claimed in claim 1 is it is characterised in that when carrying out k neighbour sequence, work as sequence Row come value and its label of k+1 position more than directly abandoning ascending order during k value.
CN201610716367.2A 2016-08-24 2016-08-24 Accelerator and method of K-nearest neighbor Pending CN106355199A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610716367.2A CN106355199A (en) 2016-08-24 2016-08-24 Accelerator and method of K-nearest neighbor

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610716367.2A CN106355199A (en) 2016-08-24 2016-08-24 Accelerator and method of K-nearest neighbor

Publications (1)

Publication Number Publication Date
CN106355199A true CN106355199A (en) 2017-01-25

Family

ID=57844133

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610716367.2A Pending CN106355199A (en) 2016-08-24 2016-08-24 Accelerator and method of K-nearest neighbor

Country Status (1)

Country Link
CN (1) CN106355199A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110796193A (en) * 2019-10-29 2020-02-14 南京宁麒智能计算芯片研究院有限公司 Reconfigurable KNN algorithm-based hardware implementation system and method

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101556601A (en) * 2009-03-12 2009-10-14 华为技术有限公司 Method and device for searching k neighbor
CN101866426A (en) * 2010-06-21 2010-10-20 清华大学 Weighting contraction method based on K near neighbor method
CN103020893A (en) * 2012-11-21 2013-04-03 西安电子科技大学 K nearest neighbor classifier based on field programmable gate array (FPGA)

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101556601A (en) * 2009-03-12 2009-10-14 华为技术有限公司 Method and device for searching k neighbor
CN101866426A (en) * 2010-06-21 2010-10-20 清华大学 Weighting contraction method based on K near neighbor method
CN103020893A (en) * 2012-11-21 2013-04-03 西安电子科技大学 K nearest neighbor classifier based on field programmable gate array (FPGA)

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
柴志雷 等: "一种KNN算法的可重构硬件加速器设计", 《计算机应用研究》 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110796193A (en) * 2019-10-29 2020-02-14 南京宁麒智能计算芯片研究院有限公司 Reconfigurable KNN algorithm-based hardware implementation system and method

Similar Documents

Publication Publication Date Title
CN108694502B (en) Self-adaptive scheduling method for robot manufacturing unit based on XGboost algorithm
CN103020288B (en) Method for classifying data stream under a kind of dynamic data environment
CN109472057B (en) Product processing quality prediction device and method based on cross-process implicit parameter memory
CN107220734A (en) CNC Lathe Turning process Energy Consumption Prediction System based on decision tree
CN113868366B (en) Streaming data-oriented online cross-modal retrieval method and system
CN107992645B (en) Sewage treatment process soft measurement modeling method based on chaos-firework hybrid algorithm
Cao et al. A PSO-based cost-sensitive neural network for imbalanced data classification
JP2023520970A (en) Lithium battery SOC estimation method, apparatus, and computer-readable storage medium
CN110442143A (en) A kind of unmanned plane situation data clustering method based on combination multiple target dove group's optimization
CN111628494A (en) Low-voltage distribution network topology identification method and system based on logistic regression method
CN109919826B (en) Graph data compression method for graph computation accelerator and graph computation accelerator
CN106355199A (en) Accelerator and method of K-nearest neighbor
CN110275868A (en) A kind of multi-modal pretreated method of manufaturing data in intelligent plant
CN104573331B (en) A kind of k nearest neighbor data predication method based on MapReduce
CN113297129A (en) SOC (System on chip) calculation method and system of energy storage system
CN112098869A (en) Self-adaptive electric vehicle SOC estimation method based on big data
CN111626324A (en) Seabed observation network data heterogeneous analysis integration method based on edge calculation
CN112925793B (en) Distributed hybrid storage method and system for multiple structural data
CN111027760A (en) Power load prediction method based on least square vector machine
CN116152644A (en) Long-tail object identification method based on artificial synthetic data and multi-source transfer learning
CN109409594B (en) Short-term wind power prediction method
CN111199307A (en) Production line production state prediction method and system based on decision tree
CN113344085B (en) Balance bias multi-source data collaborative optimization and fusion method and device
CN111222708A (en) Power plant combustion furnace temperature prediction method based on transfer learning dynamic modeling
CN110543724A (en) Satellite structure performance prediction method for overall design

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20170125