CN108985451A - Data processing method and equipment based on AI chip - Google Patents

Data processing method and equipment based on AI chip Download PDF

Info

Publication number
CN108985451A
CN108985451A CN201810712195.0A CN201810712195A CN108985451A CN 108985451 A CN108985451 A CN 108985451A CN 201810712195 A CN201810712195 A CN 201810712195A CN 108985451 A CN108985451 A CN 108985451A
Authority
CN
China
Prior art keywords
processor
data
processing
data frame
neural network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810712195.0A
Other languages
Chinese (zh)
Other versions
CN108985451B (en
Inventor
王奎澎
寇浩锋
包英泽
付鹏
范彦文
周强
周仁义
胡跃祥
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jilin Huaqingyun Technology Group Co.,Ltd.
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN201810712195.0A priority Critical patent/CN108985451B/en
Publication of CN108985451A publication Critical patent/CN108985451A/en
Application granted granted Critical
Publication of CN108985451B publication Critical patent/CN108985451B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Neurology (AREA)
  • Computational Linguistics (AREA)
  • Multimedia (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Processing (AREA)
  • Image Analysis (AREA)

Abstract

The present invention provides a kind of data processing method and equipment based on AI chip, method of the invention, pass through the processing that AI chip data processing assembly line is divided into following three phases: data acquire and pretreatment, neural network model processing, neural network model post-processing;The processing of the three phases is parallel pipeline structure;AI chip includes at least first processor, second processor and third processor, and first processor is acquired and pre-processed for data, and third processor is post-processed for neural network model processing, second processor for neural network model;First processor, second processor and third processor carry out the processing of the three phases simultaneously, reduce the processor mutual waiting time, maximumlly realize the parallel computation of each processor, the efficiency of AI chip data processing is improved, so as to improve the frame per second of AI chip.

Description

Data processing method and equipment based on AI chip
Technical field
The present invention relates to AI chip technology field more particularly to a kind of data processing methods and equipment based on AI chip.
Background technique
Xeye is an artificial intelligent camera, and xeye includes sensor for acquiring image and for carrying out to image The AI chip of identifying processing.AI chip generally includes the embedded neural network processor calculated for neural network model (Neural-network Processing Unit, abbreviation NPU) and at least two CPU, wherein NPU includes multiple cores.
Existing AI chip frame by frame carries out identifying processing to the image of acquisition, and AI chip processes a data frame Journey includes following four module: Image Acquisition, image preprocessing, neural network model processing, neural network model post-processing and Data transmission.CPU is for running the first and second modules, and for NPU for running third module, CPU is also used to run the 4th module.
Existing AI chip carries out in identification processing procedure to a picture frame, this four modules are serial in timing During operation, that is, the first module of CPU operation, the second module or the 4th module, NPU is in idle condition;It is run in NPU During third module, CPU is in idle condition;CPU and NPU, which is mutually waited, leads to computing resource waste, the processing of AI chip data Low efficiency.
Summary of the invention
The present invention provides a kind of data processing method and equipment based on AI chip, to solve in the prior art CPU and NPU, which is mutually waited, leads to computing resource waste, the low problem of AI chip data treatment effeciency.
It is an aspect of the invention to provide a kind of data processing methods based on AI chip, comprising:
AI chip includes at least first processor, second processor and third processor;
The AI chip data processing assembly line is divided into the processing of following three phases: data acquisition and pretreatment, nerve Network model processing, neural network model post-processing;The processing of the three phases is parallel pipeline structure;
The first processor is acquired and is pre-processed for data, and the third processor is at neural network model Reason, the second processor are post-processed for neural network model;
The first processor, second processor and third processor carry out the processing of the three phases simultaneously.
Another aspect of the present invention is to provide a kind of AI chip, includes at least: first processor, second processor, the Three processors, memory, and the computer program being stored on the memory;
The AI chip data processing assembly line is divided into the processing of following three phases: data acquisition and pretreatment, nerve Network model processing, neural network model post-processing;The processing of the three phases is parallel pipeline structure;
The first processor, second processor and third processor are realized described above when running the computer program The data processing method based on AI chip.
Another aspect of the present invention is to provide a kind of intelligent camera, comprising: sensor and AI chip;
The AI chip includes at least: first processor, second processor, third processor, memory, and is stored in Computer program on the memory;
The AI chip data processing assembly line is divided into the processing of following three phases: data acquisition and pretreatment, nerve Network model processing, neural network model post-processing;The processing of the three phases is parallel pipeline structure;
The first processor, second processor and third processor are realized described above when running the computer program The data processing method based on AI chip.
Another aspect of the present invention is to provide a kind of computer readable storage medium, is stored with computer program,
The computer program realizes the data processing method described above based on AI chip when being executed by processor.
Data processing method and equipment provided by the invention based on AI chip, by the way that AI chip data is handled assembly line It is divided into the processing of following three phases: data acquisition and pretreatment, neural network model processing, neural network model post-processing; The processing of the three phases is parallel pipeline structure;AI chip includes at least first processor, at second processor and third Device is managed, first processor is acquired and pre-processed for data, and third processor is for neural network model processing, second processor It is post-processed for neural network model;First processor, second processor and third processor carry out the three phases simultaneously Processing, reduces the processor mutual waiting time, maximumlly realizes the parallel computation of each processor, improve AI chip The efficiency of data processing, so as to improve the frame per second of AI chip.
Detailed description of the invention
Fig. 1 is the data processing schematic diagram for the existing AI chip that the embodiment of the present invention one provides;
Fig. 2 is the pipeline organization schematic diagram for the AI chip data processing that the embodiment of the present invention one provides;
Fig. 3 is the data processing method flow chart provided by Embodiment 2 of the present invention based on AI chip;
Fig. 4 is the structural schematic diagram for the AI chip that the embodiment of the present invention three provides;
Fig. 5 is the structural schematic diagram for the intelligent camera that the embodiment of the present invention four provides.
Through the above attached drawings, it has been shown that the specific embodiment of the present invention will be hereinafter described in more detail.These attached drawings It is not intended to limit the scope of the inventive concept in any manner with verbal description, but is by referring to specific embodiments Those skilled in the art illustrate idea of the invention.
Specific embodiment
Example embodiments are described in detail here, and the example is illustrated in the accompanying drawings.Following description is related to When attached drawing, unless otherwise indicated, the same numbers in different drawings indicate the same or similar elements.Following exemplary embodiment Described in embodiment do not represent all embodiments consistented with the present invention.On the contrary, they be only with it is such as appended The example of method being described in detail in claims, some aspects of the invention are consistent.
Noun according to the present invention is explained first:
Embedded neural network processor (Neural-network Processing Unit, abbreviation NPU) uses " data The framework of driving parallel computation " is especially good at the mass multimedia data of processing video, image class.
The frame per second of AI chip: refer to the quantity of the data frame of AI chip processing per second.The data-handling efficiency of AI chip is got over The frame per second of height, AI chip is higher.
In addition, term " first ", " second " etc. are used for description purposes only, it is not understood to indicate or imply relatively important Property or implicitly indicate the quantity of indicated technical characteristic.In the description of following embodiment, the meaning of " plurality " is two More than a, unless otherwise specifically defined.
These specific embodiments can be combined with each other below, may be at certain for the same or similar concept or process It is repeated no more in a little embodiments.Below in conjunction with attached drawing, the embodiment of the present invention is described.
Embodiment one
Fig. 1 is the data processing schematic diagram for the existing AI chip that the embodiment of the present invention one provides, and Fig. 2 is that the present invention is implemented The pipeline organization schematic diagram for the AI chip data processing that example one provides.The embodiment of the present invention is for CPU and NPU in the prior art Mutually waiting leads to computing resource waste, and the low problem of AI chip data treatment effeciency provides at the data based on AI chip Reason method.Method in the present embodiment is applied to AI chip, and AI chip includes at least first processor, second processor and third Processor, third processor are used to carry out the processor of neural network model calculating, and third processor includes multiple cores.
Wherein, third processor is the processor for carrying out neural network model calculating, and third processor includes multiple Core.For example, third processor can be embedded neural network processor NPU.
First processor and second processor are first processor and second processor is central processor CPU.
For example, AI chip can be movidius2 chip, movidius2 chip includes 2 CPU, the two CPU difference For sparkv8/TR core and sparkv8/OS core.Movidius2 chip further includes 12 for carrying out neural network model calculating Shave core, these shave cores constitute a NPU.The frame per second of AI chip depends on each in the data handling procedure of AI chip The total time of the operation in stage.
In Fig. 1 by taking the processing to four data frames as an example, to AI chip in the prior art serially to data frame at The process of reason is schematically illustrated.As shown in Figure 1, sequence serially carries out data to data frame in existing AI chip Processing.
In the present embodiment, AI chip data processing assembly line is divided into the processing of following three phases: the first stage is number According to acquisition and pretreatment, second stage is neural network model processing, and the phase III is neural network model post-processing.
First handles the processing for being responsible for the first stage, acquires and pre-processes for data.Third processor is responsible for second-order The processing of section is handled for neural network model.Second processor is responsible for the processing of phase III, for after neural network model Processing.
The processing in above three stage is parallel pipeline structure.First processor, second processor and third processor The processing of the three phases is carried out simultaneously.
In Fig. 2 by taking the processing to four data frames as an example, to the pipeline organization of AI chip data processing in the present embodiment Schematically illustrated.As shown in Fig. 2, after first processor completes the first stage processing of data frame 1, third processing Device handles the second stage of data frame 1, while first processor can be handled the first stage of data frame 2. After third processor completes the processing of the second stage of data frame 1, second processor carries out the phase III of data frame 1 While processing, for first processor after the first stage processing to data frame 2, first processor can be to data frame 3 First stage is handled, while third processor can be to data frame 2 after the completion of the first stage processing to data frame 1 Second stage is handled, and may be implemented at this time simultaneously to the phase III of data frame 1, the second stage of data frame 2, and number It is handled according to the first stage of frame 3.
After the time of influent stream water, while may be implemented first processor, second processor and third processor pair The different phase of three data frames carries out the time of the full parellel of parallel processing, when being flowing water out after the time of full parellel Between.In influent stream water and the out time of flowing water, processor carries out parallel processing to a certain extent.Influent stream water and out flowing water when Between it is very of short duration, the data frame of AI chip continuous processing is more, and the time accounting of full parellel is about high, place of the AI chip to data frame The degree of parallelism of reason is higher.
In addition, AI chip is commonly used in the image procossing of view-based access control model, third processor is for handling neural network model It can be the neural network model of view-based access control model, such as convolutional neural networks (Convolutional Neural Network, letter Claim CNN) model, all kinds of deep neural network models (Deep Neural Networks, abbreviation DNN) model etc..If AI core When piece is used to carry out voice recognition processing to audio data, third processor may be to be based on for handling neural network model The neural network model of voice, the present embodiment are not specifically limited herein.
The embodiment of the present invention is by handling the processing that assembly line is divided into following three phases for AI chip data: data acquire And pretreatment, neural network model processing, neural network model post-processing;The processing of the three phases is parallel pipelining process knot Structure;First processor is acquired and is pre-processed for data, and third processor is used for neural network model processing, second processor It is post-processed in neural network model;First processor, second processor and third processor carry out the place of the three phases simultaneously Reason, reduces the processor mutual waiting time, maximumlly realizes the parallel computation of each processor, improve AI chip-count According to the efficiency of processing, so as to improve the frame per second of AI chip.
Embodiment two
Fig. 3 is the data processing method flow chart provided by Embodiment 2 of the present invention based on AI chip.In above-described embodiment On the basis of one, in the present embodiment, to first processor, second processor and third processor carry out the three phases simultaneously The detailed process of processing is described in detail.As shown in figure 3, specific step is as follows for this method:
Step S301, first processor obtains data frame and pre-processes to the data frame, and it is corresponding to obtain the data frame The first data, and by the data frame corresponding first data storage into first queue.
In the present embodiment, after first processor pre-processes data frame, corresponding first data of data frame are obtained, Corresponding first data of data frame are the processing result handled the data frame progress first stage.Data frame corresponding Input data of one data as neural network model, third processor carry out the calculating of neural network model.For example, the data Frame is picture, can be adjustment dimension of picture to the pretreatment that data frame carries out, image enhancement is carried out to picture, extract characteristic According to etc..The present embodiment carries out pretreated specific process content for first processor and process is not specifically limited.
The first processor of AI chip can connect sensor, and data frame, which can be, to be acquired by sensor and be sent to first Processor, first processor obtains data frame in the present embodiment, it can specifically realize in the following way:
First processor driving sensor acquires the data frame, so that the data frame of acquisition is sent to the by the sensor One processor;First processor receives the data frame of sensor transmission.Wherein, sensor can for image collecting device, Voice acquisition device etc..
The first processor of AI chip can also connect external equipment by communication interface etc., and data frame can also be by first Processor is read by communication interface from external equipment.Wherein, communication interface can be universal serial bus (Universal Serial Bus, abbreviation USB) interface etc..
In the present embodiment, corresponding first number of the data frame that first processor obtains after pre-processing to data frame According to storage into first queue, so that second processor is from first team before the processing for carrying out second stage to the data frame Corresponding first data of the data frame are read in column.Optionally, first queue can be circle queue.
Optionally, first processor that is to say first by the corresponding first data storage of the data frame into first queue After the processing result of first stage is stored in first queue by processing, the first wake-up can also be sent to second processor and is disappeared Breath, to inform the processing result for increasing the data frame first stage in second processor first queue, expression can start to this The processing of data frame progress second stage.
Step S302, first processor continues to obtain next data frame and pre-process to next data frame.
In the present embodiment, processing of the first processor in the first stage of complete paired data frame, and by the of the data frame After the processing result in one stage is stored into first queue, first processor can continue the first stage to next data frame It is handled, the second stage and the processing of phase III of the data frame is completed without waiting for other processors.
Step S303, third processor obtains corresponding first data of first data frame in the first queue, according to Corresponding first data of the data frame carry out neural network model processing, obtain corresponding second data of the data frame, and should The corresponding second data storage of data frame is into second queue.
In the present embodiment, what is stored in first queue is the processing result of the first stage of each data frame, be that is to say The data of pending second stage processing.
After first processor completes the processing of the first stage of the data frame, the data are carried out in third processor While the processing of the second stage of frame, first processor can carry out the to next data frame or other subsequent data frames The processing in one stage.
Third processor obtains corresponding first data of first data frame from first queue every time, according to the data frame Corresponding first data carry out neural network model processing, obtain corresponding second data of the data frame;It that is to say that third is handled Device carries out the processing of second stage to the data frame, obtains the processing result of the corresponding second stage of the data frame.
In the present embodiment, third processor into second queue, that is to say the corresponding second data storage of the data frame The processing result of second stage to data frame is stored into second queue, so that carrying out the phase III to the data frame Before processing, second processor reads corresponding second data of the data frame from second queue.Optionally, second queue can be with For circle queue.
Optionally, for third processor by the corresponding second data storage of the data frame into second queue, that is to say will be right After the processing result of the second stage of data frame is stored into second queue, second can also be sent to second processor and waken up Message, to inform the processing result for increasing data frame second stage in second processor second queue, expression can start pair The processing of data frame progress phase III.
Optionally, third processor obtains corresponding first data of first data frame in the first queue, specifically may be used To realize in the following way:
Second processor takes out corresponding first data of first data frame by first thread from the first queue, and Corresponding first data of the data frame are consigned into third processor.
Wherein, corresponding first data of first data frame that second processor is taken out from first queue, are at third Next reason device will carry out the data of second stage processing.First data frame in first queue is taken out it by second processor Afterwards, which will delete from first queue, and the first data of first in first queue data frame are at third at this time Manage next the first data by data frame to be processed of device.
Since third processor is after the second stage processing to a upper data frame is completed, the data frame can be just carried out The processing of second stage, second processor can detect whether third processor is completed to upper one by the first thread in real time The neural network model of corresponding first data of data frame is handled.
When detecting that third processor is completed to handle the neural network model of corresponding first data of a upper data frame, Second processor takes out corresponding first data of first data frame by the first thread from the first queue, and by the number Third processor is consigned to according to corresponding first data of frame.
Step S304, third processor continues to obtain corresponding first number of next data frame of first data frame According to, and carry out neural network model processing.
In the present embodiment, third processor completes the processing to the second stage of current data frame, and by the data After the processing result of the second stage of frame is stored into second queue, third processor can continue to next data frame Second stage is handled, and the processing of the phase III of the data frame is completed without waiting for other processors.
Step S305, second processor takes out corresponding second data of first data frame from the second queue, according to Corresponding second data of first data frame carry out neural network model post-processing.
In the present embodiment, what is stored in second queue is the processing result of the second stage of each data frame, be that is to say The data of pending phase III processing.
After third processor completes the processing of the second stage of the data frame, the data are carried out in second processor While the processing of the phase III of frame, third processor can carry out the to next data frame or other subsequent data frames The processing of two-stage.
Second processor obtains corresponding second data of first data frame from second queue every time, according to the data frame Corresponding second data carry out neural network model post-processing, obtain the corresponding final process result of the data frame.It that is to say Two processors carry out the processing of phase III to the data frame, obtain the corresponding final process result of the data frame.
Optionally, the neural network model that second processor carries out the phase III to data frame post-processes, can be according to The processing result of second stage that is to say the output data of neural network model, after carrying out image recognition or speech recognition Continuous processing, for example, Face datection, frame select human face region, data compression, data network transmission (such as result data is transmitted Arrive cloud etc.) etc..The specific process content for the neural network model post-processing that the present embodiment carries out second processor and Process is not specifically limited.
Optionally, second processor takes out corresponding second data of data frame by the second thread from the second queue, And carry out neural network model post-processing.
Step S306, second processor continues to obtain corresponding second number of next data frame of first data frame According to, and carry out neural network model post-processing.
In the present embodiment, second processor can continue after the processing of the phase III of complete paired data frame to next A data frame carries out the processing of phase III.
The embodiment of the present invention is stored the processing result of the first stage to data frame to first team by first processor In column, third processor obtains the processing result of the first stage of data frame from first queue, and carries out the to the data frame The processing of two-stage, and by the storage of the processing result of the second stage of the data frame into second queue, second processor is from the The processing result of the second stage of data frame is obtained in two queues, and the processing of phase III is carried out to the data frame, so that the After one processor stores the processing result of the first stage to data frame into first queue, third processor is to the data While frame carries out the processing of second stage, first processor is without waiting for the second stage of the data frame and the place of phase III Reason completes that the first stage of subsequent data frame can be handled;Likewise, in third processor to the second of the data frame After the processing result in stage is stored into second queue, the processing of the phase III of the data frame is carried out in second processor Meanwhile third processor can carry out the processing of second stage to subsequent data frame, realize first processor, second processor The processing for carrying out the three phases simultaneously with third processor, maximumlly realizes the parallel computation of each processor, reduces The processor mutual waiting time improves the efficiency of AI chip data processing, so as to improving the frame per second of AI chip.
Embodiment three
Fig. 4 is the structural schematic diagram for the AI chip that the embodiment of the present invention three provides.AI chip provided in an embodiment of the present invention The process flow that the data processing method embodiment based on AI chip provides can be executed.As shown in figure 4, the AI chip 40 wraps It includes: first processor 401, second processor 402, third processor 403, memory 404, and be stored on the memory Computer program.
The AI chip data processing assembly line is divided into the processing of following three phases: data acquisition and pretreatment, nerve Network model processing, neural network model post-processing;The processing of the three phases is parallel pipeline structure.
The first processor 401, second processor 402 and third processor 403 run real when the computer program The data processing method based on AI chip that existing any of the above-described embodiment of the method provides.
For example, movidius2 chip includes 2 CPU, the two CPU for example, AI chip can be movidius2 chip Respectively sparkv8/TR core and sparkv8/OS core.Movidius2 chip further includes 12 for carrying out neural network model The shave core of calculating, these shave cores constitute a NPU.
The frame per second of AI chip depends on the total time of the operation in each stage in the data handling procedure of AI chip.
The connection relationship being used only in Fig. 4 between the multiple processors and memory that description AI chip includes, for Processor and memory location are without limitation.
The embodiment of the present invention is by handling the processing that assembly line is divided into following three phases for AI chip data: data acquire And pretreatment, neural network model processing, neural network model post-processing;The processing of the three phases is parallel pipelining process knot Structure;First processor is acquired and is pre-processed for data, and third processor is used for neural network model processing, second processor It is post-processed in neural network model;First processor, second processor and third processor carry out the place of the three phases simultaneously Reason, reduces the processor mutual waiting time, maximumlly realizes the parallel computation of each processor, improve AI chip-count According to the efficiency of processing, so as to improve the frame per second of AI chip.
Example IV
Fig. 5 is the structural schematic diagram for the intelligent camera that the embodiment of the present invention four provides.As shown in figure 5, intelligent camera 500 include: sensor 50 and AI chip 40.
The AI chip 40 includes: first processor 401, second processor 402, third processor 403, memory 404, with And it is stored in the computer program on the memory.
The data processing pipeline of the AI chip 40 is divided into the processing of following three phases: data acquisition and pretreatment, Neural network model processing, neural network model post-processing;The processing of the three phases is parallel pipeline structure.
The first processor 401, second processor 402 and third processor 403 run real when the computer program The data processing method based on AI chip that existing any of the above-described embodiment of the method provides.
For example, intelligent camera can be xeye, the AI chip of xeye is movidius2 chip.Movidius2 chip packet 2 CPU are included, the two CPU are respectively sparkv8/TR core and sparkv8/OS core.Movidius2 chip further includes 12 use In the shave core for carrying out neural network model calculating, these shave cores constitute a NPU.
The frame per second of xeye product, which is related to xeye, can acquire the speed of picture.The frame per second of xeye is higher, can adopt in 1 second It is more to collect picture, more data can be extracted by neural network model to rear end, so promoting the frame per second of xeye very It is important.
The frame per second of xeye product depends on the frame per second of AI chip, and the frame per second of AI chip depends on the data processing of AI chip The total time of the operation in each stage in journey.It is that serial process AI chip is carried out to data frame in the prior art compared to AI chip Intelligent camera, the frame per second of intelligent camera provided in this embodiment promotes one times.
In the present embodiment, Fig. 5 is used only for the connection relationship between each component of description intelligent camera, for each component Relative position without limitation.
The embodiment of the present invention is by handling the processing that assembly line is divided into following three phases for AI chip data: data acquire And pretreatment, neural network model processing, neural network model post-processing;The processing of the three phases is parallel pipelining process knot Structure;First processor is acquired and is pre-processed for data, and third processor is used for neural network model processing, second processor It is post-processed in neural network model;First processor, second processor and third processor carry out the place of the three phases simultaneously Reason, reduces the processor mutual waiting time, maximumlly realizes the parallel computation of each processor, improve AI chip-count According to the efficiency of processing, the AI frame per second of chip is improved, so as to improve the frame per second of intelligent camera.
In addition, the embodiment of the present invention also provides a kind of computer readable storage medium, it is stored with computer program, the meter Method that calculation machine program realizes what any of the above-described embodiment of the method provided when being executed by processor ....
In several embodiments provided by the present invention, it should be understood that disclosed device and method can pass through it Its mode is realized.For example, the apparatus embodiments described above are merely exemplary, for example, the division of the unit, only Only a kind of logical function partition, there may be another division manner in actual implementation, such as multiple units or components can be tied Another system is closed or is desirably integrated into, or some features can be ignored or not executed.Another point, it is shown or discussed Mutual coupling, direct-coupling or communication connection can be through some interfaces, the INDIRECT COUPLING or logical of device or unit Letter connection can be electrical property, mechanical or other forms.
The unit as illustrated by the separation member may or may not be physically separated, aobvious as unit The component shown may or may not be physical unit, it can and it is in one place, or may be distributed over multiple In network unit.It can select some or all of unit therein according to the actual needs to realize the mesh of this embodiment scheme 's.
It, can also be in addition, the functional units in various embodiments of the present invention may be integrated into one processing unit It is that each unit physically exists alone, can also be integrated in one unit with two or more units.Above-mentioned integrated list Member both can take the form of hardware realization, can also realize in the form of hardware adds SFU software functional unit.
The above-mentioned integrated unit being realized in the form of SFU software functional unit can store and computer-readable deposit at one In storage media.Above-mentioned SFU software functional unit is stored in a storage medium, including some instructions are used so that a computer It is each that equipment (can be personal computer, server or the network equipment etc.) or processor (processor) execute the present invention The part steps of embodiment the method.And storage medium above-mentioned includes: USB flash disk, mobile hard disk, read-only memory (Read- Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic or disk etc. it is various It can store the medium of program code.
Those skilled in the art can be understood that, for convenience and simplicity of description, only with above-mentioned each functional module Division progress for example, in practical application, can according to need and above-mentioned function distribution is complete by different functional modules At the internal structure of device being divided into different functional modules, to complete all or part of the functions described above.On The specific work process for stating the device of description, can refer to corresponding processes in the foregoing method embodiment, and details are not described herein.
Those skilled in the art after considering the specification and implementing the invention disclosed here, will readily occur to of the invention its Its embodiment.The present invention is directed to cover any variations, uses, or adaptations of the invention, these modifications, purposes or Person's adaptive change follows general principle of the invention and including the undocumented common knowledge in the art of the present invention Or conventional techniques.The description and examples are only to be considered as illustrative, and true scope and spirit of the invention are by following Claims are pointed out.
It should be understood that the present invention is not limited to the precise structure already described above and shown in the accompanying drawings, and And various modifications and changes may be made without departing from the scope thereof.The scope of the present invention is only limited by appended claims System.

Claims (13)

1. a kind of data processing method based on AI chip characterized by comprising AI chip includes at least first processor, Second processor and third processor;
The AI chip data processing assembly line is divided into the processing of following three phases: data acquisition and pretreatment, neural network Model treatment, neural network model post-processing;The processing of the three phases is parallel pipeline structure;
The first processor is acquired and is pre-processed for data, and the third processor is for neural network model processing, institute Second processor is stated to post-process for neural network model;
The first processor, second processor and third processor carry out the processing of the three phases simultaneously.
2. the method according to claim 1, wherein the first processor, second processor and third processing Device carries out the processing of the three phases parallel, comprising:
The first processor obtains data frame and pre-processes to the data frame, obtains the data frame corresponding first Data, and by the corresponding first data storage of the data frame into first queue;
The first processor continues to obtain next data frame and pre-process to next data frame.
3. according to the method described in claim 2, it is characterized in that, the first processor, second processor and third processing Device carries out the processing of the three phases parallel, comprising:
The third processor obtains corresponding first data of first data frame in the first queue, according to the data Corresponding first data of frame carry out neural network model processing, obtain corresponding second data of the data frame, and by the number According to the corresponding second data storage of frame into second queue;
The third processor continues to obtain corresponding first data of next data frame of first data frame, and carries out Neural network model processing.
4. according to the method described in claim 3, it is characterized in that, the third processor obtains in the first queue Corresponding first data of one data frame, comprising:
The second processor takes out corresponding first data of first data frame by first thread from the first queue, And corresponding first data of the data frame are consigned into the third processor.
5. according to the method described in claim 4, it is characterized in that, the second processor passes through first thread from described first It takes out corresponding first data of first data frame in queue, and corresponding first data of the data frame is consigned to described the Before three processors, further includes:
The second processor detects whether the third processor is completed to a upper data by the first thread in real time The neural network model of corresponding first data of frame is handled;
Correspondingly, the second processor takes out first data frame corresponding the by first thread from the first queue One data, and corresponding first data of the data frame are consigned into the third processor, comprising:
When detecting that the third processor is completed to handle the neural network model of corresponding first data of a upper data frame, The second processor takes out corresponding first data of first data frame by the first thread from the first queue, And corresponding first data of the data frame are consigned into the third processor.
6. according to the method described in claim 3, it is characterized in that, the first processor, second processor and third processing Device carries out the processing of the three phases parallel, comprising:
The second processor takes out corresponding second data of first data frame from the second queue, according to described first Corresponding second data of a data frame carry out neural network model post-processing;
The second processor continues to obtain corresponding second data of next data frame of first data frame, and carries out Neural network model post-processing.
7. according to the method described in claim 6, it is characterized in that,
The second processor takes out corresponding second data of data frame by the second thread from the second queue, and carries out Neural network model post-processing.
8. according to the method described in claim 2, it is characterized in that, the first processor obtains data frame, comprising:
The first processor driving sensor acquires the data frame, so that the sensor sends out the data frame of acquisition Give the first processor;
The first processor receives the data frame that the sensor is sent.
9. method according to claim 1-8, which is characterized in that the first processor and the second processing Device is central processor CPU, and the third processor is processor for carrying out neural network model calculating, at the third Managing device includes multiple cores.
10. according to the method described in claim 9, it is characterized in that, the third processor is embedded Processing with Neural Network Device NPU.
11. a kind of AI chip, which is characterized in that include at least: first processor, second processor, third processor, storage Device, and the computer program being stored on the memory;
The AI chip data processing assembly line is divided into the processing of following three phases: data acquisition and pretreatment, neural network Model treatment, neural network model post-processing;The processing of the three phases is parallel pipeline structure;
The first processor, second processor and third processor realize such as claim 1- when running the computer program Data processing method based on AI chip described in any one of 10.
12. a kind of intelligent camera characterized by comprising sensor and AI chip;
The AI chip includes at least: first processor, second processor, third processor, memory, and is stored in described Computer program on memory;
The AI chip data processing assembly line is divided into the processing of following three phases: data acquisition and pretreatment, neural network Model treatment, neural network model post-processing;The processing of the three phases is parallel pipeline structure;
The first processor, second processor and third processor realize such as claim 1- when running the computer program Data processing method based on AI chip described in any one of 10.
13. a kind of computer readable storage medium, which is characterized in that it is stored with computer program,
It realizes when the computer program is executed by processor such as the chip of any of claims 1-10 based on AI Data processing method.
CN201810712195.0A 2018-06-29 2018-06-29 Data processing method and device based on AI chip Active CN108985451B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810712195.0A CN108985451B (en) 2018-06-29 2018-06-29 Data processing method and device based on AI chip

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810712195.0A CN108985451B (en) 2018-06-29 2018-06-29 Data processing method and device based on AI chip

Publications (2)

Publication Number Publication Date
CN108985451A true CN108985451A (en) 2018-12-11
CN108985451B CN108985451B (en) 2020-08-04

Family

ID=64539849

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810712195.0A Active CN108985451B (en) 2018-06-29 2018-06-29 Data processing method and device based on AI chip

Country Status (1)

Country Link
CN (1) CN108985451B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111382857A (en) * 2018-12-29 2020-07-07 上海寒武纪信息科技有限公司 Task processing device, neural network processor chip, combination device and electronic equipment
CN111861852A (en) * 2019-04-30 2020-10-30 百度时代网络技术(北京)有限公司 Method and device for processing image and electronic equipment
CN112513817A (en) * 2020-08-14 2021-03-16 华为技术有限公司 Data interaction method of main CPU and NPU and computing equipment
CN114723033A (en) * 2022-06-10 2022-07-08 成都登临科技有限公司 Data processing method, data processing device, AI chip, electronic device and storage medium
WO2023124361A1 (en) * 2021-12-30 2023-07-06 上海商汤智能科技有限公司 Chip, acceleration card, electronic device and data processing method

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150002682A1 (en) * 2010-02-26 2015-01-01 Bao Tran High definition camera
CN107562660A (en) * 2017-08-29 2018-01-09 深圳普思英察科技有限公司 A kind of vision SLAM on-chip system and data processing method

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150002682A1 (en) * 2010-02-26 2015-01-01 Bao Tran High definition camera
CN107562660A (en) * 2017-08-29 2018-01-09 深圳普思英察科技有限公司 A kind of vision SLAM on-chip system and data processing method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
许廷发等: ""基于多DSP混合结构的Gabor小波神经网络图像目标识别"", 《北京理工大学学报》 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111382857A (en) * 2018-12-29 2020-07-07 上海寒武纪信息科技有限公司 Task processing device, neural network processor chip, combination device and electronic equipment
CN111382857B (en) * 2018-12-29 2023-07-18 上海寒武纪信息科技有限公司 Task processing device, neural network processor chip, combination device and electronic equipment
CN111861852A (en) * 2019-04-30 2020-10-30 百度时代网络技术(北京)有限公司 Method and device for processing image and electronic equipment
CN112513817A (en) * 2020-08-14 2021-03-16 华为技术有限公司 Data interaction method of main CPU and NPU and computing equipment
CN112513817B (en) * 2020-08-14 2021-10-01 华为技术有限公司 Data interaction method of main CPU and NPU and computing equipment
WO2023124361A1 (en) * 2021-12-30 2023-07-06 上海商汤智能科技有限公司 Chip, acceleration card, electronic device and data processing method
CN114723033A (en) * 2022-06-10 2022-07-08 成都登临科技有限公司 Data processing method, data processing device, AI chip, electronic device and storage medium
CN114723033B (en) * 2022-06-10 2022-08-19 成都登临科技有限公司 Data processing method, data processing device, AI chip, electronic device and storage medium

Also Published As

Publication number Publication date
CN108985451B (en) 2020-08-04

Similar Documents

Publication Publication Date Title
CN108985451A (en) Data processing method and equipment based on AI chip
WO2020248376A1 (en) Emotion detection method and apparatus, electronic device, and storage medium
CN111914937A (en) Lightweight improved target detection method and detection system
US20200293878A1 (en) Handling categorical field values in machine learning applications
US11423307B2 (en) Taxonomy construction via graph-based cross-domain knowledge transfer
CN110851255B (en) Method for processing video stream based on cooperation of terminal equipment and edge server
US11967150B2 (en) Parallel video processing systems
TW201633181A (en) Event-driven temporal convolution for asynchronous pulse-modulated sampled signals
CN110852295B (en) Video behavior recognition method based on multitasking supervised learning
CN111950700A (en) Neural network optimization method and related equipment
CN110298296A (en) Face identification method applied to edge calculations equipment
CN111488813B (en) Video emotion marking method and device, electronic equipment and storage medium
CN111176442A (en) Interactive government affair service system and method based on VR virtual reality technology
CN112528108B (en) Model training system, gradient aggregation method and device in model training
CN110600020A (en) Gradient transmission method and device
Hoai et al. An attention-based method for action unit detection at the 3rd abaw competition
CN113095506A (en) Machine learning method, system and medium based on end, edge and cloud cooperation
CN108924145A (en) Network transfer method, device and equipment
WO2023124361A1 (en) Chip, acceleration card, electronic device and data processing method
CN114391260A (en) Character recognition method and device, storage medium and electronic equipment
CN116778527A (en) Human body model construction method, device, equipment and storage medium
CN110491372A (en) A kind of feedback information generating method, device, storage medium and smart machine
CN109271637A (en) A kind of semantic understanding method and device
CN109246331A (en) A kind of method for processing video frequency and system
KR102574434B1 (en) Method and apparatus for realtime construction of specialized and lightweight neural networks for queried tasks

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20231023

Address after: Building 2540, Building B7 #~B11 #, Phase II and Phase III, Central Mansion B (Greenland International Plaza), Xinli, No. 1088 Nanhuan City Road, Nanguan District, Changchun City, Jilin Province, 130022

Patentee after: Jilin Huaqingyun Technology Group Co.,Ltd.

Address before: 100085 Baidu Building, 10 Shangdi Tenth Street, Haidian District, Beijing

Patentee before: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY Co.,Ltd.

TR01 Transfer of patent right