CN116383657A - Shadow data enhancement method and device, training method, storage medium and chip - Google Patents

Shadow data enhancement method and device, training method, storage medium and chip Download PDF

Info

Publication number
CN116383657A
CN116383657A CN202310377862.5A CN202310377862A CN116383657A CN 116383657 A CN116383657 A CN 116383657A CN 202310377862 A CN202310377862 A CN 202310377862A CN 116383657 A CN116383657 A CN 116383657A
Authority
CN
China
Prior art keywords
shadow
data
pixel
noise
event
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310377862.5A
Other languages
Chinese (zh)
Inventor
王凯
邢雁南
乔宁
胡雅伦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chengdu Shizhi Technology Co ltd
Shenzhen Shizhi Technology Co ltd
Original Assignee
Chengdu Shizhi Technology Co ltd
Shenzhen Shizhi Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chengdu Shizhi Technology Co ltd, Shenzhen Shizhi Technology Co ltd filed Critical Chengdu Shizhi Technology Co ltd
Priority to CN202310377862.5A priority Critical patent/CN116383657A/en
Publication of CN116383657A publication Critical patent/CN116383657A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computational Linguistics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Processing (AREA)

Abstract

The invention discloses a shadow data enhancement method and device, a training method, a storage medium and a chip. In order to solve the problem that in the actual environment, due to shadow noise, unexpected shadow data is generated in an image sensor, the invention adds the shadow data into sample data before training a pulse neural network, and comprises the steps of determining a pixel area meeting preset conditions based on the sample data to obtain a target range; and selecting at least one pixel point outside the target range as a shadow center point to enhance the shadow data. By the method, the training data containing shadow noise is obtained in a diversified manner, so that the impulse neural network can adapt to complex and changeable actual working environments, and has better precision and stability. The invention is suitable for the field of event cameras or brain-like computing.

Description

Shadow data enhancement method and device, training method, storage medium and chip
Technical Field
The invention relates to a shadow data enhancement method and device, a training method, a storage medium and a chip, in particular to a data enhancement method, a training method, a storage medium and a chip for enabling a pulse neural network to have stronger accuracy, robustness and universality.
Background
The pulse neural network (SNN) is known as a third generation neural network, imitates the brain operation principle, has event-driven characteristics and rich space-time dynamics characteristics, and has low calculation cost and low power consumption. It is worth mentioning that neuromorphic hardware or brain-like chips are non-von neumann architectures that do not perform the various mathematical/procedural function calculations in the traditional sense based on computer programs.
The imaging device based on the event is a novel bionic visual sensor, such as a neuromorphic visual sensor, such as an event camera (event camera), a dynamic visual sensor (DVS, DAVIS), a fusion sensor based on the event imaging, and the like, and the event camera is taken as an example, but not limited to this. Unlike a conventional frame image sensor (such as an APS sensor), an event camera does not capture images at a fixed rate, each pixel of which operates independently, and outputs an ON event (increase in light intensity) or an OFF event (decrease in light intensity) when the intensity of light changes beyond a certain threshold according to the perceived change in light, as described in prior art 1: EP3731516A1.
The event camera captures change/motion information in the scene, and the output events typically include a time stamp of the event generation (accurate to us/ns), pixel coordinates (x, y) at which the event was generated, and the polarity of the event (light intensity is bright or dark, or the value of the pixel's photovoltage, i.e., gray scale value), where the polarity of the event is negligible in some cases. The event camera generates an event based on the change of the light intensity, the output of the event camera is positive and negative, but the output of the event camera is not strong, and for an artificial neural network and a deep neural network, the reason for the event generation cannot be analyzed, and wrong characteristic information can be provided to influence the training result.
Based on the sensing and calculating scheme of the combination of the event camera (or the pulse sequence obtained by the frame image through the difference frame) and the pulse neural network (SNN), the intelligent sensing and calculating integrated solution with low power consumption (which can be as low as milliwatts) and high real-time performance (which can be as high as microseconds) can be provided, and the intelligent sensing and calculating integrated solution is applied to terminal scenes such as edge calculation, the Internet of things and the like, and the intelligent terminal is realized under the condition of not networking. Event streams generated by event/spike-based imaging devices are a type of dataset most suitable for SNN applications at present, however, current event camera-based datasets, such as neurogenic-MNIST, DVS-gestme, etc., are relatively small in scale and not abundant in application scene, and how to develop SNN training datasets that are large-scale or fit for actual use scenes is a difficult problem in industry.
The imaging principle of the event camera is that each pixel receives light from a corresponding physical space position, generates an event data based on the light change, has very high sensitivity to the light change, and thus, easily captures external minor interference as noise event. In particular, when a user is irradiated with a light source, if shadows are generated on a standing floor, a close desk surface, a wall surface, a mirror, or the like, if the generated shadows are within the field of view of an event camera, event data is also generated by shadows existing on the floor, the wall surface, or the like when the user performs a certain action. The position, size of the shadow is affected by the distance and angle of the user, light source and sensor (also called lens). In an actual application scene, because the shadow appears in the field of view very commonly, the stability and the robustness of the neural form hardware reasoning accuracy of the network configuration parameters obtained based on the training data set are not ideal enough, and the reasoning error is easy to cause. Moreover, due to the complexity and variability of the environment, those skilled in the art expect that the neuromorphic hardware can have the same precision as the training model in the actual environment and better environment adaptability, the recorded training data cannot cover all possible shadow positions and sizes, so that the data about the shadow needs to be enhanced on the training data to obtain diversified training data containing the shadow, so that the model has better robustness and environment adaptability.
Disclosure of Invention
In order to solve or alleviate some or all of the above technical problems, the present invention is implemented by the following technical solutions:
a shadow data enhancement method, consider the influence of shadow noise, increase the shadow data in the sample data, comprising the following steps:
determining a pixel area meeting preset conditions based on sample data to obtain a target range;
selecting at least one pixel point outside the target range as a shadow center point to enhance the shadow data;
wherein the preset condition is one of the following conditions:
i) In the continuous area, the pixel value of any pixel unit or the number of generated pulse events is larger than or equal to a first threshold value;
ii) the number of pixel cells within the continuous region having a pixel value or number of generated pulse events greater than or equal to a first threshold is greater than or equal to a second threshold.
In some embodiments, a pixel region generating shadow noise is obtained based on a preset shadow size or/and shape with the shadow center point as the center;
and setting the pixel values of all pixel units in the pixel region generating the shadow noise or the quantity of the generated shadow events as a first value, and rotating the pixel region generating the shadow noise by a preset angle so as to obtain shadow data.
In some embodiments, a pixel region generating shadow noise is obtained based on a preset shadow size or/and shape with the shadow center point as the center;
and randomly setting the pixel values of more than half of pixel units in the pixel area generating the shadow noise or the number of generated shadow events as a first value, and rotating the pixel area generating the shadow noise by a preset angle so as to obtain shadow data.
In certain embodiments, the first value is less than the first threshold.
In certain embodiments, the first value is greater than or equal to a third threshold.
In some embodiments, if the sample data is not in an event frame format, the sample data is preprocessed to convert the sample data into an event frame; the target range is obtained based on the pixel values of the pixel cells in the event frame or the number of generated pulse events.
In some embodiments, the shadow size is adjusted by one of the following ways to adjust the pixel area that produces shadow noise:
i) The size of the pixel region where shadow noise is generated is inversely proportional to the distance between the shadow and the event camera; the distance is an abscissa distance or/and an ordinate distance;
ii) the size of the pixel area where shadow noise is generated is inversely proportional to the light source size;
iii) The size of the pixel area where shadow noise is generated is inversely proportional to the degree to which the user or motion or object is perpendicular to the reflective surface.
In some embodiments, the magnitude of the first value is proportional to the speed of movement of the user or action or object.
In some embodiments, the number of pixel elements set to the first value in the shadow noise producing pixel area is proportional to the speed of motion of the user or motion or object.
In some embodiments, the shadow data is added to the event frame data corresponding to the sample data to complete the shadow data augmentation, or the pixel values of the pixel units in the sample data corresponding to the shadow data coordinates are directly updated based on the shadow data.
In certain embodiments, the shape is circular or rectangular.
In some embodiments, at least one pixel point on the left or right or lower side outside the target range is selected as a shadow center point within the field of view.
An apparatus for data augmentation using shadows, comprising:
the judging module is used for determining a pixel area meeting preset conditions based on the sample data set to obtain a target range;
the selection module is coupled with the judgment module, selects a pixel point outside the target range as a shadow center point, and selects a shadow size and a shadow shape;
the enhancement module is coupled with the selection module and is used for enhancing the shadow data by taking the shadow center point as the center based on the parameters determined by the selection module;
wherein the preset condition is one of the following conditions:
i) In the continuous area, the pixel value of any pixel unit or the number of generated pulse events is larger than or equal to a first threshold value;
ii) the number of pixel cells within the continuous region having a pixel value or number of generated pulse events greater than or equal to a first threshold is greater than or equal to a second threshold.
In some embodiments, the enhancement module obtains a pixel region that generates shadow noise with the shadow center point as a center based on the parameters determined by the selection module;
setting the pixel value of more than half of pixel units in the pixel area generating shadow noise or the quantity of generated shadow events as a first value, and rotating the pixel area generating shadow noise by a preset angle so as to obtain shadow data.
In certain embodiments, the first value is less than the first threshold.
In certain embodiments, the first value is greater than or equal to a third threshold.
In some embodiments, the magnitude of the first value is proportional to the speed of movement of the user or action or object.
In some embodiments, the number of pixel elements set to the first value in the shadow noise producing pixel area is proportional to the speed of motion of the user or motion or object.
In some embodiments, if the sample data is not in an event frame format, the apparatus for enhancing data using shadows further comprises a preprocessing module;
the preprocessing module is coupled between the sample data and the judging module and is used for converting the sample data into event frames.
The pulse neural network training method is based on the shadow data enhancement method, and at least one data enhancement is carried out on sample data of a pulse neural network training set; and training the impulse neural network based on the enhanced training set.
A storage medium having computer code stored thereon that, when executed, implements the shadow data enhancing method as described above.
A chip comprising a impulse neural network processor deployed with optimal configuration parameters obtained using the impulse neural network training method as described above.
In certain embodiments, the chip is a brain-like chip or a neuromimetic chip, with an event triggering mechanism.
In certain embodiments, the chip includes an image sensor integrated with the impulse neural network processor or coupled through an interface.
An electronic product provided with a chip as described above.
Some or all embodiments of the present invention have the following beneficial technical effects:
1) According to the invention, the shadow-containing training data is obtained by performing data enhancement on the training data, so that the pulse neural network can adapt to complex and changeable actual working environments, and has better precision and stability.
2) The method is simple and easy to implement, and effectively solves the corresponding problems based on the fact that the shadow exists in the field of view of the sensor in the actual application scene.
3) The invention has the advantage when the training data amount is less, and based on the relative positions of the user and the event camera, the relative positions (distance and angle) of the user, the light source and the reflecting surface and the action speed, the invention obtains rich and various training data.
Further advantageous effects will be further described in the preferred embodiments.
The above-described technical solutions/features are intended to summarize the technical solutions and technical features described in the detailed description section, and thus the ranges described may not be exactly the same. However, these new solutions disclosed in this section are also part of the numerous solutions disclosed in this document, and the technical features disclosed in this section and the technical features disclosed in the following detailed description section, and some contents in the drawings not explicitly described in the specification disclose more solutions in a reasonable combination with each other.
The technical scheme combined by all the technical features disclosed in any position of the invention is used for supporting the generalization of the technical scheme, the modification of the patent document and the disclosure of the technical scheme.
Drawings
FIG. 1 is a schematic diagram illustrating a shadow phenomenon according to an embodiment of the present invention;
FIG. 2 is a diagram illustrating a pulsed neural network training and parameter deployment map in accordance with an embodiment of the present invention;
FIG. 3 is a diagram illustrating shadow data in various situations in an embodiment of the invention;
FIG. 4 is a diagram of shadow data in another embodiment of the invention;
FIG. 5 is a flow chart of shadow data augmentation in an embodiment of the invention;
FIG. 6 is a diagram of several examples of shadow data augmentation in accordance with an embodiment of the present invention;
FIG. 7 is a diagram of shadow data enhancement in accordance with a preferred embodiment of the present invention;
FIG. 8 is a block diagram of shadow data augmentation in an embodiment of the invention.
Detailed Description
Since various alternatives are not exhaustive, the gist of the technical solution in the embodiment of the present invention will be clearly and completely described below with reference to the drawings in the embodiment of the present invention. Other technical solutions and details not disclosed in detail below, which generally belong to technical objects or technical features that can be achieved by conventional means in the art, are limited in space and the present invention is not described in detail.
Except where division is used, any position "/" in this disclosure means a logical "or". The ordinal numbers "first", "second", etc., in any position of the present invention are used merely for distinguishing between the labels in the description and do not imply an absolute order in time or space, nor do they imply that the terms preceded by such ordinal numbers are necessarily different from the same terms preceded by other ordinal terms.
The present invention will be described in terms of various elements for use in various combinations of embodiments, which elements are to be combined in various methods, products. In the present invention, even if only the gist described in introducing a method/product scheme means that the corresponding product/method scheme explicitly includes the technical feature.
The description of a step, module, or feature in any location in the disclosure does not imply that the step, module, or feature is the only step or feature present, but that other embodiments may be implemented by those skilled in the art with the aid of other technical means according to the disclosed technical solutions. The embodiments of the present invention are generally disclosed for the purpose of disclosing preferred embodiments, but it is not meant to imply that the contrary embodiments of the preferred embodiments are not intended to cover all embodiments of the invention as long as such contrary embodiments are at least one technical problem addressed by the present invention. Based on the gist of the specific embodiments of the present invention, a person skilled in the art can apply means of substitution, deletion, addition, combination, exchange of sequences, etc. to certain technical features, so as to obtain a technical solution still following the inventive concept. Such solutions without departing from the technical idea of the invention are also within the scope of protection of the invention.
Pulsed neural network (Spiking neural network, SNN): one of the event-driven neuromorphic chips is a third-generation artificial neural network, has rich space-time dynamics characteristics, various coding mechanisms and event-driven characteristics, and is low in calculation cost and low in power consumption. The type of the impulse neural network is not particularly limited, and the impulse neural network can be built according to actual application scenes, such as an impulse convolutional neural network (SCNN), an impulse cyclic neural network (SRNN) and the like, as long as the impulse signal or event-based driving neural network is suitable for the sound source orientation method provided by the embodiment of the invention.
Event cameras, event driven image sensors, also known as Dynamic Vision Sensors (DVS). Based on this principle, there are some solutions to fuse it with the pixels of the conventional frame image, and the obtained sensor can output both the event and the pixel brightness, such as a DAVIS sensor and an ATIS sensor, and these event-based sensors (EBS) are collectively referred to as event imaging devices in the present invention, which belong to one of the sensors. The invention discloses a shadow data enhancement scheme by taking an event camera as an example.
FIG. 1 is a schematic diagram of a shadow phenomenon in an embodiment of the present invention, wherein the shadow phenomenon includes sample data and real-time data obtained by signal acquisition in a real environment where the shadow exists. Sample data is usually not rich enough and is recorded in a specific scene, the data is typical, and the imaging quality is usually better. However, in an actual indoor environment, the shadows of the user actions are ubiquitous, and the presence of the shadows causes the event camera to generate undesirable events (or pulses) such as the shadow data shown in the figures. These undesirable events present false "motion" scenes or result in very poor imaging quality, which in turn affects subsequent information processing difficulty and capability.
Because the neuromorphic dataset for SNN training is still in the launch phase, the current SNN sample dataset is not only small but is only directed to a few single tasks. While the data set recorded for the current task is more limited by sample, fund, and time support. The invention is based on the shadow data enhancement, on one hand, the training data is expanded to obtain a rich data set, on the other hand, the obtained data set is fit for the scene (the action and the light source possibly cross) when in actual use, the configuration parameters obtained based on the training of the data set can not only cope with complex task scenes, but also have better universality, and can further improve the reasoning accuracy and stability of SNN or nerve form hardware.
Fig. 2 is a schematic diagram of training and parameter deployment mapping of a pulse neural network according to an embodiment of the present invention, where raw sample data is enhanced by shadow data to obtain a data set with shadow data, and training the pulse neural network using the data set with shadow. Because the position and the size of the shadow are related to the action and the relative position of the light source and the sensor, the invention enhances the shadow data aiming at various situations to obtain a plurality of data sets and uses one or a plurality of training pulse neural networks in the plurality of data sets. Training is to obtain optimal network configuration parameters for given sample data (training set or test set), based on which the SNN can output a result matching the input sample for any given input. Note that the chip or neuromorphic hardware configured with the pulsed neural network does not follow the traditional "von neumann architecture", and therefore there is no "instruction" (computer program) concept either. Typically, the SNN performs training and reasoning on different devices, the network training is performed on a training device (e.g., a high performance CPU, GPU), the SNN on the training device is a simulation of the on-chip SNN, and configuration parameters obtained by training are mapped to the on-chip SNN. Furthermore, the training and execution of the SNN may be performed on the same device, e.g., the neuromorphic hardware includes a training mode and an inference mode, performing on-chip learning.
As can be seen from a number of tests using real-world chips, shadow events can be broadly divided into two shapes, circular (including oval) or rectangular (including square or rectangle).
In general, the light source and the reflecting surface are fixed in position, and it is assumed that both motion and shadows are within the field of view of the event camera. For small light sources (or point light sources), the motion may completely block the light source, and the shadow is large and relatively blurred, i.e. the pixel area generating shadow noise events is large, but the number of shadow noise events generated by each pixel is small, and as the distance between the motion and the light source increases, the shadow gradually decreases and the definition increases, i.e. the pixel area generating shadow noise events is small and the number of shadow noise events generated by each pixel is large. For a large light source, the probability of shadow is smaller than that of a small light source, and under the same condition (such as the same distance and angle of a hand and the light source), the range and definition of shadow imaging under the large light source are smaller than those under the small light source.
In addition, experiments prove that for event frames or pulse sequences converted from a difference frame image generated based on a traditional frame image sensor, a shadow phenomenon also exists, wherein the shadow phenomenon is caused by imaging an action on a reflecting surface, so that the light intensity at the corresponding position of the reflecting surface is changed, and the pixel value of a pixel point at the shadow of the reflecting surface is influenced.
Fig. 3 is a schematic diagram of shadow data under different situations in an embodiment of the present invention, where (a) in fig. 3 is an imaging effect after an event generated by an event camera is framed when no shadow exists. Fig. 3 (b) and fig. 3 (c) show that the light source forms a shadow on the wall surface in front of the user, and when the arm is swung, the imaging effect after the event frames generated by the event camera is obvious from the shadow effect, the shadow on the wall surface causes the event camera to generate shadow noise events or shadow data, and the undesirable shadow data cause extremely poor imaging quality, thereby affecting the subsequent information processing difficulty and capability. Fig. 3 (b) and fig. 3 (c) show imaging effects after an event frame generated by an event camera when the light source is in front of the user and the wall behind the user swings the arm at different speeds, and fig. 3 (b) shows imaging effects after a shadow noise event generated by the event camera after the event frame occurs when the speed is slower. In fig. 3 (c), when the speed is high, the shadow noise event generated by the event camera is imaged after the wall shadow is framed.
FIG. 4 is a diagram of shadow data in another embodiment of the invention. The shadow data, which is also referred to as shadow noise data, includes information such as the size and shape of a pixel region in which a shadow noise event is generated, and the amount of shadow noise generated in the pixel region. The shadow data is limited not only by the relative distance of the shadow to the event camera, but also by the light source size, the relative distance and angle of the hand, light source and reflective surface, and the speed/intensity of movement. Other things being equal, it is illustratively described herein that shadow data is affected by various environmental factors as follows:
similar to imaging after an event frame generated by an event camera is triggered by a relative distance of a user/action and the event camera, the farther the relative distance of the shadow and the event camera is within the field of view of the event camera, the less shadow noise imaging after the frame is compressed, and vice versa. Wherein the relative distance is an abscissa or/and an ordinate distance.
The farther the user/action is from the light source, the smaller and clearer the shadow on the reflecting surface, the smaller the pixel range of shadow noise generated by the event camera and the denser the events in the noise area (i.e. the more the number of shadow noise events generated by the pixel units in the noise area), otherwise, the closer the distance between the user and the light source, the larger the shadow noise range and the sparser the events in the noise area.
The larger the light source, the smaller the pixel range that produces shadow noise and vice versa.
The greater the range of shadow noise, the user action/hand is parallel to the reflective surface. As the motion and the reflecting surface become more perpendicular, the shadow noise range becomes smaller.
The faster the motion, the faster the light changes, the more and more dense the events are generated in the noise region, as shown in fig. 3 (c), and as the motion becomes slower, the degree of light change becomes slower, and the events in the noise region become smaller and thinner, as shown in fig. 3 (b).
In addition, the number of noise events or pixel values generated by the pixel points in the pixel area corresponding to the shadow is basically the same as that of the pixel points in the pixel area corresponding to the shadow, and is smaller than that of the effective events or pixel values generated by the pixel points in the pixel area corresponding to the real action.
FIG. 5 is a flow chart of shadow data augmentation in an embodiment of the present invention, comprising the steps of:
s101, preprocessing.
The sample data set may be a difference frame image set, or an event stream. And if the sample data is the event stream, framing the event stream data. If the sample data is in the form of difference frame data, the difference frame data is converted into event frame data. If the sample data is event frame data, this step is skipped.
For event streams directly generated by event cameras and the like, events within a time window are compressed to generate event frames. An event frame is an aggregate of events within a time window. The mode of generating the event frame comprises a single channel event frame or a double channel event frame, wherein the single channel event frame is as follows: in the time window, neglecting the event polarity, overlapping all the time stamped events of each pixel point, or selecting to overlap all the time stamped ON events (light intensity increase) or OFF events (light intensity decrease) of each pixel point, wherein the two-channel event frame is as follows: and in the time window, overlapping all the time-stamped events of each pixel point according to the polarities of the events. The present invention is not limited in the manner in which the various event frames are generated.
S102, determining a pixel area meeting preset conditions, and obtaining a target range. For example, on event frame data based on a sample data set, a set of pixels satisfying a preset condition is searched.
In an embodiment, the target range is determined based on pixel values of the pixels. And searching for a continuous area with the pixel value of the pixel unit or the number of generated events being greater than or equal to a first threshold value on the event frame data to obtain a target range. The target range is a pixel area corresponding to the actual action captured by the event camera.
In a preferred embodiment, in the area adjacent to the address (horizontal or/and vertical coordinates), the set of pixels having the pixel value of the pixel unit greater than or equal to the first threshold and the number of pixels greater than or equal to the second threshold is the target range.
S103, searching a region for adding shadow data based on the target range, and enhancing the shadow data.
The pixel point at any one of the coordinates on the left, right and lower sides of the target range is the center point for performing the shadow data enhancement. In addition, in order to increase the sample data amount, one or more pixel units at any position outside the target range may be selected as a center point of shadow data enhancement, and in the embodiment, a shadow center point is taken as an example, but not limited thereto.
Setting the pixel values of all pixel units in a shadow corresponding pixel area (also referred to herein as a shadow area for short) or the number of generated shadow events to be a first value based on a preset shadow size or/and shape with the shadow center point as a center, and then rotating the shadow area by a preset angle to generate a set of shadow data. In other embodiments, the pixel value or the number of the generated pulse events of more than half of the pixel units in the pixel area corresponding to the shadow is set to the first value at random, and the pixel area corresponding to the shadow is rotated by a preset angle, so as to obtain the shadow data.
The shadow data can be stored first, then the shadow data is added into the sample data graph to complete the shadow data enhancement graph, and the sample data graph can be directly updated to obtain an enhanced data set.
By adjusting the size or shape parameters of the shadow noise, a new set of shadow data is obtained. And carrying out shadow data enhancement on the sample data of the training set by using one or more groups of shadow data corresponding to different shadow sizes or shapes to obtain one or more new sample data sets.
Since the distance between the shadow and the event camera is inversely proportional to the size of the shadow data, the pixel area of the shadow noise data is smaller as the shadow and event camera are further apart, whereas the pixel area of the shadow noise data is larger. And the factors such as the relative distance between the light sources with different sizes, the angle between the light sources and the hand, the light source shielding condition and the like can influence the shadow shape, the size and the definition, so that the invention randomly selects the center point or/and sets the shadow size or/and the shadow shape to simulate various different use scenes.
The shadow shape may be any shape such as a circle (including quasi-circle), a rectangle (square, rectangle), etc., and for simplicity of operation, the shadow shape of the present invention is preferably a square or a rectangle, but is not limited thereto.
In some embodiments, the shadow is set as a rectangle, the width and height of the rectangle are randomly selected from a preset width and height parameter, the pixel values of at least half of the pixel units in the rectangular area or the number of generated shadow noise events are randomly set to be the same, and the rectangle is rotated by a preset angle to generate shadow data. In addition, the present invention is not limited to the order of the step of setting the pixel value of the pixel in the rectangular area or the number of the generated shadow noise event to the first value and the step of rotating the rectangle, and may be performed in parallel.
Taking the sample data as shown in (a) of fig. 3, if a larger first threshold is selected, determining an arm region with obvious motion, namely a target range, based on the first threshold, selecting a position below the target range as a central point for enhancing shadow data, setting the size of a rectangular shadow, setting the pixel values of all pixels in the rectangular region as a first numerical value, and rotating the rectangle by a preset angle to generate shadow data. Similarly, if a smaller first threshold is selected, a target range determined based on the first threshold is a range in which the whole humanoid form is located, and likewise, a position at the lower right side outside the target range is selected as a center point for performing shadow data enhancement, the size of a rectangular shadow is set, the pixel values of all pixels in the rectangular region are set as a first numerical value, and the rectangle is rotated by a preset angle to generate shadow data. In a preferred embodiment, the first value is less than a first threshold value.
The size of the rectangle is adjusted to simulate scenes corresponding to shadows with different sizes, and the size of the first numerical value is adjusted to simulate scenes corresponding to shadows with different intensity degrees. In some embodiments, the first value is greater than or equal to a third threshold. Preferably, the first value is greater than or equal to a third threshold value, and the first value is less than the first threshold value.
Figure 6 is a diagram of several examples of shadow data augmentation in accordance with an embodiment of the present invention. The invention determines a pixel point set meeting preset conditions, namely a target range. One or more pixel units at any position within the target range are selected as the center point for shadow data augmentation. The size and shape of the shadow noise are set based on or acting on the relative position (distance and angle) of the light source and the distance of the shadow and the event camera, the same pixel value is set for all pixels in the shadow shape (such as rectangle) area with the shadow center point as the center, and the shadow area is rotated by one angle to generate shadow data. The (a) in fig. 6 to (d) in fig. 6 respectively show the shadow data enhancement performed under different situations, and the training data subjected to the shadow enhancement not only can conform to the shadow phenomenon existing in the actual use, but also can cover various possible situations, so that the diversity of the training data is improved, and the training network has better robustness and universality.
FIG. 7 is a diagram illustrating shadow data enhancement in accordance with a preferred embodiment of the present invention. The event generated by the event camera includes coordinate information and time information, i.e., pixel coordinates at which the event was generated and a time stamp at which the event was generated. Where the coordinates are two bits (x, y) to indicate that the sensor is a two-dimensional sensor, but may be a one-dimensional sensor that generates an event, such as an audio sensor, a vibration sensor, etc., or a more-dimensional sensor, to which the present invention is not limited. Wherein, the first pixel with coordinates (x, y) generates an event e at time t and generates an event e-1 at time t-1. The target range is obtained based on a preset condition, as shown in (a) in fig. 7. And (3) taking a pixel point set with the number of the events generated by each pixel unit being greater than or equal to a first threshold value (5) as a target area, selecting one pixel point outside the target area as a central point for carrying out shadow data enhancement, setting the number of noise events generated by all pixel points in a rectangular area as shown in (b) of fig. 7 as 4, rotating the rectangular area to obtain shadow data, and adding the generated shadow data into sample data to complete the shadow data enhancement.
The invention also relates to a pulse neural network training method, which uses the data enhancement method to carry out data enhancement on the sample data of the pulse neural network training set at least once; and training the impulse neural network based on the enhanced training set.
FIG. 8 is a block diagram of shadow data enhancement in an embodiment of the present invention, including a determination module, a selection module, and an enhancement module coupled in sequence. The judging module determines a pixel area meeting a preset condition based on the sample data set to obtain a target range, for example, searching for a continuous area with the pixel value of the pixel unit being greater than or equal to a first threshold value. And the selection module is used for selecting a pixel point in the target range as a center point of shadow noise and selecting the size and shape of the shadow. The enhancing module sets the same pixel value or the same shadow event number around the center point of the shadow noise based on the parameters determined by the selecting module, and the pixel area generating the shadow noise is rotated by a preset angle to generate shadow data, and the generated shadow data is added into sample data to complete the enhancement of the shadow data.
In a preferred embodiment, the system further includes a preprocessing module coupled between the sample data set and the judging module for converting the sample data into event frame data. If the sample data is event frame data, the process is skipped.
The invention also relates to a chip comprising a impulse neural network processor deployed with optimal configuration parameters obtained using the impulse neural network training method as described above.
In a preferred embodiment, the chip is a brain-like chip or a neuromimetic chip, having event triggering characteristics.
In another preferred embodiment, the chip includes an event imaging based sensor and a pulsed neural network processor, either integrated together or coupled by an interface. The chip is deployed with network configuration parameters obtained by training based on a plurality of groups (50 groups) of data sets with different shadow data enhancement, and the chip is used for sensing and calculating data in an actual application scene, so that the performance of the chip is better, and the stability and the adaptability are better.
In another preferred embodiment, the chip includes a frame image sensor integrated with the impulse neural network processor or coupled through an interface.
The invention also relates to an electronic product provided with the chip.
Although the present invention has been described with reference to specific features and embodiments thereof, various modifications, combinations, substitutions can be made thereto without departing from the invention. The scope of the present application is not intended to be limited to the particular embodiments of the process, machine, manufacture, composition of matter, means, methods and steps described in the specification, but rather, the methods and modules may be practiced in one or more products, methods, and systems of the associated, interdependent, inter-working, pre/post stages.
The specification and drawings are, accordingly, to be regarded in an abbreviated manner as an introduction to some embodiments of the technical solutions defined by the appended claims and are thus to be construed in accordance with the doctrine of greatest reasonable interpretation and are intended to cover as much as possible all modifications, changes, combinations or equivalents within the scope of the disclosure of the invention while also avoiding unreasonable interpretation.
Further improvements in the technical solutions may be made by those skilled in the art on the basis of the present invention in order to achieve better technical results or for the needs of certain applications. However, even if the partial improvement/design has creative or/and progressive characteristics, the technical idea of the present invention is relied on to cover the technical features defined in the claims, and the technical scheme shall fall within the protection scope of the present invention.
The features recited in the appended claims may be presented in the form of alternative features or in the order of some of the technical processes or the sequence of organization of materials may be combined. Those skilled in the art will readily recognize that such modifications, changes, and substitutions can be made herein after with the understanding of the present invention, by changing the sequence of the process steps and the organization of the materials, and then by employing substantially the same means to solve substantially the same technical problem and achieve substantially the same technical result, and therefore such modifications, changes, and substitutions should be made herein by the equivalency of the claims even though they are specifically defined in the appended claims.
The steps and components of the embodiments have been described generally in terms of functions in the foregoing description to clearly illustrate this interchangeability of hardware and software, and in terms of various steps or modules described in connection with the embodiments disclosed herein, may be implemented in hardware, software, or a combination of both. Whether such functionality is implemented as hardware or software depends upon the particular application or design constraints imposed on the solution. Those of ordinary skill in the art may implement the described functionality using different approaches for each particular application, but such implementation is not intended to be beyond the scope of the claimed invention.

Claims (10)

1. A shadow data enhancement method is characterized in that:
adding shadow data to the sample data taking into account the effect of shadow noise, comprising the steps of:
determining a pixel area meeting preset conditions based on sample data to obtain a target range;
selecting at least one pixel point outside the target range as a shadow center point to enhance the shadow data;
wherein the preset condition is one of the following conditions:
i) In the continuous area, the pixel value of any pixel unit or the number of generated pulse events is larger than or equal to a first threshold value;
ii) the number of pixel cells within the continuous region having a pixel value or number of generated pulse events greater than or equal to a first threshold is greater than or equal to a second threshold.
2. The shadow data enhancement method of claim 1, wherein:
taking the shadow center point as a center, and obtaining a pixel area generating shadow noise based on a preset shadow size or/and shape;
and randomly setting the pixel values of more than half of pixel units in the pixel area generating the shadow noise or the number of generated shadow events as a first value, and rotating the pixel area generating the shadow noise by a preset angle so as to obtain shadow data.
3. A shadow data enhancing method according to claim 2 or 3, wherein:
if the sample data is not in the event frame format, preprocessing the sample data and converting the sample data into an event frame; the target range is obtained based on the pixel values of the pixel cells in the event frame or the number of generated pulse events.
4. A shadow data enhancing apparatus, comprising:
the judging module is used for determining a pixel area meeting preset conditions based on the sample data set to obtain a target range;
the selection module is coupled with the judgment module, selects a pixel point outside the target range as a shadow center point, and selects a shadow size and a shadow shape;
the enhancement module is coupled with the selection module and is used for enhancing the shadow data by taking the shadow center point as the center based on the parameters determined by the selection module;
wherein the preset condition is one of the following conditions:
i) In the continuous area, the pixel value of any pixel unit or the number of generated pulse events is larger than or equal to a first threshold value;
ii) the number of pixel cells within the continuous region having a pixel value or number of generated pulse events greater than or equal to a first threshold is greater than or equal to a second threshold.
5. The apparatus for data enhancement using shadow as recited in claim 4, wherein:
the enhancement module is used for obtaining a pixel area for generating shadow noise by taking the shadow center point as the center based on the parameters determined by the selection module;
setting the pixel value of more than half of pixel units in the pixel area generating shadow noise or the quantity of generated shadow events as a first value, and rotating the pixel area generating shadow noise by a preset angle so as to obtain shadow data.
6. The shadow data enhancing apparatus of claim 4 or 5, wherein:
the magnitude of the first value is proportional to the speed of movement of the user or the action or the object.
7. A training method, characterized in that:
performing at least one data enhancement on sample data of a training set of impulse neural networks based on the shadow data enhancement method of any one of claims 1 to 3;
and training the impulse neural network based on the enhanced training set.
8. A storage medium, characterized by:
the storage medium has stored thereon computer code, characterized by: a shadow data enhancing method according to any one of claims 1 to 3, implemented by executing the computer code.
9. A chip, characterized in that:
comprising a impulse neural network processor deployed with optimal configuration parameters obtained using the impulse neural network training method of claim 7.
10. An electronic product, characterized in that:
the electronic product is provided with the chip as claimed in claim 9.
CN202310377862.5A 2023-03-31 2023-03-31 Shadow data enhancement method and device, training method, storage medium and chip Pending CN116383657A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310377862.5A CN116383657A (en) 2023-03-31 2023-03-31 Shadow data enhancement method and device, training method, storage medium and chip

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310377862.5A CN116383657A (en) 2023-03-31 2023-03-31 Shadow data enhancement method and device, training method, storage medium and chip

Publications (1)

Publication Number Publication Date
CN116383657A true CN116383657A (en) 2023-07-04

Family

ID=86965353

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310377862.5A Pending CN116383657A (en) 2023-03-31 2023-03-31 Shadow data enhancement method and device, training method, storage medium and chip

Country Status (1)

Country Link
CN (1) CN116383657A (en)

Similar Documents

Publication Publication Date Title
Wu et al. Squeezesegv2: Improved model structure and unsupervised domain adaptation for road-object segmentation from a lidar point cloud
Gehrig et al. Video to events: Recycling video datasets for event cameras
Li et al. Asynchronous spatio-temporal memory network for continuous event-based object detection
Baldwin et al. Event probability mask (epm) and event denoising convolutional neural network (edncnn) for neuromorphic cameras
JP7383616B2 (en) Methods for outputting signals from event-based sensors, and event-based sensors using such methods
CN111031266B (en) Method, system and medium for filtering background activity noise of dynamic visual sensor based on hash function
CN113424516B (en) Method of processing a series of events received asynchronously from a pixel array of an event-based light sensor
JP2021523347A (en) Reduced output behavior of time-of-flight cameras
WO2020102554A1 (en) Deep neural network pose estimation system
CN114418073B (en) Impulse neural network training method, storage medium, chip and electronic product
CN111064865A (en) Background activity noise filter of dynamic vision sensor and processor
Kogler et al. Address-event based stereo vision with bio-inspired silicon retina imagers
CN114245007A (en) High frame rate video synthesis method, device, equipment and storage medium
TW202111662A (en) Motion detection method and motion detection system
CN111798513A (en) Synthetic aperture imaging method and system based on event camera
Guo et al. HashHeat: An O (C) complexity hashing-based filter for dynamic vision sensor
Zubic et al. State space models for event cameras
CN114049483A (en) Target detection network self-supervision training method and device based on event camera
CN116383657A (en) Shadow data enhancement method and device, training method, storage medium and chip
CN116051429B (en) Data enhancement method, impulse neural network training method, storage medium and chip
Bai et al. Accurate and efficient frame-based event representation for aer object recognition
CN113516676B (en) Angular point detection method, impulse neural network processor, chip and electronic product
CN112529943B (en) Object detection method, object detection device and intelligent equipment
Zhao et al. Event-based real-time moving object detection based on imu ego-motion compensation
CN115719314A (en) Smear removing method, smear removing device and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination