CN111695608A - Data expansion method for preserving original sample distribution characteristics - Google Patents

Data expansion method for preserving original sample distribution characteristics Download PDF

Info

Publication number
CN111695608A
CN111695608A CN202010458307.1A CN202010458307A CN111695608A CN 111695608 A CN111695608 A CN 111695608A CN 202010458307 A CN202010458307 A CN 202010458307A CN 111695608 A CN111695608 A CN 111695608A
Authority
CN
China
Prior art keywords
data
sample
distribution characteristics
micro
interval
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010458307.1A
Other languages
Chinese (zh)
Other versions
CN111695608B (en
Inventor
唐樟春
梁堃
李顺
谢葭
杨宗承
李征泰
李贵杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Electronic Science and Technology of China
Original Assignee
University of Electronic Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Electronic Science and Technology of China filed Critical University of Electronic Science and Technology of China
Priority to CN202010458307.1A priority Critical patent/CN111695608B/en
Publication of CN111695608A publication Critical patent/CN111695608A/en
Application granted granted Critical
Publication of CN111695608B publication Critical patent/CN111695608B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/10Pre-processing; Data cleansing
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01RMEASURING ELECTRIC VARIABLES; MEASURING MAGNETIC VARIABLES
    • G01R31/00Arrangements for testing electric properties; Arrangements for locating electric faults; Arrangements for electrical testing characterised by what is being tested not provided for elsewhere
    • G01R31/28Testing of electronic circuits, e.g. by signal tracer
    • G01R31/2851Testing of integrated circuits [IC]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/211Selection of the most significant subset of features
    • G06F18/2113Selection of the most significant subset of features by ranking or filtering the set of features, e.g. using a measure of variance or of feature cross-correlation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Hardware Design (AREA)
  • Microelectronics & Electronic Packaging (AREA)
  • Testing Of Individual Semiconductor Devices (AREA)
  • Monitoring And Testing Of Nuclear Reactors (AREA)

Abstract

The invention discloses a data expansion method for preserving original sample distribution characteristics, and belongs to the technical field of micro-system board power supplies and electrical systems. The invention solves the problems of small test data volume and few fitting processing methods of the micro-system power panel. The method effectively reserves and extracts the distribution characteristics of the original data sample, fits the variation condition of the micro-system power panel under the nuclear irradiation stress, can be used as an effective and visual analysis basis for the variation of the micro-system power panel under the nuclear irradiation stress, and provides a statistical processing method better.

Description

Data expansion method for preserving original sample distribution characteristics
Technical Field
The invention belongs to the technical field of micro-system board power supplies and electrical systems, and particularly relates to an algorithm for expanding and processing nuclear radiation stress result data volume samples under the condition based on original data distribution characteristics.
Background
With the rapid development of power electronic technology, typical circuit boards have been widely applied to the fields of household appliances, vehicles, ships and warships, aerospace and the like, bring convenience to countless fields, and play a great role.
The change conditions of the micro-system power panel under different stress conditions are different, the service life and other parameters under different environments are different, the existing method mainly solves the problem of a change processing method of the power panel under common stress, such as high temperature, high humidity, strong current and other environments, and research on the nuclear irradiation environment is less.
The existing traditional method aiming at small sample size expansion mainly comprises mathematical statistical methods such as Bootstap resampling and the like, the principle is that a repeated sampling technology is adopted to extract a certain amount of original samples, a confidence interval and the like can be constructed through variance estimation, the application range of resampling is further extended, the method is a method for processing the resampling of original data values to further estimate true distribution, but the consideration on sample size distribution characteristics is less, and the problem is almost ignored by many existing traditional statistical methods.
Disclosure of Invention
Aiming at the defects of the background technology, the invention solves the problem of small test data result quantity of the micro-system power panel in the irradiation environment on the premise of keeping the distribution characteristics of the original data, and effectively enlarges and processes the original sample quantity data.
The technical scheme of the invention is as follows: a method for expanding data while preserving distribution characteristics of an original sample, the method comprising:
step 1: acquiring microchip voltage data in an irradiation stress environment, and determining a data failure threshold value according to the electrical characteristics of the microchip in the irradiation stress environment;
step 2: screening the microchip voltage data under the irradiation stress environment obtained in the step 1 according to the data failure threshold determined in the step 1;
and step 3: and 2, obtaining a limited data sample set H after screening in the step 2, and expanding the sample data to N according to the range distribution characteristics of all limited data in the H.
Further, the specific method of step 3 is as follows:
step 3.1: and (3) arranging the data from small to large in a limited data sample set H, wherein the arranged set is as follows: h0={a0,a1,...an,an+1| n > 0 }; wherein (a)0,...an+1) Representing the subsample data;
step 3.2: set H according to step 3.10Is prepared from H0Difference of neutron sample:
d1=a1-a0;d2=a2-a1;...dn+1=an+1-an
whereby d isiCalculation formula, expressed as di=ai-ai-1,1≤i≤n+1;
Step 3.3: according to the result of step 3.2, the probability P of falling into each range section is calculatedi
Pi=Wi/∑W
Wherein the content of the first and second substances,
∑W=W1+W2+W3+...+Wn+1,Wi=D/di
wherein the content of the first and second substances,
D=∑d=d1+d2+d3+...+dn+1
step 3.4: collecting the sample data according to the interval probability of the step 3.30Dividing into corresponding probability sample interval sets Q, Q { (0, P)1),(P1,P2),(P2,P3),...,(Pn,Pn+1)};
Step 3.5: produced in the (0,1) rangeGenerating random number Y1
If Y ∈ (0, P)1) Then at H0Of (0, a)1) Generating X in the sub-sample interval1∈(0,a1);
If Y ∈ (P)1,P2) Then at H0A (a) of1,a2) Generating X in the sub-sample interval2∈(a1,a2);
……
If Y ∈ (P)n,Pn+1) Then at H0A (a) ofn,an+1) Generating X in the sub-sample intervalN∈(an,an+1);
The generation of the random number Y in the range of (0,1) is repeated according to the method of step 3.52,Y3… …, until the sample data is expanded to N.
The invention has the beneficial effects that: the problems of small test data volume and few fitting processing methods of the micro-system power panel are solved. The method effectively reserves and extracts the distribution characteristics of the original data sample, fits the variation condition of the micro-system power panel under the nuclear irradiation stress, can be used as an effective and visual analysis basis for the variation of the micro-system power panel under the nuclear irradiation stress, and provides a statistical processing method better.
Drawings
FIG. 1 is a schematic diagram of the principle of screening and rejecting sample size according to the present invention.
FIG. 2 is a comparison of the original sample set and the expanded sample set according to the present invention.
FIG. 3 is a schematic view of step 3.4 of the present invention.
Fig. 4 is a schematic flow chart of the core algorithm principle of the present invention.
FIG. 5 is a diagram of the fitting results of the Matlab Toolbox software package of the present invention.
Detailed Description
Fig. 1 is a schematic diagram of a principle of screening and rejecting a sample size, in an embodiment, a micro power supply TPS54328DDA chip is selected as an object, the power supply outputs a voltage of 5V, and in a large amount of output voltages, a voltage value within a range where a variation range is more obvious is screened, and a result voltage value where other variations are not obvious is not used as processing data.
Through preprocessing, the voltage data table 1 of the micro power supply chip
Data in a range with obvious variation is selected through screening and elimination to serve as an original sample set H, after samples of a limited number of H sets are subjected to sampling screening, the samples are subjected to extended sampling to N groups according to the range distribution characteristics according to the algorithm provided by the invention, and the attached figure 4 is a flow schematic diagram of a core algorithm.
In the limited number of H groups of data, the data are arranged in order of magnitude, and the set after arrangement is H0={a0,a1,...an,an+1|n>0},(a0,...an+1) Representing the subsample data; dividing the subsamples into probability bins, and calculating the probability of each range bin according to steps S32 and S33 as Pi=Wi/∑ W, wherein ∑ W ═ W1+W2+W3+...+Wn+1∑ W ≈ 4062 in the present embodiment, the block interval probability may be divided into corresponding probability interval sample sets Q, Q { (0, P) according to the original data sample set1),(P1,P2),(P2,P3),...,(Pn,Pn+1) Where i is 5;
generating (m +1) random numbers Y within a range of (0,1) per probability subsample intervalm+1(m > 0), and judging Ym+1Corresponding to (m +1) random numbers X generated in the original sample size interval Hm+1(m > 0), the method diagram is shown in FIG. 3, which shows the corresponding generation method diagram of the probability set and the original sample set.
The schematic diagram of the comparison between the expanded sample data size and the original sample set after processing and sample expansion is shown in fig. 2, and it is obvious that the data size greatly expands the number of sub-samples in the sample set under the processing based on the distribution characteristics of the original sample.
And (3) expanding a sampling result set N, performing degradation processing on all data subjected to re-expanding sampling, performing curve fitting in a Matlab Toolbox-based software package, wherein the result after fitting is shown in FIG. 5, and the curve result has a certain reference value for the research of the micro-system power panel in the nuclear radiation stress environment, so that the problem of small data volume is solved well, and the conventional sample volume expanding method is improved.
TABLE 1 preprocessed original sample data set
No. 1 board No. 2 board No. 3 board No. 5 board No. 6 board No. 10 board
4.973V 4.960V 4.992V 5.130V 5.027V 4.969V
4.976V 0.053V 4.992V 5.129V 5.045V 4.971V
4.989V 0.048V 4.994V 5.139V 5.050V 4.976V
0.074V 0.052V 4.813V 5.156V 4.890V 0.230V
0.045V 0.055V 4.790V 0.048V 4.882V 0.221V

Claims (1)

1. A method for expanding data while preserving distribution characteristics of an original sample, the method comprising:
step 1: acquiring microchip voltage data in an irradiation stress environment, and determining a data failure threshold value according to the electrical characteristics of the microchip in the irradiation stress environment;
step 2: screening the microchip voltage data under the irradiation stress environment obtained in the step 1 according to the data failure threshold determined in the step 1;
and step 3: obtaining a limited data sample set H after screening in the step 2, and expanding all limited data in the H to sample data N according to the range distribution characteristics;
the specific method of the step 3 comprises the following steps:
step 3.1: a limited set of data samples H, combining the dataArranging from small to large, wherein the set after arrangement is as follows: h0={a0,a1,...an,an+1| n > 0 }; wherein (a)0,...an+1) Representing the subsample data;
step 3.2: set H according to step 3.10Is prepared from H0Difference of neutron sample:
d1=a1-a0;d2=a2-a1;...dn+1=an+1-an
whereby d isiCalculation formula, expressed as di=ai-ai-1,1≤i≤n+1;
Step 3.3: according to the result of step 3.2, the probability P of falling into each range section is calculatedi
Pi=Wi/∑W
Wherein the content of the first and second substances,
∑W=W1+W2+W3+...+Wn+1,Wi=D/di
wherein the content of the first and second substances,
D=∑d=d1+d2+d3+...+dn+1
step 3.4: collecting the sample data according to the interval probability of the step 3.30Dividing into corresponding probability sample interval sets Q, Q { (0, P)1),(P1,P2),(P2,P3),...,(Pn,Pn+1)};
Step 3.5: generating a random number Y in the range of (0,1)1
If Y ∈ (0, P)1) Then at H0Of (0, a)1) Generating X in the sub-sample interval1∈(0,a1);
If Y ∈ (P)1,P2) Then at H0A (a) of1,a2) Generating X in the sub-sample interval2∈(a1,a2);
……
If Y ∈ (P)n,Pn+1) Then at H0A (a) ofn,an+1) Generating X in the sub-sample intervalN∈(an,an+1);
The generation of the random number Y in the range of (0,1) is repeated according to the method of step 3.52,Y3… …, until the sample data is expanded to N.
CN202010458307.1A 2020-05-27 2020-05-27 Data expansion method for preserving original sample distribution characteristics Active CN111695608B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010458307.1A CN111695608B (en) 2020-05-27 2020-05-27 Data expansion method for preserving original sample distribution characteristics

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010458307.1A CN111695608B (en) 2020-05-27 2020-05-27 Data expansion method for preserving original sample distribution characteristics

Publications (2)

Publication Number Publication Date
CN111695608A true CN111695608A (en) 2020-09-22
CN111695608B CN111695608B (en) 2022-07-29

Family

ID=72478396

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010458307.1A Active CN111695608B (en) 2020-05-27 2020-05-27 Data expansion method for preserving original sample distribution characteristics

Country Status (1)

Country Link
CN (1) CN111695608B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110071965A1 (en) * 2009-09-24 2011-03-24 Yahoo! Inc. System and method for cross domain learning for data augmentation
CN102445712A (en) * 2011-11-22 2012-05-09 成都理工大学 Character window weighting related spectrum matching method facing rocks and minerals
CN103077288A (en) * 2013-01-23 2013-05-01 重庆科技学院 Small sample test data-oriented method for soft measurement and formula decision of multielement alloy material
CN103971024A (en) * 2014-05-26 2014-08-06 华北电力大学(保定) Method for evaluating reliability of relaying protection systems under small sample failure data
CN105677791A (en) * 2015-12-31 2016-06-15 新疆金风科技股份有限公司 Method and system used for analyzing operating data of wind generating set
CN108647272A (en) * 2018-04-28 2018-10-12 江南大学 A kind of small sample extending method based on data distribution
CN111161181A (en) * 2019-12-26 2020-05-15 深圳市优必选科技股份有限公司 Image data enhancement method, model training method, device and storage medium

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110071965A1 (en) * 2009-09-24 2011-03-24 Yahoo! Inc. System and method for cross domain learning for data augmentation
CN102445712A (en) * 2011-11-22 2012-05-09 成都理工大学 Character window weighting related spectrum matching method facing rocks and minerals
CN103077288A (en) * 2013-01-23 2013-05-01 重庆科技学院 Small sample test data-oriented method for soft measurement and formula decision of multielement alloy material
CN103971024A (en) * 2014-05-26 2014-08-06 华北电力大学(保定) Method for evaluating reliability of relaying protection systems under small sample failure data
CN105677791A (en) * 2015-12-31 2016-06-15 新疆金风科技股份有限公司 Method and system used for analyzing operating data of wind generating set
CN108647272A (en) * 2018-04-28 2018-10-12 江南大学 A kind of small sample extending method based on data distribution
CN111161181A (en) * 2019-12-26 2020-05-15 深圳市优必选科技股份有限公司 Image data enhancement method, model training method, device and storage medium

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
JIA XIE等: "Reliability Evaluation Method of Linear Regulated Power Supply In Nuclear Radiation Environment", 《2019 PROGNOSTICS & SYSTEM HEALTH MANAGEMENT CONFERENCE—QINGDAO (PHM-2019 QINGDAO) 》 *
RALITSA B AKINS等: "Stability of response characteristics of a Delphi panel: application of bootstrap data expansion", 《BMC MEDICAL RESEARCH METHODOLOGY》 *
丁飞等: "小样本事件下液压支架可靠性评估", 《煤炭科学技术》 *
毕略等: "基于数据分布的小样本扩充方法及应用", 《控制工程》 *
赵晨旭等: "数据基故障诊断算法更新问题研究", 《国防科技大学学报》 *

Also Published As

Publication number Publication date
CN111695608B (en) 2022-07-29

Similar Documents

Publication Publication Date Title
CN111190088B (en) Method for extracting characteristic parameters of IGBT (insulated Gate Bipolar transistor) performance degradation
CN116609720B (en) Data-driven-based intelligent error compensation method and system for desk-top multimeter
CN107220500B (en) Bayesian reliability evaluation method for performance degradation test based on inverse Gaussian process
CN106908774B (en) One-dimensional range profile identification method based on multi-scale nuclear sparse preserving projection
Ruppeiner et al. Thermodynamic curvature of the multicomponent ideal gas
CN111695608B (en) Data expansion method for preserving original sample distribution characteristics
CN114970157A (en) Method for predicting test life of small sample of electronic product under voltage stress
Ding et al. Spacecraft leakage detection using acoustic emissions based on empirical mode decomposition and support vector machine
CN106951918A (en) A kind of individual particle image clustering method analyzed for Ice mapping
CN103529308A (en) Fuzzy method and equipment for electronic equipment equivalent radiation power test
CN110610203A (en) Electric energy quality disturbance classification method based on DWT and extreme learning machine
CN111626329B (en) Insulation pipe bus fault diagnosis method based on LDA optimizing multi-scale texture characteristics
CN108563889A (en) A kind of sampled analog method of stochastic variable
CN114972330A (en) Workpiece surface roughness detection optimization method based on improved histogram homogenization algorithm
Hartler Parameter estimation for the Arrhenius model
Jiang et al. Recurrence plot quantitative analysis-based fault recognition method of rolling bearing
CN112014821B (en) Unknown vehicle target identification method based on radar broadband characteristics
CN114970601A (en) Power equipment partial discharge type identification method, equipment and storage medium
CN113190728A (en) Oil-immersed transformer fault diagnosis method based on cluster optimization
CN112417709A (en) Dynamic modal analysis method based on schlieren image
CN113556132A (en) Novel improved electric power signal compressed sensing method based on signal singularity detection
Selman et al. on reliability estimation of stress-strength (SS) modified exponentiated Lomax distribution
CN111651948A (en) Parameterized circuit unit delay estimation model and modeling method and system thereof
Kantam et al. Control charts for the log-logistic distribution
Saha et al. Side-sensitive group runs chart for detecting mean shifts using auxiliary information

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant