CN110032745A - Generate the method and apparatus and computer readable storage medium of sensing data - Google Patents

Generate the method and apparatus and computer readable storage medium of sensing data Download PDF

Info

Publication number
CN110032745A
CN110032745A CN201810027130.2A CN201810027130A CN110032745A CN 110032745 A CN110032745 A CN 110032745A CN 201810027130 A CN201810027130 A CN 201810027130A CN 110032745 A CN110032745 A CN 110032745A
Authority
CN
China
Prior art keywords
data
distribution
sequence
tentation
predetermined
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810027130.2A
Other languages
Chinese (zh)
Inventor
孙昊立
张沈斌
孙俊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fujitsu Ltd
Original Assignee
Fujitsu Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fujitsu Ltd filed Critical Fujitsu Ltd
Priority to CN201810027130.2A priority Critical patent/CN110032745A/en
Publication of CN110032745A publication Critical patent/CN110032745A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Evolutionary Computation (AREA)
  • Geometry (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a kind of methods and apparatus and computer readable storage medium for generating sensing data.The described method includes: generating the first data sequence including n the first data based on the predetermined trend of the relationship between the size of characterize data and the sequence of data;Based on scheduled function, to obtain the distribution of the relationship between the size of characterize data and the quantity of data;N the second data are generated according to the distribution and tentation data range;And it is based on the first data sequence, n the second data are reset, so that the trend of the n after resetting second data is close to predetermined trend.

Description

Generate the method and apparatus and computer readable storage medium of sensing data
Technical field
Present invention relates in general to field of image processing, more particularly to a kind of method for generating sensing data and Equipment and computer readable storage medium.
Background technique
Internet of Things is by radio frequency identification, infrared inductor, global positioning system, laser scanner, gas sensor etc. Information sensing device connects any article with internet by the agreement of agreement, information exchange and communication is carried out, with reality Existing Weigh sensor, positioning, tracking, monitoring and a kind of network of management.It, can be right in the development process of certain Internet of things system The Internet of Things application of processing mass data is tested, and this kind of application includes data analysis application, data visualization application etc..But It is to hardly result in a large amount of truthful data before sensor device high-volume is disposed.
Existing sensing data simulator cannot provide while meet requirement of the user to data trend and data distribution.Though It so has been proposed providing based on truthful data sample in the prior art close with the trend of truthful data sample and distribution The method of mass data.But truthful data sample cannot be defined by user flexibility, and the acquisition of truthful data sample is also One difficult point.
Summary of the invention
It has been given below about brief overview of the invention, in order to provide about the basic of certain aspects of the invention Understand.It should be appreciated that this summary is not an exhaustive overview of the invention.It is not intended to determine pass of the invention Key or pith, nor is it intended to limit the scope of the present invention.Its purpose only provides certain concepts in simplified form, Taking this as a prelude to a more detailed description discussed later.
In view of the drawbacks described above of the prior art, an object of the present invention, which is to provide one kind, can be generated with predetermined trend With the technology of the sensing data of predetermined distribution, relationship between the sequence of the size and data of the trend characterize data and Relationship between the size of the distribution characterize data and the quantity of data.
According to an aspect of the present invention, a kind of method for generating sensing data is provided, comprising: based on characterization number According to size and data sequence between the predetermined trend of relationship generate the first data sequence including n the first data; Based on scheduled function, to obtain the distribution of the relationship between the size of characterize data and the quantity of data;According to the distribution N the second data are generated with tentation data range;And be based on first data sequence, to the n the second data into Rearrangement, so that the trend of n the second data after resetting is close to the predetermined trend.
According to another aspect of the present invention, a kind of equipment for generating sensing data is provided, comprising: the first data are raw At device, for being generated based on the predetermined trend of the relationship between the size of characterize data and the sequence of data including n the First data sequence of one data;Acquisition device, for obtaining based on scheduled function the size and data of characterize data The distribution of relationship between quantity;Second data generating device, for generating n according to the distribution and tentation data range Second data;And rearrangement device resets the n the second data for being based on first data sequence, so that The trend of n after rearrangement the second data is close to the predetermined trend.
According to another aspect of the invention, a kind of computer readable storage medium is additionally provided, it is described computer-readable to deposit Storage media, which is stored with, the program operated below is executed by processor operation: the row of size and data based on characterize data The predetermined trend of relationship between sequence generates the first data sequence including n the first data;Based on scheduled function, to obtain Take the distribution of the relationship between the size of characterize data and the quantity of data;It is generated according to the distribution and tentation data range N the second data;And it is based on first data sequence, the n the second data are reset, so that the n after resetting The trend of a second data is close to the predetermined trend.
In accordance with a further aspect of the present invention, a kind of program is additionally provided.Described program includes the executable instruction of machine, when When executing described instruction on information processing equipment, described instruction executes the information processing equipment on according to the present invention State method.
By the detailed description below in conjunction with attached drawing to highly preferred embodiment of the present invention, these and other of the invention is excellent Point will be apparent from.
Detailed description of the invention
The embodiments of the present invention are read with reference to the drawings, other features and advantages of the present invention will be better understood, Attached drawing described here is intended merely to the purpose schematically illustrated to embodiments of the present invention, and not all possible reality It applies, and is not intended to be limited to the scope of the present invention.In the accompanying drawings:
Fig. 1 shows the flow chart of the method for generating sensing data of embodiment according to the present invention.
Fig. 2 shows the schematic diagrames of the example that distribution is obtained based on predefined function of embodiment according to the present invention.
Fig. 3 is shown generates n in the method for embodiment according to the present invention according to distribution and tentation data range The flow chart of the processing of second data.
Fig. 4 shows the stream for the processing reset in the method for embodiment according to the present invention to n the second data Cheng Tu.
Fig. 5 shows the identical sub-portion searched in 2 the second data sequences in the method for embodiment according to the present invention The schematic diagram for the processing example divided.
Fig. 6 is shown determines tentation data range and based on predefined function in the method for embodiment according to the present invention The schematic diagram for the processing example whether distribution matches.
Fig. 7 shows the structural block diagram of the equipment for generating sensing data of embodiment according to the present invention.
Fig. 8 shows the structural block diagram of the second data generating device in the equipment of embodiment according to the present invention.
Fig. 9 shows the structural block diagram of the rearrangement device in the equipment of embodiment according to the present invention.
Figure 10 shows the schematic block diagram for implementing the computer of the method and apparatus of embodiment according to the present invention.
Specific embodiment
Exemplary embodiments of the present invention are described hereinafter in connection with attached drawing.It rises for clarity and conciseness See, does not describe all features of actual implementation mode in the description.It should be understood, however, that developing any this reality Much decisions specific to embodiment must be made during embodiment, to realize the objectives of developer, For example, meeting restrictive condition those of related to system and business, and these restrictive conditions may be with embodiment It is different and change.In addition, it will also be appreciated that although development is likely to be extremely complex and time-consuming, to benefit For those skilled in the art of present disclosure, this development is only routine task.
Here, and also it should be noted is that, in order to avoid having obscured the present invention because of unnecessary details, in the accompanying drawings Illustrate only with closely related apparatus structure and/or processing step according to the solution of the present invention, and be omitted and the present invention The little other details of relationship.
Fig. 1 shows a kind of flow chart of the method for generating sensing data of embodiment according to the present invention. As shown in Figure 1, the method 100 for generating sensing data includes step S110, step S120, step S150 and step S160.? In step S110, generated based on the scheduled trend of the relationship between the size of characterize data and the sequence of data including n First data sequence of the first data.In the step s 120, it is based on scheduled function, to obtain the size and data of characterize data Quantity between relationship distribution.In step S150, n the second data are generated according to distribution and tentation data range. In step S160, it is based on the first data sequence, n the second data are reset, so that a second data of the n after resetting Trend is close to predetermined trend.
Specifically, in step s 110, by carrying out uniform sampling, Lai Shengcheng on the distribution curve for meeting predetermined trend The first data sequence including n the first data.However, generation method is without being limited thereto, those skilled in the art can also be applied Existing other methods meet the first data sequence of predetermined trend to generate.
In the following, being described with reference to Figure 2 the size and data for obtaining characterize data based on predefined function in the step s 120 Quantity between relationship distribution processing.Fig. 2 shows the schematic diagrames for the example that distribution is obtained based on predefined function.Such as Shown in Fig. 2 (a), when based on scheduled function y=f (x) to obtain distribution, drafting predefined function is fastened in two-dimensional coordinate first The distribution curve of y=f (x).Then, in the predetermined definition domain of the X-axis of two-dimensional coordinate system (be defined as 0 in Fig. 2 (a)~ 100) a large amount of points of random scatter, are drawn between distribution curve and X-axis.Then, by predetermined definition regional partition at not overlapping Multiple sections, and the ratio between the number based on the number of point included in each section and all put, to obtain The distribution, such as the distribution of the histogram as shown in Fig. 2 (b).
Fig. 3 is to show the processing for generating n the second data according to distribution and tentation data range in step S150 Flow chart.As shown in figure 3, the processing of step S150 includes step S151 to step S155.In step S151, it is based on each area Between included in point number and the number all put between ratio and n value, to calculate in each section The number of second data.For example, the number of any section x can be calculated by following equation:
However, the method for calculating the number of the second data in each section is without being limited thereto, those skilled in the art can To use other methods according to actual needs.In step S152, in the codomain in each section, it is randomly determined corresponding The size of second data of the respective number in section.For example, the codomain in section 10 is 90~100, then exists as shown in Fig. 2 (b) The size of the second data in section 10 can take 90~100 between arbitrary value.Then, it in step S153, extracts predetermined The second data in data area are simultaneously added in data acquisition system.In step S154, the number of the data in data acquisition system is determined Whether mesh is more than or equal to n, if the data in data acquisition system are less than n, processing returns to repeat step to step S151 The operation of S151 to step S154.Here, the operation of step S151 to step S154 are carried out iteratively, until in data acquisition system Data be more than or equal to n.In addition, processing enters step when the number for determining the data in data acquisition system is more than or equal to n S155, to take out n data at random from data acquisition system, as the n to be generated second data.
Compared with the existing method based on distribution to generate data, method of the invention is allowed users in distribution The numberical range for selecting the data to be generated, to apply more flexible.
In the following, being described with reference to Figure 4 the processing reset in step S160 to n the second data.As shown in figure 4, Processing in step S160 includes step S161 to step S1610.Firstly, being formed in step S161 from by n the second data N!The arrangement of m kind is randomly chosen in kind fully intermeshing, obtains m the second data sequences;And the number of iterations C is reset, i.e., by C It is set as 0.
In step S162, each sequence and the first data sequence in m the second data sequences of computational representation are in trend On similarity score, and highest 2 the second data sequences of score are extracted from m the second data sequences, and by institute 2 the second data sequences extracted are added in arrangement set.For example, any second data sequence of characterization and the first data sequence The score of similarity can be calculated by following equation:
Wherein, d2 iIndicate i-th of data in the second data sequence, and d1 iIndicate i-th in the first data sequence Data.Those skilled in the art can also use other existing calculation methods according to actual needs.
In step S163,2 the second data sequences are selected from m the second data sequences using roulette selection algorithm Column.Roulette selection algorithm has been well known to those skilled in the art, and for purpose of brevity, details are not described herein, and it is operated in detail.
In step S164, the subdivision including identical element in selected 2 the second data sequences is handed over It changes.As an example, can search and exchange the identical subdivision in 2 second data sequences by following operation.
Firstly, being constructed as shown in Fig. 5 (a) based on 2 the second data sequences (being represented as S1 and S2 in being described below) The matrix M of n × n, the wherein element M in matrix MijValue depending on sequence S1 i-th of data and sequence S2 j-th of data It is whether identical, if they are identical, MijValue be 1, be otherwise sky.
Then, searching matrix M finds out the submatrix for being less than predetermined size not comprising null and empty arrange and size, such as Fig. 5 (a) submatrix 501 shown in.Herein, it is preferable that predetermined size is n/2 × n/2.
Then, the element corresponding with submatrix in sequence S1 and S2 is swapped.As shown in Fig. 5 (a), in sequence S1 In, element corresponding with submatrix 501 is (1,2,3), and in sequence S2, element corresponding with submatrix 501 be (3,2, 1), then the element (3,2,1) in the element (1,2,3) and sequence S2 in sequence S1 is swapped, as shown in Fig. 5 (b).By This, the second data sequence after obtaining 2 exchanges.
In step S165, according to predetermined probability, respectively between the element in 2 the second data sequences after exchange Position is adjusted.When the predetermined probability can be set as zero according to actual needs, do not adjust in the second data sequence at this time Element between position.
In step S166,2 the second data sequences after aforesaid operations are added in arrangement set.
In step S167, determine whether the number of the sequence in arrangement set is more than or equal to m.If in arrangement set Sequence number is less than m and continues to execute the operation of step S163 to step S167 then processing returns to step S163.Here, step The operation of S163 to step S167 are carried out iteratively, until sequence number is more than or equal to m.If it is determined that in addition, sequence Sequence number in column set is more than or equal to m, then processing enters step S168.
In step S168, taken out in addition to 2 the second data sequences extracted in step S162 from arrangement set Any m-2 sequence, m the is updated using 2 the second data sequences extracted in the m-2 sequence and step S162 Two data sequences, and the number of iterations C is increased by 1.
Then, in step S169, determine whether the number of iterations C reaches pre-determined number.If the number of iterations C is also not up to Pre-determined number continues to execute the operation of step S162 to step S169 then processing returns to step S162.Here, step S162 Operation to step S169 is carried out iteratively, until the number of iterations reaches pre-determined number.If it is determined that the number of iterations reaches Predetermined value, then processing enters step S1610.
In step S1610, highest second data sequence of score is selected from finally obtained arrangement set, as weight N the second data after row.
In addition, distribution and tentation data range based on being obtained by predefined function has been described above to generate n the The method of two data, if but the tentation data range and distribution mismatch, data generation will take considerable time.For example, such as Shown in Fig. 6, if user selects Gaussian function as predefined function, but [17,25] is selected to be used as tentation data range, then The distribution obtained by such predefined function is unmatched with tentation data range.Because data are fallen in [17,25] range Probability it is too small, thus generate the second data will take considerable time.
In consideration of it, the method 100 for generating sensing data may be used also in another embodiment according to the present invention To include step S130 and step S140.In step s 130, determine that tentation data range is appropriate for based on predefined function Acquired distribution.If it is determined that tentation data range is unsuitable for the distribution, then user is notified to select in step S140 again Tentation data range and/or scheduled function are selected, then processing enters step S120 and continues to execute.Here, being iteratively performed step Rapid S120 to step S140, until selected tentation data range is matched with predefined function.In addition, if in step S130 In determine tentation data range be suitable for acquired distribution, then processing enter step S140.
As an example, in step s 130, determining that tentation data range is appropriate for based on acquired in predefined function Distribution may include: as shown in fig. 6, fastening the distribution curve for drawing scheduled function in two-dimensional coordinate;In the X of two-dimensional coordinate system In the predetermined definition domain of axis (0~25 in such as Fig. 5), a large amount of points of random scatter are drawn between distribution curve and X-axis;And Ratio between number based on the point within the scope of tentation data and the number all put, to determine whether tentation data range fits Together in distribution.Here, ratio and predetermined threshold between the number that can be put by the point number within the scope of tentation data and all It compares, if the ratio is greater than threshold value, then it is assumed that acquired distribution is matched with tentation data range, is otherwise discomfort It closes.
The method for generating sensing data of embodiment according to the present invention is described above with reference to Fig. 1 to Fig. 6 100.In the following, describing the equipment for generating sensing data of embodiment according to the present invention with reference to Fig. 7 to Fig. 8.
Fig. 7 shows the structural block diagram of the equipment 700 for generating sensing data of embodiment according to the present invention.Such as Shown in Fig. 7, equipment 700 includes: the first data generating device 710, is configured to size based on characterize data and data The scheduled trend of relationship between sequence generates the first data sequence including n the first data;Acquisition device 720, It is configured to based on scheduled function, to obtain the distribution of the relationship between the size of characterize data and the quantity of data;Second Data generating device 730 is configured to generate n the second data according to the distribution and tentation data range;And it resets Device 740 is configured to reset n the second data based on the first data sequence, so that the n after resetting second The trend of data is close to predetermined trend.
In embodiment according to the present invention, acquisition device 720 is configured to: it is scheduled to fasten drafting in two-dimensional coordinate The distribution curve of function;In the predetermined definition domain of the X-axis of two-dimensional coordinate system, random dissipate is drawn between distribution curve and X-axis A large amount of points of cloth;By predetermined definition regional partition at the multiple sections not overlapped;And based on point included in each section Number and the number all put between ratio, to obtain the distribution.
Fig. 8 shows the structural block diagram of the second data generating device in the equipment of embodiment according to the present invention. As shown in figure 8, the second data generating device 730 includes iteration unit 731 and retrieval unit 732.Iteration unit 731 is configured to Following operation is repeated, until the number of the data in data acquisition system is more than or equal to n: based on point included in each section Number and the number all put between ratio and n value, calculate the number in the second data in each section;Every In the codomain in a section, it is randomly determined the size of the second data of the respective number in respective bins;And it extracts pre- Determine the second data in data area and is added in data acquisition system.Retrieval unit 732 is configured to take at random from data acquisition system N data out, as the n the second data.
Fig. 9 shows the structural block diagram of the rearrangement device in the equipment of embodiment according to the present invention.Such as Fig. 9 institute Show, rearrangement device 740 may include selection unit 741, iteration unit 742 and acquiring unit 743.Selection unit 741 is configured At the n from n the second data!The arrangement of m kind is randomly chosen in kind fully intermeshing, obtains m the second data sequences.Iteration unit 742 are configured to repeat following operation until reaching pre-determined number:
1) each sequence and similarity of first data sequence in trend in computational representation m the second data sequences Score, and highest 2 the second data sequences of score are taken out from m the second data sequences, and by taken out 2 second Data sequence is added in arrangement set;
2) following operation is executed until the sequence number in arrangement set is more than or equal to m:
I) 2 the second data sequences are selected from m the second data sequences using roulette selection algorithm;
Ii) subdivision including identical element in 2 the second data sequences is swapped;
Iii the position of the element in 2 the second data sequences after) adjusting exchange according to predetermined probability;And
Iv) 2 the second data sequences after aforesaid operations are added in arrangement set;And
3) it utilizes in 1) and removes in 1) extracted 2 in extracted 2 the second data sequences and the arrangement set Any m-2 sequence other than second data sequence updates m the second data sequences.
Acquiring unit 743 is configured to obtain highest second data sequence of score from finally obtained arrangement set, As n the second data after rearrangement.
In another embodiment of the method in accordance with the present invention, acquisition device 720 is configured to, repeat it is following operation until Tentation data range is suitable for being distributed: 1) scheduled function is based on, to obtain between the size of characterize data and the quantity of data Relationship distribution;2) determine that tentation data range is appropriate for acquired distribution;And 3) when determining tentation data When range is unsuitable for acquired distribution, tentation data range and/or predefined function are updated.
In another embodiment, 1) distribution is obtained based on scheduled function includes: to fasten drafting in two-dimensional coordinate The distribution curve of scheduled function;In the predetermined definition domain of the X-axis of two-dimensional coordinate system, drawn between distribution curve and X-axis A large amount of points of random scatter;By predetermined definition regional partition at the multiple sections not overlapped;And based on being wrapped in each section Ratio between the number of the point contained and the number all put, to obtain the distribution.
The concrete operations of apparatus above and unit are referred to above in connection with the phase in the method for generating sensing data Description is closed, details are not described herein.
The method and apparatus for generating sensing data of embodiment can generate to have and make a reservation for become according to the present invention The sensing data of gesture and predetermined distribution.
In addition, here it is still necessary to, it is noted that in above system each building block can by software, firmware, hardware or The mode of a combination thereof is configured.It configures workable specific means or mode is well known to those skilled in the art, herein not It repeats again.In the case where being realized by software or firmware, from storage medium or network to the calculating with specialized hardware structure Machine (such as general purpose computer 1000 shown in Fig. 10) installation constitutes the program of the software, which is being equipped with various programs When, it is able to carry out various functions etc..
Figure 10 shows the schematic frame that can be used for implementing the computer of the method and apparatus of embodiment according to the present invention Figure.
In Figure 10, central processing unit (CPU) 1001 according to the program stored in read-only memory (ROM) 1002 or from The program that storage section 1008 is loaded into random access memory (RAM) 1003 executes various processing.In RAM1003, root is gone back The data required when CPU 1001 executes various processing etc. are stored according to needs.CPU 1001, ROM 1002 and RAM 1003 are passed through It is connected to each other by bus 1004.Input/output interface 1005 is also connected to bus 1004.
Components described below is connected to input/output interface 1005: importation 1006 (including keyboard, mouse etc.), output Part 1007 (including display, such as cathode-ray tube (CRT), liquid crystal display (LCD) etc. and loudspeaker etc.), storage unit Divide 1008 (including hard disks etc.), communications portion 1009 (including network interface card such as LAN card, modem etc.).Communication unit 1009 are divided to execute communication process via network such as internet.As needed, driver 1010 can be connected to input/output and connect Mouth 1005.Detachable media 1011 such as disk, CD, magneto-optic disk, semiconductor memory etc. can according to need mounted On driver 1010, so that the computer program read out is mounted to as needed in storage section 1008.
It is such as removable from network such as internet or storage medium in the case where series of processes above-mentioned by software realization Unload the program that the installation of medium 1011 constitutes software.
It will be understood by those of skill in the art that this storage medium is not limited to shown in Fig. 10 wherein be stored with program , separately distribute to provide a user the detachable media 1011 of program with equipment.The example packet of detachable media 1011 Containing disk (include floppy disk (registered trademark)), CD (including compact disc read-only memory (CD-ROM) and digital versatile disc (DVD)), Magneto-optic disk (including mini-disk (MD) (registered trademark)) and semiconductor memory.Alternatively, storage medium can be ROM1002, deposit The hard disk etc. for including in part 1008 is stored up, wherein computer program stored, and is distributed to user together with the equipment comprising them.
The present invention also proposes a kind of program product of instruction code for being stored with machine-readable.Described instruction code is by machine When device reads and executes, method that above-mentioned embodiment according to the present invention can be performed.
Correspondingly, it is also wrapped for carrying the storage medium of the program product of the above-mentioned instruction code for being stored with machine-readable It includes within the scope of the invention.The storage medium includes but is not limited to floppy disk, CD, magneto-optic disk, storage card, memory stick etc. Deng.
It should be noted that method of the invention be not limited to specifications described in time sequencing execute, can also be by According to other order of order, concurrently or independently it executes.Therefore, the execution sequence of method described in this specification is not right Technical scope of the invention is construed as limiting.
It is above for a better understanding of the present invention, to be only exemplary to the description of each embodiment of the present invention, And it is not intended to limit the invention.It should be noted that in the above description, describing and/or showing for a kind of embodiment Feature can be used in one or more other embodiments in a manner of same or similar, in other embodiment Feature is combined, or the feature in substitution other embodiment.It will be understood by those skilled in the art that of the invention not departing from In the case where inventive concept, for the variations and modifications that embodiment described above carries out, belong to of the invention In range.
To sum up, in embodiment according to the present invention, the present invention provides following technical solutions.
A kind of method for generating sensing data of scheme 1., comprising:
It is generated based on the predetermined trend of the relationship between the size of characterize data and the sequence of data including n first number According to the first data sequence;
Based on scheduled function, to obtain the distribution of the relationship between the size of characterize data and the quantity of data;
N the second data are generated according to the distribution and tentation data range;And
Based on first data sequence, the n the second data are reset, so that the n after resetting second number According to trend close to the predetermined trend.
2. the method for claim 1 of scheme, wherein the size that characterize data is obtained based on scheduled function The distribution of relationship between the quantity of data includes:
The distribution curve for drawing the scheduled function is fastened in two-dimensional coordinate;
In the predetermined definition domain of the X-axis of the two-dimensional coordinate system, between the distribution curve and the X-axis draw with A large amount of points that machine is spread;
By the predetermined definition regional partition at the multiple sections not overlapped;And
The ratio between number based on the number of point included in each section and all put, to obtain described point Cloth.
3. the method for claim 1 of scheme, wherein the size that characterize data is obtained based on scheduled function The distribution of relationship between the quantity of data includes,
Following operation is repeated until acquired distribution is matched with the tentation data range:
Based on the scheduled function, to obtain the distribution;
Determine that the tentation data range is appropriate for the distribution;And
When determining that tentation data range and the distribution are not suitable for, tentation data range and/or described predetermined is updated Function.
4. the method for claim 3 of scheme, wherein the determination tentation data range is appropriate for described Distribution includes:
The distribution curve for drawing the scheduled function is fastened in two-dimensional coordinate;
In the predetermined definition domain of the X-axis of the two-dimensional coordinate system, between the distribution curve and the X-axis draw with A large amount of points that machine is spread;And
Ratio between number based on the point within the scope of the tentation data and the number all put, it is described pre- to determine Determine data area and is appropriate for the distribution.
5. the method for claim 3 of scheme, wherein described that the distribution packet is obtained based on the scheduled function It includes:
The distribution curve for drawing the scheduled function is fastened in two-dimensional coordinate;
In the predetermined definition domain of the X-axis of the two-dimensional coordinate system, between the distribution curve and the X-axis draw with A large amount of points that machine is spread;
By the predetermined definition regional partition at the multiple sections not overlapped;And
The ratio between number based on the number of point included in each section and all put, to obtain described point Cloth.
The method according to scheme 2 or 5 of scheme 6., wherein generating the n the second data includes:
Following operation is repeated, until the number of the data in data acquisition system is more than or equal to n:
I) ratio between the number put based on the number of point included in each section and all
The value of rate and n calculates the number in the second data in each section;
Ii) in the codomain in each section, it is randomly determined the respective number in respective bins
The second data size;And
Iii the second data within the scope of the tentation data) are extracted and are added to the data set
In conjunction;And
N data are taken out at random as the n the second data from the data acquisition system.
The method according to any one of scheme 1 to 5 of scheme 7., wherein rearrangement packet is carried out to the n the second data It includes:
From the n of the n the second data!The arrangement of m kind is randomly chosen in kind fully intermeshing, obtains m the second data sequences;
Following operation is repeated, until reaching pre-determined number:
1) each sequence and similarity of first data sequence in trend in computational representation m the second data sequences Score, and highest 2 the second data sequences of score are taken out from m the second data sequences, and by taken out 2 second Data sequence is added in arrangement set;
2) following operation is executed until the sequence number in the arrangement set is more than or equal to m:
I) 2 the second data sequences are selected from the m the second data sequences using roulette selection algorithm;
Ii) subdivision including identical element in 2 second data sequences is swapped;
Iii the position of the element in 2 the second data sequences after) adjusting exchange according to predetermined probability;And
Iv) 2 the second data sequences after aforesaid operations are added in the arrangement set;And
3) it utilizes in 1) and removes in 1) extracted 2 in extracted 2 the second data sequences and the arrangement set Any m-2 sequence other than second data sequence updates the m the second data sequences, and
Highest second data sequence of score is obtained from the arrangement set, as n the second data after rearrangement.
8. the method for claim 7 of scheme, wherein the score passes through to second data sequence and described the The inverse of the absolute value of the difference summation of element at each same position of one data sequence calculates.
The method according to scheme 7 or 8 of scheme 9., wherein the number of the element in the subdivision is at most n/2.
A kind of equipment for generating sensing data of scheme 10., comprising:
First data generating device, for the relationship between the sequence of size and data based on characterize data it is predetermined become Gesture generates the first data sequence including n the first data;
Acquisition device, for obtaining the relationship between the size of characterize data and the quantity of data based on scheduled function Distribution;
Second data generating device, for generating n the second data according to the distribution and tentation data range;And
Rearrangement device resets the n the second data for being based on first data sequence, so as to reset The trend of rear n the second data is close to the predetermined trend.
The equipment according to scheme 10 of scheme 11., wherein the acquisition device is configured to:
The distribution curve for drawing the scheduled function is fastened in two-dimensional coordinate;
In the predetermined definition domain of the X-axis of the two-dimensional coordinate system, between the distribution curve and the X-axis draw with A large amount of points that machine is spread;
By the predetermined definition regional partition at the multiple sections not overlapped;And
The ratio between number based on the number of point included in each section and all put, to obtain described point Cloth.
The equipment according to scheme 10 of scheme 12., wherein the acquisition device is configured to:
Following operation is repeated until acquired distribution is matched with the tentation data range:
Based on the scheduled function, to obtain the distribution;
Determine that the tentation data range is appropriate for the distribution;And
When determining that tentation data range and the distribution are not suitable for, tentation data range and/or described predetermined is updated Function.
The equipment according to scheme 12 of scheme 13., wherein the determination tentation data range is appropriate for institute Stating distribution includes:
The distribution curve for drawing the scheduled function is fastened in two-dimensional coordinate;
In the predetermined definition domain of the X-axis of the two-dimensional coordinate system, between the distribution curve and the X-axis draw with A large amount of points that machine is spread;And
Ratio between number based on the point within the scope of the tentation data and the number all put, it is described pre- to determine Determine data area and is appropriate for the distribution.
The equipment according to scheme 12 of scheme 14., wherein described that the distribution is obtained based on the scheduled function Include:
The distribution curve for drawing the scheduled function is fastened in two-dimensional coordinate;
In the predetermined definition domain of the X-axis of the two-dimensional coordinate system, between the distribution curve and the X-axis draw with A large amount of points that machine is spread;
By the predetermined definition regional partition at the multiple sections not overlapped;And
The ratio between number based on the number of point included in each section and all put, to obtain described point Cloth.
The equipment according to scheme 11 or 14 of scheme 15., wherein second data generating device includes:
Iteration unit is configured to be repeatedly carried out following operation, until data acquisition system in data number be greater than etc. In n:
I) ratio between the number put based on the number of point included in each section and all and the value of n, meter Calculate the number in the second data in each section;
Ii) in the codomain in each section, it is randomly determined the big of the second data of the respective number in respective bins It is small;And
Iii the second data within the scope of the tentation data) are extracted and are added in the data acquisition system;And
Retrieval unit is configured to take out n data at random from the data acquisition system as the n the second data.
The equipment according to any one of scheme 10 to 14 of scheme 16., wherein the rearrangement device includes:
Selection unit is configured to the n from the n the second data!The arrangement of m kind is randomly chosen in kind fully intermeshing, is obtained To m the second data sequences;
Iteration unit is configured to be repeatedly carried out following operation, until reaching pre-determined number:
1) each sequence and similarity of first data sequence in trend in computational representation m the second data sequences Score, and highest 2 the second data sequences of score are taken out from m the second data sequences, and by taken out 2 second Data sequence is added in arrangement set;
2) following operation is executed until the sequence number in the arrangement set is more than or equal to m:
I) 2 the second data sequences are selected from the m the second data sequences using roulette selection algorithm;
Ii) subdivision including identical element in 2 second data sequences is swapped;
Iii the position of the element in 2 the second data sequences after) adjusting exchange according to predetermined probability;And
Iv) 2 the second data sequences after aforesaid operations are added in the arrangement set;And
3) it utilizes in 1) and removes in 1) extracted 2 in extracted 2 the second data sequences and the arrangement set Any m-2 sequence other than second data sequence updates the m the second data sequences;And
Acquiring unit is configured to obtain highest second data sequence of score from the arrangement set, as rearrangement N the second data afterwards.
The equipment according to scheme 16 of scheme 17., wherein the score by second data sequence with it is described The inverse of the absolute value of the difference summation of element at each same position of first data sequence calculates.
The equipment according to scheme 16 or 17 of scheme 18., wherein the number of the element in the subdivision is at most n/ 2。
A kind of computer readable storage medium of scheme 19., the computer-readable recording medium storage has can be by handling Device runs to execute the program operated below:
It is generated based on the predetermined trend of the relationship between the size of characterize data and the sequence of data including n first number According to the first data sequence;
Based on scheduled function, to obtain the distribution of the relationship between the size of characterize data and the quantity of data;
N the second data are generated according to the distribution and tentation data range;And
Based on first data sequence, the n the second data are reset, so that the n after resetting second number According to trend close to the predetermined trend.

Claims (10)

1. a kind of method for generating sensing data, comprising:
It is generated based on the predetermined trend of the relationship between the size of characterize data and the sequence of data including a first data of n First data sequence;
Based on scheduled function, to obtain the distribution of the relationship between the size of characterize data and the quantity of data;
N the second data are generated according to the distribution and tentation data range;And
Based on first data sequence, the n the second data are reset, so that a second data of the n after resetting Trend is close to the predetermined trend.
2. according to the method described in claim 1, wherein, the size that characterize data is obtained based on scheduled function and number According to quantity between the distribution of relationship include,
Following operation is repeated until acquired distribution is matched with the tentation data range:
Based on the scheduled function, to obtain the distribution;
Determine that the tentation data range is appropriate for the distribution;And
When determining that the tentation data range and the distribution are not suitable for, the tentation data range and/or described is updated Scheduled function.
3. according to the method described in claim 2, wherein, the determination tentation data range is appropriate for the distribution Include:
The distribution curve for drawing the scheduled function is fastened in two-dimensional coordinate;
In the predetermined definition domain of the X-axis of the two-dimensional coordinate system, random dissipate is drawn between the distribution curve and the X-axis A large amount of points of cloth;And
Ratio between number based on the point within the scope of the tentation data and the number all put, to determine the predetermined number The distribution is appropriate for according to range.
It is described to obtain the distribution based on the scheduled function and include: 4. according to the method described in claim 2, wherein
The distribution curve for drawing the scheduled function is fastened in two-dimensional coordinate;
In the predetermined definition domain of the X-axis of the two-dimensional coordinate system, random dissipate is drawn between the distribution curve and the X-axis A large amount of points of cloth;
By the predetermined definition regional partition at the multiple sections not overlapped;And
The ratio between number based on the number of point included in each section and all put, to obtain the distribution.
5. according to the method described in claim 1, wherein, the size that characterize data is obtained based on scheduled function and number According to quantity between the distribution of relationship include:
The distribution curve for drawing the scheduled function is fastened in two-dimensional coordinate;
In the predetermined definition domain of the X-axis of the two-dimensional coordinate system, random dissipate is drawn between the distribution curve and the X-axis A large amount of points of cloth;
By the predetermined definition regional partition at the multiple sections not overlapped;And
The ratio between number based on the number of point included in each section and all put, to obtain the distribution.
6. method according to claim 4 or 5, wherein generating the n the second data includes:
Following operation is repeated, until the number of the data in data acquisition system is more than or equal to n:
I) ratio between the number put based on the number of point included in each section and all and the value of n are calculated The number of the second data in each section;
Ii) in the codomain in each section, it is randomly determined the size of the second data of the respective number in respective bins;With And
Iii the second data within the scope of the tentation data) are extracted and are added in the data acquisition system;And
N data are taken out at random as the n the second data from the data acquisition system.
7. the method according to any one of claims 1 to 5, wherein carrying out rearrangement to the n the second data includes:
From the n of the n the second data!The arrangement of m kind is randomly chosen in kind fully intermeshing, obtains m the second data sequences;
Following operation is repeated, until reaching pre-determined number:
1) point of each sequence in computational representation m the second data sequences and similarity of first data sequence in trend Number, and highest 2 the second data sequences of score are taken out from m the second data sequences, and taken out 2 second are counted It is added in arrangement set according to sequence;
2) following operation is executed until the sequence number in the arrangement set is more than or equal to m:
I) 2 the second data sequences are selected from the m the second data sequences using roulette selection algorithm;
Ii) subdivision including identical element in 2 second data sequences is swapped;
Iii the position between the element in 2 the second data sequences after) adjusting exchange according to predetermined probability;And
Iv) 2 the second data sequences after aforesaid operations are added in the arrangement set;And
3) it utilizes in 1) and removes in 1) extracted 2 second in extracted 2 the second data sequences and the arrangement set Any m-2 sequence other than data sequence updates the m the second data sequences, and
Highest second data sequence of score is obtained from the arrangement set, as n the second data after rearrangement.
8. according to the method described in claim 7, wherein, the score passes through to second data sequence and first number It is calculated according to the inverse that the absolute value of the difference of the element at each same position of sequence is summed.
9. a kind of equipment for generating sensing data, comprising:
First data generating device, for the relationship between the sequence of size and data based on characterize data predetermined trend come Generate the first data sequence including n the first data;
Acquisition device, for obtaining point of the relationship between the size of characterize data and the quantity of data based on scheduled function Cloth;
Second data generating device, for generating n the second data according to the distribution and tentation data range;And
Rearrangement device resets the n the second data for being based on first data sequence, so that the n after resetting The trend of a second data is close to the predetermined trend.
10. a kind of computer readable storage medium, the computer-readable recording medium storage have can by processor operation Lai Execute the following program operated:
It is generated based on the predetermined trend of the relationship between the size of characterize data and the sequence of data including a first data of n First data sequence;
Based on scheduled function, to obtain the distribution of the relationship between the size of characterize data and the quantity of data;
N the second data are generated according to the distribution and tentation data range;And
Based on first data sequence, the n the second data are reset, so that a second data of the n after resetting Trend is close to the predetermined trend.
CN201810027130.2A 2018-01-11 2018-01-11 Generate the method and apparatus and computer readable storage medium of sensing data Pending CN110032745A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810027130.2A CN110032745A (en) 2018-01-11 2018-01-11 Generate the method and apparatus and computer readable storage medium of sensing data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810027130.2A CN110032745A (en) 2018-01-11 2018-01-11 Generate the method and apparatus and computer readable storage medium of sensing data

Publications (1)

Publication Number Publication Date
CN110032745A true CN110032745A (en) 2019-07-19

Family

ID=67234737

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810027130.2A Pending CN110032745A (en) 2018-01-11 2018-01-11 Generate the method and apparatus and computer readable storage medium of sensing data

Country Status (1)

Country Link
CN (1) CN110032745A (en)

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0744357A (en) * 1993-08-03 1995-02-14 Seiko Epson Corp Rearranging device for data
US5432724A (en) * 1992-12-04 1995-07-11 U.S. Philips Corporation Processor for uniform operations on respective series of successive data in respective parallel data streams
CN1303061A (en) * 1999-10-21 2001-07-11 国际商业机器公司 System and method of sequencing and classifying attributes for better visible of multidimentional data
CN101568900A (en) * 2006-12-22 2009-10-28 日本电气株式会社 Parallel sort device, method, and program
US8214370B1 (en) * 2009-03-26 2012-07-03 Crossbow Technology, Inc. Data pre-processing and indexing for efficient retrieval and enhanced presentation
CN102930155A (en) * 2012-10-30 2013-02-13 国网能源研究院 Method and device for acquiring early-warming parameters of power demands
US20140279874A1 (en) * 2013-03-15 2014-09-18 Timmie G. Reiter Systems and methods of data stream generation
CN104881267A (en) * 2015-03-04 2015-09-02 西安电子科技大学 Weight method-based generation method of complex Nakagami-m fading random sequences
JP2015184877A (en) * 2014-03-24 2015-10-22 株式会社日立ソリューションズ Data processor and data processing program
CN106775582A (en) * 2015-11-20 2017-05-31 中移(杭州)信息技术有限公司 A kind of method and apparatus for generating random sequence
US20170249534A1 (en) * 2016-02-29 2017-08-31 Fujitsu Limited Method and apparatus for generating time series data sets for predictive analysis

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5432724A (en) * 1992-12-04 1995-07-11 U.S. Philips Corporation Processor for uniform operations on respective series of successive data in respective parallel data streams
JPH0744357A (en) * 1993-08-03 1995-02-14 Seiko Epson Corp Rearranging device for data
CN1303061A (en) * 1999-10-21 2001-07-11 国际商业机器公司 System and method of sequencing and classifying attributes for better visible of multidimentional data
CN101568900A (en) * 2006-12-22 2009-10-28 日本电气株式会社 Parallel sort device, method, and program
US8214370B1 (en) * 2009-03-26 2012-07-03 Crossbow Technology, Inc. Data pre-processing and indexing for efficient retrieval and enhanced presentation
CN102930155A (en) * 2012-10-30 2013-02-13 国网能源研究院 Method and device for acquiring early-warming parameters of power demands
US20140279874A1 (en) * 2013-03-15 2014-09-18 Timmie G. Reiter Systems and methods of data stream generation
JP2015184877A (en) * 2014-03-24 2015-10-22 株式会社日立ソリューションズ Data processor and data processing program
CN104881267A (en) * 2015-03-04 2015-09-02 西安电子科技大学 Weight method-based generation method of complex Nakagami-m fading random sequences
CN106775582A (en) * 2015-11-20 2017-05-31 中移(杭州)信息技术有限公司 A kind of method and apparatus for generating random sequence
US20170249534A1 (en) * 2016-02-29 2017-08-31 Fujitsu Limited Method and apparatus for generating time series data sets for predictive analysis

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
储久良: "改进随机函数局限性样本取数优化算法", 《现代电子技术》 *
郭斯羽等: "一种挖掘相似子趋势的可变递增步长算法", 《浙江大学学报(工学版)》 *

Similar Documents

Publication Publication Date Title
Heinonen et al. Non-stationary gaussian process regression with hamiltonian monte carlo
Raman et al. The Bayesian group-lasso for analyzing contingency tables
Patel et al. Graph based link prediction between human phenotypes and genes
JP2012058972A (en) Evaluation prediction device, evaluation prediction method, and program
Xie et al. Joint estimation of multiple dependent Gaussian graphical models with applications to mouse genomics
Zhao et al. Score test variable screening
US20160232330A1 (en) Manifold Diffusion of Solutions for Kinetic Analysis of Pharmacokinetic Data
Neves et al. SGAIN, WSGAIN-CP and WSGAIN-GP: Novel GAN methods for missing data imputation
Valkonen et al. Generalized fixation invariant nuclei detection through domain adaptation based deep learning
Choi et al. A modified generalized lasso algorithm to detect local spatial clusters for count data
Yang et al. Optimal transport for parameter identification of chaotic dynamics via invariant measures
Azhar et al. A hierarchical Gamma Mixture Model-based method for estimating the number of clusters in complex data
Hafych et al. Parallelizing MCMC sampling via space partitioning
Pocock et al. State estimation using the particle filter with mode tracking
CN110032745A (en) Generate the method and apparatus and computer readable storage medium of sensing data
Walter et al. Estimation of the parameter uncertainty resulting from bounded-error data
Gandy et al. QuickMMCTest: quick multiple Monte Carlo testing
He et al. Measuring boundedness for protein complex identification in PPI networks
Smolander et al. Cell-connectivity-guided trajectory inference from single-cell data
Ansariola et al. IndeCut evaluates performance of network motif discovery algorithms
Wienke et al. Towards an accurate simulation of the crystallisation process in injection moulded plastic components by hybrid parallelisation
Mordant et al. Statistical analysis of random objects via metric measure Laplacians
US9449284B2 (en) Methods and systems for dependency network analysis using a multitask learning graphical lasso objective function
JP6817243B2 (en) Spectrum processing equipment and methods
Knopov et al. A model of infectious disease spread with hidden carriers

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20190719

WD01 Invention patent application deemed withdrawn after publication