CN109685101B - Multi-dimensional data self-adaptive acquisition method and system - Google Patents

Multi-dimensional data self-adaptive acquisition method and system Download PDF

Info

Publication number
CN109685101B
CN109685101B CN201811345413.8A CN201811345413A CN109685101B CN 109685101 B CN109685101 B CN 109685101B CN 201811345413 A CN201811345413 A CN 201811345413A CN 109685101 B CN109685101 B CN 109685101B
Authority
CN
China
Prior art keywords
data
acquisition
adaptive
dimensional
multidimensional
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811345413.8A
Other languages
Chinese (zh)
Other versions
CN109685101A (en
Inventor
蔺华庆
闫峥
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xidian University
Original Assignee
Xidian University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xidian University filed Critical Xidian University
Priority to CN201811345413.8A priority Critical patent/CN109685101B/en
Publication of CN109685101A publication Critical patent/CN109685101A/en
Application granted granted Critical
Publication of CN109685101B publication Critical patent/CN109685101B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • G06F18/2135Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods based on approximation criteria, e.g. principal component analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/10Pre-processing; Data cleansing

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention belongs to the technical field of big data acquisition and discloses a self-adaptive acquisition method and system for multidimensional data. The method comprises the steps of reducing the dimension of multi-dimensional data by using a dimension reduction technology, and reducing the dimension of the multi-dimensional data to one dimension to obtain one-dimensional principal components of the multi-dimensional data; the method comprises the steps of utilizing one-dimensional principal components of original multi-dimensional data as reference data for judging data change, and inputting the reference data into a one-dimensional adaptive data acquisition algorithm; and adjusting the acquisition process of the multidimensional big data by using a one-dimensional self-adaptive data acquisition algorithm. Because the PCA in the dimension reduction technology utilizes the covariance of multidimensional data to reduce the dimension, and the acquisition frequency of the adjustment data in the one-dimensional data acquisition is also adjusted based on the change size of the data, the method is feasible, and experiments show the feasibility and the effectiveness of the method. The invention has wide application range, comprises all service scenes applying multidimensional large data acquisition, and can improve the performance of data acquisition on the basis of ensuring the data acquisition precision so as to improve the efficiency of application service.

Description

Multi-dimensional data self-adaptive acquisition method and system
Technical Field
The invention belongs to the technical field of big data acquisition, and particularly relates to a multi-dimensional data self-adaptive acquisition method and system.
Background
Currently, the current state of the art commonly used in the industry is such that: in current internet application scenarios, data is becoming more and more important. Data is the implementation basis for supporting many services, and data collection is the performance bottleneck of most data-related service systems. For example, in the field of network security, the network system protection is realized by collecting communication data and further analyzing the characteristics of the data to detect attacks and intrusions. However, in the big data era, data has 5V characteristics, and traditional data acquisition based on statistical sampling methods (periodic sampling, Poisson sampling and random sampling) cannot meet the current requirements. Further, with the development of artificial intelligence, intelligent business permeates the aspects of people's life, so that the current data collection is usually targeted to multidimensional data rather than one-dimensional data. In conclusion, the self-adaptive multi-dimensional big data acquisition method is an urgent problem to be solved in the current big data era.
Except for the traditional statistical sample-based data acquisition method. Adaptive acquisition methods for one-dimensional data have been proposed in prior work, such as predictive algorithms based on regression analysis and time series analysis. The data acquisition frequency can be adaptively adjusted, so that the data acquisition quantity is reduced and the data acquisition performance is improved on the basis of ensuring the data acquisition precision. However, these methods cannot be applied to multidimensional data, and cannot solve the problem of adaptive acquisition of multidimensional data. In the one-dimensional adaptive data acquisition algorithm, the adaptive acquisition adjustment process of data is based on the change of the data: when the data volume of the data is large, the data acquisition frequency is increased, and more data are acquired to ensure the data acquisition precision; and when the data volume of the data is small, the data acquisition frequency is reduced, and the burden of the application system for data acquisition is reduced. However, for multidimensional data, which kind of data in multiple dimensions should be used as reference data in the adaptive adjustment process of data acquisition is an unsolved problem. No solution is given in the current work, i.e. there is no adaptive acquisition scheme for multi-dimensional data acquisition in the current research work.
In summary, the problems of the prior art are as follows: most of the current enterprises adopt traditional statistical sampling methods for collection, such as periodic, random, hierarchical and poisson sampling. It can directly acquire multi-dimensional data but cannot achieve adaptive acquisition. And at present, data acquisition aiming at data analysis is full acquisition. However, in the current big data age, the data volume is larger and larger, and adaptive sampling is needed to solve the problem of reducing the data acquisition volume. We propose an adaptive sampling method for multidimensional data. No adaptive acquisition method for multidimensional big data is proposed. However, the value of the multidimensional big data self-adaptive acquisition method in the future big data era is very high, and the bottleneck problem of data acquisition performance can be avoided, so that the realization of business is better supported.
The difficulty and significance for solving the technical problems are as follows: no relevant solution to the problem of multidimensional large data acquisition has been proposed in the current work. Data acquisition is currently typically accomplished using conventional sampling algorithms (periodic, random and poisson sampling). However, there is a problem that adaptive acquisition adjustment cannot be performed based on the context, thereby reducing the amount of acquired data and reducing the accuracy of data acquisition. However, the adaptive acquisition schemes based on regression prediction or time series analysis, which have been proposed at present, are directed to one-dimensional data and cannot be applied to multi-dimensional data. Because the problem of searching reference data in multi-dimensional data is not solved, namely none of the reference data is used for adjusting the data acquisition process, the self-adaptive acquisition of the multi-dimensional big data cannot be realized.
Disclosure of Invention
Aiming at the problems in the prior art, the invention provides a multidimensional data self-adaptive acquisition method and system.
The invention is realized in such a way that a multidimensional data self-adaptive acquisition method realizes the self-adaptive adjustment of multidimensional data acquisition by utilizing a one-dimensional data self-adaptive acquisition algorithm and combining a dimension reduction technology, and realizes the acquisition of multidimensional data. The multidimensional data self-adaptive acquisition method comprises the following steps:
step one, using a dimension reduction technology: reducing the dimension of the multidimensional data of the acquired target to one dimension;
and step two, using one-dimensional principal components obtained by dimensionality reduction of the original multi-dimensional data as reference data for judging data change, adjusting the acquisition frequency of the multi-dimensional data, and realizing self-adaptive acquisition of the multi-dimensional data.
Further, multi-dimensional data self-adaptive collection under the current big data scene is achieved.
The multidimensional data self-adaptive acquisition method specifically comprises the following steps:
(1) suppose the goal of data acquisition is Yi=(y1,y2,y3,...,yn) Which is multidimensional data, yj(j ═ 1, 2, 3.., n) is each dimension of the target data. Wherein i is t +1, t +2, t +3r(ii) a Where t is a certain acquisition time point. Defining the predicted value of data as
Figure GDA0001973585630000031
YiIs the actual value. N is a radical ofrIs the actual number of acquisitions, NpIs the number of predicted acquisitions, and Np=Nr
(2) Reducing the dimension of the multidimensional data to a one-dimensional principal component as reference data, wherein a PCA algorithm is mainly utilized:
yi=PCA(Yi)
Figure GDA0001973585630000032
(3) calculating the mean value of the actual value and the predicted value of the target data based on the one-dimensional principal components as follows:
Figure GDA0001973585630000033
Figure GDA0001973585630000034
(4) average ratio of predicted value to actual value using RMRepresents: when R isM1, indicating that the data has not changed substantially; when R isMSignificantly greater than 1 and less than 1 represent a large change in the target data value.
Figure GDA0001973585630000035
Theoretically, the data change ratio can also be calculated by variance, RDAnd (4) showing. The calculation process is as follows:
Figure GDA0001973585630000036
Figure GDA0001973585630000041
Figure GDA0001973585630000042
(5) based on the change ratio of the data, the specific adjustment method of the data acquisition process is as follows:
Figure GDA0001973585630000043
Figure GDA0001973585630000044
wherein T isiRepresenting the current sampling interval, Ti-1Representing the previous sampling interval. T isincRepresents an increased value of the sampling interval; t isdecRepresenting a reduced value of the sampling interval. Thr (Thr)uAnd ThrlIs a threshold value for judging data change. When R isMGreater than ThrlAnd is less than ThruRepresentative data change is small, acquisition interval should be increased by Tinc(ii) a When R isMGreater than ThruOr less than ThrlRepresentative data varies greatly, and the acquisition interval should be reduced by Tdec。TmmaxAnd TminRepresenting maximum and minimum values of the data acquisition interval, i.e. the adjustment of the data acquisition interval not exceeding T at the maximummmaxMinimum value of not less than TminAs a constraint for data acquisition adjustment.
The invention also aims to provide a social network recommendation control system applying the multidimensional data adaptive acquisition method.
Another object of the present invention is to provide an intrusion detection system using the multidimensional data adaptive acquisition method.
The invention also aims to provide an asset portrait acquisition system applying the multidimensional data adaptive acquisition method.
The invention also aims to provide any business application system applying the multidimensional data self-adaptive acquisition method.
The invention provides a method for optimizing data acquisition in a big data scene. In the big data era, the data has the characteristics of large data volume, high flow rate and the like. Therefore, a more optimized method is needed to improve the performance of data collection to better support the implementation of services.
In summary, the advantages and positive effects of the invention are: original multidimensional data are subjected to dimension reduction to one-dimensional principal components through a dimension reduction technology, and a one-dimensional data self-adaptive acquisition algorithm is combined to realize the self-adaptive acquisition method of the multidimensional big data. The invention has wide application, including all scenes applying multidimensional big data acquisition. Such as network security, recommendation systems, social networks, etc. The invention realizes the self-adaptive multi-dimensional data acquisition method by reducing the dimension of the original multi-dimensional data and adjusting the data acquisition process by utilizing the one-dimensional principal component. The invention has the advantages and positive effects that: the method is characterized in that the reference data is automatically searched by using a dimensionality reduction technology (such as PCA), and the multidimensional data self-adaptive acquisition method is realized by combining a one-dimensional data self-adaptive acquisition method. The most important advantage is that the problem of collecting multidimensional data in a big data scene is solved. The method is simple to implement, can be applied to a plurality of scenes, relates to the field of multi-dimensional large data acquisition, and has low computational complexity of the dimensionality reduction technologies such as PCA (principal component analysis). With the development of the big data era, the invention can be applied to any scene needing to realize multi-dimensional data acquisition, and greatly reduces the data acquisition amount on the basis of ensuring the data acquisition precision, thereby improving the performance of big data acquisition and reducing the burden of the data acquisition operation on an application system. The data required by self-adaptive acquisition can be acquired under the scene of multi-dimensional big data acquisition requirements. The data collection has a wide range of applications, including recommendation systems, social networks, intrusion detection, and the like. Conventional data acquisition is generally based on statistical sampling methods. However, in the current big data era, data has 5V characteristics (Volume, Variety, Velocity, Value, Veracity). Meanwhile, due to the development of technologies such as artificial intelligence and the like, the requirement for acquiring required data in multiple dimensions is greatly increased; an adaptive method is needed to be found on the basis of not influencing the accuracy of multi-dimensional data, so that the data acquisition amount is greatly reduced, the burden of data acquisition on an application system is reduced, and the acquisition performance is improved. The invention mainly combines regression analysis and dimension reduction technology in machine learning to design a universal multidimensional data self-adaptive acquisition method, which can meet the acquisition requirement of a big data era and realize the self-adaptive acquisition of multidimensional data in a specific service scene.
Drawings
Fig. 1 is a flowchart of a multidimensional data adaptive acquisition method according to an embodiment of the present invention.
FIG. 2 is a schematic structural diagram of a multidimensional data adaptive acquisition system provided by an embodiment of the present invention;
FIG. 3 is a schematic diagram of the results of a principle contrast plot of one-dimensional data acquisition and multi-dimensional data acquisition provided by the present invention.
FIG. 4 is a schematic diagram illustrating the results of the steps of multi-dimensional data acquisition provided by the present invention.
Fig. 5 is a schematic diagram of an adaptive data acquisition result of a one-dimensional principal component according to an embodiment of the present invention.
Fig. 6 is a schematic diagram of a result of adaptive data acquisition of memory data according to an embodiment of the present invention.
Fig. 7 is a schematic diagram of an adaptive data acquisition result of CPU occupancy data according to an embodiment of the present invention.
Fig. 8 is a schematic diagram of a result of adaptive data acquisition of battery capacity data according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail with reference to the following embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
On the basis of ensuring the multi-dimensional data acquisition precision, the invention greatly reduces the data acquisition amount, thereby preventing the data acquisition from influencing the normal operation of an application system and improving the multi-dimensional data acquisition performance.
The following detailed description of the principles of the invention is provided in connection with the accompanying drawings.
As shown in fig. 1, the multidimensional data adaptive acquisition method provided by the embodiment of the present invention includes the following steps:
s101: reducing the dimension of a plurality of data, namely multidimensional data, to one dimension, namely obtaining one-dimensional principal components of the multidimensional data;
s102: and using the one-dimensional principal component of the original multi-dimensional data as reference data for judging data change, and adjusting the acquisition frequency of the multi-dimensional data.
As shown in fig. 2, the multidimensional data adaptive acquisition method provided by the embodiment of the present invention mainly expands one-dimensional adaptive data acquisition four times, so that the multidimensional data adaptive acquisition method can be applied to a multidimensional data acquisition scene. Or the multidimensional big data acquisition problem is simplified into the one-dimensional data acquisition problem by combining the dimension reduction technology.
The present invention mainly solves the problems that: under a big data scene, the problem that the application system is burdened due to the large acquisition amount of big data is solved; the problem of determining reference data when the one-dimensional data acquisition algorithm acquires the self-adaptive acquisition adjustment of multi-dimensional data. The main objective of the method is to greatly reduce the collected data volume on the basis of ensuring certain collection precision for multidimensional data, thereby improving the performance of data collection and preventing the normal operation of an application system from being influenced by the operation of data collection. For the adaptive acquisition problem of one-dimensional data, some research work has been proposed so far, and most of the research work is to predict the change of data by using technical means such as regression analysis and time series analysis (e.g., Auto-Regressive Moving Average Model), and adjust the data acquisition process according to the change of the one-dimensional data. At the moment of large data volume, the frequency of data acquisition is improved, so that more data are acquired; and at the moment of small data volume, the frequency of data acquisition is reduced, and the data required to be acquired is reduced. The invention can ensure the precision of data acquisition, reduce the data acquisition amount, improve the data acquisition performance and be suitable for large data scenes. However, in the current collection environment, especially when processing data by machine learning and data mining techniques, multiple data in the same time dimension need to be collected, that is, multiple data, that is, multiple features, that is, multi-dimensional data, should be collected at the same time. However, if a plurality of data are acquired simultaneously, it is a difficult problem to adjust the acquisition frequency of the multidimensional data according to which data should be used as reference data, that is, according to which data change.
Dimension reduction is performed on a plurality of data, that is, multidimensional data, by using a dimension reduction technique (for example, PCA (Principal Component Analysis)), and the multidimensional data is reduced to one dimension, that is, a one-dimensional Principal Component of the multidimensional data is obtained. Then, one-dimensional principal components of the original multi-dimensional data are used as reference data for judging data change, and the acquisition frequency of the multi-dimensional data is further adjusted; PCA uses covariance of multidimensional data to reduce dimension, and the acquisition frequency of the data is adjusted based on the variation of the data. As shown in fig. 3, the invention reduces the dimension of the multidimensional data to the one-dimensional principal component by comparing the problem of the adaptive acquisition of the one-dimensional data and combining the dimension reduction technology, and adjusts the acquisition process of the multidimensional data by using the one-dimensional principal component as the reference data and the change of the one-dimensional principal component.
When one-dimensional data is acquired, the adaptive data acquisition algorithm adjusts the acquisition frequency of the data according to the change of the one-dimensional data. But for the acquisition of multi-dimensional data it is a difficult problem on which data the acquisition system should be adjusted. For the problem of acquisition of multidimensional data, two solutions are considered: 1. in the collected multi-dimensional data, a main data is found as reference data for adaptive adjustment, namely, when the collection frequency is adjusted in one-dimensional adaptive collection, the frequency of multi-dimensional data collection is adjusted according to the change of the main data. However, this may be inaccurate, and firstly, how to determine a main data in multiple data is not possible to design a general selection method for main data because the main data to be selected is different in different business scenarios and a large amount of statistical calculation needs to be performed on each kind of data in the multidimensional data in each business scenario to obtain the main data. Furthermore, the main data may not necessarily reflect the variation trend of other data, in which case the accuracy of the acquired data is low, and the requirement of multidimensional large data acquisition cannot be met. This second solution was thus designed and is also the core of the present invention. 2. Dimension reduction is performed on a plurality of data, that is, multidimensional data, by using a dimension reduction technique (for example, PCA (Principal Component Analysis)), and the multidimensional data is reduced to one dimension, that is, a one-dimensional Principal Component of the multidimensional data is obtained. Then, the one-dimensional principal component of the original multi-dimensional data is used as the reference data for judging the data change, and the acquisition frequency of the multi-dimensional data is further adjusted. This is reasonable because PCA uses the covariance of the multidimensional data to reduce the dimension, and the acquisition frequency of the data is adjusted based on the magnitude of the change in the data. The specific operation steps are shown in fig. 4: (1) reducing the dimension of original multi-dimensional data to a one-dimensional principal component by PCA; (2) inputting the one-dimensional principal component as reference data and the original multi-dimensional data as original data into a one-dimensional adaptive data acquisition algorithm; (3) and acquiring multidimensional data by using a one-dimensional adaptive data acquisition algorithm. The conventional sampling method can reduce the data acquisition amount, but cannot realize the adjustment of the adaptive data acquisition, so that the one-dimensional data acquisition algorithm used in the multi-dimensional data acquisition scheme can adopt any one-dimensional adaptive data acquisition algorithm such as regression analysis or time series analysis.
The application effect of the present invention will be described in detail with reference to the simulation.
The experiment is simulated at the mobile terminal, and mainly considers the conditions of simultaneously acquiring four data, including system memory occupation, system CPU occupation, total battery capacity and CPU temperature. Firstly, using PCA to reduce the dimension of four-dimensional data to a one-dimensional principal component; and adjusting the multi-dimensional data acquisition frequency by using a one-dimensional data self-adaptive acquisition algorithm (regression analysis) to obtain an experimental simulation result, and analyzing.
As shown in fig. 4 to 8, the experimental result includes an adaptive acquisition result of the extracted one-dimensional principal component, and an adaptive acquisition result of the multidimensional data including four kinds of data, such as memory usage, CPU usage, battery capacity, and CPU temperature, which are adjusted based on changes in the principal component. As can be seen from the experimental results, the trend of each data can be reflected by the trend of the one-dimensional principal component. The result of the separate acquisition of the four data is ideal, and the acquisition process of the four data is adjusted based on the variation trend of the principal component, so that the PCA is feasible for the self-adaptive acquisition of the multidimensional data/variable/attribute/characteristic.
Before describing the present invention in detail, first, a one-dimensional adaptive data acquisition algorithm ACFAS _ par (adaptive collecting frequency response adjusting length basis predicted acquired ratio) is described. This is the previous work. The basic principle is to adjust the frequency of data acquisition based on data changes, thereby enabling adaptive sampling. The data change size may be represented by calculating a difference between an actual value of the data and a predicted value of the data. When the actual value of the data is very close to the predicted value, the current data change is very small, the acquisition frequency needs to be reduced, and the content of data acquisition is reduced; when the difference value between the actual value and the predicted value of the data is large, the current data change greatly, and the acquisition frequency needs to be increased, so that the data acquisition content is increased, and the data acquisition precision is improved.
The multidimensional data self-adaptive acquisition specifically comprises the following steps:
(1) the target of data acquisition is Yi=(y1,y2,y3,y4) It is four-dimensional data. Suppose y1Is memory footprint, y2Is CPU busy, y3Is the amount of battery, y4Is the CPU temperature. Wherein i is t +1, t +2, t +3r(ii) a Where t is a certain acquisition time point. Defining the predicted value of data as
Figure GDA0001973585630000091
YiIs the actual value. N is a radical ofrIs the actual number of acquisitions, NpIs the number of predicted acquisitions, and Np=Nr
(2) Reducing the dimension of the multidimensional data to a one-dimensional principal component as reference data, wherein the four-dimensional data is reduced to the one-dimensional principal component by mainly utilizing a PCA algorithm:
yi=PCA(Yi)
Figure GDA0001973585630000092
(3) calculating the mean value of the actual value and the predicted value of the target data based on the one-dimensional principal components as follows:
Figure GDA0001973585630000093
Figure GDA0001973585630000094
(4) average ratio of predicted value to actual value using RMRepresents: when R isM1, indicating that the data has not changed substantially; when R isMSignificantly greater than 1 and less than 1 represent a large change in the target data value.
Figure GDA0001973585630000101
Theoretically, the data change ratio can also be calculated by variance, RDAnd (4) showing. The calculation process is as follows:
Figure GDA0001973585630000102
Figure GDA0001973585630000103
Figure GDA0001973585630000104
(5) based on the change ratio of the data, the specific adjustment method of the data acquisition process is as follows:
Figure GDA0001973585630000105
Figure GDA0001973585630000106
wherein T isiRepresenting the current sampling interval, Ti-1Representing the previous sampling interval. T isincRepresents an increased value of the sampling interval; t isdecRepresenting a reduced value of the sampling interval. Thr (Thr)uAnd ThrlIs a threshold value for judging data change. When R isMGreater than ThrlAnd is less than ThruRepresentative data change is small, acquisition interval should be increased by Tinc(ii) a When R isMGreater than ThruOr less than ThrlRepresentative data varies greatly, and the acquisition interval should be reduced by Tdec。TmmaxAnd TminRepresenting maximum and minimum values of the data acquisition interval, i.e. the adjustment of the data acquisition interval not exceeding T at the maximummaxMinimum value of not less than TminAs a constraint for data acquisition adjustment.
The implementation pseudo code of the multidimensional data self-adaptive acquisition method provided by the invention is as follows:
Figure GDA0001973585630000107
Figure GDA0001973585630000111
the above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents and improvements made within the spirit and principle of the present invention are intended to be included within the scope of the present invention.

Claims (5)

1. A social network recommendation control system for implementing a multidimensional data self-adaptive acquisition method is characterized in that the social network recommendation control system for implementing the multidimensional data self-adaptive acquisition method is used for implementing the multidimensional data self-adaptive acquisition method, and the multidimensional data self-adaptive acquisition method utilizes a one-dimensional data self-adaptive acquisition algorithm and combines a dimension reduction technology to realize the self-adaptive adjustment of multidimensional data acquisition and realize the acquisition of multidimensional data; the multidimensional data self-adaptive acquisition method mainly comprises the following processes:
step one, using a dimension reduction technology: reducing the dimension of the multidimensional data of the acquired target to one dimension;
step two, using one-dimensional principal components obtained by dimensionality reduction of original multidimensional data as reference data for judging data change, adjusting the acquisition frequency of the multidimensional data, and realizing self-adaptive acquisition of the multidimensional data;
realizing the self-adaptive acquisition of multi-dimensional data under the current big data scene;
the multidimensional data self-adaptive acquisition method specifically comprises the following steps:
(1) the target of data acquisition is Yi=(y1,y2,y3,...,yn) Which is multidimensional data, yj(j ═ 1, 2, 3.., N) is each dimension of the target data, where i is t +1, t +2, t + 3.., t + Nr(ii) a Wherein t is a certain collection time point, and the predicted value of the data is defined as
Figure FDA0003144108190000011
YiIs the actual value, NrIs the actual number of acquisitions, NpIs the number of predicted acquisitions, and Np=Nr
(2) Reducing the dimension of the multidimensional data to a one-dimensional principal component as reference data, wherein a PCA algorithm is mainly utilized:
yi=PCA(Yi)
Figure FDA0003144108190000014
(3) calculating the mean value of the actual value and the predicted value of the target data based on the one-dimensional principal components as follows:
Figure FDA0003144108190000012
Figure FDA0003144108190000013
(4) average ratio of predicted value to actual value using RMRepresents: when R isM1, indicating that the data has not changed substantially; when R isMSignificantly greater than 1 and less than 1 indicates a large change in the target data value;
Figure FDA0003144108190000021
theoretically, the data change ratio can also be calculated by variance, RDExpressed, the calculation process is as follows:
Figure FDA0003144108190000022
Figure FDA0003144108190000023
Figure FDA0003144108190000024
(5) based on the change ratio of the data, the specific adjustment method of the data acquisition process is as follows:
Figure FDA0003144108190000025
s.t.
Figure FDA0003144108190000026
wherein T isiRepresenting the current sampling interval, Ti-1Representing the previous sampling interval,TincRepresents an increased value of the sampling interval; t isdecRepresenting a reduced value of the sampling interval, ThruAnd ThrlIs a threshold value for judging data change when R isMGreater than ThrlAnd is less than ThruRepresentative data change is small, acquisition interval should be increased by Tinc(ii) a When R isMGreater than ThruOr less than ThrlRepresentative data varies greatly, and the acquisition interval should be reduced by Tdec;TmaxAnd TminRepresenting maximum and minimum values of the data acquisition interval, i.e. the adjustment of the data acquisition interval not exceeding T at the maximummaxMinimum value of not less than TminAs a constraint condition for data acquisition adjustment;
each parameter in the above formula specific calculation process should be determined based on the statistical distribution characteristics of the collected target data in the business.
2. An intrusion detection system using the multidimensional data adaptive collection method in the social network recommendation control system implementing the multidimensional data adaptive collection method according to claim 1.
3. A service quality evaluation system of the multidimensional data adaptive acquisition method in the social network recommendation control system for implementing the multidimensional data adaptive acquisition method according to claim 1.
4. A trust management system of the multidimensional data adaptive acquisition method in the social network recommendation control system for implementing the multidimensional data adaptive acquisition method as claimed in claim 1.
5. Any business application system of the multidimensional data adaptive acquisition method in the social network recommendation control system for implementing the multidimensional data adaptive acquisition method as claimed in claim 1 is applied.
CN201811345413.8A 2018-11-13 2018-11-13 Multi-dimensional data self-adaptive acquisition method and system Active CN109685101B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811345413.8A CN109685101B (en) 2018-11-13 2018-11-13 Multi-dimensional data self-adaptive acquisition method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811345413.8A CN109685101B (en) 2018-11-13 2018-11-13 Multi-dimensional data self-adaptive acquisition method and system

Publications (2)

Publication Number Publication Date
CN109685101A CN109685101A (en) 2019-04-26
CN109685101B true CN109685101B (en) 2021-09-28

Family

ID=66185393

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811345413.8A Active CN109685101B (en) 2018-11-13 2018-11-13 Multi-dimensional data self-adaptive acquisition method and system

Country Status (1)

Country Link
CN (1) CN109685101B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112507208B (en) * 2020-11-02 2021-07-20 北京迅达云成科技有限公司 Network data acquisition system based on big data

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2010057119A3 (en) * 2008-11-14 2011-11-24 Neurovigil, Inc. Methods of identifying sleep and waking patterns and uses
US8385662B1 (en) * 2009-04-30 2013-02-26 Google Inc. Principal component analysis based seed generation for clustering analysis
CN103970092A (en) * 2014-04-13 2014-08-06 北京工业大学 Multi-stage fermentation process fault monitoring method based on self-adaption FCM algorithm
CN104809476A (en) * 2015-05-12 2015-07-29 西安电子科技大学 Multi-target evolutionary fuzzy rule classification method based on decomposition
CN105678270A (en) * 2016-01-12 2016-06-15 哈尔滨工程大学 Screening method for stress wave signal characteristics of one-dimensional member
CN106250819A (en) * 2016-07-20 2016-12-21 上海交通大学 Based on face's real-time monitor and detection facial symmetry and abnormal method

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2010057119A3 (en) * 2008-11-14 2011-11-24 Neurovigil, Inc. Methods of identifying sleep and waking patterns and uses
US8385662B1 (en) * 2009-04-30 2013-02-26 Google Inc. Principal component analysis based seed generation for clustering analysis
CN103970092A (en) * 2014-04-13 2014-08-06 北京工业大学 Multi-stage fermentation process fault monitoring method based on self-adaption FCM algorithm
CN104809476A (en) * 2015-05-12 2015-07-29 西安电子科技大学 Multi-target evolutionary fuzzy rule classification method based on decomposition
CN105678270A (en) * 2016-01-12 2016-06-15 哈尔滨工程大学 Screening method for stress wave signal characteristics of one-dimensional member
CN106250819A (en) * 2016-07-20 2016-12-21 上海交通大学 Based on face's real-time monitor and detection facial symmetry and abnormal method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
A Novel Two-Dimension Adaptive Data Collection Method for Network Management;Zhihui Ji 等;《2009 WRI International Conference on Communications and Mobile Computing》;IEEE;20090108;237-241 *

Also Published As

Publication number Publication date
CN109685101A (en) 2019-04-26

Similar Documents

Publication Publication Date Title
WO2020043267A1 (en) Device and method for anomaly detection on an input stream of events
WO2017152734A1 (en) Data processing method and relevant devices and systems
WO2019080908A1 (en) Image processing method and apparatus for implementing image recognition, and electronic device
WO2016033969A1 (en) Method and system for predicting traffic data amount and/or resource data amount
CN113452676B (en) Detector distribution method and Internet of things detection system
WO2022001918A1 (en) Method and apparatus for building predictive model, computing device, and storage medium
CN111159243A (en) User type identification method, device, equipment and storage medium
Kaur et al. Dynamic resource allocation for big data streams based on data characteristics (5 V s)
CN113485792B (en) Pod scheduling method in kubernetes cluster, terminal equipment and storage medium
CN115460153B (en) Dynamic adjustment method and device for token bucket capacity, storage medium and electronic device
CN109754135B (en) Credit behavior data processing method, apparatus, storage medium and computer device
CN109685101B (en) Multi-dimensional data self-adaptive acquisition method and system
CN117081996B (en) Flow control method based on server-side real-time feedback and soft threshold and related equipment
CN117149746B (en) Data warehouse management system based on cloud primordial and memory calculation separation
Kong et al. Edge-assisted on-device model update for video analytics in adverse environments
CN116841753B (en) Stream processing and batch processing switching method and switching device
CN117201410B (en) Flow management method and system for Internet of things
CN110770753B (en) Device and method for real-time analysis of high-dimensional data
CN116055495A (en) Edge computing resource collaboration method based on comprehensive trust
CN113434270B (en) Data resource scheduling method and device, electronic equipment and storage medium
CN112001301B (en) Building monitoring method and device based on global cross entropy weighting and electronic equipment
Huo et al. Traffic anomaly detection method based on improved GRU and EFMS-Kmeans clustering
Shi et al. Ppvc: Online learning toward optimized video content caching
CN114519605A (en) Advertisement click fraud detection method, system, server and storage medium
CN114356712A (en) Data processing method, device, equipment, readable storage medium and program product

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant