CN103778155A - 一种数据降维方法 - Google Patents

一种数据降维方法 Download PDF

Info

Publication number
CN103778155A
CN103778155A CN201210410485.2A CN201210410485A CN103778155A CN 103778155 A CN103778155 A CN 103778155A CN 201210410485 A CN201210410485 A CN 201210410485A CN 103778155 A CN103778155 A CN 103778155A
Authority
CN
China
Prior art keywords
data
sample point
dimension reduction
weight matrix
reduction method
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201210410485.2A
Other languages
English (en)
Inventor
李兰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to CN201210410485.2A priority Critical patent/CN103778155A/zh
Publication of CN103778155A publication Critical patent/CN103778155A/zh
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/583Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • G06F16/5838Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using colour
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Library & Information Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Image Analysis (AREA)

Abstract

本发明公开了一种数据降维方法,包括:(a)寻找每个样本点的K个近邻点;(b)由每个样本点的近邻点计算出该样本点的局部重建权值矩阵;(c)由该样本点的局部重建权值矩阵和其近邻点计算出该样本点的输出值。本发明能够使降维后的数据保持原有的拓扑结构,可以广泛地应用于非线性数据的降维、聚类以及图像分割等领域。

Description

一种数据降维方法
技术领域
本发明涉及一种数据降维方法。
背景技术
数据作为信息的载体,当然要分析数据中包含的主要信息,及分析数据的主要特征。也就是说,要研究数据的数字特征。研究数据就是对数据进行采集、分类、录入、储存、统计分析,统计检验等一系列活动的统称。
数据是载荷或记录信息的按一定规则排列组合的物理符号。可以是数字、文字、图像,也可以是计算机代码。对信息的接收始于对数据的接收,对信息的获取只能通过对数据背景的解读。数据背景是接收者针对特定数据的信息准备,即当接收者了解物理符号序列的规律,并知道每个符号和符号组合的指向性目标或含义时,便可以获得一组数据所载荷的信息。
发明内容
本发明的目的为了克服现有技术的不足与缺陷,提供一种数据降维方法,该数据降维方法可以广泛地应用于非线性数据的降维、聚类以及图像分割等领域,能够使降维后的数据保持原有的拓扑结构。
本发明的目的通过下述技术方案实现:一种数据降维方法,包括以下步骤:
(a)把相对于所求样本点距离最近的K个样本点规定为所求样本点的K个近邻点,K为预先给定的值;
(b)由每个样本点的近邻点计算出该样本点的局部重建权值矩阵;
(c)利用高维输入向量的重构权值矩阵来计算高维数据的低维嵌入坐标,并使所有输出数据保存原有的拓扑结构。
综上所述,本发明的有益效果是:可以广泛地应用于非线性数据的降维、聚类以及图像分割等领域,能够使降维后的数据保持原有的拓扑结构。
具体实施方式
下面结合实施例,对本发明作进一步地的详细说明,但本发明的实施方式不限于此。
实施例:
本实施例涉及一种数据降维方法,包括以下步骤:
(a)寻找每个样本点的K个近邻点;
(b)由每个样本点的近邻点计算出该样本点的局部重建权值矩阵;
(c)由该样本点的局部重建权值矩阵和其近邻点计算出该样本点的输出值。
所述步骤(a)的具体过程为:把相对于所求样本点距离最近的K个样本点规定为所求样本点的K个近邻点,K为预先给定的值。
所述步骤(c)的具体过程为:利用高维输入向量的重构权值矩阵来计算高维数据的低维嵌入坐标,并使所有输出数据保存原有的拓扑结构。
以上所述,仅是本发明的较佳实施例,并非对本发明做任何形式上的限制,凡是依据本发明的技术实质上对以上实施例所作的任何简单修改、等同变化,均落入本发明的保护范围之内。

Claims (1)

1.一种数据降维方法,其特征在于,包括以下步骤:
(a)把相对于所求样本点距离最近的K个样本点规定为所求样本点的K个近邻点,K为预先给定的值;
(b)由每个样本点的近邻点计算出该样本点的局部重建权值矩阵;
(c)利用高维输入向量的重构权值矩阵来计算高维数据的低维嵌入坐标,并使所有输出数据保存原有的拓扑结构。
CN201210410485.2A 2012-10-17 2012-10-17 一种数据降维方法 Pending CN103778155A (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210410485.2A CN103778155A (zh) 2012-10-17 2012-10-17 一种数据降维方法

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210410485.2A CN103778155A (zh) 2012-10-17 2012-10-17 一种数据降维方法

Publications (1)

Publication Number Publication Date
CN103778155A true CN103778155A (zh) 2014-05-07

Family

ID=50570397

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210410485.2A Pending CN103778155A (zh) 2012-10-17 2012-10-17 一种数据降维方法

Country Status (1)

Country Link
CN (1) CN103778155A (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106056068A (zh) * 2016-05-27 2016-10-26 大连楼兰科技股份有限公司 车辆低速碰撞信号特征变换方法及系统
CN106547862A (zh) * 2016-10-31 2017-03-29 中原智慧城市设计研究院有限公司 基于流形学习的交通大数据降维处理方法
CN110827919A (zh) * 2019-11-05 2020-02-21 哈尔滨工业大学 一种应用于基因表达谱数据的降维方法

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106056068A (zh) * 2016-05-27 2016-10-26 大连楼兰科技股份有限公司 车辆低速碰撞信号特征变换方法及系统
CN106547862A (zh) * 2016-10-31 2017-03-29 中原智慧城市设计研究院有限公司 基于流形学习的交通大数据降维处理方法
CN110827919A (zh) * 2019-11-05 2020-02-21 哈尔滨工业大学 一种应用于基因表达谱数据的降维方法

Similar Documents

Publication Publication Date Title
CN104462184A (zh) 一种基于双向抽样组合的大规模数据异常识别方法
WO2006020290A3 (en) Methods and systems for multi-pattern searching
CN103425639A (zh) 一种基于信息指纹的相似信息识别方法
Tang et al. Comparison of different daily streamflow series in US and China, under a viewpoint of complex networks
CN103778155A (zh) 一种数据降维方法
Wang et al. Constructing slacks-based composite indicator of sustainable energy development for China: A meta-frontier nonparametric approach
CN102305792B (zh) 基于非线性偏最小二乘优化模型的森林碳汇遥感估算方法
Yu et al. Based on quadtree fractal image compression improved algorithm for research
Wang et al. A density-based clustering structure mining algorithm for data streams
CN102708172B (zh) 一种用于挖掘rfid数据孤立点的方法
CN104361058A (zh) 一种面向海量数据流的哈希结构复杂事件检测方法
Lai et al. Understanding China's resumption of work and production during the critical period of COVlD‐19 based on multi‐source data
Eum Application of a statistical interpolation method to correct extreme values in high-resolution gridded climate variables
CN114880380A (zh) 一种基于密度聚类和自组织网络的电网告警数据关联溯源系统的实现方法
CN109754159B (zh) 一种电网运行日志的信息提取方法及系统
Zou et al. Research on privacy protection of large-scale network data aggregation process
Wen-pei et al. Study on the Development Index of Urban AtmosphericEnvironment Based on DPSIR Model
CN107423790A (zh) 变压器设备温度的选择性存储方法
Liu et al. Global seasonal-scale meteorological droughts. Part II: temperature anomaly-based classifications
Zhong et al. Energy Detection and Feature Extraction of Signals in SETI Radio Spectrograms
Binwal et al. Legal and Ethical Aspects of IoT Security
Zheng et al. Research of big data space-time analytics for clouding based contexts-aware IOV applications
Ye et al. Privacy preservation in a two-tiered sensor network through correlation tracking
Jeong et al. A decentralized approach to damage localization through smart wireless sensors
Bini et al. Robust transformation of proportions using the forward search

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C02 Deemed withdrawal of patent application after publication (patent law 2001)
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20140507