CN109035078A - A kind of source of houses polymerization based on the similar calculating of various dimensions information - Google Patents
A kind of source of houses polymerization based on the similar calculating of various dimensions information Download PDFInfo
- Publication number
- CN109035078A CN109035078A CN201811009790.4A CN201811009790A CN109035078A CN 109035078 A CN109035078 A CN 109035078A CN 201811009790 A CN201811009790 A CN 201811009790A CN 109035078 A CN109035078 A CN 109035078A
- Authority
- CN
- China
- Prior art keywords
- houses
- source
- polymerization
- platform
- matrix
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000006116 polymerization reaction Methods 0.000 title claims abstract description 28
- 239000011159 matrix material Substances 0.000 claims description 27
- 238000004364 calculation method Methods 0.000 claims description 6
- 238000000034 method Methods 0.000 claims description 5
- 230000000379 polymerizing effect Effects 0.000 claims description 3
- 230000004927 fusion Effects 0.000 abstract 1
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/10—Services
- G06Q50/16—Real estate
Landscapes
- Business, Economics & Management (AREA)
- Tourism & Hospitality (AREA)
- Health & Medical Sciences (AREA)
- Economics (AREA)
- General Health & Medical Sciences (AREA)
- Human Resources & Organizations (AREA)
- Marketing (AREA)
- Primary Health Care (AREA)
- Strategic Management (AREA)
- Physics & Mathematics (AREA)
- General Business, Economics & Management (AREA)
- General Physics & Mathematics (AREA)
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The present invention relates to the source of houses polymerizations based on the similar calculating of various dimensions information, comprising the following steps: the source of houses of step (1), each platform of crawl or brokerage firm cleans information of real estate, and it is serious and doubtful false repeat the source of houses to filter out loss of learning;Whether step (2), the source of houses of the multiple platforms of identification belong to the same suite of rooms;The polymerization of step (3), the multi-platform source of houses carries out benchmark and selects;Step (4), source of houses polymerization accuracy and coverage rate detect;Step (5), the source of houses will record history lists in the history restocking of each platform, rise in price, price reduction information, show a source of houses in the life cycle of each platform of the whole network.The invention has the advantages that the data of fusion the whole network, information of real estate more comprehensively, converges historical data, and transverse and longitudinal fully understands the various states and price trend of a source of houses till now in the whole network history, one key can contact interested brokerage firm and broker, greatly improve user and look for room efficiency.
Description
Technical field
The present invention relates to a kind of source of houses polymerizations based on the similar calculating of various dimensions information.
Background technique
The official website of current each brokerage firm is only capable of showing the own source of houses, and some platforms also only simply grab other
The source of houses of platform or brokerage firm is shown as former state.As for the historical price and condition managing in house, now in each platform base
Sheet or blank, because brokerage firm is not intended to let the user know that a house is to appreciate or make a price reduction.
With regard to information provided by current platform and brokerage firm, if house purchaser wonders that a house is managed at more
The listed information of company needs to multiple brokerage firms or platform to go to compare, and user does not know about some city on earth yet
How many brokerage firm of family may have this house.
Summary of the invention
To overcome the shortcomings of existing technologies, the present invention provides a kind of source of houses polymerization side based on the similar calculating of various dimensions information
Method, the technical scheme is that
A kind of source of houses polymerization based on the similar calculating of various dimensions information, comprising the following steps:
The source of houses of step (1), each platform of crawl or brokerage firm, cleans information of real estate, it is serious to filter out loss of learning
And the doubtful false repetition source of houses;
Whether step (2), the source of houses of the multiple platforms of identification belong to the same suite of rooms;
The polymerization of step (3), the multi-platform source of houses carries out benchmark and selects;
Step (4), source of houses polymerization accuracy and coverage rate detect;
Step (5), the source of houses will record history lists in the history restocking of each platform, rise in price, price reduction information, show a source of houses
In the life cycle of each platform of the whole network.
The step (2) specifically:
Entry criteria: when an information of real estate comes, first finding all same cells of database, with total floor, same to room, and same floor
The house in section meets entry criteria to carry out similar weight calculation, and weight calculation meets the value attributes such as condition and key not
With number less than 2, it is determined as that the same set of source of houses is polymerize;
Wherein, in different channel informations of real estate, the cell name of same cells might have difference, but cell ID be it is identical,
It is known that whether belong to the same cell by comparing cell ID;The cell name of different channels and the corresponding relationship of cell ID are
Merge foundation with name similarity by cell geographical location;
The step (3) specifically: when certain set source of houses of two channels condenses together, higher channel priority is base
Standard, when the source of houses of third channel and two sources of houses meet polymerizing condition, and third channel priority is higher, then third is a
On the basis of the adjustment of the channel source of houses;Specific polymerization are as follows:
By the area of the source of houses, price, room, floor multidimensional characteristic is abstracted as the feature vector input of (x1, x2 ..., xn): sample
This collection D=(x1, x2 ..., xn), the generating mode of similar matrix, the dimension k after dimensionality reduction1, dimension k after cluster2, output:
Cluster division C (c1, c2 ... ck2);
1) the similar matrix S of sample is constructed according to the generating mode of the similar matrix of input;
2) adjacency matrix W, building degree matrix D are constructed according to similar matrix S;
3) Laplacian Matrix L is calculated;
4) the Laplacian Matrix D after building standardization−1/2LD−1/2;
5) D is calculated−1/2LD−1/2The smallest corresponding feature vector f of k1 characteristic value institute;
6) matrix by rows of corresponding feature vector f composition is standardized, finally forms n × k1The eigenmatrix F of dimension;
7) to every a line in F as a k1The sample of dimension, total n sample, is clustered with the clustering method of input, cluster
Dimension is k2;
8) obtain cluster divide C (c1, c2 ... ck2);By above-mentioned algorithm, the polymerization of the identical source of houses of different channels is realized.
The step (4) specifically: by inspecting the source of houses in polymerization by random samples, judge whether to be the same set of source of houses,
If it is not, then adjusting the accidentally combined source of houses;
Unpolymerized source of houses coverage rate detected, filter out it is doubtful should polymerize, decide whether to polymerize, if should,
Then adjust the unpolymerized source of houses.
The invention has the advantages that merging the data of the whole network, information of real estate more comprehensively, converges historical data, and transverse and longitudinal is comprehensive
Understand a source of houses in the whole network history various states till now and price trend, a key can contact interested brokerage firm and
Broker greatly improves user and looks for room efficiency.
Specific embodiment
The invention will now be further described with reference to specific embodiments, the advantages and features of the present invention will be with description and
It is apparent.But examples are merely exemplary for these, and it is not intended to limit the scope of the present invention in any way.Those skilled in the art
Member it should be understood that without departing from the spirit and scope of the invention can details to technical solution of the present invention and form into
Row modifications or substitutions, but these modifications and replacement are fallen within the protection scope of the present invention.
The present invention relates to a kind of source of houses polymerizations based on the similar calculating of various dimensions information, comprising the following steps:
The source of houses of step (1), each platform of crawl or brokerage firm, cleans information of real estate, it is serious to filter out loss of learning
And the doubtful false repetition source of houses;
Whether step (2), the source of houses of the multiple platforms of identification belong to the same suite of rooms;
The polymerization of step (3), the multi-platform source of houses carries out benchmark and selects;
Step (4), source of houses polymerization accuracy and coverage rate detect;
Step (5), the source of houses will record history lists in the history restocking of each platform, rise in price, price reduction information, show a source of houses
In the life cycle of each platform of the whole network.
The step (2) specifically:
Entry criteria: when an information of real estate comes, first finding all same cells of database, with total floor, same to room, and same floor
The house in section meets entry criteria to carry out similar weight calculation, and weight calculation meets the value attributes such as condition and key not
With number less than 2, it is determined as that the same set of source of houses is polymerize;
Wherein, in different channel informations of real estate, the cell name of same cells might have difference, but cell ID be it is identical,
It is known that whether belong to the same cell by comparing cell ID;The cell name of different channels and the corresponding relationship of cell ID are
Merge foundation with name similarity by cell geographical location;
The step (3) specifically: when certain set source of houses of two channels condenses together, higher channel priority is base
Standard, when the source of houses of third channel and two sources of houses meet polymerizing condition, and third channel priority is higher, then third is a
On the basis of the adjustment of the channel source of houses;Specific polymerization are as follows:
By the area of the source of houses, price, room, floor multidimensional characteristic is abstracted as the feature vector input of (x1, x2 ..., xn): sample
This collection D=(x1, x2 ..., xn), the generating mode of similar matrix, the dimension k after dimensionality reduction1, dimension k after cluster2, output:
Cluster division C (c1, c2 ... ck2);
1) the similar matrix S of sample is constructed according to the generating mode of the similar matrix of input;
2) adjacency matrix W, building degree matrix D are constructed according to similar matrix S;
3) Laplacian Matrix L is calculated;
4) the Laplacian Matrix D after building standardization−1/2LD−1/2;
5) D is calculated−1/2LD−1/2The smallest corresponding feature vector f of k1 characteristic value institute;
6) matrix by rows of corresponding feature vector f composition is standardized, finally forms n × k1The eigenmatrix F of dimension;
7) to every a line in F as a k1The sample of dimension, total n sample, is clustered with the clustering method of input, cluster
Dimension is k2;
8) obtain cluster divide C (c1, c2 ... ck2);By above-mentioned algorithm, the polymerization of the identical source of houses of different channels is realized.
The step (4) specifically: by inspecting the source of houses in polymerization by random samples, judge whether to be the same set of source of houses,
If it is not, then adjusting the accidentally combined source of houses;
Unpolymerized source of houses coverage rate detected, filter out it is doubtful should polymerize, decide whether to polymerize, if should,
Then adjust the unpolymerized source of houses.
Claims (4)
1. a kind of source of houses polymerization based on the similar calculating of various dimensions information, which comprises the following steps:
The source of houses of step (1), each platform of crawl or brokerage firm, cleans information of real estate, it is serious to filter out loss of learning
And the doubtful false repetition source of houses;
Whether step (2), the source of houses of the multiple platforms of identification belong to the same suite of rooms;
The polymerization of step (3), the multi-platform source of houses carries out benchmark and selects;
Step (4), source of houses polymerization accuracy and coverage rate detect;
Step (5), the source of houses will record history lists in the history restocking of each platform, rise in price, price reduction information, show a source of houses
In the life cycle of each platform of the whole network.
2. a kind of source of houses polymerization based on the similar calculating of various dimensions information according to claim 1, which is characterized in that
The step (2) specifically:
Entry criteria: when an information of real estate comes, first finding all same cells of database, with total floor, same to room, and same floor
The house in section meets entry criteria to carry out similar weight calculation, and weight calculation meets the value attributes such as condition and key not
With number less than 2, it is determined as that the same set of source of houses is polymerize;
Wherein, in different channel informations of real estate, the cell name of same cells might have difference, but cell ID be it is identical,
It is known that whether belong to the same cell by comparing cell ID;The cell name of different channels and the corresponding relationship of cell ID are
Merge foundation with name similarity by cell geographical location.
3. a kind of source of houses polymerization based on the similar calculating of various dimensions information according to claim 1, which is characterized in that
The step (3) specifically: when certain set source of houses of two channels condenses together, higher channel priority is benchmark,
When the source of houses of third channel and two sources of houses meet polymerizing condition, and third channel priority is higher, then by third canal
On the basis of the adjustment of the road source of houses;Specific polymerization are as follows:
By the area of the source of houses, price, room, floor multidimensional characteristic is abstracted as the feature vector input of (x1, x2 ..., xn): sample
This collection D=(x1, x2 ..., xn), the generating mode of similar matrix, the dimension k after dimensionality reduction1, dimension k after cluster2, output:
Cluster division C (c1, c2 ... ck2);
1) the similar matrix S of sample is constructed according to the generating mode of the similar matrix of input;
2) adjacency matrix W, building degree matrix D are constructed according to similar matrix S;
3) Laplacian Matrix L is calculated;
4) the Laplacian Matrix D after building standardization−1/2LD−1/2;
5) D is calculated−1/2LD−1/2The smallest corresponding feature vector f of k1 characteristic value institute;
6) matrix by rows of corresponding feature vector f composition is standardized, finally forms n × k1The eigenmatrix F of dimension;
7) to every a line in F as a k1The sample of dimension, total n sample, is clustered with the clustering method of input, cluster
Dimension is k2;
8) obtain cluster divide C (c1, c2 ... ck2);By above-mentioned algorithm, the polymerization of the identical source of houses of different channels is realized.
4. a kind of source of houses polymerization based on the similar calculating of various dimensions information according to claim 1, which is characterized in that
The step (4) specifically: by inspecting the source of houses in polymerization by random samples, judge whether to be the same set of source of houses, if it is not, then
Adjustment misses the combined source of houses;
Unpolymerized source of houses coverage rate detected, filter out it is doubtful should polymerize, decide whether to polymerize, if should,
Then adjust the unpolymerized source of houses.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811009790.4A CN109035078A (en) | 2018-08-31 | 2018-08-31 | A kind of source of houses polymerization based on the similar calculating of various dimensions information |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811009790.4A CN109035078A (en) | 2018-08-31 | 2018-08-31 | A kind of source of houses polymerization based on the similar calculating of various dimensions information |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109035078A true CN109035078A (en) | 2018-12-18 |
Family
ID=64622929
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811009790.4A Pending CN109035078A (en) | 2018-08-31 | 2018-08-31 | A kind of source of houses polymerization based on the similar calculating of various dimensions information |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109035078A (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109977287A (en) * | 2019-03-28 | 2019-07-05 | 国家计算机网络与信息安全管理中心 | A kind of house property data identity method of discrimination of different aforementioned sources |
CN110096634A (en) * | 2019-04-29 | 2019-08-06 | 成都理工大学 | A kind of house property data vector alignment schemes based on particle group optimizing |
CN110618982A (en) * | 2018-12-26 | 2019-12-27 | 北京时光荏苒科技有限公司 | Multi-source heterogeneous data processing method, device, medium and electronic equipment |
CN110633726A (en) * | 2018-12-25 | 2019-12-31 | 北京时光荏苒科技有限公司 | Room source identification method and device, storage medium and electronic equipment |
CN111260445A (en) * | 2020-01-20 | 2020-06-09 | 北京无限光场科技有限公司 | House resource information display method, device, terminal and storage medium |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140337347A1 (en) * | 2013-04-19 | 2014-11-13 | Tencent Technology (Shenzhen) Company Limited | Cluster method and apparatus based on user interest |
CN104281967A (en) * | 2013-07-10 | 2015-01-14 | 永庆房屋仲介股份有限公司 | Object renting and selling system with object price fluctuation as display basis |
CN107908677A (en) * | 2017-10-27 | 2018-04-13 | 链家网(北京)科技有限公司 | Cell source of houses methods of exhibiting and device based on intelligent terminal |
CN108197312A (en) * | 2018-01-31 | 2018-06-22 | 平安好房(上海)电子商务有限公司 | Obtain source of houses data method, device, equipment and readable storage medium storing program for executing |
CN108197311A (en) * | 2018-01-31 | 2018-06-22 | 平安好房(上海)电子商务有限公司 | Source of houses data aggregate methods of exhibiting, device, equipment and readable storage medium storing program for executing |
-
2018
- 2018-08-31 CN CN201811009790.4A patent/CN109035078A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140337347A1 (en) * | 2013-04-19 | 2014-11-13 | Tencent Technology (Shenzhen) Company Limited | Cluster method and apparatus based on user interest |
CN104281967A (en) * | 2013-07-10 | 2015-01-14 | 永庆房屋仲介股份有限公司 | Object renting and selling system with object price fluctuation as display basis |
CN107908677A (en) * | 2017-10-27 | 2018-04-13 | 链家网(北京)科技有限公司 | Cell source of houses methods of exhibiting and device based on intelligent terminal |
CN108197312A (en) * | 2018-01-31 | 2018-06-22 | 平安好房(上海)电子商务有限公司 | Obtain source of houses data method, device, equipment and readable storage medium storing program for executing |
CN108197311A (en) * | 2018-01-31 | 2018-06-22 | 平安好房(上海)电子商务有限公司 | Source of houses data aggregate methods of exhibiting, device, equipment and readable storage medium storing program for executing |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110633726A (en) * | 2018-12-25 | 2019-12-31 | 北京时光荏苒科技有限公司 | Room source identification method and device, storage medium and electronic equipment |
CN110618982A (en) * | 2018-12-26 | 2019-12-27 | 北京时光荏苒科技有限公司 | Multi-source heterogeneous data processing method, device, medium and electronic equipment |
CN110618982B (en) * | 2018-12-26 | 2022-09-30 | 北京时光荏苒科技有限公司 | Multi-source heterogeneous data processing method, device, medium and electronic equipment |
CN109977287A (en) * | 2019-03-28 | 2019-07-05 | 国家计算机网络与信息安全管理中心 | A kind of house property data identity method of discrimination of different aforementioned sources |
CN110096634A (en) * | 2019-04-29 | 2019-08-06 | 成都理工大学 | A kind of house property data vector alignment schemes based on particle group optimizing |
CN110096634B (en) * | 2019-04-29 | 2023-02-24 | 成都理工大学 | House property data vector alignment method based on particle swarm optimization |
CN111260445A (en) * | 2020-01-20 | 2020-06-09 | 北京无限光场科技有限公司 | House resource information display method, device, terminal and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109035078A (en) | A kind of source of houses polymerization based on the similar calculating of various dimensions information | |
Bauman et al. | Optimizing the choice of a spatial weighting matrix in eigenvector‐based methods | |
CN109873501B (en) | Automatic identification method for low-voltage distribution network topology | |
US9287713B2 (en) | Topology identification in distribution network with limited measurements | |
CN110110881B (en) | Power customer demand prediction analysis method and system | |
Bornmann | How to analyze percentile citation impact data meaningfully in bibliometrics: The statistical analysis of distributions, percentile rank classes, and top‐cited papers | |
CN106952159B (en) | Real estate collateral risk control method, system and storage medium | |
Militino et al. | Alternative models for describing spatial dependence among dwelling selling prices | |
US20140058705A1 (en) | System and Method for Detecting Abnormal Occurrences | |
Mohammadian et al. | Data-driven classifier for extreme outage prediction based on Bayes decision theory | |
Micevski et al. | Regionalisation of the parameters of the log‐Pearson 3 distribution: A case study for New South Wales, Australia | |
CN106026092A (en) | Island dividing method for power distribution network comprising distributed power supply | |
CN103559426A (en) | Protein functional module excavating method for multi-view data fusion | |
CN103581982B (en) | A kind of detection method of traffic hotspots, determine method, localization method and device | |
CN106326923A (en) | Sign-in position data clustering method in consideration of position repetition and density peak point | |
CN104735710A (en) | Mobile network performance early warning pre-judging method based on trend extrapolation clustering | |
US10557720B2 (en) | Unauthorized electrical grid connection detection and characterization system and method | |
CN117200217A (en) | Power system scheduling method based on load classification | |
Steinley et al. | A note on the expected value of the Rand index | |
US20050131873A1 (en) | System and method for adaptive pruning | |
Xia | Improve the resilience of multilayer supply chain networks | |
Kabir et al. | Power outage prediction using data streams: An adaptive ensemble learning approach with a feature‐and performance‐based weighting mechanism | |
CN109656904B (en) | Case risk detection method and system | |
Nabian et al. | Uncertainty quantification and pca-based model reduction for parallel monte carlo analysis of infrastructure system reliability | |
CN112598041B (en) | Power distribution network cloud platform data verification method based on K-MEANS algorithm |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20181218 |
|
WD01 | Invention patent application deemed withdrawn after publication |