CN112418939A - Method for mining space-time correlation of house price based on neural network to predict house price - Google Patents

Method for mining space-time correlation of house price based on neural network to predict house price Download PDF

Info

Publication number
CN112418939A
CN112418939A CN202011344364.3A CN202011344364A CN112418939A CN 112418939 A CN112418939 A CN 112418939A CN 202011344364 A CN202011344364 A CN 202011344364A CN 112418939 A CN112418939 A CN 112418939A
Authority
CN
China
Prior art keywords
data
house
price
neural network
rate
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011344364.3A
Other languages
Chinese (zh)
Inventor
李俊
刘胜强
聂俊
杨文韬
舒文杰
蓝子璇
许高武
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Advanced Technology University of Science and Technology of China
Original Assignee
Institute of Advanced Technology University of Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Advanced Technology University of Science and Technology of China filed Critical Institute of Advanced Technology University of Science and Technology of China
Priority to CN202011344364.3A priority Critical patent/CN112418939A/en
Publication of CN112418939A publication Critical patent/CN112418939A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • G06Q30/0206Price or cost determination based on market factors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/16Real estate

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Strategic Management (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Development Economics (AREA)
  • General Physics & Mathematics (AREA)
  • Finance (AREA)
  • Accounting & Taxation (AREA)
  • Economics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • General Health & Medical Sciences (AREA)
  • Marketing (AREA)
  • Health & Medical Sciences (AREA)
  • General Business, Economics & Management (AREA)
  • Data Mining & Analysis (AREA)
  • Tourism & Hospitality (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computational Linguistics (AREA)
  • Game Theory and Decision Science (AREA)
  • Human Resources & Organizations (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Primary Health Care (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a method for mining the space-time relevance of the house price based on a neural network to predict the house price, which comprises the following steps: acquiring a data set, separating data with room price tags from data without room price tags, and dividing the data set with the room price tags into a plurality of subdata sets; respectively inputting the subdata sets into a first neural network model for training to obtain a first room price prediction model; inputting the data without the rate label into the first rate prediction model, and taking the predicted rate as the rate of the data without the rate label; fusing the data without the room price labels and the data with the room price labels into a new data set, inputting the new data set into a second neural network model for training to obtain a second room price prediction model; the house characteristic information to be predicted is input into the second house price prediction model after classification processing to obtain the house predicted price, and the problems that no house price label data exists in the near future and the internal relation between time and house price is difficult to determine in the prior art are solved.

Description

Method for mining space-time correlation of house price based on neural network to predict house price
Technical Field
The invention relates to the technical field of room price prediction, in particular to a method, a device and a computer-readable storage medium for predicting room prices by mining space-time relevance of the room prices based on a neural network.
Background
In recent years, with the rapid development of a real estate market, a lot of real estate transactions occur in the market, business demands of real estate transaction parties for evaluating market prices of real estate to be traded are increased, at present, the traditional real estate evaluation mainly depends on an evaluator to estimate the real estate price in a small range according to historical data and working experience, but with the gradual fire and heat of the real estate market, the manpower and financial resources consumed by the method are also increased rapidly, and the method for manually predicting the real estate price is not suitable for large-scale real estate prediction.
The data-based room price prediction method has gradually become the research focus of room price prediction, but the technical problems of the prior art research include: firstly, the method comprises the following steps: only the recently traded house price data is used for training, and the data without house prices in the recent period is ignored; secondly, the method comprises the following steps: in the second-hand house transaction data set, the transaction time span of the same house is large, so that the internal relation between the learning time and the house price is difficult to train.
Therefore, it is crucial to design a method for mining the intrinsic temporal and spatial correlation of the rate based on the neural network and predicting the rate accordingly.
Disclosure of Invention
The invention mainly aims to provide a method, a device and a computer-readable storage medium for predicting a room price by mining space-time correlation of the room price based on a neural network, and aims to solve the problems that no room price label data in the near future is ignored and the internal relation between time and the room price is difficult to determine in the prior art.
In order to achieve the above object, the present invention provides a method for predicting a room price by mining a space-time correlation of the room price based on a neural network, the method for predicting the room price by mining the space-time correlation of the room price based on the neural network comprising the steps of:
acquiring a data set, separating data with room price tags from data without room price tags, and dividing the data set with the room price tags into a plurality of subdata sets;
respectively inputting the subdata sets into a first neural network model for training to obtain a first room price prediction model;
inputting the data without the rate label into the first rate prediction model, and taking the predicted rate as the rate of the data without the rate label;
fusing the data without the room price labels and the data with the room price labels into a new data set, inputting the new data set into a second neural network model for training to obtain a second room price prediction model;
and inputting the house characteristic information to be predicted into the second house price prediction model after classification processing so as to obtain the house predicted price.
In one embodiment, the acquiring the dataset and separating the rate tagged data from the non-rate tagged data comprises: the method comprises the steps of collecting a house data set in an area, detecting the data type of each data in the house data set, classifying, and separating the labeled room price data from the unlabeled room price data.
In one embodiment, the data types include at least a timestamp type, a numerical type, and a category type; the detecting and classifying the data types of the data in the house data set comprises the following steps:
if the detected data type is a timestamp type, separating the year, month and day of the timestamp into new data characteristics respectively and processing the new data characteristics as numerical type characteristics;
if the detected data type is numerical type, filling the missing numerical type characteristics by using an average value filling method;
if the detected data type is a type, counting the number of types in the characteristics of each type in the data set, and converting the type data into identifiable data by using a one-hot coding mode to increase the number of characteristics.
In one embodiment, the method further comprises: and normalizing each numerical characteristic to change the characteristic distribution, wherein the normalization operation enables the mean value of the numerical characteristic to be 0 and the variance to be 1.
In one embodiment, the dividing the room price tagged data set into a plurality of sub data sets includes:
acquiring a data set with a room price tag after classification processing, and dividing the data set with the room price tag into a plurality of subdata sets in a time unit;
and proportionally dividing each subdata set according to a training set, a verification set and a test set, wherein the corresponding proportional relation of the training set, the verification set and the test set is 8: 1: 1.
in one embodiment, the first rate prediction model comprises at least two sub-rate prediction models; the step of inputting the subdata sets into a first neural network model respectively for training to obtain a first room price prediction model comprises:
counting the characteristic quantity of a house data set to be trained, and inputting the sub data sets into the first neural network model respectively for training to obtain a corresponding sub-house price prediction model;
and predicting the house price without the house price label, which has similar house characteristic information with the house price label in the corresponding time period, according to the house price prediction model.
In one embodiment, the method further comprises: neural network parameters are adjusted for model optimization.
In one embodiment, the parameters include at least: the number of hidden layers of the neural network and the number of neurons in each hidden layer.
In order to achieve the above object, the present invention further provides a room price predicting apparatus, which includes a memory, a processor, and a neural network-based mining room price spatiotemporal correlation stored in the memory and operable on the processor to predict a room price program, wherein the neural network-based mining room price spatiotemporal correlation to predict a room price program when executed by the processor implements the steps of the neural network-based mining room price spatiotemporal correlation to predict a room price as described above.
To achieve the above object, the present invention further provides a computer readable storage medium storing a program for mining space-time correlation of rates based on neural networks to predict rates, which when executed by a processor implements the steps of the method for mining space-time correlation of rates based on neural networks to predict rates as described above.
The technical scheme of the method and the device for predicting the room price based on the neural network mining room price space-time correlation and the computer readable storage medium provided by the embodiment of the application at least has the following technical effects:
the method comprises the steps that a house data set in an acquisition area is adopted, the data types of all data in the house data set are detected and classified, labeled room price data and unlabeled room price data are separated, and the data set with room price labels is divided into sub-data sets; respectively inputting the subdata sets into a first neural network model for training to obtain a first room price prediction model; inputting the data without the rate label into the first rate prediction model, and taking the predicted rate as the rate of the data without the rate label; the data without the room price labels and the data with the room price labels are fused into a new data set, the new data set is input into a second neural network model for training to obtain a second room price prediction model, the room price of a house with similar spatial attributes in the same time period is estimated by using the spatial dependence of the room price, so that the time span of the house labels is greatly shortened, the time relevance of network learning data is facilitated, and the prediction is more accurate at different time points.
Drawings
Fig. 1 is a schematic structural diagram of a room price predicting apparatus according to an embodiment of the present invention;
FIG. 2 is a flowchart illustrating a method for predicting a room price according to a first embodiment of the present invention based on a neural network mining the spatiotemporal correlation of the room price;
FIG. 3 is a schematic flow chart illustrating a step S110 of the first embodiment of the method for predicting a room price based on the neural network mining the spatiotemporal correlation of the room price;
FIG. 4 is a schematic flowchart of a step S111 of the first embodiment of the method for predicting a room price by mining the space-time correlation of the room price based on the neural network according to the present invention;
FIG. 5 is a schematic flow chart illustrating a step S120 of the first embodiment of the method for predicting a room price based on the neural network mining the spatiotemporal correlation of the room price;
FIG. 6 is a flowchart illustrating a second embodiment of the method for predicting a room price by mining the space-time correlation of the room price based on a neural network according to the present invention;
the implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.
Detailed Description
It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
In order to solve the problems that the prior art ignores data without a room price tag in the near term and is difficult to determine the internal relation between time and room price, the method comprises the steps of acquiring a data set, separating the data with the room price tag from the data without the room price tag, and dividing the data set with the room price tag into a plurality of sub data sets; respectively inputting the subdata sets into a first neural network model for training to obtain a first room price prediction model; inputting the data without the rate label into the first rate prediction model, and taking the predicted rate as the rate of the data without the rate label; fusing the data without the room price labels and the data with the room price labels into a new data set, inputting the new data set into a second neural network model for training to obtain a second room price prediction model; the house characteristic information to be predicted is input into the second house price prediction model after being classified and processed to obtain the house predicted price, so that the house price prediction cost is reduced, and the house price prediction accuracy is improved.
For a better understanding of the above technical solutions, exemplary embodiments of the present application will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present application are shown in the drawings, it should be understood that the present application may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
It will be understood by those skilled in the art that the structure of the rate prediction device shown in fig. 1 does not constitute a limitation of the rate prediction device, and that the rate prediction device may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components.
As an implementation manner, as shown in fig. 1, fig. 1 is a schematic structural diagram of a room price predicting device according to an embodiment of the present invention.
Processor 1100 may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware or instructions in the form of software in the processor 1100. The processor 1100 described above may be a general purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components. The various methods, steps and logic blocks disclosed in the embodiments of the present invention may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The software module may be located in ram, flash memory, rom, prom, or eprom, registers, etc. storage media as is well known in the art. The storage medium is located in the memory 1200, and the processor 1100 reads the information in the memory 1200 and performs the steps of the above method in combination with the hardware thereof.
It will be appreciated that memory 1200 in embodiments of the invention may be either volatile memory or nonvolatile memory, or may include both volatile and nonvolatile memory. The non-volatile Memory may be a Read Only Memory (ROM), a Programmable ROM (PROM), an Erasable PROM (EPROM), an Electrically Erasable PROM (EEPROM), or a flash Memory. Volatile Memory can be Random Access Memory (RAM), which acts as external cache Memory. By way of illustration and not limitation, many forms of RAM are available, such as Static random access memory (Static RAM, SRAM), Dynamic Random Access Memory (DRAM), Synchronous Dynamic random access memory (Synchronous DRAM, SDRAM), Double Data Rate Synchronous Dynamic random access memory (ddr Data Rate SDRAM, ddr SDRAM), Enhanced Synchronous SDRAM (ESDRAM), Synchlink DRAM (SLDRAM), and Direct Rambus RAM (DRRAM). The memory 1200 of the systems and methods described in connection with the embodiments of the invention is intended to comprise, without being limited to, these and any other suitable types of memory.
For a software implementation, the techniques described in this disclosure may be implemented with modules (e.g., procedures, functions, and so on) that perform the functions described in this disclosure. The software codes may be stored in a memory and executed by a processor. The memory may be implemented within the processor or external to the processor.
Based on the above structure, an embodiment of the present invention is proposed.
Referring to fig. 2, fig. 2 is a schematic flow chart of a first embodiment of the method for mining the space-time correlation of the room price based on the neural network to predict the room price, and the method for mining the space-time correlation of the room price based on the neural network to predict the room price comprises the following steps:
step S110, acquiring a data set, separating the data with the room price tags from the data without the room price tags, and dividing the data set with the room price tags into a plurality of subdata sets.
In this embodiment, a house data set in an area is obtained, where the house data set includes data with a room price tag and data without a room price tag, the data type of each data in the house data set is detected and is classified, the data with a room price tag and the data without a room price tag are separated, and the data set with a room price tag is divided into a plurality of sub data sets.
Referring to fig. 3, fig. 3 is a schematic flowchart illustrating a detailed step S110 of the method for predicting a room price based on a neural network mining space-time correlation of the room price according to the present invention, which includes the following steps:
and S111, acquiring a house data set in the area, detecting the data type of each data in the house data set, classifying, and separating the labeled room price data from the unlabeled room price data.
In the present embodiment, the source channel of the house data set is not limited to data publicly known on websites, newspapers, advertisements, and the like; the region can be a province, a city, a government region and the like, for example, house data of the past 10 years of the government region of city C of city B is collected from the site of the house source A; the data in the house data set comprises house attributes, geographic positions, house developers, house transaction records and the like; the house attributes can comprise house area, decoration condition, house type, orientation, affiliated subdistrict and the like; the geographic position can comprise the position of the administrative district, the position of the cell and the like; the data types at least comprise a timestamp type, a numerical type and a classification type; collecting house data in a certain area, and screening the house data, wherein the screened data comprises obviously wrong data, data with serious information loss and the like, for example: by detecting the data type of each data in the house data set and carrying out classification processing, the characteristics of the data can be classified into at least three categories, for example, time data is classified into time stamp type characteristics such as house selling time and the like; classifying the numerical data into numerical characteristics such as house area, house price and the like; data that is difficult to quantify is classified into categorical features such as the cell to which it belongs, orientation, and the like.
Referring to fig. 4, fig. 4 is a schematic flowchart illustrating a detailed process of step S111 of the method for predicting a room price based on a neural network mining space-time correlation of the room price according to the present invention, which includes the following steps:
in step S1111, if the detected data type is a time stamp type, the year, month and day of the time stamp are separated into new data characteristics and processed as numerical characteristics.
In the present embodiment, the timestamp refers to data generated by using a digital signature technology, and is mainly intended to authenticate the time of data generation by a certain technical means, so as to verify whether the data is falsified after the generation.
In step S1112, if the detected data type is numerical, the missing numerical features are filled in by using an average filling method.
In this embodiment, if the detected data type is numerical, the missing numerical features are filled using an average filling method; suppose that 4 houses in a certain cell in 3 months are respectively sold as A, B, C, D, only the house selling time of A, C, D houses is collected when data are collected, the house selling time is 3 months and 5 numbers respectively for A houses, the house selling time is 3 months and 15 numbers, the house selling time is 3 months and 21 numbers respectively for C houses, the house selling time of B houses is lost, then the average house selling time can be calculated according to the house selling time of A, C, D the three houses to be about 5 days to sell one house, the house selling time of B houses is calculated to be about 3 months and 10 numbers, and then the house selling time is filled by 3 months and 10 numbers.
Step S1113, if the detected data type is the type, counting the number of types in the characteristics of each type in the data set, and converting the type data into the identifiable data by using the one-hot coding mode to increase the number of the characteristics.
In this embodiment, if the detected data type is a type, such as a cell, an orientation, a decoration condition, a house type, and the like, the number of types in each type feature in the data set is counted, and the type data is converted into identifiable data by using a one-hot coding (one-hot coding) method to increase the number of features, that is, discrete type data in the data set is converted into continuous type data; the one-hot encoding, also known as one-bit-efficient encoding, uses an N-bit state register to encode N states, each state having an independent register bit and only one bit being active at any time, wherein each class-type feature is represented as a binary vector, requiring first that a classification value be mapped to an integer value, and then each integer value be represented as a binary vector; for example, a one-hot transcoding of an "orientation" data type feature in a data set, detecting a class-type data feature in the data type includes: the characteristics of the type data are processed according to the principle that an N-bit state register encodes N states, wherein, east is converted into 100, south is converted into 010, west is converted into 001, and the type data are converted into recognizable data by using a unique heat coding mode, namely: [1, 0, 0, 0, 1, 0, 0, 0, 1 ]; in the class type feature, the missing is also a class, and does not need to be filled, and if a feature has a large number of missing, the feature can be directly removed.
In step S1114, each numerical feature is normalized to change the feature distribution, where the normalization operation makes the mean value of the numerical feature 0 and the variance 1.
In this embodiment, after the discrete features are subjected to the one-hot encoding, the encoded features, in which the features of each dimension can be regarded as continuous features, can be normalized as the normalization method for the continuous features, and each dimension feature is subjected to the normalization operation of 0 mean value, so that the mean value of the numerical features is 0 and the variance is 1.
The house data set in the acquisition area is adopted, the data types of all data in the house data set are detected and classified, the labeled room price data and the unlabeled room price data are separated, and if the detected data types are timestamp types, the year, month and day of a timestamp are respectively separated into new data characteristics and are treated as numerical type characteristics; if the detected data type is numerical type, filling the missing numerical type characteristics by using an average value filling method; if the detected data type is a type, counting the number of types in the characteristics of each type in the data set, converting text data and type loss into a data form which can be identified by the model by using a single-hot coding mode, increasing the number of characteristics, providing more information for the model, and still predicting the data by the model under the condition of partial characteristic loss so as to ensure that the house price prediction result is more accurate; the numerical data is subjected to normalization operation, so that the distribution condition of data characteristics is improved, and the model learning data internal rule is facilitated.
Step S112, acquiring the classified data set with the room price tags, and dividing the data set with the room price tags into a plurality of subdata sets in time units.
In this embodiment, the data set with the room price tags after the classification processing is obtained, the data set with the room price tags is divided into a plurality of sub-data sets in time units, where D represents data with the room price in the data set, and D ^ represents data without the room price, and for data with the real room price tags, the data is divided into a plurality of sample sets D according to the transaction time1、D2、…、DnRespectively correspond to T1、T2、…、TnA data sample set for a time period, for example, in half a year or one year as a time period, data in each time period is put in a sample set, and if a room has a transaction record and house information in 2015 and does not have 2016, the house has house information but does not have a room price tag in 2016.
Step S113, each subdata set is divided according to a training set, a verification set and a test set in proportion, wherein the proportion relation corresponding to the training set, the verification set and the test set is 8: 1: 1.
in this embodiment, D is used1、D2、…、DnSeparately training neural networks to simulateBefore the spatial relevance of the room price data of each time period is combined, each subdata set is divided in proportion according to a training set, a verification set and a test set, wherein the corresponding proportion relation of the training set, the verification set and the test set is 8: 1: 1, the proportion relation of the sub data set division can be divided proportionally according to the actual model fitting condition, and is not limited to the proportion relation of the application.
The house data set in the collection area is adopted, the data types of all data in the house data set are detected and classified, and the labeled room price data and the unlabeled room price data are separated; the technical scheme of obtaining the classified data set with the room price labels and dividing the data set with the room price labels into a plurality of subdata sets in time units is beneficial to the time relevance of network learning data, so that the prediction is more accurate at different time points.
And step S120, inputting the subdata sets into a first neural network model respectively for training to obtain a first room price prediction model.
In this embodiment, the subdata sets are respectively input into a first neural network model for training to obtain a first room price prediction model, where the first room price prediction model at least includes two sub-room price prediction models; because of the subdata set D1、D2、…、DnThe method is sample data in a small time period, the house price in each data set does not fluctuate greatly along with time, but the spatial correlation of data characteristics is strong, for example, the spatial attributes are similar, like a cell and a similar building, the house price of the similar building is similar with the high probability, and the house price of the house with the similar spatial attributes in the same time period can be estimated by using the spatial dependence of the house price.
Referring to fig. 5, fig. 5 is a schematic flowchart illustrating a detailed process of step S120 of the first embodiment of the method for predicting a room price based on a neural network mining space-time correlation of the room price, including the following steps:
step S121, counting the characteristic quantity of the house data set to be trained, and inputting the sub data sets into the first neural network model respectively for training to obtain a corresponding ovary price prediction model.
In this embodiment, the feature quantity of the house data set to be trained is counted, which determines the quantity of neurons in the input layer of the neural network, and the quantity of neurons in the output layer is 1 because the room price prediction task belongs to the regression task; inputting the subdata sets into the first neural network model for training, namely using D1、D2、…、DnRespectively training a neural network to fit the spatial relevance of the room price data in each time period, wherein the first neural network model mainly mines the influence of spatial factors such as the property of a house, the geographical position of the house and the like on the room price; obtaining corresponding ovary price prediction models, namely obtaining n neural network models M1、…、Mn
And S122, predicting the room price without the room price label, which has similar house characteristic information with the room price label in the corresponding time period, according to the sub-room price prediction model.
In this embodiment, the obtained n neural network models M are used1、…、MnPredicting the room price without the room price label, which has similar house characteristic information with the room price label in the corresponding time period, namely estimating the room price of the house without the room price label but with the room information in the corresponding time period; due to different distribution of data sets in different time periods, e.g. T1There is more transaction data in SCell during the time period, and T2If there is no transaction data in S cell in time period, only M is suitable for use1To predict Scell T1The rate of the time period.
As the characteristic quantity of the house data set to be trained is counted, the sub data sets are respectively input into the first neural network model for training to obtain a corresponding sub-house price prediction model; according to the technical scheme that the house price without the house price label and with the house price label similar to the house price label in the corresponding time period is predicted according to the house price prediction model, the house price of the house with the similar spatial attribute in the same time period is estimated by using the spatial dependence of the house price, and the prediction is more accurate at different time points.
Step S130, inputting the data without the rate label into the first rate prediction model, and taking the predicted rate as the rate of the data without the rate label.
In this embodiment, the first room price prediction model includes n neural network models M1、…、MnAnd inputting the data without the rate label into the first rate prediction model, predicting the rate without the rate label with the rate label similar to the rate label in the corresponding time period by using the first rate prediction model, and taking the predicted rate as the rate of the data without the rate label.
And S140, fusing the data without the room price labels and the data with the room price labels into a new data set, and inputting the new data set into a second neural network model for training to obtain a second room price prediction model.
In the embodiment, the data with the prediction label, namely the data without the room price label, and the data with the room price label in the original data set are integrated into a new data set, the data set not only keeps the spatial correlation of the original data set, but also has the time correlation, and the model can easily learn the space-time relation of the room price by training the network by using the data set, so that the prediction result is more accurate; inputting the new data set into a second neural network model for training to obtain a second room price prediction model, wherein the establishment method of the second neural network model is similar to that of the first neural network model, firstly, counting the characteristic quantity of the data set of the house to be trained, which determines the quantity of neurons on an input layer of the neural network, because the room price prediction task belongs to a regression task, the quantity of neurons on an output layer is 1, then establishing the second neural network model, and training the network by using the new data set to fit the time and space relevance of the room price data.
And S150, inputting the characteristic information of the house to be predicted into the second house price prediction model after classification processing so as to obtain the house predicted price.
In this embodiment, the data set for training the second neural network model is larger, so the number of layers and the number of neurons of the network are more, the house characteristic information to be predicted needs to be classified first, the time point to be predicted and the house information are preprocessed to obtain a data format which can be input into the model, the processed data is input into the second neural network model to obtain the normalized house predicted price, and then the more accurate house predicted price is obtained through the inverse normalization process.
The data set with the room price tags is divided into a plurality of subdata sets due to the adoption of the steps of acquiring the data set and separating the data with the room price tags from the data without the room price tags; respectively inputting the subdata sets into a first neural network model for training to obtain a first room price prediction model; inputting the data without the rate label into the first rate prediction model, and taking the predicted rate as the rate of the data without the rate label; fusing the data without the room price labels and the data with the room price labels into a new data set, inputting the new data set into a second neural network model for training to obtain a second room price prediction model; the technical scheme of inputting the house characteristic information to be predicted into the second house price prediction model after classification processing to obtain the house predicted price greatly shortens the time span of the house label, and is beneficial to the time relevance of network learning data, so that the prediction is more accurate at different time points.
Referring to fig. 6, fig. 6 is a flowchart illustrating a second embodiment of the method for predicting a room price based on a neural network mining space-time correlation of the room price, including the following steps:
step S210, acquiring a data set, separating the data with the room price tags from the data without the room price tags, and dividing the data set with the room price tags into a plurality of subdata sets.
And S220, respectively inputting the subdata sets into a first neural network model for training to obtain a first room price prediction model.
Step S230, inputting the data without the rate label into the first rate prediction model, and taking the predicted rate as the rate of the data without the rate label.
And S240, fusing the data without the room price labels and the data with the room price labels into a new data set, and inputting the new data set into a second neural network model for training to obtain a second room price prediction model.
And step S250, inputting the characteristic information of the house to be predicted into the second house price prediction model after classification processing so as to obtain the house predicted price.
Step S260, adjusting neural network parameters to perform model optimization.
In this embodiment, the neural network parameters at least include: the number of hidden layers of the neural network, the number of neurons of each hidden layer, the learning rate, the optimizer, the loss function and the like; in the application, the number of hidden layers of a first neural network and a second neural network and the number of neurons in each hidden layer are adjusted, the learning rate is adjusted, and parameters such as the size of a sample and an optimizer are trained each time to optimize a model; the first neural network and the second neural network use the mean square loss as a loss function, use the average absolute percentage error as a measurement index of the model, and the condition of training is that the model is trained continuously for n times to increase the loss of the verification set, namely, the training is finished when overfitting is carried out, or the maximum training iteration number is reached, the model with the lowest average absolute percentage error on the verification set is selected as the trained model, and the average absolute percentage error on the test set is used as the index of the measurement model.
Compared with the first embodiment, the second embodiment includes step S260, and other steps are the same as those of the first embodiment and are not repeated.
The data set with the room price tags is divided into a plurality of subdata sets due to the adoption of the steps of acquiring the data set and separating the data with the room price tags from the data without the room price tags; respectively inputting the subdata sets into a first neural network model for training to obtain a first room price prediction model; inputting the data without the rate label into the first rate prediction model, and taking the predicted rate as the rate of the data without the rate label; fusing the data without the room price labels and the data with the room price labels into a new data set, inputting the new data set into a second neural network model for training to obtain a second room price prediction model; inputting the characteristic information of the house to be predicted into the second house price prediction model after classification processing so as to obtain the house predicted price; the technical scheme of adjusting the neural network parameters to optimize the model is beneficial to the time correlation of network learning data and optimizing the model, so that the prediction is more accurate at different time points.
Based on the same inventive concept, the invention also provides a room price predicting device, which comprises a memory, a processor and a program which is stored in the memory and can be operated on the processor and is based on the neural network to mine the space-time relevance of the room price so as to predict the room price, wherein the method for predicting the room price based on the neural network to mine the space-time relevance of the room price so as to predict the room price realizes each step of the method for predicting the room price based on the space-time relevance of the room price mined based on the neural network when the program is executed by the processor, and can achieve the same technical effect, and in order to avoid repetition, the method is not repeated.
Since the room price predicting device provided in the embodiment of the present application is a room price predicting device used for implementing the method of the embodiment of the present application, based on the method described in the embodiment of the present application, a person skilled in the art can understand the specific structure and the deformation of the room price predicting device, and thus details are not described herein. All the room price prediction devices adopted in the method of the embodiment of the present application belong to the protection scope of the present application. The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.
Based on the same inventive concept, the embodiment of the present application further provides a computer-readable storage medium, where a program for predicting a room price by mining a space-time relevance of the room price based on a neural network is stored, and when the program for predicting a room price by mining a space-time relevance of the room price based on a neural network is executed by a processor, the above-described steps of the method for predicting a room price by mining a space-time relevance of the room price based on a neural network are implemented, and the same technical effects can be achieved, and in order to avoid repetition, the description is omitted here.
Since the computer-readable storage medium provided in the embodiments of the present application is a computer-readable storage medium used for implementing the method in the embodiments of the present application, based on the method described in the embodiments of the present application, those skilled in the art can understand the specific structure and modification of the computer-readable storage medium, and thus details are not described herein. Any computer-readable storage medium that can be used with the methods of the embodiments of the present application is intended to be within the scope of the present application.
The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It should be noted that in the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The invention may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In the unit claims enumerating several means, several of these means may be embodied by one and the same item of hardware. The usage of the words first, second and third, etcetera do not indicate any ordering. These words may be interpreted as names.
While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the invention.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.

Claims (10)

1. A method for mining space-time correlation of house prices based on a neural network to predict house prices, the method comprising:
acquiring a data set, separating data with room price tags from data without room price tags, and dividing the data set with the room price tags into a plurality of subdata sets;
respectively inputting the subdata sets into a first neural network model for training to obtain a first room price prediction model;
inputting the data without the rate label into the first rate prediction model, and taking the predicted rate as the rate of the data without the rate label;
fusing the data without the room price labels and the data with the room price labels into a new data set, inputting the new data set into a second neural network model for training to obtain a second room price prediction model;
and inputting the house characteristic information to be predicted into the second house price prediction model after classification processing so as to obtain the house predicted price.
2. The method of claim 1, wherein the obtaining a dataset and separating rate-tagged data from rate-untagged data comprises: the method comprises the steps of collecting a house data set in an area, detecting the data type of each data in the house data set, classifying, and separating the labeled room price data from the unlabeled room price data.
3. The method of claim 2, wherein said data types include at least timestamp type, numerical type and category type; the detecting and classifying the data types of the data in the house data set comprises the following steps:
if the detected data type is a timestamp type, separating the year, month and day of the timestamp into new data characteristics respectively and processing the new data characteristics as numerical type characteristics;
if the detected data type is numerical type, filling the missing numerical type characteristics by using an average value filling method;
if the detected data type is a type, counting the number of types in the characteristics of each type in the data set, and converting the type data into identifiable data by using a one-hot coding mode to increase the number of characteristics.
4. The method of mining the spatiotemporal correlation of rates based on neural networks to predict rates as claimed in claim 3, further comprising: and normalizing each numerical characteristic to change the characteristic distribution, wherein the normalization operation enables the mean value of the numerical characteristic to be 0 and the variance to be 1.
5. The method of claim 2, wherein said partitioning of the room price tagged data set into a plurality of subdata sets comprises:
acquiring a data set with a room price tag after classification processing, and dividing the data set with the room price tag into a plurality of subdata sets in a time unit;
and proportionally dividing each subdata set according to a training set, a verification set and a test set, wherein the corresponding proportional relation of the training set, the verification set and the test set is 8: 1: 1.
6. the method for predicting rates of property based on neural network mining spatio-temporal correlations of rates of property of claim 5, wherein the first rate prediction model comprises at least two sub-rate prediction models; the step of inputting the subdata sets into a first neural network model respectively for training to obtain a first room price prediction model comprises:
counting the characteristic quantity of a house data set to be trained, and inputting the sub data sets into the first neural network model respectively for training to obtain a corresponding sub-house price prediction model;
and predicting the house price without the house price label, which has similar house characteristic information with the house price label in the corresponding time period, according to the house price prediction model.
7. The method of mining the spatiotemporal correlation of rates based on neural networks to predict rates as claimed in claim 1, further comprising: neural network parameters are adjusted for model optimization.
8. The method of claim 7, wherein the parameters at least include: the number of hidden layers of the neural network and the number of neurons in each hidden layer.
9. A rate prediction apparatus comprising a memory, a processor, and a neural network-based rate mining spatiotemporal correlation to predict a rate program stored in the memory and executable on the processor, the neural network-based rate mining spatiotemporal correlation to predict a rate program when executed by the processor implementing the steps of the neural network-based rate spatiotemporal correlation to predict a rate method according to any one of claims 1 to 8.
10. A computer-readable storage medium storing a neural network-based mining of spatiotemporal correlations of rates to predict rates, which when executed by a processor implements the steps of the neural network-based mining of spatiotemporal correlations of rates to predict rates of a house according to any one of claims 1 to 8.
CN202011344364.3A 2020-11-24 2020-11-24 Method for mining space-time correlation of house price based on neural network to predict house price Pending CN112418939A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011344364.3A CN112418939A (en) 2020-11-24 2020-11-24 Method for mining space-time correlation of house price based on neural network to predict house price

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011344364.3A CN112418939A (en) 2020-11-24 2020-11-24 Method for mining space-time correlation of house price based on neural network to predict house price

Publications (1)

Publication Number Publication Date
CN112418939A true CN112418939A (en) 2021-02-26

Family

ID=74843064

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011344364.3A Pending CN112418939A (en) 2020-11-24 2020-11-24 Method for mining space-time correlation of house price based on neural network to predict house price

Country Status (1)

Country Link
CN (1) CN112418939A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113344645A (en) * 2021-07-07 2021-09-03 中国工商银行股份有限公司 House price prediction method and device and electronic equipment
CN113627977A (en) * 2021-07-30 2021-11-09 北京航空航天大学 House value prediction method based on heteromorphic graph
CN116032359A (en) * 2022-12-27 2023-04-28 中国联合网络通信集团有限公司 Characteristic network data prediction method and system and electronic equipment

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113344645A (en) * 2021-07-07 2021-09-03 中国工商银行股份有限公司 House price prediction method and device and electronic equipment
CN113627977A (en) * 2021-07-30 2021-11-09 北京航空航天大学 House value prediction method based on heteromorphic graph
CN116032359A (en) * 2022-12-27 2023-04-28 中国联合网络通信集团有限公司 Characteristic network data prediction method and system and electronic equipment

Similar Documents

Publication Publication Date Title
WO2020249125A1 (en) Method and system for automatically training machine learning model
Ali et al. A data-driven approach for multi-scale GIS-based building energy modeling for analysis, planning and support decision making
Qiao et al. Towards developing a systematic knowledge trend for building energy consumption prediction
CN111400620B (en) User trajectory position prediction method based on space-time embedded Self-orientation
CN112418939A (en) Method for mining space-time correlation of house price based on neural network to predict house price
CN111091196B (en) Passenger flow data determination method and device, computer equipment and storage medium
CN115440032B (en) Long-short-period public traffic flow prediction method
US20230207135A1 (en) Methods and systems for detecting environment features in images to predict location-based health metrics
Esquivel et al. Spatio-temporal prediction of Baltimore crime events using CLSTM neural networks
WO2022127339A1 (en) Website registration-based user portrait generating method and apparatus, device and medium
CN113591971B (en) User individual behavior prediction method based on DPI time sequence word embedded vector
CN111597340A (en) Text classification method and device and readable storage medium
Lindenthal et al. Machine learning, architectural styles and property values
CN112418936A (en) Intelligent house price evaluation method and device based on multi-level evaluation model
Zhou et al. An attention-based deep learning model for citywide traffic flow forecasting
Zou et al. A feature extraction and deep learning approach for network traffic volume prediction considering detector reliability
CN117275215A (en) Urban road congestion space-time prediction method based on graph process neural network
Ragapriya et al. Machine Learning Based House Price Prediction Using Modified Extreme Boosting
CN112581177B (en) Marketing prediction method combining automatic feature engineering and residual neural network
Qiu et al. Integrated model for traffic flow forecasting under rainy conditions
Chen et al. FedLGAN: a method for anomaly detection and repair of hydrological telemetry data based on federated learning
Gulgulia et al. Tracking Socio-Economic Development in Rural India over Two Decades Using Satellite Imagery
Wu et al. Mining trajectory patterns with point-of-interest and behavior-of-interest
CN114915563A (en) Network flow prediction method and system
Xue et al. Urban population density estimation based on spatio‐temporal trajectories

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20210226

RJ01 Rejection of invention patent application after publication