WO2008145798A2 - A system and a method for assessing objects - Google Patents

A system and a method for assessing objects Download PDF

Info

Publication number
WO2008145798A2
WO2008145798A2 PCT/FI2007/000150 FI2007000150W WO2008145798A2 WO 2008145798 A2 WO2008145798 A2 WO 2008145798A2 FI 2007000150 W FI2007000150 W FI 2007000150W WO 2008145798 A2 WO2008145798 A2 WO 2008145798A2
Authority
WO
WIPO (PCT)
Prior art keywords
model
value
data
vehicle
estimate
Prior art date
Application number
PCT/FI2007/000150
Other languages
French (fr)
Inventor
Mikael Teerilahti
Markus Halonen
Risto Teerilahti
Teppo Teerilahti
Samuli Teerilahti
Original Assignee
Grey-Hen Oy
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Grey-Hen Oy filed Critical Grey-Hen Oy
Priority to PCT/FI2007/000150 priority Critical patent/WO2008145798A2/en
Priority to EP08709307A priority patent/EP2153393A4/en
Priority to PCT/FI2008/050039 priority patent/WO2008145805A1/en
Priority to US12/602,244 priority patent/US20100179861A1/en
Publication of WO2008145798A2 publication Critical patent/WO2008145798A2/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/067Enterprise or organisation modelling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • G06Q30/0204Market segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0278Product appraisal
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]

Definitions

  • the invention pertains to evaluating characteristics of various objects. Particularly the invention concerns provision of information relating to the estimated value of target object(s) on the basis of the evaluation results.
  • pre-owned cars Reverting to the trade of pre-owned cars, for example, if second-hand products were as homogenous as new ones with only a few dynamic characteristics such as age having an effect in the running value, purely intellectual or manual pricing thereof would not necessarily cause insuperable problems.
  • pre-owned cars are associated with a great number of price-affecting variables, the values of which can be either of categorizing (e.g. brand, model) or continuous (distance driven or "mileage" (despite of the term, can in the context of the invention be given in kilometres or other preferred alternative units), and age, even if modeled as mere "model year”) type, for example.
  • the seller of a more expensive product may offer additional services not provided by other sellers.Thus, in market economy even homogenous products differ in price; such difference may be produced by factors that cannot be or at least, are not, parameterized, e.g. unknown and random factors (results of expert evaluations, etc).
  • Some computerized sales services that are accessible through the Internet, for example, comprise a database wherein details such as the aforesaid variable values of the individual sales objects of multiple sellers are stored.
  • User of the system may then execute searches in the database on the basis of available search terms that are defined via desired variable values and value ranges, for example.
  • the system then returns the sales objects that fulfill the conditions set by the user.
  • the user may manually investigate the retrieved price requests and try to figure out why these differ between each object, although each object has fulfilled the initial search criteria.
  • the user may also exploit this arrangement for pricing his own article he is willing to place for sale by calculating an average price from the returned data and adjusting the sales price of his article subjectively based on that, for example; if the user is of the opinion that his article is in exceptionally good condition, he may set the price above the average and vice versa.
  • this kind of pricing method is not accurate as the user has to, in practice, extrapolate and thus mentally model a running value for his article on the basis of known articles and their specifications and price requests.
  • This kind of mental modelling may not be possible, especially when there a lot of data and parameters.
  • the effect the individual variables eventually have in the prices is likewise very hard to estimate on the basis of such database arrangements, not forgetting the fact that some variables may have a dependency between them in so far as the sales value of the article is in question.
  • the effect of distance driven is alluring to model in an oversimplified manner, e.g. as a straight line where distance driven is shown on one axis and price reduction on another.
  • the territorial effect i.e. where the object is offered for sale geographically, is typically omitted from modeling; all the data in the database (concerning e.g. a single country) is just equally considered.
  • the objective is to provide an arrangement for assessing objects on a basis of a plurality of characterizing features thereof, and simultaneously alleviating at least some of the defects present in various prior art solutions reviewed above.
  • a computerized system for assessing vehicles comprises
  • means for calculating a value estimation model for vehicles on the basis of the received data said model comprising a plurality of parameters to be determined and a plurality of explanatory variables associated with the parameters and defined on the basis of said characteristics, wherein at least one of said age and mileage is configured to affect the value estimate, the value estimate being a dependent variable in the model, and
  • a method for assessing vehicles comprises receiving data from a plurality of sources, said data representing characteristics of a plurality of vehicles, said data including indication of a value of the vehicle and of at least one characteristic selected from the group consisting of: age and mileage of the vehicle,
  • calculating a value estimation model for vehicles on the basis of the received data said model comprising a plurality of parameters to be determined and a plurality of explanatory variables associated with the parameters and defined on the basis of said characteristics, wherein at least one of said age and mileage affect the value estimate, the value estimate being a dependent variable in the model,
  • mapping at least part of said number of predetermined characteristics of the target vehicle, via said value estimation model, to said value estimate.
  • age and mileage jointly affect the value estimate.
  • first and second storages refer to first and second entities such as databases, respectively.
  • first and second storages refer to physically and/or logically different memory locations of one or more memory entities.
  • said data also include indication of a location of each vehicle, whereby the location is configured to affect the value estimate, via the value estimation model, as an explanatory variable and/or as a part of a joint (interactive) variable defined by a plurality of characteristics and/or other preferred factors.
  • the present invention enables modeling the (value) effect of desired characteristics (e.g. distance driven or age) with different modelling accuracy. It may be computationally efficient to model certain characteristics in a more coarse manner, i.e. concerning a larger group of vehicles (e.g. certain vehicle types of a certain vehicle model, certain brands, certain engine types, etc) at at time, whereas some other characteristics shall be modelled more independently.
  • desired characteristics e.g. distance driven or age
  • the current invention advantageously enables constructing only a single value estimation model via which value estimates of a plurality of different vehicle types, e.g. car types according to a predetermined classification system, can be modelled instead of dividing the whole vehicle type population into more homogenous subsets that are independently modelled via a plurality of models.
  • the present invention thus provides a function that maps various characteristics of a vehicle to a value estimate.
  • available vehicle type register (classification database of vehicles into homogenous subsets according to predetermined rules) can be used to supplement arid/or correct the value information, e.g. sales/asking price, provided by the rest of the sources in relation to the vehicles. It may happen that the sources, e.g. car dealers, provide information that is either erroneous, inaccurate, or otherwise deficient. If such imperfect information is combined with the information present in the type register, data in the first storage can be verified upon input or afterwards. In that sense the input data to be stored in the first storage may initially reside as fragmented in multiple locations.
  • value information e.g. sales/asking price
  • the arrangement of the present invention provides reliable value estimations for target vehicles the types of which may be present in the available type register but for which real-life findings have been insufficient or nonexistent.
  • the present invention may be used to interpolate (or extrapolate, in certain occasions) the effect of various characteristics of the target vehicle in the value estimate based on the corresponding effect of the neighbouring types having more modelling data available.
  • Various embodiments of of the invention for gathering data from a plurality of sources for estimating parameters of a value estimation model for articles such as pre-owned vehicles is more precise than previous models due to taking e.g. the regional effect into account together with mileage and/or age. Further synergy benefit is obtained when the raw data gathered from the plurality of sources is translated, for example periodically, into the model capitalizing e.g. the regional effect so that the model can be kept detailed and accurate without adding to the response time upon receiving a service request from a user for obtaining a value estimate for a specific vehicle.
  • the resulting estimation result can be obtained relatively fast, e.g. in a few seconds.
  • the estimation can be determined on a montly, weekly or daily basis for each object; for example, a retailer having a plurality of pre-owned cars in stock can enter details of the cars into the system and update the information in case of changes, whereupon the system automatically, e.g. once a week on the basis of updated sample data, or in response to an update request, (re)calculates the parameters of an empirical model describing the relationships between the predetermined dependent and explanatory variables thereof.
  • the user may relatively easily detect significant price deviations. For example, the user can notice whether the true asking price of a target vehicle is exceptionally high or low based on comparison with a purely mathematical value estimate obtained by the model. This phenomenon could be illustrated as an isolated point located far away from the corresponding expectation value surface as defined by the model. The expectation value surfaces provided by the model thus effectively filter out indiscriminate noise from the data.
  • the modelling solution of the invention is specifically tailored for estimating running value, e.g. asking or sales price, of pre- owned vehicles such as cars. Also regional price differences are taken into account in the model.
  • Fig. 1 discloses one embodiment of a system according to the invention.
  • Fig. 2 is one example of a feasible visualization of an expectation value surface relating to a market asking price of MB C 220 CDI STW 5d type passenger car in middle-August 2003.
  • Fig. 3 is a flow diagram illustrating one embodiment of a modeling method in accordance with the present invention.
  • Fig. 4 is a flow diagram disclosing a more detailed view of the embodiment of the modeling method in accordance with the present invention.
  • Figure 1 visualizes an example 100 of a system for determining a value estimate, e.g. asking or running price, of various objects on a basis of available sample data.
  • the example shown relates to vehicles, but various embodiments may as well relate to other equivalent objects available for trade.
  • Reference numerals 101 , 102 and 103 refer to input sources.
  • the input sources 101 , 102 and 103 can be located remotely from a central first database 104.
  • the remote input sources 101 , 102 and 103 input various data relating to the objects being characterised.
  • Dotted line 120 surrounds entities that are, in one embodiment, at least logically located together to form a server side of the system including first and second databases 104 and 105.
  • Such databases 104, 105 are managed and accessed via suitable interfacing 116 (e.g. wireless or wired data interface such as a network interface), memory 110 and data processing 114 (e.g. one or more processors, microcontrollers, DSPs, programmable logic chips, etc) means.
  • suitable interfacing 116 e.g. wireless or wired data interface such as a network interface
  • memory 110 and data processing 114 e.g. one or more processors, microcontrollers, DSPs, programmable logic chips, etc
  • Local Ul 112 e.g. a display
  • input e.g. a keyboard, keypad, mouse
  • the remote input sources 101 , 102 and 103 represent vehicle dealers such as car dealers. From a technical standpoint the illustrated remote input sources 101 , 102, 103 may specifically refer to data systems of the aforesaid dealers enabled and/or configured to communicate over available communication means with the first database 104 that may be thus considered as a central database collecting information from a number of remote entities.
  • the sources 101 , 102, 103 can be located in various different geographic locations, even in different countries.
  • the sources 101 , 102 and 103 may provide time dependent (e.g. timed) updates of locally gathered data into the first database 104. Such data include vehicle specifications (type information, variable values) and pricing information available at the source systems.
  • the remote input sources 101 , 102 and 103 provide and update the data of the first database 104 in a timed manner and/or upon local/remote data transfer and/or update request, for example.
  • the remote input sources 101 , 102 and 103 and the first database 104 are preferably coupled with each other over a network connection.
  • the Internet or some other IP based connection can be used in coupling the remote input sources 101 , 102 and 103 with the first database 104.
  • Wireless connections can likewise be exploited.
  • the first database 104 collects various data relating to the objects being characterized. Further embodiments of such data are provided hereinafter in connection with the description of the related functionalities of the system.
  • the first database 104 comprises so-called 'raw data' material received from the remote input sources 101 , 102, and 103.
  • such raw data includes individual indications of various objects and their variable values, but as such the raw data do not yet determine a feasible model for reliably and freely estimating an expectation value of one or more random variables (dependent variables), e.g. a value estimate (running/sales/asking price) of a particular individual car, from a number of other random variables (explanatory variables), e.g. 'specs' (mileage, engine type, etc) of the car.
  • dependent variables e.g. a value estimate (running/sales/asking price) of a particular individual car
  • explanatory variables e.g. 'specs' (mileage, engine type, etc) of the car.
  • the first database 104 may be relatively large in terms of bit size in order to store enough data.
  • the system 100 also includes a second database 105.
  • the second database 105 is coupled with the first database 104. It should be noted that there can be two or more databases in the various systems and it is not vital how many physical databases there really are as long as in the logical sense both the raw data in the first database 104 and the second database 105 are provided. It is also noted that even a single database can alternatively be used as long as the two entities 104 and 105 can be logically or functionally, i.e. data-wise, separable from each other.
  • the second database 105 comprises modeling information, which may be represented e.g. via different expectation value surfaces as depicted in the figure via contour lines and a single target location on one of the surfaces in a form of a dot, as determined on the basis of the raw data.
  • the second database 105 may thus comprise data logically establishing at least one expectation value surface.
  • the expectation value surface is established from the raw data in relation to a selected variable by the underlying model.
  • various expectation value surfaces can be thus established.
  • the system 100 of Figure 1 preferably comprises an element (not shown) for establishing the expectation value surfaces.
  • the system 100 of Figure 1 has another element (not shown) for updating the second database 105 on a basis of the first database 104.
  • the system is substantially real-time functional and said element may update the second database 105 by revising the underlying model for example on hourly, daily, weekly, or monthly basis. Therefore the second database 105 contains in practical circumstances near real-time information and e.g. various expectation value surfaces can be constructed for characterising the objects. There can be various timed periods of updating in addition to non-timed (for example event-triggered) updates. In any case, the second database 105 can thus obtain an updated model with updated expectation value surface(s) on a timely basis.
  • the remote input sources 101 , 102 and 103 can independently provide the first database 104 with various data updates or amendments, which affects the model.
  • the first database 104 can thereafter submit the updated data collectively or in parts to the second database 105, for example.
  • the system 100 can establish one or more expectation value surfaces on a basis of the collective update data and optionally store them in the the second database 105. Characterization of the objects is thus easily obtainable.
  • the execution time needed for obtaining the result can thereby be reduced considerably as model parameters (and thus related expectation surfaces) are not always calculated from the scratch whenever a party, hereinafter a "user" or a "client”, wants to access the service offered by the system for assessing the objects.
  • a terminal 106 of the user e.g. a computer, a mobile terminal, or a PDA
  • the terminal 106 can be coupled by the Internet or IP based connection or the like. Also a wireless connection can be used.
  • the actual query can be performed from the terminal 106 as defined by the user.
  • the user can query desired characteristic(s) of objects by entering a corresponding query on the terminal 106 or some other apparatus providing the terminal 106 with such information.
  • the terminal sends and provides the query towards the second database 105.
  • the query is carried out by accessing the model defining e.g. the one or more expectation value surfaces.
  • the result of the query is transmitted back to the terminal 106 from the system 120, optionally via a number of intermediate devices.
  • the model and various expectation value surfaces the model defines result in a good approximate of the characteristics, e.g. a value estimate, of the object of the query, ignoring the false or misleading variable values while still advantageously fitting those variable values provided in the query to the system/model space that are not even directly presented in the raw data of the first database 104; raw data elements are often just a small sample of the real world situation and the underlying pattern, and some sort of interpolation is required to extend the model to the whole variable value space required for providing a comprehensive model.
  • the user wants to know a running asking price for a certain target car.
  • the user enters various variable values identifying the car like model year, mileage, type of the car (brand, model), equipment and accessories in a form of a query called herein a request.
  • the request can be accordingly transmitted to the system 120 and the second database 105 thereof.
  • result is generated.
  • the result can thus be an estimate of an asking price or the like.
  • the effective processing and technical work is done by the advantageous utilization of the model that can be illustrated by the expectation value surface(s), for example.
  • the result is then returned to the terminal 106 the user of which may evaluate or set a (realistic) price request for the target car.
  • Erroneous information e.g. wrong parameters of an object corresponding to wrong variable values in the corresponding mathematic model
  • Erroneous information in the database information of first 104 and second databases 105 due to e.g. typing errors introduced during the information input process can be deleted or effect thereof minimized or rendered non-influenting, either prior to storing corresponding erroneous data elements in the first database 104 or defining the model of second database 105.
  • deletion or nullification of erroneous or insufficient information can be performed by fixed or adaptive elimination, wherein on a basis of an predetermined or adaptive elimination model (e.g. filtering rules) some input data may be classified as erroneous and eliminated, which can even take place prior to storing the filtered data in the first database 104.
  • an predetermined or adaptive elimination model e.g. filtering rules
  • suspiciously deviating data values or entities can be first detected and then deleted or ignored, thereby not affecting the model itself.
  • information conversion procedures are preferably applied to uniform the received data elements, if necessary. For example, if mileage of vehicles included in the sample data is given either in miles or kilometres, depending on the data source, it is advantageous to uniform the presentation prior to determining any empirical model therefrom.
  • Unit conversions may be executed via tables or mathematical formulae, for example.
  • the substantially electronic system 100, 120 comprising one or more physical elements such as computers or data storage units may have a processing unit comprising at least one microprocessor (or microcontroller, DSP, etc) , possibly a memory and software.
  • the processing unit controls, on the basis of the software, the operations of the elements and the databases, such as receiving, storing and deleting data relating to the objects, categorizing the object parameters and the objects, establishing and creating the model(s) and related one or more expectation value surfaces, updating them, controlling periodic updates, etc.
  • Various operations to be performed and means for carrying out them have been described in the examples.
  • the systems of various embodiments of the invention can be implemented by a computer system comprising or being connected to a network, for example the global Internet or the like.
  • Software, programming means, elements, etc can be used in performing the various operations and functions of the embodiments, examples of which have been described above.
  • middleware, programming logic, or circuit logic can be applied (not shown).
  • the software may be provided as a software product on a carrier such as a memory card, a floppy disc, a cd, a dvd, hard disk, etc.
  • a designer may have a sort of conception of variables that are significant in finding a way to statistically describe interrelationships and causal connections between various random variables such as price (or "value"), brand, model, age, location, mileage, etc.
  • the conception may be initially generated on the basis of personal knowledge and experience, for example. Model is then defined by the selected variables and later on updated when it is realized that the remaining modelling error is not purely due to random factors but also due to yet unknown variables, so-called lurking variables, that should should be thus brought in to the model in order to enhance the overall accuracy thereof.
  • the systematic deviation can be illustrated or figured out by using a function such as a price function between the characteristic group and the price group.
  • the expectation value eg. an average or median price
  • the function can determine fairly accurately the association between the price and other parameters of the objects.
  • the logarithmic price is a dependent variable and the variation thereof can be depicted on a basis of factors such as the location and timing of the sale, i.e. explanatory variables.
  • the function including the dependent/explanatory variables and various parameters can be estimated in accordance with a regression model, for example. Regression analysis and resulting models are generally used to determine a first degree equation describing the empiric sample pairs of the variables with maximal fit.
  • the estimated model can be constructed as follows:
  • LogP t is a logarithmic price, the dependent variable.
  • Variables ⁇ x ,..., ⁇ ⁇ are thus explanatory.
  • Index i refers to the number or index within the data, wherein there are n observations.
  • Term ⁇ t refers to a random error term, which is a part left unexplained in the variation of the explanatory variable due to true random factors or aforesaid lurking variables.
  • ⁇ :s are parameters of the model (not parameters/characteristics of the individual objects, i.e. variable values such as vehicle brand or mileage) to be estimated. Instead of logarithmic version, also other mappings (square root, cubic root, more complex functions, etc) can be used in the model as desired.
  • covariance models or mixed models can be utilized as estimation techniques.
  • the brand, type and technical features such as engine power, body type, drive type (front, rear, 4-wheel), fuel type, engine size, gearbox, equipment/accessories, etc, and, of course, the age of the car (which may also be indicated as "model year"), usage kilometres or miles (-mileage), sale region, sale period, and many interactions of these and derivative variables can be used as explanatory variables, for example.
  • Erroneous information can be tracked down by threshold comparison -type tests between input values and allowed values, or by comparing the input values with data available in the used type register (type register may be locally available at the system 120 or be considered as one of the input sources), for example.
  • the statistic model can be estimated by defining parameters by which a quadrature error (sum) is minimized. Thus the result is 'best fit' for prices in the source data of first database 104.
  • the estimate of the regression model can be calculated in accordance with the parameter estimates and variables as follows:
  • LOgP 1 is a logarithmic estimate for the price of car with running number i. ⁇ ⁇ ,..., ⁇ k are parameter estimates, thus estimates of the influence of various factors on the dependent variable, i.e. the log price.
  • features (variable values for the formula) x n ,..., x ik of the target car are obtained (model, type, year, kilometers, and various other interactives or derivative variables thereof), a precise and absolute expectation value with even several decimal accuracy can be advantageously calculated on a basis of the above formula.
  • An estimate of the price can be obtained from the log price on a basis of the exponential function, which is an inverse function of the logarithm simply as follows:
  • the expectation value of the price can be determined more accurately than a single price.
  • the variance of single prices is greater than the one of the estimated average. For example, let us have a look at an example wherein the average height of 50 adult males is determined. By this sample, the average height of the whole population can be estimated. Such average height can be, on the basis of the sample, determined much more precisely than a height of a randomly selected male. Similarly, an expectation or average value for the price of a vehicle is easier to determine than just a single price on the market.
  • One important use of the model relates to predicting market prices of a certain car at a certain moment.
  • an increase in the amount of the explanatory variables also increases the explanatory degree of the model and furthermore decreases the error scattering. For example, if there are 10 000 observations and accordingly same amount of explanatory/descriptive variables, the correspondence is complete.
  • Such a model cannot, however, estimate, for example interpolate or extrapolate, values between or outside the existing material.
  • the model of various embodiments is focused not to this kind of over-parameterization, but for advantageously obtaining results outside or between the known data values. Over- parameterization can be avoided, for example, by ignoring part of the available parameters during the estimation. Alternatively a check for a correspondence between price estimates and realized prices can be carried out.
  • Figure 2 depicts an expectation value surface (2-d projection thereof) for a chosen vehicle and for a chosen dependent variable, in this case an asking price of the car, in function of the combination of age and mileage.
  • a chosen dependent variable in this case an asking price of the car
  • the example relating a certain car type, which can be selected by adjusting preferred values for e.g. categorizing variables defining the type as desired, is evidently rather homogeneous because the technical characteristics of the object are the same. Therefore, foreseeable price deviation within the same type is due to the age, usage kilometres and accessories/equipment only.
  • a formula 1 On a basis of a formula
  • price estimates can be depicted for a desired car type with the variance of age and mileage, for example, which then converts into a representation of three- dimensional expectation value surface(s).
  • Figure 3 discloses a flow diagram of one embodiment of a method in accordance with the present invention.
  • the system 120 receives and stores data relating a plurality of objects such as vehicles on the basis of which the value estimating model is constructed.
  • the value estimation model is created as described in this text, for example periodically or upon explicit instruction to create/update the model.
  • the model is stored, for example parametrically.
  • Step 308 refers to a situation wherein a user of the system provides a request identifying characteristics of an object for which he wants to receive a value estimate.
  • the value can be represented as an asking or sales price, for example, wherein the selection of the representation method most logically corresponds to the value information given in relation to the objects forming the sample database.
  • the characteristics of the target object are input to the model and a value estimate is obtained as an output.
  • Figure 4 discloses a more detailed view of the above method in accordance with the invention.
  • the example of figure 4 discloses, in addition to system 120 tasks, also measures taken by the data sources 101 , 102, 103 and user 106.
  • the remote input sources e.g. car dealers 101, 102, and 103 in the context of pre- owned cars' trade, first locally acquire data about their stock and valuation (e.g. asking prices) situation. Such information is nowadays typically readily available in computerized accounting/sales systems.
  • the sources 101 , 102, 103 transmit the data towards the service system 120 that receives the data 404, executes optional filtering and data conversion procedures thereto, and stores 406 at least the filtered/converted data in a first database 104.
  • step 408 the system 120 is either triggered by a local timer or an external entity to create/update the value estimation model.
  • step 410 the model is created/updated and in step 412 stored in a second database 105.
  • step 414 the user 106 generates a query and transmits a corresponding value estimation request to the system 120, said request including necessary details for running the model.
  • the system 120 receives the request and optionally verifies its validity and/or user's 106 entitlement to utilize the system 120.
  • the system 120 calculates the value estimate in step 416 by the model that applies e.g. age, location, and mileage information as explained herein.
  • step 418 the value estimate is transmitted to the user 106 either in a default form (e.g.
  • SMS/MMS message e-mail, predetermined database format, PDF, HTML) or in a supported alternative format preferred by the user 106 and indicated by the request or e.g. user-dependent service registration information available at the system 120.
  • the value estimate is outputted at the user 106, e.g. via a display.
  • inventive system 100, 120 of the various embodiments does not necessarily determine or even visualize the value estimate graphically.
  • Various embodiments still advantageously utilize a parametric (e.g. numerical) representation of the model that may be visualized via expectation value surface(s), for example.
  • a parametric e.g. numerical representation of the model that may be visualized via expectation value surface(s)
  • analyzing the surfaces one can easily and quickly inspect the value development in relation to other changing variable values.
  • the calculation of the average for each object such as vehicle/car type with various combinations of age-mileage values can be avoided. In practice this kind of estimation would be near impossible to perform as there are e.g. about 20 000 different types of cars (depending on the type register used and local market; e.g.
  • one 'type' may be defined via a plurality of values of categorizing variables) and accordingly, a huge number of variable value combinations per each type. Thereby a true average etc could be calculated only from a small portion of the whole car pool.
  • the estimation methods of the various embodiments disclosed herein advantageously solve or at least alleviate such problem.
  • calculations are not necessarily performed independently for each car type concerning e.g. the effect the mileage has on the value estimate.
  • the estimation can be performed more generally by examining which characteristics together with mileage shift the value estimate.
  • various embodiments determine different mileage effect factors for groups of cars instead of single cars (or car types).
  • Various embodiments can utilize e.g. different brands and fuels as a distinguishing factor. For example, the effect of mileage is smaller with diesel cars or with cars having a larger engine. As another example, between different brands also the effect of mileage changes.
  • the geometry of the expectation value surface depends on the characteristics of the vehicle. It is formed in accordance with the data input so that a group of expectation value surfaces results in good fit with real values, thereby also being able to extrapolate the near future trends.
  • Regional effect can be taken into account in a variety of ways.
  • the regionality is included in the model as a categorizing variable.
  • the regionality is implemented via regional correction factors.
  • Log-effect 0.1 relating to a certain area compares with 10.5% higher price on that area in contrast to the overall average price.
  • the regional differences in vehicle valuation, i.e. pricing are static between regions. This is not always the case as the regional effect may be itself dependent on various other variables such as age and/or distance driven. Therefore, in another embodiment the regional effect can be modelled with increased precision, yet not leading to a situation where a fully independent pricing model would be determined for each region, because the latter technique would require enormous amount of processing and sample data to avoid results becoming stochastic; there would be no necessary number of observations available relating to certain car brands and models for smaller regions (single retailer, etc).
  • the regression model described hereinbefore contained an error term (see formula 4) that should represent white noise in case the residual error has no further explanatory variables hidden therein. Nevertheless, it was found that in most cases by suitable interaction between regional information and car properties the error could be minimized further, i.e. the residual was not pure white noise.
  • two knots may be located at regional percentual points corresponding to 15 and 75 per cent in such a manner that Age_p15(a) relates to a region a so that 15% of the cars on that region are younger than Age_p15 indicates.
  • Age__p75(a) indicates age so that 75% of the cars of the region are younger.
  • Each region preferably has its own knots.
  • age-related spline variables can be defined as follows:
  • AgelPl (Age > Age_pl5) * (Age- Age _ pi 5) 2 . (7)
  • the above variable gets value 0 as long as the age ⁇ Age_p15 and for other values of the age the variable corresponds to the age exceeding the knot value to the power of two.
  • the curve has a linear start but at point wherein 15% of cars are younger, a second degree term is added thereto.
  • the interactions between region and age plays no (at least, major) role, which is also intuitively logical as new cars are often rather homogeneously priced. Accordingly, the second spline variable can be defined:
  • Age2P2 (Age > Age _ pi ' S) * (Age - Age_pl5f . (8)
  • the resulting curve would first start as a straight line, then convert into a first second-order polynome approximation at the first knot and further to a second second-order polynome approximation at the second knot.
  • the resulting curve is thus continuous and differentiable also at knots.
  • spline variables (p15 and optionally also p75) for mileage can be defined.
  • the number of parameters to be estimated could be calculated: 2*18+G*7 (9)
  • Multipliers 2 and G result from the time period split into two parts and the number of parameters to be estimated per greater region, respectively.
  • One example of a regression function for estimating, in relation to a region a, deviation from the overall average price level is formed as: c A *Agelp2 i + ⁇ d A * Age2p2 i
  • ⁇ a therefore describes how the difference between region a price level and the average has changed from a period to another.
  • the value of G would be 6 (b, c, d, e, f, and g), but in one alternative embodiment, a mileage junction point could be removed, resulting in G valued as 5, without necessarily heavily affecting the modelling accuracy, which would then lead to a reduced number of 71 parameters in total according to formula 9.
  • Measuring price levels separately for the last period ensures that the log-difference between the prediction result and prices from the sample space has a zero average for the last 30 days, i.e. the estimates for the log-price are unbiased in relation to the data samples of last 30 days. For estimating other parameters, data from a longer period can be used to obtain stable results.
  • a regional price function can be defined as:
  • logPA (X ai ) logP(X ai ) + logA(X ai ) (11)
  • the car is located on a region a, which resides within a greater region A.
  • ⁇ ogP(X ai ) is the overall price estimate covering all the regions whereas logA(X al ) is a regional correction factor.
  • Another option is to include, as mentioned hereinbefore, the regional variables in the actual price model. This way determining the overall price estimate gets a bit more complicated as parameter estimates shall be determined in relation to each particular region.
  • the regional effect may be limited in connection with specific variable values, e.g. particularly high mileage and age values, because flexible function forms utilized in the invention may behave unexpectedly in border areas wherein the amount of sample data is not adequate, e.g. in the context of particularly old and/or much driven cars.
  • the limitation can be introduced by applying a certain age variable value Age_P98 for cars that are older than 98% of all cars in the sample data. For mileage, a corresponding limitation can be set.
  • Such limits are preferably determined for each region separately as age distributions may differ significantly between different regions. For example, in more wealthy regions the cars may be younger and/or less driven than elsewhere.
  • Data input sources include e.g. a selected type register and price information (e.g. asking prices).
  • the type register can be used to validate/match the obtained price information with detailed type information so that the input data for the model is complete and correct.
  • the type register classifies the cars into homogenous sub- sets and includes information such as car brand, model, generation, etc.
  • Commercial type registers are available and produced by CAP and JATO, for example. Within one type the cars are somewhat similar and differ possibly in minor issues such as color, equipment, etc. By utilizing a detailed type register the quality of other input data can be improved instead of purely exploiting data manually typed in by a plurality of car salesmen and thus possibly including defective information.
  • Variables of the model can be then defined as follows:
  • Such model could be of covariance type including a plurality of categorizing variables (bolded) and also several continuous variables (e.g. Age, Mileage, and Enginesize), for example. Further, the model, may include derivative variables defining a combined effect of several variables, see e.g. Fuel*Mileage and Enginesize*Mileage in the above variable list.
  • the price estimate would equal the price average of the cars of the same model generation in the available input data.
  • more variables, such as mileage is added to the model, they will explain the price variation within each model generation; e.g. when analyzing two cars with 50 000 km and 200 000 km on the clock, expectation value for the price of the car with the latter mileage would be lower.
  • a plurality of parameter estimates for the price equation is obtained and stored (e.g. in a computer-readable file).
  • Parameters can be then pre-calculated for each type in the type register, provided that there's enough data available for such estimation (e.g. data related to few vehicles only do not generally enable reliable estimation).
  • the types in the type register are modelled price-wise via a parametric mapping that converts e.g. age and mileage into a price estimate.
  • an expectation value surface of the price is obtained for each type.
  • Variable values that vary within a type and per each car entity, e.g. age and mileage cannot be fixed beforehand.
  • Variables such as model generation, body category (e.g. coupe, sedan, SUV, wagon), gearbox (manual, automatic, etc) and fuel (petrol, diesel) remain constant within a type definition of this example; therefore they can be determined at an early stage for future use.
  • the parameter of the Modelgeneration variable is, as calculated on the basis of available data, set at 9.5 and the parameter of the Enginesize variable is set at 0.0003.
  • Age variable has a parameter value (i.e. multiplier) -0.08
  • Mileage variable has a parameter value -0.00095
  • Diesel*Mileage has a parameter value +0.00025
  • the price of the car will decrease 9.5% per 100 000 mileage units (e.g. kilometres), except for diesel-engined cars for which the Diesel*Mileage correction factor will finetune the reduction to 7%.
  • an expectation value surface for a car belonging to the above model generation and having 1. ⁇ litre petrol engine can be described parameter-wise as:
  • a price multiplier file can be formed, including a separate parametric description (in relation to age, mileage, and price coordinate system, for example) for each car type. If modelling data does not include enough examples of a certain type, the parameters can be interpolated/extrapolated from the parameters of neighbouring (according to predetermined criteria) types.
  • the selected parametric mappings can be further cultivated by transforming linear relationships (e.g. between age and mileage) into non-linear ones by utilizing splines, polynome approximation, etc, and by utilizing several parameters/variables instead of one for desired properties.
  • the users of the system may transmit requests including information (e.g. plate number) required for defining the target car. Based on the obtained means, the system may figure out the exact type of the car by accessing the type register, for example. If the type cannot be defined accurately enough by the given information, the user may be asked for additional information for type recognition purposes.
  • the expectation value surfaces have been-predetermined for different car types, real-time calculations can be kept minimum upon receipt of the inquiry, and a proper location as determined by target car entity -specific variables such as age, mileage, etc, on a corresponding type-specific expectation value surface can be found out fast.
  • the system may perform a plurality of simultaneous queries with reduced load. Age, mileage, equipment and sales region belong to the variables that are typically specific to each individual car entity, not to each type; therefore their effect on the price cannot be pre-determined on a type level.

Description

A system and a method for assessing objects
FIELD OF THE INVENTION
Generally the invention pertains to evaluating characteristics of various objects. Particularly the invention concerns provision of information relating to the estimated value of target object(s) on the basis of the evaluation results.
BACKGROUND OF THE INVENTION
Traditionally, business-wise successful trade of second-hand commodities such as pre-owned vehicles (cars, motorbikes, etc) has been a business of highly-trained and experienced individuals, i.e. "experts", who have assessed running price (sales/purchase) of the products by mainly relying on their cultivated perception and instinct, i.e. mental arithmetics. Additionally, some relatively simple statistic tools such as calculating price average from the available data including sales catalogues and various advertisements have been utilized as pricing basis in the decision-making process.
Reverting to the trade of pre-owned cars, for example, if second-hand products were as homogenous as new ones with only a few dynamic characteristics such as age having an effect in the running value, purely intellectual or manual pricing thereof would not necessarily cause insuperable problems. However, in reality the pre-owned cars are associated with a great number of price-affecting variables, the values of which can be either of categorizing (e.g. brand, model) or continuous (distance driven or "mileage" (despite of the term, can in the context of the invention be given in kilometres or other preferred alternative units), and age, even if modeled as mere "model year") type, for example.
One could estimate the overall number of different types of cars (an exemplary nationwide car type register may contain tens of thousands of different type approvals, for example), which is in most market areas quite considerable. Further, various variables in addition to the type information may indicate the presence of different equipment such as air-con or alloy wheels, engine size, maximum horsepower/torque, consumption figures, trunk space, number of doors and/or seats, etc. Thus, one may fairly state that each article on sale is a unique entity keeping also the distance driven and age in mind. These factors result in scenarios wherein a salesman or a buyer seldom has enough comparison data available for estimating the value of an object in order to either price the object or evaluate the asking price, respectively. Especially from the standpoint of the buyer, as the objects available for sale are not fully homogenous, it is hard to assess the validity of price requests, i.e. which objects are actually cheap and which are more expensive. In addition to objective specifications reflected by the above variable values, there are number of factors that are hard or practically impossible to represent parametrically, said factors including the overall condition of the object (e.g. in relation to cars, condition of paint work, windshield, dampers, and springs, number and nature of dents, interior wear, engine leaks, etc). In case there are two parametrically quite comparable objects available for sale but the prices differ, one could legitimately expect that the more pricey one is either in better shape or the seller just wants more money from it. Alternatively, the seller of a more expensive product may offer additional services not provided by other sellers.Thus, in market economy even homogenous products differ in price; such difference may be produced by factors that cannot be or at least, are not, parameterized, e.g. unknown and random factors (results of expert evaluations, etc).
Some computerized sales services that are accessible through the Internet, for example, comprise a database wherein details such as the aforesaid variable values of the individual sales objects of multiple sellers are stored. User of the system may then execute searches in the database on the basis of available search terms that are defined via desired variable values and value ranges, for example. The system then returns the sales objects that fulfill the conditions set by the user. The user may manually investigate the retrieved price requests and try to figure out why these differ between each object, although each object has fulfilled the initial search criteria. The user may also exploit this arrangement for pricing his own article he is willing to place for sale by calculating an average price from the returned data and adjusting the sales price of his article subjectively based on that, for example; if the user is of the opinion that his article is in exceptionally good condition, he may set the price above the average and vice versa. However, this kind of pricing method is not accurate as the user has to, in practice, extrapolate and thus mentally model a running value for his article on the basis of known articles and their specifications and price requests. This kind of mental modelling may not be possible, especially when there a lot of data and parameters. The effect the individual variables eventually have in the prices is likewise very hard to estimate on the basis of such database arrangements, not forgetting the fact that some variables may have a dependency between them in so far as the sales value of the article is in question.
Particularly in the context of vehicle trade, the effect of distance driven is alluring to model in an oversimplified manner, e.g. as a straight line where distance driven is shown on one axis and price reduction on another. Further, the territorial effect, i.e. where the object is offered for sale geographically, is typically omitted from modeling; all the data in the database (concerning e.g. a single country) is just equally considered.
Yet, in computerized solutions where the amount of available source data is massive, e.g. a nationwide database, the tendency is to cut most of the data straight away from slowing down the real-time analysis while determining a price estimate for a certain type of an object. This may certainly reduce the processing requirements but at the same time the precision of results may decrease as a major part of data initially available for analysis is filtered out. For example, if the user is solely interested in purchasing, selling or maybe just assessing one particular car model ("model" defined according to the utilized type/model system, for example) with certain age and kilometer specs, the system may simply filter out the data not fulfilling the criteria (i.e. falling within the desired parameter range) and then determine basic characteristics such as average or standard deviation from the asking prices of the remaining database objects. To the contrary, expert evaluation may actually offer more accurate results than oversimplified computerized modeling, however at the cost of processing delay, which may in the case of a single expert or an expert council extend over several days.
One more issue frequently popping up in the context of computerized solutions is the presence of erroneous information (e.g. wrong parameters) in the database information due to e.g. typing errors introduced during the information input process. If this erroneous information is taken into account upon determining indications of a presumed value of a certain-type object and the total number of objects used in the evaluation is low, the erroneous information may dramatically reduce the precision of the assessment.
Notwithstanding the existence of tools for assisting a product dealer or a buyer with object value evaluations, the current computerized solutions are either extremely simple and cursory, omitting many factors in reality affecting the value, or exhaustively based on personal knowledge and thus bias of certain experts, which make the analysis generally inaccurate and in the latter case, also time- consuming.
SUMMARY OF THE INVENTION
The objective is to provide an arrangement for assessing objects on a basis of a plurality of characterizing features thereof, and simultaneously alleviating at least some of the defects present in various prior art solutions reviewed above.
According to an aspect of the invention a computerized system for assessing vehicles comprises
means for receiving data from a plurality of sources, said data representing characteristics of a plurality of vehicles, said data including indication of a value of the vehicle and of at least one characteristic selected from the group consisting of: age and mileage of the vehicle,
means for storing said received data,
means for receiving a request for a value estimate of a target vehicle having a number of predetermined characteristics,
means for producing a value estimate for the target vehicle on the basis of the data, characterized by
means for calculating a value estimation model for vehicles on the basis of the received data, said model comprising a plurality of parameters to be determined and a plurality of explanatory variables associated with the parameters and defined on the basis of said characteristics, wherein at least one of said age and mileage is configured to affect the value estimate, the value estimate being a dependent variable in the model, and
means for storing at least part of the value estimation model including said plurality of parameters modeled, whereby, upon receiving the request said means for producing the value estimate is configured to map at least part of said number of predetermined characteristics of the target vehicle, via said value estimation model, to said value estimate.
According to another aspect of the invention a method for assessing vehicles comprises receiving data from a plurality of sources, said data representing characteristics of a plurality of vehicles, said data including indication of a value of the vehicle and of at least one characteristic selected from the group consisting of: age and mileage of the vehicle,
storing said received data in a first storage,
receiving a request for a value estimate of a target vehicle having a number of predetermined characteristics, and producing a value estimate for the target vehicle on the basis of the data, characte rized by
calculating a value estimation model for vehicles on the basis of the received data, said model comprising a plurality of parameters to be determined and a plurality of explanatory variables associated with the parameters and defined on the basis of said characteristics, wherein at least one of said age and mileage affect the value estimate, the value estimate being a dependent variable in the model,
storing at least part of the value estimation model including said plurality of parameters modelled in a second storage, and, upon receiving the request,
mapping at least part of said number of predetermined characteristics of the target vehicle, via said value estimation model, to said value estimate.
In an embodiment, age and mileage jointly affect the value estimate.
In a further embodiment, the first and second storages refer to first and second entities such as databases, respectively. In another embodiment, the first and second storages refer to physically and/or logically different memory locations of one or more memory entities.
In one embodiment, said data also include indication of a location of each vehicle, whereby the location is configured to affect the value estimate, via the value estimation model, as an explanatory variable and/or as a part of a joint (interactive) variable defined by a plurality of characteristics and/or other preferred factors.
The present invention enables modeling the (value) effect of desired characteristics (e.g. distance driven or age) with different modelling accuracy. It may be computationally efficient to model certain characteristics in a more coarse manner, i.e. concerning a larger group of vehicles (e.g. certain vehicle types of a certain vehicle model, certain brands, certain engine types, etc) at at time, whereas some other characteristics shall be modelled more independently.
Further, the current invention advantageously enables constructing only a single value estimation model via which value estimates of a plurality of different vehicle types, e.g. car types according to a predetermined classification system, can be modelled instead of dividing the whole vehicle type population into more homogenous subsets that are independently modelled via a plurality of models.
The present invention thus provides a function that maps various characteristics of a vehicle to a value estimate.
In one embodiment of the invention, available vehicle type register (classification database of vehicles into homogenous subsets according to predetermined rules) can be used to supplement arid/or correct the value information, e.g. sales/asking price, provided by the rest of the sources in relation to the vehicles. It may happen that the sources, e.g. car dealers, provide information that is either erroneous, inaccurate, or otherwise deficient. If such imperfect information is combined with the information present in the type register, data in the first storage can be verified upon input or afterwards. In that sense the input data to be stored in the first storage may initially reside as fragmented in multiple locations.
Still, in one embodiment the arrangement of the present invention provides reliable value estimations for target vehicles the types of which may be present in the available type register but for which real-life findings have been insufficient or nonexistent. The present invention may be used to interpolate (or extrapolate, in certain occasions) the effect of various characteristics of the target vehicle in the value estimate based on the corresponding effect of the neighbouring types having more modelling data available.
Various embodiments of of the invention for gathering data from a plurality of sources for estimating parameters of a value estimation model for articles such as pre-owned vehicles is more precise than previous models due to taking e.g. the regional effect into account together with mileage and/or age. Further synergy benefit is obtained when the raw data gathered from the plurality of sources is translated, for example periodically, into the model capitalizing e.g. the regional effect so that the model can be kept detailed and accurate without adding to the response time upon receiving a service request from a user for obtaining a value estimate for a specific vehicle. In various further embodiments, the resulting estimation result can be obtained relatively fast, e.g. in a few seconds. This is mainly due to the predetermined or precalculated model parameters that enable utilizing and/or depicting the model via expectation value surfaces in relation to the model variables. The estimation can be determined on a montly, weekly or daily basis for each object; for example, a retailer having a plurality of pre-owned cars in stock can enter details of the cars into the system and update the information in case of changes, whereupon the system automatically, e.g. once a week on the basis of updated sample data, or in response to an update request, (re)calculates the parameters of an empirical model describing the relationships between the predetermined dependent and explanatory variables thereof.
Further, in various embodiments of the invention the user may relatively easily detect significant price deviations. For example, the user can notice whether the true asking price of a target vehicle is exceptionally high or low based on comparison with a purely mathematical value estimate obtained by the model. This phenomenon could be illustrated as an isolated point located far away from the corresponding expectation value surface as defined by the model. The expectation value surfaces provided by the model thus effectively filter out indiscriminate noise from the data.
In a certain embodiment of the invention the modelling solution of the invention is specifically tailored for estimating running value, e.g. asking or sales price, of pre- owned vehicles such as cars. Also regional price differences are taken into account in the model.
BRIEF DESCRIPTION OF THE RELATED DRAWINGS
Below, the embodiments of the invention are described in more detail with reference to the attached drawings in which:
Fig. 1 discloses one embodiment of a system according to the invention.
Fig. 2 is one example of a feasible visualization of an expectation value surface relating to a market asking price of MB C 220 CDI STW 5d type passenger car in middle-August 2003.
Fig. 3 is a flow diagram illustrating one embodiment of a modeling method in accordance with the present invention. Fig. 4 is a flow diagram disclosing a more detailed view of the embodiment of the modeling method in accordance with the present invention.
DETAILED DESCRIPTION OF THE EMBODIMENTS
Figure 1 visualizes an example 100 of a system for determining a value estimate, e.g. asking or running price, of various objects on a basis of available sample data. The example shown relates to vehicles, but various embodiments may as well relate to other equivalent objects available for trade. Reference numerals 101 , 102 and 103 refer to input sources. The input sources 101 , 102 and 103 can be located remotely from a central first database 104. The remote input sources 101 , 102 and 103 input various data relating to the objects being characterised. Dotted line 120 surrounds entities that are, in one embodiment, at least logically located together to form a server side of the system including first and second databases 104 and 105. Such databases 104, 105 are managed and accessed via suitable interfacing 116 (e.g. wireless or wired data interface such as a network interface), memory 110 and data processing 114 (e.g. one or more processors, microcontrollers, DSPs, programmable logic chips, etc) means. Local Ul 112 (e.g. a display) and input (e.g. a keyboard, keypad, mouse) 118 means may also be available for enabling local control of the system.
In an embodiment, the remote input sources 101 , 102 and 103 represent vehicle dealers such as car dealers. From a technical standpoint the illustrated remote input sources 101 , 102, 103 may specifically refer to data systems of the aforesaid dealers enabled and/or configured to communicate over available communication means with the first database 104 that may be thus considered as a central database collecting information from a number of remote entities. The sources 101 , 102, 103 can be located in various different geographic locations, even in different countries. The sources 101 , 102 and 103 may provide time dependent (e.g. timed) updates of locally gathered data into the first database 104. Such data include vehicle specifications (type information, variable values) and pricing information available at the source systems.
Referring back to embodiments of Figure 1 , the remote input sources 101 , 102 and 103 provide and update the data of the first database 104 in a timed manner and/or upon local/remote data transfer and/or update request, for example. The remote input sources 101 , 102 and 103 and the first database 104 are preferably coupled with each other over a network connection. For example, the Internet or some other IP based connection can be used in coupling the remote input sources 101 , 102 and 103 with the first database 104. Wireless connections can likewise be exploited. The first database 104 collects various data relating to the objects being characterized. Further embodiments of such data are provided hereinafter in connection with the description of the related functionalities of the system.
In an embodiment the first database 104 comprises so-called 'raw data' material received from the remote input sources 101 , 102, and 103. In various further embodiments such raw data includes individual indications of various objects and their variable values, but as such the raw data do not yet determine a feasible model for reliably and freely estimating an expectation value of one or more random variables (dependent variables), e.g. a value estimate (running/sales/asking price) of a particular individual car, from a number of other random variables (explanatory variables), e.g. 'specs' (mileage, engine type, etc) of the car. In vehicle industry and trade the number of variables affecting the sales value is intuitively so huge that it is relatively difficult to evaluate or characterise a particular object purely on a basis of the raw data. However, the raw data still advantageously comprise a lot of information relating to and characterising the objects, although it is not a trivial task to retrieve a proper model out therefrom. The first database 104 may be relatively large in terms of bit size in order to store enough data.
Yet, referring to embodiments of Figure 1 , the system 100 also includes a second database 105. The second database 105 is coupled with the first database 104. It should be noted that there can be two or more databases in the various systems and it is not vital how many physical databases there really are as long as in the logical sense both the raw data in the first database 104 and the second database 105 are provided. It is also noted that even a single database can alternatively be used as long as the two entities 104 and 105 can be logically or functionally, i.e. data-wise, separable from each other.
The second database 105 comprises modeling information, which may be represented e.g. via different expectation value surfaces as depicted in the figure via contour lines and a single target location on one of the surfaces in a form of a dot, as determined on the basis of the raw data. The second database 105 may thus comprise data logically establishing at least one expectation value surface. The expectation value surface is established from the raw data in relation to a selected variable by the underlying model. In various further embodiments various expectation value surfaces can be thus established. As previously described the expectation value surfaces determined by the model to be reviewed hereinafter are advantageously utilized for characterising the objects. The system 100 of Figure 1 preferably comprises an element (not shown) for establishing the expectation value surfaces.
Further, the system 100 of Figure 1 has another element (not shown) for updating the second database 105 on a basis of the first database 104. Advantageously in a further embodiment the system is substantially real-time functional and said element may update the second database 105 by revising the underlying model for example on hourly, daily, weekly, or monthly basis. Therefore the second database 105 contains in practical circumstances near real-time information and e.g. various expectation value surfaces can be constructed for characterising the objects. There can be various timed periods of updating in addition to non-timed (for example event-triggered) updates. In any case, the second database 105 can thus obtain an updated model with updated expectation value surface(s) on a timely basis. This is advantageous because the remote input sources 101 , 102 and 103 can independently provide the first database 104 with various data updates or amendments, which affects the model. The first database 104 can thereafter submit the updated data collectively or in parts to the second database 105, for example. The system 100 can establish one or more expectation value surfaces on a basis of the collective update data and optionally store them in the the second database 105. Characterization of the objects is thus easily obtainable. Advantageously, the execution time needed for obtaining the result can thereby be reduced considerably as model parameters (and thus related expectation surfaces) are not always calculated from the scratch whenever a party, hereinafter a "user" or a "client", wants to access the service offered by the system for assessing the objects.
Still with reference to the embodiments of Figure 1 , a terminal 106 of the user, e.g. a computer, a mobile terminal, or a PDA, is functionally connected to the second database 105. The terminal 106 can be coupled by the Internet or IP based connection or the like. Also a wireless connection can be used. The actual query can be performed from the terminal 106 as defined by the user. The user can query desired characteristic(s) of objects by entering a corresponding query on the terminal 106 or some other apparatus providing the terminal 106 with such information. The terminal sends and provides the query towards the second database 105. The query is carried out by accessing the model defining e.g. the one or more expectation value surfaces. The result of the query is transmitted back to the terminal 106 from the system 120, optionally via a number of intermediate devices. The model and various expectation value surfaces the model defines result in a good approximate of the characteristics, e.g. a value estimate, of the object of the query, ignoring the false or misleading variable values while still advantageously fitting those variable values provided in the query to the system/model space that are not even directly presented in the raw data of the first database 104; raw data elements are often just a small sample of the real world situation and the underlying pattern, and some sort of interpolation is required to extend the model to the whole variable value space required for providing a comprehensive model.
For example, in one scenario the user wants to know a running asking price for a certain target car. The user enters various variable values identifying the car like model year, mileage, type of the car (brand, model), equipment and accessories in a form of a query called herein a request. The request can be accordingly transmitted to the system 120 and the second database 105 thereof. On the basis of the model a query, result is generated. The result can thus be an estimate of an asking price or the like. The effective processing and technical work is done by the advantageous utilization of the model that can be illustrated by the expectation value surface(s), for example. The result is then returned to the terminal 106 the user of which may evaluate or set a (realistic) price request for the target car.
Erroneous information (e.g. wrong parameters of an object corresponding to wrong variable values in the corresponding mathematic model) in the database information of first 104 and second databases 105 due to e.g. typing errors introduced during the information input process can be deleted or effect thereof minimized or rendered non-influenting, either prior to storing corresponding erroneous data elements in the first database 104 or defining the model of second database 105.
In further embodiments, deletion or nullification of erroneous or insufficient information can be performed by fixed or adaptive elimination, wherein on a basis of an predetermined or adaptive elimination model (e.g. filtering rules) some input data may be classified as erroneous and eliminated, which can even take place prior to storing the filtered data in the first database 104. Alternatively or additionally, while determining or calculating the model, suspiciously deviating data values or entities can be first detected and then deleted or ignored, thereby not affecting the model itself. Correspond ingy, upon receipt or, at the latest, during actual creation of the model, information conversion procedures are preferably applied to uniform the received data elements, if necessary. For example, if mileage of vehicles included in the sample data is given either in miles or kilometres, depending on the data source, it is advantageous to uniform the presentation prior to determining any empirical model therefrom. Unit conversions may be executed via tables or mathematical formulae, for example.
Concerning the general implementation, the substantially electronic system 100, 120 comprising one or more physical elements such as computers or data storage units may have a processing unit comprising at least one microprocessor (or microcontroller, DSP, etc) , possibly a memory and software. The processing unit controls, on the basis of the software, the operations of the elements and the databases, such as receiving, storing and deleting data relating to the objects, categorizing the object parameters and the objects, establishing and creating the model(s) and related one or more expectation value surfaces, updating them, controlling periodic updates, etc. Various operations to be performed and means for carrying out them have been described in the examples. Generally, the systems of various embodiments of the invention can be implemented by a computer system comprising or being connected to a network, for example the global Internet or the like. Software, programming means, elements, etc can be used in performing the various operations and functions of the embodiments, examples of which have been described above. Alternatively, middleware, programming logic, or circuit logic can be applied (not shown). The software may be provided as a software product on a carrier such as a memory card, a floppy disc, a cd, a dvd, hard disk, etc.
What comes to the establishment of the actual model describing the market sample obtained via the input sources 101 , 102, and 103, deviations in the market price of sales objects such as vehicles (cars, motorcycles, lorries, trucks, tractors, snowmobiles, atv's, etc) is based on the systematic and random factors. Systematic deviation arises from variations in the vehicle characteristics, sales period, and sales region, for example. Leftover deviation in the price of basically homogenous objects is caused by e.g. different pricing principles and desired profit margins by different parties, i.e. typically random factors from the viewpoint of a third party constructing a value estimation model.
At the time of creating a model from scratch a designer may have a sort of conception of variables that are significant in finding a way to statistically describe interrelationships and causal connections between various random variables such as price (or "value"), brand, model, age, location, mileage, etc. The conception may be initially generated on the basis of personal knowledge and experience, for example. Model is then defined by the selected variables and later on updated when it is realized that the remaining modelling error is not purely due to random factors but also due to yet unknown variables, so-called lurking variables, that should should be thus brought in to the model in order to enhance the overall accuracy thereof.
Accordingly, the systematic deviation can be illustrated or figured out by using a function such as a price function between the characteristic group and the price group. The expectation value (eg. an average or median price) is generally a function of characteristics, such as features, type, accessories, sales moment and/or period, sales area or sales location, etc. The function can determine fairly accurately the association between the price and other parameters of the objects. The logarithmic price is a dependent variable and the variation thereof can be depicted on a basis of factors such as the location and timing of the sale, i.e. explanatory variables. The function including the dependent/explanatory variables and various parameters can be estimated in accordance with a regression model, for example. Regression analysis and resulting models are generally used to determine a first degree equation describing the empiric sample pairs of the variables with maximal fit.
The estimated model can be constructed as follows:
K
LogPι = a + ∑βkxlk + ε, = a + βιxn + ... + βκxx + εl,i = l,...,n , (1 ) k=\
wherein LogPt is a logarithmic price, the dependent variable. Variables χx,...,χκ are thus explanatory. Index i refers to the number or index within the data, wherein there are n observations. Term εt refers to a random error term, which is a part left unexplained in the variation of the explanatory variable due to true random factors or aforesaid lurking variables. β:s are parameters of the model (not parameters/characteristics of the individual objects, i.e. variable values such as vehicle brand or mileage) to be estimated. Instead of logarithmic version, also other mappings (square root, cubic root, more complex functions, etc) can be used in the model as desired. Correspondingly, instead of, or in addition to, a basic regression model, also covariance models or mixed models can be utilized as estimation techniques. The brand, type and technical features such as engine power, body type, drive type (front, rear, 4-wheel), fuel type, engine size, gearbox, equipment/accessories, etc, and, of course, the age of the car (which may also be indicated as "model year"), usage kilometres or miles (-mileage), sale region, sale period, and many interactions of these and derivative variables can be used as explanatory variables, for example.
Generally within the input data there is a small amount of erroneous information, as mentioned hereinbefore, wherein the age, mileage, type of the car or e.g. price are erroneous. These erroneous data are pursued to be reduced to some extent beforehand because they can disadvantageously skew the price expectation value surfaces and the model itself. Erroneous information can be tracked down by threshold comparison -type tests between input values and allowed values, or by comparing the input values with data available in the used type register (type register may be locally available at the system 120 or be considered as one of the input sources), for example.
It should be noted that the deviation between actual prices and the estimates do not indicate presence of an error in the value evaluation model. Identical products may always be priced differently by different sellers. Thus the main reason for differencies are due to the homogenic dispersion of the price among different parties. The individual prices establish a distribution around the average expectation value of the price.
The statistic model can be estimated by defining parameters by which a quadrature error (sum) is minimized. Thus the result is 'best fit' for prices in the source data of first database 104. When the parameters have been estimated, the estimate of the regression model can be calculated in accordance with the parameter estimates and variables as follows:
LogPt = a + ∑A*Λ >/ = 1,.»»» (2)
*=1
In the formula (2) LOgP1 is a logarithmic estimate for the price of car with running number i. βλ,...,βk are parameter estimates, thus estimates of the influence of various factors on the dependent variable, i.e. the log price. When features (variable values for the formula) xn ,..., xik of the target car are obtained (model, type, year, kilometers, and various other interactives or derivative variables thereof), a precise and absolute expectation value with even several decimal accuracy can be advantageously calculated on a basis of the above formula. An estimate of the price can be obtained from the log price on a basis of the exponential function, which is an inverse function of the logarithm simply as follows:
Price estimate=exp(j) (. )=e *' (3)
The above result is a 'point-estimate' of the price. In practice due to the price deviation of the homogenous products such an estimate is not perfectly precise, but the estimate has distribution and variance. If the prices deterministically followed a certain formula, there would not be any uncertainty. However, this is not the case in real life, for example due to the market economy, as precisely identical cars may have totally different prices as explained hereinbefore. A 'standard error of prediction' depicts how great the uncertainty in determining the expectation value of the price is.
The expectation value of the price can be determined more accurately than a single price. Thus the variance of single prices is greater than the one of the estimated average. For example, let us have a look at an example wherein the average height of 50 adult males is determined. By this sample, the average height of the whole population can be estimated. Such average height can be, on the basis of the sample, determined much more precisely than a height of a randomly selected male. Similarly, an expectation or average value for the price of a vehicle is easier to determine than just a single price on the market.
The estimates produced above (formulae 2 and 3) are based on the presumption of unbiasness:
E[ε,]= 0 , (4) which also indicates that the price estimates produced by the model are on average the same as "market prices" in the sample. If categorizing variables have been utilized in the model, the estimates for each category in the corresponding variable correspond to the prices found in that category.
One important use of the model relates to predicting market prices of a certain car at a certain moment.
Generally an increase in the amount of the explanatory variables also increases the explanatory degree of the model and furthermore decreases the error scattering. For example, if there are 10 000 observations and accordingly same amount of explanatory/descriptive variables, the correspondence is complete. Such a model cannot, however, estimate, for example interpolate or extrapolate, values between or outside the existing material. Thereby the model of various embodiments is focused not to this kind of over-parameterization, but for advantageously obtaining results outside or between the known data values. Over- parameterization can be avoided, for example, by ignoring part of the available parameters during the estimation. Alternatively a check for a correspondence between price estimates and realized prices can be carried out.
One representation of the estimate produced by the system can be seen from Figure 2. Figure 2 depicts an expectation value surface (2-d projection thereof) for a chosen vehicle and for a chosen dependent variable, in this case an asking price of the car, in function of the combination of age and mileage. The example relating a certain car type, which can be selected by adjusting preferred values for e.g. categorizing variables defining the type as desired, is evidently rather homogeneous because the technical characteristics of the object are the same. Therefore, foreseeable price deviation within the same type is due to the age, usage kilometres and accessories/equipment only. On a basis of a formula
K yi = ά + ∑βkxik ,i = \,...,n , (5)
price estimates can be depicted for a desired car type with the variance of age and mileage, for example, which then converts into a representation of three- dimensional expectation value surface(s).
In Figure 2 estimates for an asking price of a specific car type "MB C 220 GDI STW 5d" have been illustrated in a form of an expectation value surface (the situation corresponds to year 2003 scenario in Finland). The expectation value surface can be traversed through and desired points cleverly picked up (pointing certain mileage and age) for a quick asking price evaluation. The expectation value surfaces shall be reviewed in more detail hereinafter.
Figure 3 discloses a flow diagram of one embodiment of a method in accordance with the present invention. In step 302 the system 120 receives and stores data relating a plurality of objects such as vehicles on the basis of which the value estimating model is constructed. In step 304 the value estimation model is created as described in this text, for example periodically or upon explicit instruction to create/update the model. In step 306 the model is stored, for example parametrically. Step 308 refers to a situation wherein a user of the system provides a request identifying characteristics of an object for which he wants to receive a value estimate. The value can be represented as an asking or sales price, for example, wherein the selection of the representation method most logically corresponds to the value information given in relation to the objects forming the sample database. The characteristics of the target object are input to the model and a value estimate is obtained as an output.
Figure 4 discloses a more detailed view of the above method in accordance with the invention. The example of figure 4 discloses, in addition to system 120 tasks, also measures taken by the data sources 101 , 102, 103 and user 106. In step 402 the remote input sources, e.g. car dealers 101, 102, and 103 in the context of pre- owned cars' trade, first locally acquire data about their stock and valuation (e.g. asking prices) situation. Such information is nowadays typically readily available in computerized accounting/sales systems. Then the sources 101 , 102, 103 transmit the data towards the service system 120 that receives the data 404, executes optional filtering and data conversion procedures thereto, and stores 406 at least the filtered/converted data in a first database 104. In step 408 the system 120 is either triggered by a local timer or an external entity to create/update the value estimation model. In step 410 the model is created/updated and in step 412 stored in a second database 105. In step 414 the user 106 generates a query and transmits a corresponding value estimation request to the system 120, said request including necessary details for running the model. The system 120 receives the request and optionally verifies its validity and/or user's 106 entitlement to utilize the system 120. Next, the system 120 calculates the value estimate in step 416 by the model that applies e.g. age, location, and mileage information as explained herein. In step 418 the value estimate is transmitted to the user 106 either in a default form (e.g. SMS/MMS message, e-mail, predetermined database format, PDF, HTML) or in a supported alternative format preferred by the user 106 and indicated by the request or e.g. user-dependent service registration information available at the system 120. In step 420 the value estimate is outputted at the user 106, e.g. via a display.
It should be noted that the inventive system 100, 120 of the various embodiments does not necessarily determine or even visualize the value estimate graphically. Various embodiments still advantageously utilize a parametric (e.g. numerical) representation of the model that may be visualized via expectation value surface(s), for example. By analyzing the surfaces one can easily and quickly inspect the value development in relation to other changing variable values. However, the calculation of the average for each object such as vehicle/car type with various combinations of age-mileage values can be avoided. In practice this kind of estimation would be near impossible to perform as there are e.g. about 20 000 different types of cars (depending on the type register used and local market; e.g. one 'type' may be defined via a plurality of values of categorizing variables) and accordingly, a huge number of variable value combinations per each type. Thereby a true average etc could be calculated only from a small portion of the whole car pool. The estimation methods of the various embodiments disclosed herein advantageously solve or at least alleviate such problem.
Namely, in various embodiments calculations are not necessarily performed independently for each car type concerning e.g. the effect the mileage has on the value estimate. Advantageously the estimation can be performed more generally by examining which characteristics together with mileage shift the value estimate.
Thus various embodiments determine different mileage effect factors for groups of cars instead of single cars (or car types). Various embodiments can utilize e.g. different brands and fuels as a distinguishing factor. For example, the effect of mileage is smaller with diesel cars or with cars having a larger engine. As another example, between different brands also the effect of mileage changes.
Thus the geometry of the expectation value surface (e.g. steepness with respect to age and mileage) depends on the characteristics of the vehicle. It is formed in accordance with the data input so that a group of expectation value surfaces results in good fit with real values, thereby also being able to extrapolate the near future trends.
Regional effect can be taken into account in a variety of ways. In one embodiment the regionality is included in the model as a categorizing variable. In another embodiment the regionality is implemented via regional correction factors.
Accordingly, estimates for comparing average price levels between different regions can be obtained whilst the other factors of the applied model remain unaltered.
Percentual and log-difference have a following relationship:
Percentual effect = 100*exp(Log-effect)-100. (6)
Thus e.g. Log-effect 0.1 relating to a certain area compares with 10.5% higher price on that area in contrast to the overall average price. In this embodiment it is assumed that notwithstanding the car type and specifications the regional differences in vehicle valuation, i.e. pricing, are static between regions. This is not always the case as the regional effect may be itself dependent on various other variables such as age and/or distance driven. Therefore, in another embodiment the regional effect can be modelled with increased precision, yet not leading to a situation where a fully independent pricing model would be determined for each region, because the latter technique would require enormous amount of processing and sample data to avoid results becoming stochastic; there would be no necessary number of observations available relating to certain car brands and models for smaller regions (single retailer, etc).
Regional price differences would, in many cases, be advantageous to model utilizing a technique that still enhances the modelling result while it utilizes less parameters to be estimated in contrast to a considerable number of fully independent models.
The regression model described hereinbefore contained an error term (see formula 4) that should represent white noise in case the residual error has no further explanatory variables hidden therein. Nevertheless, it was found that in most cases by suitable interaction between regional information and car properties the error could be minimized further, i.e. the residual was not pure white noise.
Although the exemplary regression model was linear (in relation to the input parameters), there's no reason to presume that the interactions between age, mileage and region, for example, are linear. Instead, a polynome approximation, a so-called spline function, changing at selected junction points often called knots, can be exploited.
E.g. two knots may be located at regional percentual points corresponding to 15 and 75 per cent in such a manner that Age_p15(a) relates to a region a so that 15% of the cars on that region are younger than Age_p15 indicates. Respectively, Age__p75(a) indicates age so that 75% of the cars of the region are younger. Each region preferably has its own knots.
Now, age-related spline variables can be defined as follows:
AgelPl = (Age > Age_pl5) * (Age- Age _ pi 5)2 . (7) The above variable gets value 0 as long as the age<Age_p15 and for other values of the age the variable corresponds to the age exceeding the knot value to the power of two. The curve has a linear start but at point wherein 15% of cars are younger, a second degree term is added thereto. Thus for young cars the interactions between region and age plays no (at least, major) role, which is also intuitively logical as new cars are often rather homogeneously priced. Accordingly, the second spline variable can be defined:
Age2P2 = (Age > Age _ pi 'S) * (Age - Age_pl5f . (8)
Thus the resulting curve would first start as a straight line, then convert into a first second-order polynome approximation at the first knot and further to a second second-order polynome approximation at the second knot. The resulting curve is thus continuous and differentiable also at knots.
Correspondingly, spline variables (p15 and optionally also p75) for mileage can be defined.
The regional distribution of data is not always uniform as described above; there is only a small amount of data available at certain regions, which would lead to somewhat stochastic analysis results. One way to overcome this is to model an individual price level for each region but estimate the interactions between age, region, and distance driven in more coarse manner, for each greater region. The number of parameters to be estimated (and required processing) is thus reduced, and there will be enough data to enable modeling with reasonable precision.
For example, if there are 18 regions and 7 greater regions, the number of parameters to be estimated could be calculated: 2*18+G*7 (9)
deduced from a more generic formula 2*O+5*P wherein O is the number of regions and P is the number of greater regions. Multipliers 2 and G result from the time period split into two parts and the number of parameters to be estimated per greater region, respectively.
One example of a regression function for estimating, in relation to a region a, deviation from the overall average price level is formed as: cA *Agelp2i + ∑ dA * Age2p2i
Figure imgf000022_0001
+ ∑ eA *Kmi +∑ fA *KMlp2i +∑ gA *KM2p2i + ^i
(10) In the above formula / corresponds to a sample number, a indicates the region in question, A indicates the greater region in question, P refers to the number of greater regions, ADai is a regional pointer variable receiving 1 if sample / is on the area a and 0 otherwise. Old is a pointer variable receiving 1 if the sample is older than 30 days or other desired period, for example. For example, data used for the function is generally advantageously from 60 days period. aa alone describes how much the price level of region a diverges from the overall average based on zero age and mileage during the last 30 days. aaa , for its part, tells how much the price level of region a diverged from the overall average earlier, e.g. in this particular example 31-60 days ago. βa therefore describes how the difference between region a price level and the average has changed from a period to another. In this example the value of G would be 6 (b, c, d, e, f, and g), but in one alternative embodiment, a mileage junction point could be removed, resulting in G valued as 5, without necessarily heavily affecting the modelling accuracy, which would then lead to a reduced number of 71 parameters in total according to formula 9.
Measuring price levels separately for the last period, e.g. aforesaid 30 days, ensures that the log-difference between the prediction result and prices from the sample space has a zero average for the last 30 days, i.e. the estimates for the log-price are unbiased in relation to the data samples of last 30 days. For estimating other parameters, data from a longer period can be used to obtain stable results.
When analyzing certain market and e.g. average asking prices, a proper result can be obtained by concentrating on the cars priced during a predetermined period, e.g. 30 days. On the contrary, one could analyze cars that were for sale, necessarily not priced, during that period. Such method is, in reality, more or less biased, because cheaper cars are typically sold faster than more pricey ones that thus remain for sale longer. As a result, cars having a heavy price tag end up affecting and biasing the value estimation result more, which respectively raises the estimated average prices. The regional correction may be either integrated into the overall price estimation model (equations 1 and 2) or used for explaining the corresponding residual, the "random" error. The remaining question is how to correct the price estimate to obtain unbiased results in relation to each region and varying feature combinations of the cars for sale.
A regional price function can be defined as:
logPA (Xai) = logP(Xai) + logA(Xai) (11)
wherein logPi (Xia) is a regional, logarithmic price estimate for a car / having a feature vector X=(xl,x2...,xp). The car is located on a region a, which resides within a greater region A. \ogP(Xai) is the overall price estimate covering all the regions whereas logA(Xal) is a regional correction factor.
Another option is to include, as mentioned hereinbefore, the regional variables in the actual price model. This way determining the overall price estimate gets a bit more complicated as parameter estimates shall be determined in relation to each particular region.
One may further consider what the regional correction factor function shoud be like to produce unbiased regional price estimates on each area and per each car type/sub-group.
Yet as a further modeling feature, the regional effect may be limited in connection with specific variable values, e.g. particularly high mileage and age values, because flexible function forms utilized in the invention may behave unexpectedly in border areas wherein the amount of sample data is not adequate, e.g. in the context of particularly old and/or much driven cars. The limitation can be introduced by applying a certain age variable value Age_P98 for cars that are older than 98% of all cars in the sample data. For mileage, a corresponding limitation can be set. Such limits are preferably determined for each region separately as age distributions may differ significantly between different regions. For example, in more wealthy regions the cars may be younger and/or less driven than elsewhere.
Next, few more hands-on examples of the applicability of the invention in the context of vehicle, especially car, sales are presented. Data input sources include e.g. a selected type register and price information (e.g. asking prices). The type register can be used to validate/match the obtained price information with detailed type information so that the input data for the model is complete and correct. The type register classifies the cars into homogenous sub- sets and includes information such as car brand, model, generation, etc. Commercial type registers are available and produced by CAP and JATO, for example. Within one type the cars are somewhat similar and differ possibly in minor issues such as color, equipment, etc. By utilizing a detailed type register the quality of other input data can be improved instead of purely exploiting data manually typed in by a plurality of car salesmen and thus possibly including defective information.
Variables of the model (see e.g. formulae 1 and 2) can be then defined as follows:
LogP= Modelgeneration Category Gearbox Enginesize Modelyear Age Mileage Fuel Fuel*Mileage Enginesize*Mileage Brand*Mileage Power Weight width Consumption Drivetype
Such model could be of covariance type including a plurality of categorizing variables (bolded) and also several continuous variables (e.g. Age, Mileage, and Enginesize), for example. Further, the model, may include derivative variables defining a combined effect of several variables, see e.g. Fuel*Mileage and Enginesize*Mileage in the above variable list.
In case only one explanatory variable was utilized in the model, being e.g. the model generation, the price estimate would equal the price average of the cars of the same model generation in the available input data. Now, if more variables, such as mileage, is added to the model, they will explain the price variation within each model generation; e.g. when analyzing two cars with 50 000 km and 200 000 km on the clock, expectation value for the price of the car with the latter mileage would be lower.
As a result of model estimation, a plurality of parameter estimates for the price equation is obtained and stored (e.g. in a computer-readable file).
Parameters can be then pre-calculated for each type in the type register, provided that there's enough data available for such estimation (e.g. data related to few vehicles only do not generally enable reliable estimation). Hence, the types in the type register are modelled price-wise via a parametric mapping that converts e.g. age and mileage into a price estimate. By varying the values of age and mileage variables, an expectation value surface of the price is obtained for each type. Variable values that vary within a type and per each car entity, e.g. age and mileage, cannot be fixed beforehand. Variables such as model generation, body category (e.g. coupe, sedan, SUV, wagon), gearbox (manual, automatic, etc) and fuel (petrol, diesel) remain constant within a type definition of this example; therefore they can be determined at an early stage for future use.
Let us now consider a model comprising the following variables:
LogP=Modelgeneration Age Mileage Enginesize Diesel*Mileage
For a certain exemplary car model, the parameter of the Modelgeneration variable is, as calculated on the basis of available data, set at 9.5 and the parameter of the Enginesize variable is set at 0.0003.
Thus an expectation value surface for the cars of that model generation is obtained with a constant 9.5 + 0.0003*Enginesize. Concerning a car within the model generation having a 1600cm3 engine, the constant is 9.5 + 0.0003*1600, which equals 9.98.
Accordingly, a price estimate for this car type with the age of 0 and mileage of 0 km/miles is exp(9,98)=21590 price units.
If the Age variable has a parameter value (i.e. multiplier) -0.08, Mileage variable has a parameter value -0.00095, and Diesel*Mileage has a parameter value +0.00025, the price of the car will decrease 9.5% per 100 000 mileage units (e.g. kilometres), except for diesel-engined cars for which the Diesel*Mileage correction factor will finetune the reduction to 7%.
Accordingly, an expectation value surface for a car belonging to the above model generation and having 1.βlitre petrol engine can be described parameter-wise as:
Constant=9.98
Age= -0.08
Mileage= -0.00095
And for the 1.δlitre diesel car:
Constant=9.98 Age= -0.08
Mileage= -0.0007
Thus the mileage effect differs from a corresponding petrol-powered car.
For a 1.9litre diesel car the parameters are:
Constants 0.07
Age = -0.08
Mileage = -0.0007
From the above parameters a price multiplier file can be formed, including a separate parametric description (in relation to age, mileage, and price coordinate system, for example) for each car type. If modelling data does not include enough examples of a certain type, the parameters can be interpolated/extrapolated from the parameters of neighbouring (according to predetermined criteria) types.
Also other parametric descriptions following the above logic but including more/more complex variables can be correspondingly calculated for car types in the car register.
Reverting to the embodiments disclosed hereinbefore, the selected parametric mappings can be further cultivated by transforming linear relationships (e.g. between age and mileage) into non-linear ones by utilizing splines, polynome approximation, etc, and by utilizing several parameters/variables instead of one for desired properties.
After constructing the model, i.e. determining parameters and/or expectation value surfaces, the users of the system may transmit requests including information (e.g. plate number) required for defining the target car. Based on the obtained means, the system may figure out the exact type of the car by accessing the type register, for example. If the type cannot be defined accurately enough by the given information, the user may be asked for additional information for type recognition purposes. As the expectation value surfaces have been-predetermined for different car types, real-time calculations can be kept minimum upon receipt of the inquiry, and a proper location as determined by target car entity -specific variables such as age, mileage, etc, on a corresponding type-specific expectation value surface can be found out fast. The system may perform a plurality of simultaneous queries with reduced load. Age, mileage, equipment and sales region belong to the variables that are typically specific to each individual car entity, not to each type; therefore their effect on the price cannot be pre-determined on a type level.
Ramifications and Scope
Although the description above contains many specifics, these are merely provided to illustrate the invention and should not be construed as limitations of the invention's scope. It should be also noted that the many specifics can be combined in various ways in a single or multiple embodiments. Thus it will be apparent to those skilled in the art that various modifications and variations can be made in the apparatuses and processes of the present invention without departing form the spirit or scope of the invention. The feasibility of the invention is not strictly limited to vehicles and can be used to assess other commodities as well; used variables may be then defined in a case-specific manner.

Claims

Claims
1. A computerized system (120) for assessing vehicles comprising
means for receiving data (116) from a plurality of sources (101 , 102, 103), said data representing characteristics of a plurality of vehicles, said data including indication of a value of the vehicle and of at least one characteristic selected from the group consisting of: age and mileage of the vehicle,
means (104) for storing said received data,
means for receiving (116) a request (106) for a value estimate of a target vehicle having a number of predetermined characteristics,
means for producing (114) a value estimate for the target vehicle on the basis of the data, characterized by
means for calculating (114) a value estimation model for vehicles on the basis of the received data, said model comprising a plurality of parameters to be determined and a plurality of explanatory variables associated with the parameters and defined on the basis of said characteristics, wherein at least one of said age and mileage is configured to affect the value estimate, the value estimate being a dependent variable in the model, and
means (105) for storing at least part of the value estimation model including said plurality of parameters modeled, Whereby, upon receiving the request said means for producing the value estimate (114) is configured to map at least part of said number of predetermined characteristics of the target vehicle, via said value estimation model, to said value estimate.
2. The system of claim 1 , wherein said data further include indication of a location of each vehicle, whereby the location is configured to affect the value estimate.
3. The system of claim 2, wherein regional effect in said value estimate is based on the locations of vehicles and modelled via at least one explanatory variable.
4. The system of claim 2, wherein regional effect is taken into account in the value estimate on the basis of said location via a regional correction factor.
5. The system of any preceding claim, wherein during said mapping a location in a variable value space is obtained by the model, said location obtained being indicative of the value estimate.
6. The system of claim 5, wherein an expectation value surface for a value estimate is spanned by a plurality of locations in the variable value space produced as an output of the model to the input of predetermined variable value combinations, wherein a number of variables in the predetermined variable value combinations are kept static whereas the remaining ones are traversed through in relation to their potential values.
7. The system of any preceding claim, wherein said indication of the value corresponds to or is based on the asking or sales price of the vehicle.
8. The system of any preceding claim, wherein said model includes at least one of the following: a regression model, a covariance model, or a mixed model.
9. The system of any preceding claim, further comprising means for deleting data classified as erroneous or insufficient from the received data.
10. The system of any preceding claim, wherein said age is represented as a model year or age in months calculated from the manufacture or registration date of the vehicle.
11. The system of any preceding claim, wherein at least one parameter is jointly determined for a group of vehicles with partially similar and partially different characteristics.
12. The system of any preceding claim, wherein said received data includes type register data.
13. The system of claim 12, wherein the received data in relation to at least one vehicle is a data aggregate from multiple sources, said sources including a predetermined type register and a source providing value-related information.
14. The system of any preceding claim, wherein said model includes at least one of the following elements as an explanatory variable or a part of a joint explanatory variable: engine type, fuel type, engine size, body type, vehicle model, and engine power.
15. The system of claim 14, wherein said vehicle type is determined on the basis of at least one of the following elements : brand, model, engine, and level of equipment.
16. The system of any preceding claim, wherein said model includes a joint variable formed by at least fuel type and mileage information.
17. The system of any preceding claim, wherein upon detecting said target vehicle to fall under a vehicle type, according to a predetermined type register, on the basis of which the model has not been calculated, said system is configured to interpolate or extrapolate the value estimate by utilizing information available from similar types as determined according to predefined criteria.
18. A method for assessing vehicles comprises
receiving data from a plurality of sources (302, 404), said data representing characteristics of a plurality of vehicles, said data including indication of a value of the vehicle and of at least one characteristic selected from the group consisting of: age and mileage of the vehicle,
storing said received data in a first storage (302, 406),
receiving a request for a value estimate of a target vehicle having a number of predetermined characteristics (408), and
producing a value estimate for the target vehicle on the basis of the data (308), characterized by
calculating a value estimation model for vehicles on the basis of the received data (304, 410), said model comprising a plurality of parameters to be determined and a plurality of explanatory variables associated with the parameters and defined on the basis of said characteristics, wherein at least one of said age and mileage affect the value estimate, the value estimate being a dependent variable in the model,
storing at least part of the value estimation model including said plurality of parameters modelled in a second storage (306, 412), and, upon receiving the request,
mapping at least part of said number of predetermined characteristics of the target vehicle, via said value estimation model, to said value estimate (416).
19. The method according to claim 18, wherein said data further include indication of a location of each vehicle, whereby the location is configured to affect the value estimate.
20. Computer software adapted to, when run on a computer, to execute the method steps of claim 18 or 19.
21. A carrier medium carrying the computer software according to claim 20.
PCT/FI2007/000150 2007-05-31 2007-05-31 A system and a method for assessing objects WO2008145798A2 (en)

Priority Applications (4)

Application Number Priority Date Filing Date Title
PCT/FI2007/000150 WO2008145798A2 (en) 2007-05-31 2007-05-31 A system and a method for assessing objects
EP08709307A EP2153393A4 (en) 2007-05-31 2008-02-01 System and method for assessing and managing objects
PCT/FI2008/050039 WO2008145805A1 (en) 2007-05-31 2008-02-01 System and method for assessing and managing objects
US12/602,244 US20100179861A1 (en) 2007-05-31 2008-02-01 System and method for assessing and managing objects

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/FI2007/000150 WO2008145798A2 (en) 2007-05-31 2007-05-31 A system and a method for assessing objects

Publications (1)

Publication Number Publication Date
WO2008145798A2 true WO2008145798A2 (en) 2008-12-04

Family

ID=40074607

Family Applications (2)

Application Number Title Priority Date Filing Date
PCT/FI2007/000150 WO2008145798A2 (en) 2007-05-31 2007-05-31 A system and a method for assessing objects
PCT/FI2008/050039 WO2008145805A1 (en) 2007-05-31 2008-02-01 System and method for assessing and managing objects

Family Applications After (1)

Application Number Title Priority Date Filing Date
PCT/FI2008/050039 WO2008145805A1 (en) 2007-05-31 2008-02-01 System and method for assessing and managing objects

Country Status (3)

Country Link
US (1) US20100179861A1 (en)
EP (1) EP2153393A4 (en)
WO (2) WO2008145798A2 (en)

Families Citing this family (31)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2074572A4 (en) 2006-08-17 2011-02-23 Experian Inf Solutions Inc System and method for providing a score for a used vehicle
CN102203814A (en) 2008-09-09 2011-09-28 真车股份有限公司 System and method for sales generation in conjunction with a vehicle data system
US8612314B2 (en) * 2008-09-09 2013-12-17 Truecar, Inc. System and method for the utilization of pricing models in the aggregation, analysis, presentation and monetization of pricing data for vehicles and other commodities
US20100299190A1 (en) * 2009-05-20 2010-11-25 Tim Pratt Automotive market place system
US8458012B2 (en) 2009-10-02 2013-06-04 Truecar, Inc. System and method for the analysis of pricing data including a sustainable price range for vehicles and other commodities
US20130218636A1 (en) * 2009-11-25 2013-08-22 Hcd Software Llc System for determining a trade-in value of an item based on a match to a database of customers
US20110178839A1 (en) * 2010-01-20 2011-07-21 Adra Hosni I Method and system for evaluating a consumer product based on web-searchable criteria
US11301922B2 (en) 2010-11-18 2022-04-12 AUTO I.D., Inc. System and method for providing comprehensive vehicle information
US10977727B1 (en) 2010-11-18 2021-04-13 AUTO I.D., Inc. Web-based system and method for providing comprehensive vehicle build information
US10296929B2 (en) * 2011-06-30 2019-05-21 Truecar, Inc. System, method and computer program product for geo-specific vehicle pricing
JP5963861B2 (en) 2011-07-28 2016-08-03 トゥルーカー インコーポレイテッド System and method for analysis and presentation of used vehicle pricing data
US10504159B2 (en) 2013-01-29 2019-12-10 Truecar, Inc. Wholesale/trade-in pricing system, method and computer program product therefor
US20150066568A1 (en) * 2013-09-03 2015-03-05 Adobe Systems Incorporated Service and location selection in the cloud
KR101664034B1 (en) * 2013-11-06 2016-10-10 현대자동차 주식회사 Expectation price providing system of the Used Car and Method thereof
US10580054B2 (en) 2014-12-18 2020-03-03 Experian Information Solutions, Inc. System, method, apparatus and medium for simultaneously generating vehicle history reports and preapproved financing options
US11062338B2 (en) * 2016-01-12 2021-07-13 Sandeep Aggarwal Used-vehicle algorithmic pricing engine method and system
US20170300991A1 (en) * 2016-01-12 2017-10-19 Sandeep Aggarwal Used-vehicle algorithmic pricing engine method and system
US10963963B2 (en) * 2016-03-28 2021-03-30 Investcloud Inc Rule based hierarchical configuration
US10409867B1 (en) 2016-06-16 2019-09-10 Experian Information Solutions, Inc. Systems and methods of managing a database of alphanumeric values
US10909515B2 (en) * 2017-05-01 2021-02-02 Mastercard International Incorporated Systems and methods for use in tracking usage of assets based on sensors associated with the assets
US11210276B1 (en) 2017-07-14 2021-12-28 Experian Information Solutions, Inc. Database system for automated event analysis and detection
US11205229B1 (en) * 2017-08-04 2021-12-21 EMC IP Holding Company LLC Content storage management based on multidimensional valuation models
US10740404B1 (en) 2018-03-07 2020-08-11 Experian Information Solutions, Inc. Database system for dynamically generating customized models
CN108694671A (en) * 2018-06-27 2018-10-23 北京中电普华信息技术有限公司 Marketing service site investment analysis method
IT201900000106A1 (en) * 2019-01-07 2020-07-07 Gba S R L SYSTEM FOR DETERMINING THE ACTUAL VALUE OF A VEHICLE
US11157835B1 (en) 2019-01-11 2021-10-26 Experian Information Solutions, Inc. Systems and methods for generating dynamic models based on trigger events
US20200410465A1 (en) 2019-06-28 2020-12-31 Fair Ip, Llc Payment-driven sourcing
CN113379356B (en) * 2021-07-02 2022-05-17 西北师范大学 Vehicle and goods matching method based on AHP-DBN
CN113485694B (en) * 2021-07-06 2023-04-28 算话信息科技(上海)有限公司 Variable data intelligent middle platform system of algorithm
US20240062228A1 (en) * 2022-08-21 2024-02-22 Cogitaas AVA Pte Ltd System and method for determining consumer surplus factor
CN117670382A (en) * 2023-12-05 2024-03-08 广州穗圣信息科技有限公司 Method and system for carrying out secondary handcart estimation by utilizing big data

Family Cites Families (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6026399A (en) * 1997-05-30 2000-02-15 Silicon Graphics, Inc. System and method for selection of important attributes
US6748369B2 (en) * 1999-06-21 2004-06-08 General Electric Company Method and system for automated property valuation
KR100832604B1 (en) * 2000-05-30 2008-05-27 시스테무.로케이션 가부시키가이샤 Vehicle resale price analysis system
US20020128985A1 (en) * 2001-03-09 2002-09-12 Brad Greenwald Vehicle value appraisal system
US20030078850A1 (en) * 2001-09-05 2003-04-24 Eric Hartman Electronic marketplace system and method using a support vector machine
US7050982B2 (en) * 2002-08-14 2006-05-23 Veretech, Llc Lead generation system using buyer criteria
KR20040019896A (en) * 2002-09-02 2004-03-06 마츠다 가부시키가이샤 Server, system, method and program for sales supporting
US20040199429A1 (en) * 2003-01-13 2004-10-07 Daniel Kwoh Schemes for rating cruises
US7152778B2 (en) * 2003-06-23 2006-12-26 Bitstock Collecting and valuating used items for sale
US7596512B1 (en) * 2003-11-26 2009-09-29 Carfax, Inc. System and method for determining vehicle price adjustment values
US20050187778A1 (en) * 2004-02-20 2005-08-25 Guy Mitchell Method and system for estimating the value of real estate
US20050256780A1 (en) * 2004-05-14 2005-11-17 Wired Logic, Inc. Electronically implemented vehicle marketing services
US20050267774A1 (en) * 2004-06-01 2005-12-01 David Merritt Method and apparatus for obtaining and using vehicle sales price data in performing vehicle valuations
WO2007002680A2 (en) * 2005-06-27 2007-01-04 Namx, Inc. System and method for control, distribution and purchase of wholesale goods and related interactions
US20070038522A1 (en) * 2005-08-09 2007-02-15 Capital One Financial Corporation Auto buying system and method
US20160247211A2 (en) * 2006-03-30 2016-08-25 Vidangel, Inc. Apparatus, system, and method for remote media ownership management
US20070250327A1 (en) * 2006-04-24 2007-10-25 Shad Hedy System and method for used vehicle valuation based on actual transaction data provided by industry participants

Also Published As

Publication number Publication date
EP2153393A4 (en) 2012-05-02
EP2153393A1 (en) 2010-02-17
WO2008145805A1 (en) 2008-12-04
US20100179861A1 (en) 2010-07-15

Similar Documents

Publication Publication Date Title
WO2008145798A2 (en) A system and a method for assessing objects
Moraga-González et al. Consumer search and prices in the automobile market
US20230259986A1 (en) System and method for analysis and presentation of used vehicle pricing data
US10504159B2 (en) Wholesale/trade-in pricing system, method and computer program product therefor
US20100293181A1 (en) VALUEpilot - METHOD AND APPARATUS FOR ESTIMATING A VALUE OF A VEHICLE
CN103827906A (en) Method and system for selection, filtering or presentation of available sales outlets
US10685363B2 (en) System, method and computer program for forecasting residual values of a durable good over time
CN101589385A (en) A choice engine
Peterson et al. Adverse selection in the used‐car market: evidence from purchase and repair patterns in the Consumer Expenditure Survey
US20200027141A1 (en) System and method for analysis and presentation of used vehicle pricing data
Amineh et al. Assessment of Consumers' Satisfaction with the Automotive Product Quality.
Sha et al. Analyzing customer preference to product optional features in supporting product configuration
Engers et al. Annual miles drive used car prices
Fetscherin et al. Valuating brand equity and product-related attributes in the context of the German automobile market
Habibi et al. An empirical study on aggregation of alternatives and its influence on prediction in car type choice models
US20210118016A1 (en) Net valuation guarantee for vehicles
ALGANAD et al. Boosting green cars retail in Malaysia: The influence of conditional value on consumers behaviour
US11915275B2 (en) Systems and methods for estimating asset resale value
Bhagirath et al. Impact of consumer behavior on online resale price and transaction closure
Abdullahu et al. After-Sales Service and Pricing as Determinants on Consumer Buying Decision in Automotive Industry Case Study: Porsche Kosova
Park et al. Effects of FTA provisions on the market structure of the Korean automobile industry
Andrian The effect of electronic word of mouth and perceived value on purchase intention
Duch-Brown et al. Evaluating the impact of market integration-Banning online trade restrictions in the EU portable PC market
JP2023166082A (en) Used vehicle information processing apparatus, used vehicle information processing method, and program
Simanjuntak et al. The influence of product quality on car puchase decision

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 07730619

Country of ref document: EP

Kind code of ref document: A2

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 07730619

Country of ref document: EP

Kind code of ref document: A2