CN110059118A

CN110059118A - Weighing computation method and device, the terminal device of characteristic attribute

Info

Publication number: CN110059118A
Application number: CN201910340945.0A
Authority: CN
Inventors: 彭明喜; 张胜; 邱祥平; 雷霆; 杜渂; 林永生; 周赵云; 王月; 王孟轩; 韩国令; 张昆鹏; 何共晖; 陈健
Original assignee: Di'aisi Information Technology Ltd By Share Ltd
Current assignee: Di'aisi Information Technology Ltd By Share Ltd
Priority date: 2019-04-26
Filing date: 2019-04-26
Publication date: 2019-07-26
Anticipated expiration: 2039-04-26
Also published as: CN110059118B

Abstract

The invention discloses a kind of weighing computation method of characteristic attribute and devices, terminal device, are related to feature calculation field, this method comprises: collecting case sample；Case sample is clustered according to default characteristic attribute, obtains m class case sample database；N case sample is randomly selected from every class case sample database respectively；For each case sample randomly selected, k case samples recently are found out from the case sample database of its affiliated class, find k neighbour's case sample respectively from each case sample database of other classes；According to the n case sample randomly selected from every class case sample database respectively and its corresponding k nearest case samples and (m-1) k neighbour's case sample, the weight of characteristic attribute is calculated.The weight of feature of present invention attribute reduces influence of the case sample distribution difference to the weight of characteristic attribute, to obtain the weight of more effective characteristic attribute.

Description

Weighing computation method and device, the terminal device of characteristic attribute

Technical field

The present invention relates to the weighing computation methods and device of feature calculation field more particularly to a kind of characteristic attribute, terminal Equipment.

Background technique

The probability that Urban Fires occur constantly rises, and be easy to cause a large amount of casualties and property loss, is receiving After fire alert, artificial experience is generally relied on to determine to send specific fire fighting truck quantity and its type, but it is this by people The decision of work experience has certain randomness and blindness, and than relatively time-consuming.

Currently, in response to this problem, traditional solution is to weigh the inspection of method using kNN (k-NearestNeighbor) Rope strategy searches similar case, is provided according to similar case and sends strategy.But kNN weighs method and does not consider characteristic attribute to weight It influences, usually different characteristic attributes has different weights to the expression of each case, and kNN weighs each feature category of method Property weight be all artificially set as identical, will affect the reliability of calculated result, using the similar case being retrieved as recommend When scheme, reasonability also will receive certain influence.

Summary of the invention

The object of the present invention is to provide a kind of weighing computation method of characteristic attribute and devices, terminal device, calculate The weight of characteristic attribute meet real case situation, obtain the weight of more effective characteristic attribute, improved when being subsequent use The accuracy of search result lays the foundation.

Technical solution provided by the invention is as follows:

A kind of weighing computation method of characteristic attribute, comprising the following steps: collect case sample；To case sample according to pre- If characteristic attribute clusters, m class case sample database is obtained；N case sample is randomly selected from every class case sample database respectively；Needle To each case sample randomly selected, k case samples recently are found out from the case sample database of its affiliated class, from other classes Each case sample database in find k neighbour's case sample respectively；Wherein, nearest case sample is in the case randomly selected The nearest case sample of the case sample randomly selected described in distance in the case sample database of the affiliated class of sample, neighbour's case sample It is the nearest case sample of the case sample randomly selected described in distance in the case sample database of other classes；According to respectively from every The n case sample randomly selected in class case sample database and its corresponding k nearest case samples and (m-1) k neighbour's case Example sample, is calculated the weight of characteristic attribute；Wherein, m, n, k are greater than the integer equal to 1.

In the above-mentioned technical solutions, the weight of characteristic attribute according to extracted from all kinds of case sample databases several Case sample and its nearest case sample, neighbour's case sample are averaging after calculating respective weight, reduce case sample Influence of the distributional difference to the weight of characteristic attribute, to obtain more effective feature weight, in this way for subsequent calls When the weight of the characteristic attribute calculated carries out Case Retrieval, search result lays the foundation with more science.

Further, the default characteristic attribute is fire size class.

In the above-mentioned technical solutions, quantity of the fire size class in fire-fighting case is proper, will not influence because of very little The calculating of weight will not calculate too complicated because of too many.

Further, when the characteristic attribute is multiple, according to the n randomly selected from every class case sample database respectively Case sample and its corresponding k nearest case samples and (m-1) k neighbour's case sample, calculate separately to obtain each feature The weight of attribute.

In the above-mentioned technical solutions, the case sample randomly selected can be applied in the weight calculation of each characteristic attribute, As long as computing repeatedly repeatedly, the weight of each characteristic attribute can be calculated, it is convenience of calculation, quick, and each spy calculated The weight for levying attribute is representative preferably.

Further, described according to the n case sample and its correspondence randomly selected from every class case sample database respectively K case samples and (m-1) k neighbour's case sample recently, the calculation formula of the weight of characteristic attribute is calculated are as follows:

Wherein, W'(A) be characteristic attribute A weight, W (A) is the initialization weight of characteristic attribute A, R_tiIt is from the i-th class case T-th of the case sample randomly selected in example sample database, H_tijIt is that the distance found out from the i-th class case sample database is randomly selected T-th of case sample j-th of nearest case sample, M_tijIt (C) is the distance found out from C class case sample database from the i-th class J-th of neighbour's case sample of t-th of the case sample randomly selected in case sample database, C class case sample database are not belonging to i-th Class case sample database, diff (A, R_ti,H_tij) indicate case sample R_tiWith case sample H_tijDifference on characteristic attribute A, diff (A,R_ti,M_tij(C)) case sample R is indicated_tiWith case sample M_tij(C) difference on characteristic attribute A.

In the above-mentioned technical solutions, the weight of characteristic attribute obtains after being averaging, and it is poor to reduce case sample distribution The different influence to characteristic attributes weight, keeps the weight calculated more effective.

Further, diff (A, the R_ti,H_tij) calculation formula are as follows:

R_tiIt (A) is the value of t-th of case sample being randomly selected from the i-th class case sample database on characteristic attribute A, H_tij(A) be t-th of case sample that the distance found out from the i-th class case sample database is randomly selected j-th of nearest case sample Originally the value on characteristic attribute A, max (A) refer to the maximum value of the characteristic attribute A in the case sample of collection, and min (A) is Refer to the minimum value of the characteristic attribute A in the case sample of collection.

Further, diff (A, the R_ti,M_tij(C)) calculation formula are as follows:

R_tiIt (A) is the value of t-th of case sample being randomly selected from the i-th class case sample database on characteristic attribute A, M_tij(C) (A) is t-th of the case randomly selected from the i-th class case sample database from the distance found out in C class case sample database Value of j-th of the neighbour's case sample of sample on characteristic attribute A, C class case sample database are not belonging to the i-th class case sample Library, max (A) refer to the maximum value of the characteristic attribute A in the case sample of collection, and min (A) refers in the case sample of collection The minimum value of characteristic attribute A.

In the above-mentioned technical solutions, dimensional normalization processing is carried out to the weight of characteristic attribute A in above-mentioned formula, after being convenient for The calling of continuous weight.

The present invention also provides a kind of weight calculation devices of characteristic attribute, comprising: collection module, for collecting case sample This；Cluster module obtains m class case sample database for clustering to case sample according to default characteristic attribute；Abstraction module is used In n case sample is randomly selected from every class case sample database respectively；Searching module, for for the every case randomly selected Example sample finds out k case samples recently, from each case sample database of other classes from the case sample database of its affiliated class K neighbour's case sample is found respectively；Wherein, nearest case sample is the case sample in the affiliated class of case sample randomly selected The nearest case sample of the case sample randomly selected described in distance in this library, neighbour's case sample are the case samples in other classes The nearest case sample of the case sample randomly selected described in distance in this library；Computing module, for according to respectively from every class case The n case sample randomly selected in example sample database and its corresponding k nearest case samples and (m-1) k neighbour's case sample This, is calculated the weight of characteristic attribute；Wherein, m, n, k are greater than the integer equal to 1.

Further, the computing module is further used for when the characteristic attribute is multiple, according to respectively from every class case The n case sample randomly selected in example sample database and its corresponding k nearest case samples and (m-1) k neighbour's case sample This, calculates separately to obtain the weight of each characteristic attribute.

The present invention also provides a kind of terminal device, including memory, processor and storage are in the memory and can The computer program run on the processor, the processor realize such as any of the above-described spy when running the computer program The step of levying the weighing computation method of attribute.

The present invention also provides a kind of computer readable storage medium, the computer-readable recording medium storage has computer Program, when the computer program is executed by processor the step of the realization such as weighing computation method of above-mentioned any feature attribute.

Compared with prior art, the weighing computation method of characteristic attribute of the invention and device, terminal device beneficial effect It is:

The weight of feature of present invention attribute is according only to several case samples extracted from all kinds of case sample databases And its case sample, neighbour's case sample are averaging after calculating respective weight recently, reduce case sample distribution difference Influence to the weight of characteristic attribute is counted to obtain the weight of more effective characteristic attribute for subsequent calls in this way When the weight of the characteristic attribute calculated carries out Case Retrieval, search result lays the foundation with more science.

Detailed description of the invention

Below by clearly understandable mode, preferred embodiment is described with reference to the drawings, to a kind of weight of characteristic attribute Above-mentioned characteristic, technical characteristic, advantage and its implementation of calculation method and device, terminal device are further described.

Fig. 1 is the flow chart of weighing computation method one embodiment of feature of present invention attribute；

Fig. 2 is the flow chart of another embodiment of the weighing computation method of feature of present invention attribute；

Fig. 3 is the structural schematic diagram of weight calculation device one embodiment of feature of present invention attribute；

Fig. 4 is the flow chart of another embodiment of the weighing computation method of feature of present invention attribute；

Fig. 5 is the structural schematic diagram of terminal device one embodiment of the present invention.

Drawing reference numeral explanation:

3. the weight calculation device of characteristic attribute, 31. collection modules, 32. cluster modules, 33. abstraction modules, 34. search Module, 35. computing modules, 5. terminal devices, 51. memories, 52. computer programs, 53. processors.

Specific embodiment

In being described below, for illustration and not for limitation, the tool of such as particular system structure, technology etc is proposed Body details, so as to provide a thorough understanding of the present application embodiment.However, it will be clear to one skilled in the art that there is no these specific The application also may be implemented in the other embodiments of details.In other cases, it omits to well-known system, device, electricity The detailed description of road and method, so as not to obscure the description of the present application with unnecessary details.

It should be appreciated that ought use in this specification and in the appended claims, term " includes " indicates the description Feature, entirety, step, operation, the presence of element and/or component, but one or more other features, entirety, step are not precluded Suddenly, the presence or addition of operation, element, component and/or set.

To make simplified form, part related to the present invention is only schematically shown in each figure, they are not represented Its practical structures as product.In addition, there is identical structure or function in some figures so that simplified form is easy to understand Component only symbolically depicts one of those, or has only marked one of those.Herein, "one" is not only indicated " only this ", can also indicate the situation of " more than one ".

It will be further appreciated that the term "and/or" used in present specification and the appended claims is Refer to any combination and all possible combinations of one or more of associated item listed, and including these combinations.

In the specific implementation, terminal device described in the embodiment of the present application is including but not limited to such as with the sensitive table of touch Mobile phone, laptop computer or the tablet computer in face (for example, touch-screen display and/or touch tablet) etc other Portable device.It is to be further understood that in certain embodiments, the terminal device is not portable communication device, but Desktop computer with touch sensitive surface (such as: touch-screen display and/or touch tablet).

In following discussion, the terminal device including display and touch sensitive surface is described.However, should manage Solution, terminal device may include that other one or more physical Users of such as physical keyboard, mouse and/or control-rod connect Jaws equipment.

Terminal device supports various application programs, such as one of the following or multiple: drawing application program, demonstration application Program, network creation application program, word-processing application, disk imprinting application program, spreadsheet applications, game are answered With program, telephony application, videoconference application, email application, instant messaging applications, forging Refining supports application program, photo management application program, digital camera application program, digital camera applications program, web browsing to answer With program, digital music player application and/or video frequency player application program.

At least one of such as touch sensitive surface can be used in the various application programs that can be executed on the terminal device Public physical user-interface device.It can be adjusted among applications and/or in corresponding application programs and/or change touch is quick Feel the corresponding information shown in the one or more functions and terminal on surface.In this way, terminal public physical structure (for example, Touch sensitive surface) it can support the various application programs with user interface intuitive and transparent for a user.

In addition, term " first ", " second " etc. are only used for distinguishing description, and should not be understood as in the description of the present application Indication or suggestion relative importance.

In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, Detailed description of the invention will be compareed below A specific embodiment of the invention.It should be evident that drawings in the following description are only some embodiments of the invention, for For those of ordinary skill in the art, without creative efforts, it can also be obtained according to these attached drawings other Attached drawing, and obtain other embodiments.

It should be noted that the weighing computation method of characteristic attribute of the invention is the improvement carried out based on kNN algorithm, remove It is applied to fire-fighting domain, other field can also be applied to, as long as retrieve CROSS REFERENCE using the weight of characteristic attribute The case where it is all applicable.

Fig. 1 shows the implementation flow chart of the weighing computation method of a feature of the present invention attribute, the weight calculation side Method can be applied to terminal device (such as: computer understands in the present embodiment, all explanations using computer as subject, But those skilled in the art understands that the weighing computation method can also be applied to other terminal devices, as long as being able to achieve corresponding function Energy), weighing computation method the following steps are included:

S101 collects case sample.

Specifically, case sample is to collect to obtain according to actual needs from different places.Such as: if desired fire Case is rescued, can be collected from fire-fighting alert database；If desired user's service condition of A platform, can be from the number of A platform According to being collected in library.

During collecting case sample, the information that can describe the characteristic attribute of each case sample can be extracted, Such as: weather information, time of putting on record, alarm people's description information, combustible substance, burning floor, disposition object and fire size class etc., The information that the characteristic attribute of each case sample has its exclusive.

Preferably, after having collected case sample, data cleansing can be carried out to the case sample of collection, removes repeated and redundant And and indicate unrelated field with case sample and the characteristic attribute of case sample encoded, facilitate subsequent calling.

Such as: combustible substance can be encoded from low to high according to inflammable attribute.Specific coding mode is according to actually making It is determined, is not limited thereto with situation.

S102 clusters the case sample of collection according to default characteristic attribute, obtains m class case sample database.

Each case sample of collection is clustered specifically, existing clustering algorithm can be used, such as: K-Means is poly- Class algorithm, the speed of service is fast, the result that can be quickly needed.

Default characteristic attribute can self-setting according to actual needs, such as:, can will be fiery for the case sample about fire-fighting Calamity grade is as default characteristic attribute.Reason is that generally in the case of fire-fighting, fire size class shares 5 grades, by this The categorical measure obtained if cluster is proper, convenient for calculating；Other attributive character or classification are few (3 classes or less), Calculating or the classification that will affect weight are especially more, general 70 class of object are such as disposed, so using fire size class as default feature Hierarchical cluster attribute is proper.

If the case sample of other field, can according to its characteristic attribute situation, select corresponding characteristic attribute as Default characteristic attribute is clustered, and suitably all kinds of case sample databases of quantity are obtained.

S103 randomly selects n case sample from every class case sample database respectively.

Specifically, above-mentioned steps, which are equivalent to from the case sample database of every one kind, all extracts n case sample, this implementation The extraction mode of example can be a for 1) the disposable n that directly extracts from every a kind of case sample database, and 2) it can also be every time from every One is extracted in class case sample database, repeats n times, as shown in Figure 4.

About extraction mode 2), identical case sample may be extracted sometimes, but when data volume is larger, Too much influence is not had, but extraction mode 1) does not have this misgivings completely, preferentially selects extraction mode 1).

Such as: when by all case sample clusterings of collection being 5 class case sample databases according to fire size class, when n=5, point 5 case samples are not extracted from every class case sample database, extract 5*5=25 case sample in total.

The case sample size randomly selected and its affiliated class are defined in the present embodiment, avoid repeating to be extracted into same class In multiple case samples lead to the possibility that greater weight is endowed on such, i.e. the mode of randomly selecting of the present embodiment guarantees The case sample reasonability with higher randomly selecting out.

S104 is directed to each case sample randomly selected, and k nearest cases are found out from the case sample database of its affiliated class Example sample, finds k neighbour's case sample respectively from each case sample database of other classes；Wherein, nearest case sample is The nearest case sample of the case sample that distance is randomly selected in the case sample database for the affiliated class of case sample randomly selected, Neighbour's case sample is the case sample that the distance case sample randomly selected is nearest in the case sample database of other classes.

Specifically, being illustrated for the case sample B randomly selected by one, belong to the case that fire size class is 3 In sample database.

Case sample refers in feature space k of case sample B recently, in the case sample database that fire size class is 3 The k case sample nearest apart from case sample B.

Assuming that k=3, fire size class is for four case samples in 3 case sample database such as following table at a distance from case sample A Shown in one, then case sample 1,2,4 is the nearest case sample of case sample B.

Table one

The case sample database for being 3 positioned at fire size class	At a distance from case sample A
		Case sample 1	3
Case sample 2	2
		Case sample 3	4
Case sample 4	0.5

Neighbour's case sample is every class of other classes from being not belonging to find in case sample database locating for case sample B K neighbour's case sample is all found out in case sample database.

Such as: one shares 5 class case sample databases, and respectively fire size class is 1 case sample database 1, and fire size class is 2 Case sample database 2, the case sample database 3 that fire size class is 3, the case sample database 4 that fire size class is 4, the case that fire size class is 5 Example sample database 5, case sample B belong to case sample database 3.

Neighbour's case sample of case sample B have the k that is found from case sample 1 away from nearest case sample, It is a away from nearest away from nearest case sample, the k found from case sample 4 from the k found in case sample 2 Case sample and the k found from case sample 5 away from nearest case sample, 4k neighbour's case sample altogether.

S105 is according to a case sample of n (being total up to mn) and its correspondence randomly selected from every class case sample database respectively K case samples and (m-1) k neighbour's case sample recently, the weight of characteristic attribute is calculated；Wherein, m, n, k are Integer more than or equal to 1.

Specifically, the value of m, n, k are arranged according to actual case sample situation.The weight of characteristic attribute is taken out according to random The corresponding nearest case sample of mn case sample and each case sample and neighbour's case sample taken is calculated.

According to the n case sample randomly selected from every class case sample database respectively and its corresponding k nearest cases Sample and (m-1) k neighbour's case sample, are calculated the calculation formula of the weight of characteristic attribute are as follows:

Wherein, W'(A) be characteristic attribute A weight, W (A) is the initialization weight of characteristic attribute A, R_tiIt is from the i-th class case T-th of the case sample randomly selected in example sample database, H_tijIt is the distance found out from the i-th class case sample database from the i-th class case J-th of nearest case sample of t-th of the case sample randomly selected in example sample database, M_tijIt (C) is from C class case sample database In j-th of neighbour's case sample of t-th of case sample for being randomly selected from the i-th class case sample database of the distance found out, C class Case sample database is not belonging to the i-th class case sample database, diff (A, R_ti,H_tij) indicate case sample R_tiWith case sample H_tij? Difference on characteristic attribute A, diff (A, R_ti,M_tij(C)) case sample R is indicated_tiWith case sample M_tij(C) on characteristic attribute A Difference.

Specifically, the general weight that initializes is set as 0 in the weight calculation of a characteristic attribute, initialization power is not considered The influence of weight, only calculates from each case sample screened in the case sample correctly classified (i.e. according to nearest case Sample and the case sample randomly selected) difference on this feature attribute and case sample (the i.e. basis in mistake classification Neighbour's case sample and the case sample randomly selected) difference, after the two is averaging respectively, COMPREHENSIVE CALCULATING is obtained.

In one embodiment it is preferred that diff (A, R_ti,H_tij) calculation formula are as follows:

R_tiIt (A) is the value of t-th of case sample being randomly selected from the i-th class case sample database on characteristic attribute A, H_tijIt (A) is t-th of the case sample randomly selected from the i-th class case sample database from the distance found out in the i-th class case sample database Value of this j-th of the nearest case sample on characteristic attribute A, max (A) refer to the characteristic attribute in the case sample of collection The maximum value of A, min (A) refer to the minimum value of the characteristic attribute A in the case sample of collection.

Specifically, acquiring case sample R in above-mentioned formula_tiWith case sample H_tijAfter the difference on characteristic attribute A, also Divided by the maximum value of characteristic attribute A and the difference of minimum value, dimensional normalization processing is carried out to the weight of this characteristic attribute A, just In the calling of subsequent weight.

In one embodiment it is preferred that diff (A, R_ti,M_tij(C)) calculation formula are as follows:

Similarly, case sample R is acquired in above-mentioned formula_tiWith case sample M_tij(C) after the difference on characteristic attribute A, also Divided by the maximum value of characteristic attribute A and the difference of minimum value, dimensional normalization processing is carried out to the weight of this characteristic attribute A, just In the calling of subsequent weight.

Such as: when calculating the weight of burning this characteristic attribute of floor, the number of plies for the floor that burns is exactly that each case sample exists Value on this characteristic attribute, by R_ti(A) and H_tij(A) difference between calculate separately out it is cumulative after be averaging, then by R_ti (A) and M_tij(C) difference between (A) calculate separately out it is cumulative after be averaging, after comprehensively considering, this feature category is calculated The weight of property.

In yet another example, when needing normalized, R is calculated_ti(A) and H_tij(A) divided by combustion after the difference between The difference of floor maximum value and minimum value is burnt, then is added up, is similarly R_ti(A) and M_tij(C) (A), then the two is comprehensively considered Afterwards, the weight of this characteristic attribute is calculated.

In the present embodiment, the weight of characteristic attribute is unrelated with its initial weight, according only to from all kinds of case sample databases Several case samples and its nearest case sample, the neighbour's case sample extracted asks flat after calculating respective weight , influence of the case sample distribution difference to the weight of characteristic attribute is reduced, to obtain the power of more effective characteristic attribute Weight, when being that the weight for the characteristic attribute that subsequent calls are calculated in this way carries out Case Retrieval, search result has more Science lays the foundation.

Improvement based on the above embodiment, in another embodiment of the present invention, as shown in Fig. 2, a kind of characteristic attribute Weighing computation method, comprising the following steps:

S201 collects case sample；

S202 clusters the case sample of collection according to default characteristic attribute, obtains m class case sample database；Optionally, in advance If characteristic attribute is fire size class；

S203 randomly selects n case sample from every class case sample database respectively；

S204 is directed to each case sample randomly selected, and k nearest cases are found out from the case sample database of its affiliated class Example sample, finds k neighbour's case sample respectively from each case sample database of other classes；Wherein, nearest case sample is The nearest case sample of the case sample that distance is randomly selected in the case sample database for the affiliated class of case sample randomly selected, Neighbour's case sample is the nearest case sample of distance is randomly selected in the case sample database of other classes case sample；

S205 is when characteristic attribute is multiple, according to the n case sample randomly selected from every class case sample database respectively Sheet and the corresponding k nearest case sample of each case sample and (m-1) k neighbour's case sample, calculate separately to obtain each spy Levy the weight of attribute；Wherein, m, n, k are greater than the integer equal to 1.

Specifically, in S205 a characteristic attribute calculation formula are as follows:

The weight of each characteristic attribute is according to the case sample and its nearest case sample, neighbour's case for randomly selecting out Example sample is calculated.

Such as: the weight of three characteristic attributes is if desired calculated, then three times using above-mentioned formula, every time in different features It is calculated on attribute, calculates the weight of a characteristic attribute every time.

It should be noted that diff (A, R_ti,H_tij) and diff (A, R_ti,M_tij(C)) calculation formula and above-described embodiment Identical, details are not described herein.

In the present embodiment, when if desired calculating the weight of multiple characteristic attributes, the case sample randomly selected be can be applied to In the weight calculation of each characteristic attribute, as long as computing repeatedly repeatedly, the weight of each characteristic attribute, calculating side can be calculated Just, fast, and the weight of each characteristic attribute calculated is representative preferably.

It should be understood that in the above-described embodiments, the size of each step number is not meant that the order of the execution order, each step Execution sequence should determine that the implementation process of the embodiments of the invention shall not be constituted with any limitation with function and internal logic.

Fig. 3 is that the schematic diagram of the weight calculation device 3 of characteristic attribute provided by the present application is only shown for ease of description Relevant to the embodiment of the present application part.

The weight calculation device of this feature attribute can be the software unit being built in terminal device, hardware cell or The unit of soft or hard combination can also be used as independent pendant and be integrated into terminal device.

The weight calculation device of this feature attribute includes:

Collection module 31, for collecting case sample.

Preferably, collection module 31 can carry out data cleansing to the case sample of collection after having collected case sample, Removal repeated and redundant and and indicate unrelated field with case sample and the characteristic attribute of case sample encoded, facilitate subsequent Calling.

Cluster module 32 clusters according to default characteristic attribute for the case sample to collection, obtains m class case sample Library.

Abstraction module 33, for randomly selecting n case sample from every class case sample database respectively.

Specifically, above-mentioned steps, which are equivalent to from the case sample database of every one kind, all extracts n case sample, this implementation The extraction mode of example can be a for 1) the disposable n that directly extracts from every a kind of case sample database, and 2) it can also be every time from every One is extracted in class case sample database, repeats n times.

Searching module 34, for being looked for from the case sample database of its affiliated class for each case sample randomly selected K nearest case samples out, find k neighbour's case sample respectively from each case sample database of other classes；Wherein, recently Case sample is that the case sample that distance is randomly selected in the case sample database for the affiliated class of case sample randomly selected is nearest Case sample, neighbour's case sample is the nearest case of distance is randomly selected in the case sample database of other classes case sample Example sample.

Computing module 35, for according to n case sample being randomly selected from every class case sample database respectively and its right The nearest case sample of k answered and (m-1) k neighbour's case sample, are calculated the weight of characteristic attribute；Wherein, m, n, k be It is greater than the integer equal to 1.

It is k corresponding according to the n case sample randomly selected from every class case sample database respectively and each case sample Nearest case sample and (m-1) k neighbour's case sample, are calculated the calculation formula of the weight of characteristic attribute are as follows:

Specific example refers to corresponding embodiment of the method, and details are not described herein.

In the present embodiment, the weight of characteristic attribute is unrelated with its initial weight, according only to from all kinds of case sample databases Several case samples and its nearest case sample, the neighbour's case sample extracted asks flat after calculating respective weight , influence of the case sample distribution difference to the weight of characteristic attribute is reduced, so that more effective feature weight is obtained, after being When the continuous weight for calling the characteristic attribute calculated in this way carries out Case Retrieval, search result is beaten with more science Lower basis.

Improvement based on the above embodiment, in another Installation practice of the invention, the weight meter of this feature attribute Calculating device 3 includes:

Collection module 31, for collecting case sample；

Cluster module 32 clusters according to default characteristic attribute for the case sample to collection, obtains m class case sample Library；Optionally, presetting characteristic attribute is fire size class；

Abstraction module 33, for randomly selecting n case sample from every class case sample database respectively；

Searching module 34, for being looked for from the case sample database of its affiliated class for each case sample randomly selected K nearest case samples out, find k neighbour's case sample respectively from each case sample database of other classes；Wherein, recently Case sample is that the case sample that distance is randomly selected in the case sample database for the affiliated class of case sample randomly selected is nearest Case sample, neighbour's case sample is the nearest case of distance is randomly selected in the case sample database of other classes case sample Example sample；

Computing module 35, for when characteristic attribute is multiple, according to being randomly selected from every class case sample database respectively N case sample and corresponding k, each case sample case sample and (m-1) k neighbour's case sample recently, calculate separately Obtain the weight of each characteristic attribute；Wherein, m, n, k are greater than the integer equal to 1.

Specifically, computing module 35 calculates the calculation formula of a characteristic attribute are as follows:

It is apparent to those skilled in the art that for convenience of description and succinctly, only with above-mentioned each journey The division progress of sequence module can according to need and for example, in practical application by above-mentioned function distribution by different programs Module is completed, i.e., the internal structure of device is divided into different program unit or module, with complete it is described above whole or Person's partial function.Each program module in embodiment can integrate in one processing unit, can also be the independent object of each unit Reason exists, and can also be integrated in a processing unit with two or more units, above-mentioned integrated unit can both use Formal implementation of hardware can also be realized in the form of software program unit.In addition, the specific name of each program module also only It is the protection scope that is not intended to limit this application for the ease of mutually distinguishing.

Fig. 5 is the structural schematic diagram of the terminal device 5 provided in one embodiment of the invention.As shown in figure 5, the present embodiment Terminal device 5 include: processor 53, memory 51 and be stored in the meter that can be run in memory 51 and on processor 53 Calculation machine program 52, such as: the weight calculation program of characteristic attribute.Processor 53 is realized above-mentioned each when executing computer program 52 Step in the weighing computation method embodiment of characteristic attribute, alternatively, being realized when the execution computer program 52 of processor 53 above-mentioned The function of each module in the weight calculation Installation practice of each characteristic attribute.

Terminal device 5 can be the equipment such as desktop PC, notebook, palm PC, Tablet PC, mobile phone. Terminal device 5 may include, but be not limited only to, processor 53, memory 51.It will be understood by those skilled in the art that Fig. 5 is only The example of terminal device does not constitute the restriction to terminal device 5, may include components more more or fewer than diagram, or Certain components or different components are combined, such as: terminal device can also include input-output equipment, display equipment, network Access device, bus etc..

Processor 53 can be central processing unit (Central Processing Unit, CPU), can also be other General processor, digital signal processor (Digital Signal Processor, DSP), specific integrated circuit (Application Specific Integrated Circuit, ASIC), field programmable gate array (Field- Programmable Gate Array, FPGA) either other programmable logic device, discrete gate or transistor logic, Discrete hardware components etc..General processor can be microprocessor or the processor is also possible to any conventional processor Deng.

Memory 51 can be the internal storage unit of terminal device 5, such as: the hard disk or memory of terminal device.Storage Device is also possible to the External memory equipment of terminal device, such as: the plug-in type hard disk being equipped on terminal device, intelligent memory card (Smart Media Card, SMC), secure digital (Secure Digital, SD) card, flash card (Flash Card) etc..Into One step, memory 51 can also both internal storage units including terminal device 5 or including External memory equipment.Memory 51 For storing other programs and data required for computer program 52 and terminal device 5.Memory can be also used for temporarily Ground stores the data that has exported or will export.

In the above-described embodiments, it all emphasizes particularly on different fields to the description of each embodiment, is not described in some embodiment Or the part recorded, reference can be made to the related descriptions of other embodiments.

Those of ordinary skill in the art may be aware that list described in conjunction with the examples disclosed in the embodiments of the present disclosure Member and algorithm steps can be realized with the combination of electronic hardware or computer software and electronic hardware.These functions are actually It is executed with hardware or software, specific application and design constraint depending on technical solution.Professional technician can be with Each specific application is used different methods to achieve the described function, but this realization is it is not considered that exceed this Shen Range please.

In embodiment provided herein, it should be understood that disclosed device/terminal device and method, it can be with It realizes in other way.For example, device described above/terminal device embodiment is only schematical, for example, mould The division of block or unit, only a kind of logical function partition, there may be another division manner in actual implementation, for example, more A unit or assembly can be combined or can be integrated into another system, or some features can be ignored or not executed.It is another Point, shown or discussed mutual coupling or direct-coupling or communication connection can be through some interfaces, device or The INDIRECT COUPLING or communication connection of unit can be electrical, mechanical or other forms.

Unit may or may not be physically separated as illustrated by the separation member, shown as a unit Component may or may not be physical unit, it can and it is in one place, or may be distributed over multiple networks On unit.It can some or all of the units may be selected to achieve the purpose of the solution of this embodiment according to the actual needs.

It, can also be in addition, each functional unit in each embodiment of the application may be integrated in a processing unit It is that each unit physically exists alone, can also be integrated in one unit with two or more units.Above-mentioned integrated list Member both can take the form of hardware realization, can also realize in the form of software functional units.

If integrated module/unit is realized in the form of SFU software functional unit and sells or use as independent product When, it can store in a computer readable storage medium.Based on this understanding, the present invention realizes above-described embodiment method In all or part of the process, relevant hardware can also be sent instructions to by computer program and is completed, computer program It can be stored in a computer readable storage medium, the computer program is when being executed by processor, it can be achieved that above-mentioned each side The step of method embodiment.Wherein, computer program includes: computer program code, and computer program code can be source code Form, object identification code form, executable file or certain intermediate forms etc..Computer readable storage medium may include: can Carry any entity or device, recording medium, USB flash disk, mobile hard disk, magnetic disk, CD, the computer storage of computer program code Device, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), Electric carrier signal, telecommunication signal and software distribution medium etc..It should be noted that computer readable storage medium include it is interior Increase and decrease appropriate can be carried out according to the requirement made laws in jurisdiction with patent practice by holding, such as: in certain jurisdictions of courts Area does not include electric carrier signal and telecommunication signal according to legislation and patent practice, computer-readable medium.

It should be noted that above-described embodiment can be freely combined as needed.The above is only preferred implementations of the invention Mode, it is noted that for those skilled in the art, without departing from the principle of the present invention, also Several improvements and modifications can be made, these modifications and embellishments should also be considered as the scope of protection of the present invention.

Claims

1. a kind of weighing computation method of characteristic attribute, which comprises the following steps:

Collect case sample；

The case sample of collection is clustered according to default characteristic attribute, obtains m class case sample database；

N case sample is randomly selected from every class case sample database respectively；

For each case sample randomly selected, k case samples recently are found out from the case sample database of its affiliated class, from K neighbour's case sample is found in each case sample database of other classes respectively；

Wherein, nearest case sample is taken out at random described in distance in the case sample database for the affiliated class of case sample randomly selected The nearest case sample of the case sample taken, neighbour's case sample are taken out at random described in distance in the case sample database of other classes The nearest case sample of the case sample taken；

According to the n case sample randomly selected from every class case sample database respectively and its corresponding k nearest case samples (m-1) k neighbour's case sample, is calculated the weight of characteristic attribute；

Wherein, m, n, k are greater than the integer equal to 1.

2. the weighing computation method of characteristic attribute as described in claim 1, which is characterized in that the default characteristic attribute is fire Calamity grade.

3. the weighing computation method of characteristic attribute as described in claim 1, it is characterised in that:

When the characteristic attribute is multiple, according to the n case sample randomly selected from every class case sample database respectively and Its corresponding k nearest case samples and (m-1) k neighbour's case sample, calculate separately to obtain the weight of each characteristic attribute.

4. the weighing computation method of characteristic attribute as described in claim 1, which is characterized in that described according to respectively from every class The n case sample randomly selected in case sample database and its corresponding k nearest case samples and (m-1) k neighbour's case The calculation formula of the weight of characteristic attribute is calculated in sample are as follows:

Wherein, W'(A) be characteristic attribute A weight, W (A) is the initialization weight of characteristic attribute A, R_tiIt is from the i-th class case sample T-th of the case sample randomly selected in this library, H_tijIt is the t that the distance found out from the i-th class case sample database is randomly selected The nearest case sample of j-th of a case sample, M_tijIt (C) is the distance found out from C class case sample database from the i-th class case J-th of neighbour's case sample of t-th of the case sample randomly selected in sample database, C class case sample database are not belonging to the i-th class case Example sample database, diff (A, R_ti,H_tij) indicate case sample R_tiWith case sample H_tijDifference on characteristic attribute A, diff (A, R_ti,M_tij(C)) case sample R is indicated_tiWith case sample M_tij(C) difference on characteristic attribute A.

5. the weighing computation method of characteristic attribute as claimed in claim 4, which is characterized in that diff (A, the R_ti,H_tij) Calculation formula are as follows:

R_tiIt (A) is the value of t-th of case sample being randomly selected from the i-th class case sample database on characteristic attribute A, H_tij (A) be t-th of case sample that the distance found out from the i-th class case sample database is randomly selected j-th of nearest case sample Value on characteristic attribute A, max (A) refer to the maximum value of the characteristic attribute A in the case sample of collection, and min (A) refers to The minimum value of characteristic attribute A in the case sample of collection.

6. the weighing computation method of characteristic attribute as claimed in claim 4, which is characterized in that diff (A, the R_ti,M_tij (C)) calculation formula are as follows:

R_tiIt (A) is the value of t-th of case sample being randomly selected from the i-th class case sample database on characteristic attribute A, M_tij (C) (A) is t-th of the case sample randomly selected from the i-th class case sample database from the distance found out in C class case sample database Value of j-th of neighbour's case sample on characteristic attribute A, C class case sample database is not belonging to the i-th class case sample database, max (A) refer to the maximum value of the characteristic attribute A in the case sample of collection, min (A) refers to the feature category in the case sample of collection The minimum value of property A.

7. a kind of weight calculation device of characteristic attribute characterized by comprising

Collection module, for collecting case sample；

Cluster module clusters according to default characteristic attribute for the case sample to collection, obtains m class case sample database；

Abstraction module, for randomly selecting n case sample from every class case sample database respectively；

Searching module, for finding out k from the case sample database of its affiliated class most for each case sample randomly selected Nearly case sample, finds k neighbour's case sample respectively from each case sample database of other classes；

Computing module, for according to the n case sample randomly selected from every class case sample database respectively and its corresponding k Nearest case sample and (m-1) k neighbour's case sample, are calculated the weight of characteristic attribute；

Wherein, m, n, k are greater than the integer equal to 1.

8. the weight calculation device of characteristic attribute as claimed in claim 7, it is characterised in that:

The computing module is further used for when the characteristic attribute is multiple, according to respectively from every class case sample database The n case sample randomly selected and its corresponding k nearest case samples and (m-1) k neighbour's case sample, calculate separately Obtain the weight of each characteristic attribute.

9. a kind of terminal device, including memory, processor and storage are in the memory and can be on the processor The computer program of operation, which is characterized in that the processor is realized when running the computer program as in claim 1-6 The step of weighing computation method of any one characteristic attribute.

10. a kind of computer readable storage medium, the computer-readable recording medium storage has computer program, and feature exists In the weight meter of realization characteristic attribute as described in any one of claim 1-6 when the computer program is executed by processor The step of calculation method.