CN102937985A

CN102937985A - Method for classifying, optimizing and analyzing website based on user mental model

Info

Publication number: CN102937985A
Application number: CN2012104137748A
Authority: CN
Inventors: 吴鹏; 张佩佩
Original assignee: Nanjing University of Science and Technology
Current assignee: Nanjing University of Science and Technology
Priority date: 2012-10-25
Filing date: 2012-10-25
Publication date: 2013-02-20
Anticipated expiration: 2032-10-25
Also published as: CN102937985B

Abstract

The invention discloses a method for classifying, optimizing and analyzing a website based on a user mental model. The method comprises the following steps of: firstly preprocessing log data of the website, wherein the log data contains the data relative to a concept of optimizing of a website classified catalogue issued by a user based on cognition of the user on the website classified catalogue, and extracting the concept from the log data by preprocessing; then determining a co-occurrence relation between the concept issued by the user and the concept in the website classified catalogue by a user mental model classification theory, wherein the concept presents a specific name of the website classified catalogue, such as books and daily articles; converting the co-occurrence relation into a co-occurrence matrix; converting the co-occurrence matrix into a similarity matrix by virtue of a pearson coefficient algorithm; and finally carrying out clustering analysis and multi-dimensional dimensional analysis to analyze similarity and spatiality among concepts of the cognition of the user on the website classified catalogue. Due to the adoption of six steps, decision supports can be provided to the optimizing of the website classified catalogue from a quantified angle based on the user mental model of the website.

Description

A kind of websites collection method for optimization analysis based on user's mental model

Technical field

The present invention relates to a kind of websites collection method for optimization analysis, particularly a kind of websites collection method for optimization analysis based on user's mental model.

Background technology

The optimization of site information taxonomic hierarchies is on the basis of the existing information classification system in assessment website, and whether determine needs to adjust the existing information classification system in website, then determines how to adjust if need to adjust.The at present research for site information taxonomic hierarchies optimization method also is short of very much, mainly concentrate in the research of classification foundation, standard, the research of principle scheduling theory and the particular problem phenomenons such as the level of classifying, granularity, also only pay close attention to from existing sorting technique and look for defective, profound exploration is carried out in rarer research under the support of certain theoretical method.

Norman has proposed to exist in the interaction design 3 kinds of mental models first in his " The Design ofEveryday Things " book, namely show model, user's mental model, system model, he think performance model and user's mental model more near the time, the user more can understand the web site organization structure, can carry out more efficiently acquisition of information.Therefore at the mental model that carries out examining or check when the site information taxonomic hierarchies is optimized the user as far as possible, namely the user is to the cognition of websites collection system.

In website user's mental model was measured, psychological approximate data was the subjective evaluation by the concept Relations Among of individual perception, and wherein " similarity ", " spatiality " are mainly to take measurement of an angle.The quantitative measurment of mental model all is as starting point mostly take concept similarity, extract the related notion in the research theme, adopt different sorting technique contrived experiments, obtain tested similarity assessment data to concept, to data analysis, characterize the user about the mental model of correlative study theme.Cluster analysis is exactly to process the concept similarity data usually, according to concept similarity concept is classified.And the spatiality of concept refers to relative position (the Rusbult C.E of different concepts in tested psychological space, Onizuka R.K, Lipkus I.What Do We Really Want: MentalModels of Ideal Romantic Involvement Explored through Multidimensional Scaling[J] .Journal ofExperimental Social Psychology, 1993,29 (9): 493-527), the multi-dimentional scale method can be used for the measurement of concept space, obtain the user about the spatial characterization of concept, observe intuitively the user about the mental model of a certain field concept.And the current research that both are combined for the measurement of web catalogue optimization user mental model does not also have.

Current research to the website user also all is to rest on traditional user to investigate the stage, the user who adopts investigates mode and mainly comprises: scene method, focus group, usability testing, in-depth interview, observation etc., but all there is limitation in these methods, the valid data that can get access to are very limited, Expenses Cost is high and when adopting these modes to carry out user investigation, relating to problem can not be too many, so acquired information is too macroscopical, is difficult to obtain to the real useful detailed information of user behavior research.

Therefore still there are some problems in websites collection system optimization method: (1) is difficult to carry out effective user study, is difficult to collect all sidedly the user to the cognition of website; (2) seldom " customer-centric " carry out the websites collection system optimization.

Summary of the invention

Technical matters solved by the invention is to provide a kind of websites collection method for optimization analysis based on user's mental model.

The technical solution that realizes the object of the invention is: a kind of websites collection method for optimization analysis based on user's mental model, and step is as follows:

Step 1, the web log file data are carried out pre-service, are specially:

Step 1-1, the web log file data are purified, irrelevant or have wrong data with analysis purpose in the deletion journal file, the irrelevant data of described and analysis purpose comprise: comprise concept in the split catalog data, comprise product with the data of coded representation; The described data of mistake that exist comprise: misspelling, product description mistake; The attribute of selecting afterwards Data processing to need, described attribute comprises user's name, user region, user cognition concept, product category, described user cognition concept for the user based on the concept of optimizing about web catalogue that the cognition of web catalogue is submitted to;

Step 1-2, the data that purified among the step 1-1 are carried out format conversion, the form of the user cognition concept extracted and region, three attributes of title is unified, be specially and remove numbering, unified, the single plural number of capital and small letter is unified;

Step 1-3, determine the frequency that the user cognition concept occurs, setting threshold afterwards, threshold value is according to actual amount of data and extract user cognition concept quantity and determine, chooses the frequency greater than the user cognition concept of this threshold value, and the record frequency;

Step 2, determine the concept co-occurrence whether in user cognition concept and the web catalogue, specifically utilize the mental model category theory, the user cognition concept is retrieved concept and the frequency in the split catalog that occurs in the statistics result for retrieval to the website as search key;

Step 3, generation co-occurrence matrix, described co-occurrence matrix is symmetric matrix, the first row and first is classified concept as, comprise the concept in user cognition concept and the web catalogue, the remaining element lattice are the co-occurrence frequency between concept, are specially the co-occurrence frequency between concept in the first row corresponding to cell and the first row;

Step 4, the basis of co-occurrence matrix generates similarity matrix in step 3;

Step 5, carry out cluster analysis on the basis of step 4, specifically utilize the pedigree clustering procedure that similarity matrix is carried out cluster, according to the statistic of cluster, determine the cluster result of concept afterwards, described concept comprises the user cognition concept of extraction and the concept in the web catalogue;

Step 6, utilize Multidimensional Scaling that the similarity matrix in the step 4 is analyzed, obtain the Multidimensional Scaling space diagram of corresponding dimension, analyze thereby finish websites collection optimization.

The present invention compared with prior art, its remarkable advantage is: (1) the present invention directly utilizes the web log file data to carry out user study, saves the cost of user's investigation, can comprehensively obtain user profile; (2) adopt quantitative calculation method, the result is accurate, analyzes the net result that obtains and can directly provide foundation for the websites collection system optimization; (3) cluster analysis and Multidimensional Scaling represent two key points " similarity " and " spatiality " of user's mental model, and both analysis results can verify mutually, represent intuitively visualization result.

Below in conjunction with accompanying drawing the present invention is described in further detail.

Description of drawings

Fig. 1 is the websites collection method for optimization analysis process flow diagram based on user's mental model of the present invention.

Fig. 2 is concept and self-defined group name clustering tree in the secondary classification catalogue.

Fig. 3 is the Multidimensional Scaling space diagram of concept and self-defined group name in the secondary classification catalogue.

Embodiment

A kind of websites collection method for optimization analysis based on user's mental model, step is as follows:

Step 1, the web log file data are carried out pre-service, are specially:

Step 1-1, the web log file data are purified, irrelevant or have wrong data with analysis purpose in the deletion journal file, the irrelevant data of described and analysis purpose comprise: comprise concept in the split catalog data, comprise product with the data of coded representation; The described data of mistake that exist comprise: misspelling, product description mistake; The attribute of selecting afterwards Data processing to need, described attribute comprises user's name, user region, user cognition concept, product category, described user cognition concept for the user based on the concept of optimizing about web catalogue that the cognition of web catalogue is submitted to, be that the user is when utilizing web catalogue to browse, when can not find only concept, in the more suitably concept of oneself thinking of website interactive interface submission; For example the user utilizes split catalog to search the books of data mining in Joyo.com, find that such books belong to split catalog " database " classification, think improper like this, the user thinks that data mining should directly appear in the split catalog classification, and at this moment " data mining " is exactly described user cognition concept.

Step 1-3, determine the frequency that the user cognition concept occurs, setting threshold afterwards, threshold value is determined according to actual amount of data and extraction user cognition concept quantity, for example, in all less situation of actual amount of data and user cognition concept quantity, in order to obtain certain data volume, can set less threshold value.Choose the frequency greater than the user cognition concept of this threshold value, and the record frequency;

According to the mental model category theory, when the user carries out acquisition of information at the website use split catalog, main employing level, the vertical and impartial click mode of horizontal vertical, in the click process according to the correlativity between concept in the split catalog, select the high concept of correlativity to click, utilize this principle, the user cognition concept is retrieved to the website as search key, concept and the frequency thereof in the split catalog that occurs in the statistics result for retrieval are with the correlativity between concept in analysis user cognitive concept and the web catalogue.

The mental model category theory is Charles Cole, modal three kinds of the mental model that the people such as Yang Lin found through experiments the people is vertical-type (26%), horizontal type (31%), and impartial type (21%) consist of 78% crowd's mental model type altogether.The classification of mental model is to determine according to the hierachy number among the mental model figure and number of regions.Three common class mental model features are as follows:

● vertical: the mental model that the level of vertical dimensions is Duoed than horizontal dimensions

● level: the mental model that the level of horizontal dimensions is Duoed than vertical dimensions

● equalization: the mental model that vertical dimensions and horizontal dimensions level equate

According to this theory, it is expanded to the user utilize split catalog to carry out in the information access process, suppose the user when the website use split catalog carries out acquisition of information, also adopt the mode of vertical, level and horizontal vertical equalization to click.

Determine the co-occurrence frequency between concept in the co-occurrence matrix, concrete steps are as follows:

Step 3-1, determine the co-occurrence frequency of concept in the user cognition concept and classification catalogue specifically to be divided into two kinds

Situation: a kind of is the co-occurrence frequency of concept in user cognition concept and the secondary classification catalogue, is designated as F ₁,

F ₁The frequency that concept occurs in the secondary classification catalogue in the=p*x p=result for retrieval

The frequency that x=user cognition concept occurs

Another kind is the co-occurrence frequency of concept in user cognition concept and the reclassify catalogue, is the frequency that the user cognition concept occurs;

Step 3-2, determine the co-occurrence frequency between the concept in the split catalog, be the smaller value in the co-occurrence frequency of concept and all user cognition concepts in two split catalogs, to its summation, be designated as F afterwards ₂, m, n represent respectively concept A in the split catalog, the co-occurrence frequency of B and user cognition concept, and used formula is:

F ₂＝SUM(MIN(m，n))

Step 3-3, determine the co-occurrence frequency between the user cognition concept, the co-occurrence frequency between the user cognition concept is 0.

Generate similarity matrix and specifically adopt the pearson relative coefficient to calculate as similarity, used formula is

r = \frac{Σ_{i = 1}^{n} (X_{i} - \overset{&OverBar;}{X}) (Y_{i} - \overset{&OverBar;}{Y})}{\sqrt{Σ_{i = 1}^{n} {(X_{i} - \overset{&OverBar;}{X})}^{2}} \sqrt{Σ_{i = 1}^{n} {(Y_{i} - \overset{&OverBar;}{Y})}^{2}}}

In the formula, r is the degree of two linear dependence powers between variable, usually satisfies 0≤r≤1, and n is sample size, x, y and

Be respectively observed reading and the average of two variablees.

Utilize the pedigree clustering procedure that similarity matrix is carried out cluster, afterwards according to the statistic of cluster, determine the cluster result of concept, specifically may further comprise the steps:

Step 5-1, determine the distance between sample, consist of the symmetry distance matrix, adopt T _i, T _jExpression sample i, j, d(T _i, T _j) distance of expression between i, the j, note by abridging and be d _Ij, used variance weighted range formula is

d_{ij} = {[Σ_{k = 1}^{p} \frac{{(T_{ik} - T_{ik})}^{2}}{S_{k}^{2}}]}^{\frac{1}{2}}

With N sample as N class, M _p, M _qRepresent two classes, contain respectively N _p, N _qIndividual sample, M _p, M _qBetween distance D very _Pq, calculate sample distance between any two, consist of a symmetry distance matrix D (0);

Step 5-2, merging classification, generate new distance matrix, specifically select the least member on the off-diagonal among the D (0), if this least member is Dpq, at this moment Mp={Xp}, Mq={Xq} are merged into new class Mr={Xp, an Xq} with Mp, Mq, the corresponding ranks of cancellation Mp, Mq in D (0), and add by new class Mr to be delegation and the row that the distance between the class of polymerization forms with remaining other, to obtain new Distance matrix D (1) that it is N-1 rank square formations;

Step 5-3, repeating step 5-2 until N sample poly-be 1 large class;

Step 5-4, determine the cluster result of concept according to the statistic of pedigree clustering method, described statistic comprises: R ²Statistic, half is R partially ²Statistic, pseudo-F statistic, pseudo-t ²Statistic.

Step 6, utilize Multidimensional Scaling that the similarity matrix in the step 4 is analyzed, obtain the Multidimensional Scaling space diagram of corresponding dimension, analyze thereby finish websites collection optimization.Utilize Multidimensional Scaling that similarity matrix is analyzed, generate the Multidimensional Scaling space diagram, specifically may further comprise the steps:

Step 6-1, generation observing matrix, specifically utilize Euclid to stimulate the space to carry out spatial description, calculate based on the Minkowski Distance function: supposition is in web catalogue, and is tested cognitive as basic input data to the concept Relations Among, be provided with n object, can get

Individual object to apart from S _Ij, the distance table between some i and the j is shown d _Ij, used formula is:

S_{ij} = {[Σ_{a}^{v} {(x_{ia} - x_{ja})}^{2}]}^{\frac{1}{2}}

In the formula, v represents dimension, X _IaCoordinate points i on the expression a dimension, X _JaCoordinate points j on the expression a dimension;

Step 6-2, homomorphic mapping are specifically sought the q dimension space of a dimensionality reduction, do homomorphic mapping and process, and make d in the q dimension space _IjBe object to the distance in the p space with former apart from S _IjBe complementary, if d _IjWith S _IjBe complementary fully, the distance relation is d between each paired object _I1＞d _I2＞...＞d _Im, namely this distance that falls progressively is consistent with original similarity order of rising progressively;

Step 6-3, reliability and validity check, determine optimum number of dimensions, calculated difference degree K specifically, be called Cruise gram coefficient, be used for checking the space diagram that obtains whether to have effective representativeness and stress stress exponent, be the degree of fitting value, be defined as the departure between the distance of the theoretical of similarity assessment data representatives and calculating, Stress adopts formula to be:

Stress = \sqrt{\underset{i}{Σ} \underset{j}{Σ} {(d_{ij} - {\hat{d}}_{ij})}^{2} / \underset{i}{Σ} \underset{j}{Σ} d_{ij}_{2}}

D wherein _IjBe to satisfy tested original input concept distance order relation, make again the reference value of stress exponent value minimum simultaneously.Above-mentioned K value is the bigger the better, and is being acceptable more than 0.60 generally; The stress value generally can be accepted with interior 0.20, and stress exponent size and degree of fitting relation sees Table 1 in detail

Table 1 stress exponent size and degree of fitting relation

Stress	Degree of fitting
		0.200	Bad
0.100	All right
		0.050	Good
0.025	Very good
		0.000	Fully match

Step 6-4, according to the optimum number of dimensions of determining among the step 6-3, generate the Multidimensional Scaling space diagram.

Below in conjunction with embodiment the present invention is done further detailed description:

Goal in research: the optimization of made in China net illuminating product split catalog is analyzed.

Data declaration: made in China net (international station http://www.made-in-china.com/) product classification catalogue Lights﹠amp; The large class of Lighting Zhejiang, Shanghai, Jiangsu, Guangdong four provinces and cities' User Defined group name data (6872 record).The made in China net is called self-defined group name with the user cognition concept.

Step 1 is carried out pre-service to the web log file data, is specially:

1) the web log file data are purified after, filter out the attribute that Data processing needs, comprise Business Name, province, city and self-defined group name, concrete form is as shown in table 2:

Data layout behind table 2 data purification

2) first the numbering that comprises in the self-defined group name is removed, then self-defined group name is converted into small letter, remove plural form, and according to first letter mother sorts;

3) because the less self-defined group name quantity of the frequency that filters out is very large, threshold value is made as 4, selects the frequency greater than 4 User Defined group name, select at last 114 self-defined group names and record its frequency.The self-defined group name result who filters out is as shown in table 3:

The self-defined group name the selection result of table 3

Step 2 is determined the concept co-occurrence whether in self-defined group name and the web catalogue.Concrete operations are as follows:

1) signs in to the international station of made in China net http://www.made-in-china.com/;

2) the self-defined group name that input need to be retrieved in frame retrieval is selected " Lights﹠amp in " all categories " drop-down menu; Lighting ", click search to then;

3) concept in the secondary classification order that occurs in the statistics result for retrieval " catalog ";

4) click successively the concept that occurs in " catalog ", the concept that occurs in " catalog " at this moment be the concept in the reclassify catalogue of correspondence;

The secondary classification catalogue that occurs among the record catalog, the concept in the reclassify catalogue, the corresponding unit lattice fill in 1, obtain original cooccurrence relation statistical form, and partial results is as shown in table 4:

Table 4 part co-occurrence is statistical form as a result

In the ensuing processing procedure, the processing procedure of concept is similar in self-defined group name and secondary classification catalogue and the reclassify catalogue, the below in the secondary classification catalogue concept and the cooccurrence relation of self-defined group name as example.

Step 3 generates co-occurrence matrix, is specially:

1) determines the co-occurrence frequency between concept and self-defined group name in the secondary classification catalogue; Specifically the frequency number with self-defined group name multiply by the frequency that concept occurs in the secondary classification catalogue, and it is as shown in table 5 to obtain partial results:

The co-occurrence frequency partial results of the concept in the table 5 secondary classification catalogue and self-defined group name

2) determine the co-occurrence frequency between the concept in the secondary classification catalogue;

Calculate on the basis as a result in previous step, illustrate, the co-occurrence frequency such as Interior lighting and LED lighting, the row of these two concepts B, C by name among the excel, therefore formula is SUM(MIN(B, C)), namely at first select the less data of every delegation in two row, then summation;

3) the co-occurrence frequency between self-defined group name all fills out 0, and the co-occurrence matrix that obtains at last is as shown in table 6:

The co-occurrence matrix of concept and self-defined group name in the table 6 part secondary classification catalogue

	Interiorlighting	ledlighting	lightingfixtureg	bulblamp	lightingdecoration
						Interior_lighting		14441	6587	11403	10697
led_lighting	14441		6643	12204	11108
						lighting_fixtures	6587	6643		6467	5836
bulb_lamp	11403	12204	6467		9433
						lighting_decoration	10697	11108	5836	9433
outdoor_lighting	14640	17255	6620	12498	11189
						camping_light	1116	1116	995	1100	1110
emergency_indicator_light	2245	2240	2129	2226	2205
						torch	653	653	582	645	648
portable_lighting	1364	1388	1289	1356	1353

Step 4 generates similarity matrix, adopts SAS software, selects Pearson correlation coefficient to calculate, and obtains similarity matrix, and partial results is as shown in table 7:

Table 7 similarity matrix partial results

Step 5, cluster analysis, utilize SAS software, choose the pedigree clustering method, carry out cluster analysis, the between class distance method is chosen the methods such as ward, complete, single, through comparing, find result's the best that method=ward obtains, with the mode of sample with each merging two classes, the process operation result of last 15 merging is as shown in table 8:

Table 8SAS cluster process method=ward operation result table

Three statistics, half inclined to one side R according to the pedigree clustering method ²Statistic (SPRSQ), pseudo-F statistic (PSF), pseudo-t ²It is 4 that statistic (PST2) is selected optimum classification number.Totally 127 concepts in the cluster result, 114 self-defined group names wherein, 13 secondary classification catalogue concepts, best classification number is 4, wherein 13 second-level directory concepts are in the middle of two classes.Cluster result (runic is the concept in the secondary classification catalogue, and the concept of overstriking is not self-defined group name) as shown in table 9.

The self-defined group name of table 9 and second-level directory cluster result

Four classes that mark in the clustering tree that Fig. 2 represents are mutually corresponding with the cluster result in the table 9.Cluster result represents to be gathered that correlativity is maximum between the concept in a class, as led_plug_light, induction_lamp in the 4th class, led_module, led_rigid_bar, led_moving_head, led_rope_light, these eight concepts of led_dance_floor, led_recessed_light by poly-be a class, then illustrate in all concepts, correlativity between these eight concepts is maximum, can place the same class classification.

Step 6, Multidimensional Scaling directly adopts the Multidimensional Scaling function in the SAS software to analyze in this example, can verify the accuracy of cluster result and the visual cluster result that represents by Multidimensional Scaling.

In order to make the Multidimensional Scaling result more clear, concept is replaced with variable X 1～X127, the variable numbering is consistent with concept sequence number in the cluster result.Can be found out that by the Multidimensional Scaling space diagram 127 concepts have been divided into four classes, it has verified cluster result dry straightly, has also showed very intuitively the cluster result of concept, thereby the split catalog optimization of having finished made in China net illumination series products is analyzed.

By above-mentioned example as can be known, method of the present invention directly utilizes the web log file data to carry out user study, saves the cost of user's investigation, can comprehensively obtain user profile.

Claims

1. websites collection method for optimization analysis based on user's mental model is characterized in that step is as follows:

Step 1, the web log file data are carried out pre-service, are specially:

Step 1-3, determine the frequency that the user cognition concept occurs, setting threshold is chosen the frequency greater than the user cognition concept of this threshold value afterwards, and the record frequency;

2. the websites collection method for optimization analysis based on user's mental model according to claim 1, it is characterized in that, in the step 2 according to the mental model category theory, when the user carries out acquisition of information at the website use split catalog, main employing level, the click mode of vertical and horizontal vertical equalization, in the click process according to the correlativity between concept in the split catalog, select the high concept of correlativity to click, utilize this principle, the user cognition concept is retrieved to the website as search key, concept and the frequency thereof in the split catalog that occurs in the statistics result for retrieval are with the correlativity between concept in analysis user cognitive concept and the web catalogue.

3. the websites collection method for optimization analysis based on user's mental model according to claim 1 is characterized in that, determines the co-occurrence frequency between concept in the co-occurrence matrix in the step 3, and concrete steps are as follows:

Step 3-1, determine the co-occurrence frequency of concept in the user cognition concept and classification catalogue, specifically be divided into two kinds of situations: a kind of is the co-occurrence frequency of concept in user cognition concept and the secondary classification catalogue, is designated as F ₁,

The frequency that x=user cognition concept occurs

F ₂＝SUM(MIN(m，n))

4. the websites collection method for optimization analysis based on user's mental model according to claim 1 is characterized in that, step 4 generates similarity matrix and specifically adopts the pearson relative coefficient to calculate as similarity, and used formula is

r = \frac{Σ_{i = 1}^{n} (X_{i} - \overset{&OverBar;}{X}) (Y_{i} - \overset{&OverBar;}{Y})}{\sqrt{Σ_{i = 1}^{n} {(X_{i} - \overset{&OverBar;}{X})}^{2}} \sqrt{Σ_{i = 1}^{n} {(Y_{i} - \overset{&OverBar;}{Y})}^{2}}}

Be respectively observed reading and the average of two variablees.

5. the websites collection method for optimization analysis based on user's mental model according to claim 1, it is characterized in that, utilize the pedigree clustering procedure that similarity matrix is carried out cluster in the step 5, afterwards according to the statistic of cluster, determine the cluster result of concept, specifically may further comprise the steps:

d_{ij} = {[Σ_{k = 1}^{p} \frac{{(T_{ik} - T_{jk})}^{2}}{S_{k}^{2}}]}^{\frac{1}{2}}

Step 5-3, repeating step 5-2 until N sample poly-be 1 large class;

6. the websites collection method for optimization analysis based on user's mental model according to claim 1 is characterized in that, utilizes Multidimensional Scaling that similarity matrix is analyzed in the step 6, generates the Multidimensional Scaling space diagram, specifically may further comprise the steps:

Step 6-1, generation observing matrix, specifically utilize Euclid to stimulate the space to carry out spatial description, calculate based on the Minkowski Distance function: supposition is in web catalogue, and is tested cognitive as basic input data to the concept Relations Among, be provided with n object, can get Individual object to apart from S _Ij, the distance table between some i and the j is shown d _Ij, used formula is:

S_{ij} = {[Σ_{a}^{v} {(x_{ia} - x_{ja})}^{2}]}^{\frac{1}{2}}

Step 6-3, reliability and validity check, determine optimum number of dimensions, calculated difference degree K specifically, be called Cruise gram coefficient, be used for checking the space diagram that obtains whether to have effective representativeness and stress stress exponent, be the degree of fitting value, be defined as between the distance of the theoretical of similarity assessment data representatives and calculating

Stress = \sqrt{\underset{i}{Σ} \underset{j}{Σ} {(d_{ij} - {\hat{d}}_{ij})}^{2} / \underset{i}{Σ} \underset{j}{Σ} d_{ij}^{2}}

Departure, Stress adopts formula to be:

D wherein _IjBe to satisfy tested original input concept distance order relation, make again the reference value of stress exponent value minimum simultaneously;