CN115239106A

CN115239106A - Analysis method based on all-purpose card big data

Info

Publication number: CN115239106A
Application number: CN202210820713.7A
Authority: CN
Inventors: 张滢雪; 司占军; 卢勇拾; 邢斌; 李龙
Original assignee: Tianjin University of Science and Technology
Current assignee: Tianjin University of Science and Technology
Priority date: 2022-07-12
Filing date: 2022-07-12
Publication date: 2022-10-25

Abstract

The invention relates to an analysis method based on all-purpose card big data, which comprises the following steps: acquiring an original data set in a preset period; cleaning and counting the original data set to obtain a campus life behavior data set for analysis; performing clustering analysis on two-dimensional data consisting of consumption frequency and consumption total; acquiring consumption behavior characteristics of students according to the clustering analysis result; performing joint analysis on the campus life behavior data set and the effective data subset thereof by using an Apriori association rule analysis method; and according to the analysis result of the association rule, acquiring the life track behavior characteristics of the students, and providing meal preparation planning suggestions and dining recommendations for the canteen manager and the students respectively. On one hand, the invention provides a side-writing image reflecting the living behavior characteristics of the students, helps the students to carry out self management and promotion, on the other hand, provides overall management suggestions of the living and learning environment of the students, potential problem early warning and the like for school managers, and has more multidimensional and wide application scenes.

Description

Analysis method based on one-card big data

Technical Field

The invention relates to the technical field of data analysis, in particular to an analysis method based on all-purpose card big data.

Background

With the rapid development of information technology and related software and hardware, schools can provide integrated and efficient information services for students, such as consumption, access control, book borrowing, identity authentication, service reservation and the like, by means of a card and a corresponding data management system without leaving campus card in all aspects of campus life. The common use of the campus card is convenient for students to live and enables campus management to be more standard and intelligent.

In the construction of digital campuses and smart campuses, campus smart card data is undoubtedly an important foundation. In order to obtain valuable information from massive original data generated by card swiping of students and support student behavior analysis and school safety management, a proper data analysis or mining method is of great importance. Based on a data analysis method, high-level semantic information is obtained from original data, so that on one hand, living habit analysis and healthy living advice can be provided for students, and on the other hand, effective reference can be provided for daily management and safe work of schools.

At present, research on campus card data analysis technology is still relatively limited. On one hand, the analysis subject mainly focuses on consumption data, the data analysis dimension is small, and the obtained conclusion is mainly oriented to a school manager. With the popularization and development of mobile devices, mobile applications and big data concepts, students gradually have greater interest in statistical rules of their own data, and have stronger willingness to evaluate their own states and adopt corresponding suggestions according to big data analysis results, for example, the current annual plans introduced by various mobile applications are popular with teenager groups. On the other hand, the method used for analysis is single, one of the most common methods is a K-means clustering method, the method needs to preset the number of clusters, and for the data of the all-purpose cards with increasingly large data volumes, the distribution condition and possible category number of the data are often difficult to determine, so that efficient clustering cannot be realized, and the optimal analysis result is obtained. Therefore, massive and multidimensional campus card data provide more diverse requirements for analysis technology research, and people hope that the campus card data can reduce manual intervention as much as possible, provide efficient and reliable analysis service for both schools and students, and promote development of related applications.

In view of the above background and problems, there is a need for an analysis method based on campus card data, which can perform efficient and automatic analysis on massive multidimensional data generated by swiping a card through a campus card, and obtain campus life behavior characteristics of students and association relations therebetween, on one hand, provide a side-writing image reflecting the life behavior characteristics of the students, help the students to perform self-management and promotion, and on the other hand, provide overall management suggestions of student life and learning environments such as canteens, dormitories, teaching buildings and the like, and student economic and safety problem early warning and the like for school managers. By fully mining the rich information contained in the data of the one-card, valuable data analysis results are provided for students and schools at the same time.

Disclosure of Invention

In order to overcome the defects of the prior art, the invention aims to provide an analysis method based on one-card big data, and the analysis method can provide valuable data analysis results for students and school parties by fully mining the multi-dimensional information of the one-card data, on one hand, provides a side-writing image reflecting the living behavior characteristics of the students and helps the students to carry out self-management and promotion, on the other hand, provides overall planning management suggestions, potential problem early warning and the like of the living and learning environments of the students such as canteens, dormitories, teaching buildings and the like for school managers, and has more multidimensional and wide application scenes.

In order to achieve the purpose, the invention provides the following scheme:

an analysis method based on big data of a one-card comprises the following steps:

s1: reading personal information of a campus card-through student group in a preset period, campus card consumption card swiping records and access control card swiping records in the same period, and acquiring an original data set reflecting campus life, wherein the original data set takes a card number as a unique identifier and corresponds to students one by one.

S2: cleaning and counting the original data set, and processing the original data set according to different living behavior characteristics to obtain a campus living behavior data set for analysis and an effective data subset thereof;

s3: acquiring two-dimensional data consisting of consumption frequency and consumption total amount through the effective data subset which can be used for analysis, and performing cluster analysis on the two-dimensional data consisting of consumption frequency and consumption total amount by using a mean shift method to acquire data after cluster analysis;

s4: and acquiring consumption behavior characteristics of the students according to the data after the cluster analysis, and providing management and self-management references for schools and students respectively.

S5: performing joint analysis on the campus life behavior data set for analysis and the effective data subset thereof by using an Apriori association rule analysis method through the campus life behavior data set for analysis and the effective data subset thereof to obtain a joint analysis result and analyze life behavior trajectory preferences of different student groups;

s6: and acquiring the behavior characteristics of the life tracks of the students according to the joint analysis result, and providing meal preparation planning suggestions and dining recommendations for the canteen manager and the students respectively.

Preferably, step S1 comprises: reading N consumption card swiping records of the student campus card in a preset period Dur, wherein each consumption card swiping record is p _n Can be expressed as a number of charactersThe set of signatures is as follows:

p _n ＝{(CardNo _n ,Time _n ,Loction _n ,Money _n )|n＝1,2,3,…,N} (1)

wherein, cardNO _n ,Time _n ,Loction _n ,Money _n Record p for consumption card swiping respectively _n Card number, consumption time, consumption place and consumption amount;

the personal information of the students refers to the utilization of the card number CardNO in each record _n Inquiring and reading access control identification code Acc of student corresponding to the card number _n And Sex information Sex _n ；

The student personal information and the consumption card swiping record p _n The consumption place data sets which contain N card swiping records and student personal information in the preset period Dur are formed together:

l _n ＝{(CardNo _n ,Time _n ,Loction _n ,Money _n ,Acc _n ,Sex _n )|n＝1,2,3,…,N} (2)

preferably, step S2 comprises:

counting the consumption frequency and the consumption sum of each student in a preset period, and expressing the consumption frequency and the consumption sum as a two-dimensional characteristic vector set of the consumption frequency-the consumption sum:

{v _i ＝(t _i ,f _i )|i＝1,2,…,I} (3)

wherein, I is the total number of students, and the card number of the campus card corresponding to each student I is C ⁱ Then all the card numbers CardNO in the data set are consumed _n ＝C ⁱ Consumption record p of _n (n.ltoreq.N) the subset of consumption records P constituting the student ⁱ . Its total amount of consumption t _i Has a value of P ⁱ Sum of the intermediate consumption amount and consumption frequency f _i Then is P ⁱ Number of pieces recorded in (1):

f _i ＝|P ⁱ | (5)

obtaining a subset L of consumption location records valid for student i ⁱ : for each student i, the card number of the campus card corresponding to the student i is C ⁱ All card numbers CardNo in the consumption site set _n ＝C ⁱ Consumption record l _n (N ≦ N) the subset of consumption site records L that make up the student ⁱ Let the time interval threshold be T _interv If:

then look at l _n+m And l _n Only one record of card swiping is reserved for the same consumption process _n From L to ⁱ Middle removing _n+m To l, to _n Traverse all l _n+m If the condition of formula (6) is not satisfied, let l _n ＝l _n+m+1 Repeating the above condition until L ⁱ All the data items in the database are checked completely, and a valid consumption place record subset L of the student i is obtained ⁱ ；

Repeating the steps until the consumption place record subsets corresponding to each student are obtained;

obtaining the effective entrance guard card swiping record of the student i: for a predetermined period Dur, read in and _n the entrance guard card-swiping record of the same batch of students in the same period of the middle consumption data is recorded by utilizing the I _n Access control identification code Acc in _n And Time of card swiping Time _n Screening records of which the card swiping time is adjacent to the consumption time from the entrance guard card swiping records, and checking the entrance guard card swiping place Tbuilding in the records _n Adding the data into consumption place data records to obtain a complete school life behavior data set a of the students _n ：

a _n ＝{(CardNo _n ，Time _n ，Loction _n ，Tbuilding _n ，Money _n ，Acc _n ，Sex _n )|n＝1，2，3，...，N} (7)

The complete school life behavior data set a of students _n I.e. the valid data set available for analysis.

Preferably, step S3 comprises:

s3.1: for two-dimensional feature vector v _i It can be considered as a set in two-dimensional space with (t) _i ,f _i ) The method is characterized in that the method is a point set of horizontal and vertical coordinates, wherein I =1,2, \8230, and I, each point corresponds to the consumption behavior of a student, and each point is taken as an independent initial class to realize the initialization of a clustering process;

s3.2: randomly selecting a point v _x As an initial centroid cen _x ；

S3.3: by the center of mass cen _x Selecting a sliding window with the bandwidth of r for the center, marking a set consisting of all points in the window range as W, and temporarily marking the set as belonging to the class clu _x And increasing the access frequency of the points within the class by 1;

s3.4: calculating all points in the sliding window to the initial centroid cen _x Radial basis kernel weighted average distance M _r As mean shift vector:

s3.5: by mean shift vector M _r Updating the centroid coordinates as:

cen _x+1 ＝M _r +cen _x (9)

s3.6: repeating the steps 3.3 to 3.5 until the offset M _r Less than a threshold value T _conv Then the centroid cen at this time is determined _X As a cluster center, all points accessed in the repeated iteration process of steps 3.3 to 3.5 belong to the class clu corresponding to the center _X The current drift is converged;

s3.7: if current class clu _X The distance between the cluster center and the center of a certain existing class is less than a threshold value T _dis If not, the current class is kept as a new class;

s3.8: repeating the steps 3.1 to 3.7 until all the points are accessed, and ending the mean shift clustering process;

s3.9: and attributing all the points to corresponding clustering centers according to the marks, and clustering the points to a class with higher access frequency if one point is marked by a plurality of class accesses.

Preferably, step S4 comprises:

and feeding back the positions of the students in the frequency-total two-dimensional space according to the frequency-total clustering result, providing individual consumption behavior reports for the students, analyzing the reports to obtain abnormal classes and abnormal items, feeding back the abnormal classes and abnormal items to a student management department as early warning information, reminding special attention to corresponding student groups, and simultaneously providing necessary consumption suggestions for the students.

Preferably, the anomaly classes and anomaly terms include:

potential economically difficult students; a potential student away from school; a potentially high consumer group;

potential economic difficulties students: for CEN, if CEN, in the cluster center point set _x (t _x ,f _x ) And its corresponding class clus _x The following conditions are met, which indicates that the total consumption amount of the students in the class in the preset period is very low and the consumption frequency is higher,

wherein, t _CEN Set of abscissas of the CEN set of central points, f _I The ordinate set, mean () and std () of all the points and num () are respectively a mean value, a standard deviation and a counting function;

potential out-of-school students: for the cluster center point set CEN, if CEN _x (t _x ,f _x ) And its corresponding class clus _x If the following conditions are met, the total consumption amount and the consumption frequency of the students in the class in the preset period are both extremely low,

potential high consumer groups: for CEN, if CEN, in the cluster center point set _x (t _x ,f _x ) And its corresponding class clus _x The following conditions are met, which indicates that the total consumption amount and the consumption frequency of the students in the class in the preset period are extremely high,

preferably, step S5 includes:

digging association rules for the gender item set and the consumption place item set in the consumption place record subset, judging whether strong association rules exist, and reflecting that the gender of the student is clearly associated with the selection of the dining room if the strong association rules exist; if the gender of the student does not exist, the fact that obvious interaction influence does not exist between the gender of the student and the selection of the dining hall is shown, and the method specifically comprises the following steps:

step 5.1.1, recording subset l from consumption site _n Taking two elements of consumption place and gender in each item to form tau _n ＝{Loction _n ,Sex _n | N =1,2, \8230;, N } as a transaction for association rule mining, where M different locations and 2 different genders of Loction are contained ^m ,Sex ^q (M =1,2, \8230;, M, q =1,2) is an item in a transaction, transaction database D ^l ＝{τ ₁ ,τ ₂ ,…,τ _N }；

Step 5.1.2, let it = { Loction ^m ,Sex ^q I M =1,2, \ 8230, M, q =1,2} is D ^l If any of the items in (1) is D, then any non-empty subset X of Ite is D ^l In order to determine the association rule between gender and consumption location, the item set in (1) is a 2-item set X comprising 2 items and composed of location and gender items in Ite _k (k =1,2, \8230;, M × 2) to obtain a degree of support:

wherein the content of the first and second substances,

is D ^l Middle inclusion item set X _k N is D ^l A total number of transactions;

step 5.1.3, setting the minimum support threshold value as

For the 2 item set X in step 5.1.2 _k If, if

Then X _k For frequent item set, the set of all frequent item sets is marked as X ^F ；

Step 5.1.4, in order to know whether the student gender and the dining room selection have strong association relation, a frequent item set X is collected ^F In the method, all association rules between the gender and the consumption place are generated, taking the gender as a condition and the place as a result as an example:

Sex ^q →Location ^m ,m＝1,2,…,M,q＝1,2 (14)

step 5.1.5, solving the confidence of each association rule:

step 5.1.6, set the minimum confidence threshold to

Known frequent itemset set X ^F In

If it is

Then Sex ^q →Location ^m Is a strong association rule;

mining association rules of the entrance and exit place item sets and the consumption place item sets in the campus life behavior data sets, judging whether strong association rules exist, and reflecting that the selection of the teaching building and the selection of the dining room have clear association if the strong association rules exist; if the result does not exist, the method indicates that no obvious interaction exists between the teaching building selection and the dining hall selection, and specifically comprises the following substeps:

step 5.2.1, from campus life behavior data set a _n Taking two elements of consumption place and gender in each item to form alpha _n ＝{Loction _n ,Tbuilding _n L N =1,2, \8230;, N } as a matter of association rule mining, wherein M different canteens and S different teaching buildings Loction are included ^m ,Tbuilding ^s (M =1,2, \8230;, M, S =1,2, \8230;, S) are terms in transactions, a transaction database D ^a ＝{τ ₁ ,τ ₂ ,…,τ _N }；

Step 5.2.2, let it = { Location = ^m ,Tbuilding ^s I M =1,2, \ 8230 |, M, S =1,2, \8230 |, S } is D ^a If any of the items in (1) is D, then any non-empty subset X of Ite is D ^a The item set in (1) is a relation rule between a teaching building and a canteen, and firstly, a 2-item set X which comprises 2 items and is formed by the items of the teaching building and the canteen in item _k (k =1,2, \8230;, S × M) is supported:

wherein the content of the first and second substances,

is D ^a Middle containing item set X _k N is D ^a A total number of transactions;

step 5.2.3, setting the minimum support threshold value as

For the 2-item set X in step 5.2.2 _k If at all

Then X _k For frequent item sets, the set of all frequent item sets is marked as X ^F ；

Step 5.2.4, in order to know whether the teaching building and the dining room have strong association relation, the frequent item set X is collected ^F Generating all association rules between the teaching building and the canteen, taking the association rules with the teaching building as a condition and the canteen as a result as an example:

Tbuilding ^s →Location ^m ,m＝1,2,…,M,s＝1,2,…,S (17)

step 5.2.5, solving the confidence of each association rule:

step 5.2.6, set the minimum confidence threshold to

Known frequent itemset set X ^F In

If it is

Tbuilding ^s →Location ^m Is a strongly associated rule.

Preferably, step S6 includes:

acquiring the potential influence of the gender on the selection of the dining room by utilizing the strong association rule, providing suggestions of meal preparation amount and meal preparation types for the dining room according to the suggestions, respectively forming the strongly associated dining rooms with females and males, respectively increasing the meal types which are more in line with the preference of the females or males, and adjusting the meal preparation amount according to the number of people with different genders and the meal consumption;

by utilizing the strong association rule, potential influences of coming in and going out of different teaching buildings on canteen selection are known to the students to form a strongly associated canteen with the teaching buildings, the class-going and class-leaving time of the class arranged on the day of the corresponding teaching building and the capacity of the class students are combined to provide suggestions of meal supply time and meal preparation amount for a canteen management party, and suggestions of meal place selection and peak-load meal place selection are provided for the students.

According to the specific embodiment provided by the invention, the invention discloses the following technical effects:

the campus card-swiping system can utilize data generated by swiping the campus card to automatically analyze multiple dimensions such as consumption behaviors, living behavior tracks, consumption place preference and the like, and can provide valuable data analysis results for students and schools; the invention adopts a two-dimensional clustering mode to the consumption data, can carry out joint analysis on the consumption sum and frequency, obtains more comprehensive consumption behavior analysis results, eliminates artificial limitation on the clustering number in the clustering process, ensures that the results are more in line with objective conditions, and has stronger inclusiveness.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings required for the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without creative efforts.

FIG. 1 is a flow chart of a method in an embodiment provided by the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be described clearly and completely with reference to the accompanying drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The invention aims to provide an analysis method based on one-card big data, which can provide valuable data analysis results for students and schools by fully mining multi-dimensional information of one-card data, on one hand, provides a side-writing image reflecting life behavior characteristics of the students, helps the students to perform self management and promotion, on the other hand, provides overall management suggestions and potential problem early warning of student life and learning environments such as canteens, dormitories, teaching buildings and the like for school managers, and has more multi-dimensional and wide application scenes.

In order to make the aforementioned objects, features and advantages of the present invention more comprehensible, the present invention is described in detail with reference to the accompanying drawings and the detailed description thereof.

As shown in FIG. 1, the invention provides an analysis method based on big data of a one-card, which comprises the following steps:

s1: reading personal information of a campus card-through student group in a preset period, campus card consumption card-swiping records and access control card-swiping records in the same period, and acquiring an original data set reflecting campus life, wherein the original data set takes card numbers as unique identifiers and corresponds to students one by one, and the consumption and access control card-swiping records all contain card number information, so that the three groups of data records can be associated with each other through the card number information.

s3: acquiring two-dimensional data formed by consumption frequency and consumption sum through the effective data subset which can be used for analysis, and performing cluster analysis on the two-dimensional data formed by the consumption frequency and the consumption sum by using a mean shift method to acquire data after the cluster analysis;

s4: and acquiring consumption behavior characteristics of the students through the data after the cluster analysis, and providing management and self-management references for schools and students respectively.

s6: and acquiring the living track behavior characteristics of the students according to the joint analysis result, and providing a meal preparation planning suggestion and a dining recommendation for the dining room manager and the students respectively.

Further, step S1 includes: reading N consumption card swiping records of one card in the school of students in a preset period Dur, wherein each consumption card swiping record p _n The set of several features that can be expressed is as follows:

p _n ＝{(CardNo _n ,Time _n ,Loction _n ,Money _n )|n＝1,2,3,…,N} (1)

wherein, cardNO _n ,Time _n ,Loction _n ,Money _n Record p for consumption card swiping separately _n Card number, consumption time, consumption place and consumption amount; it should be noted that the actual card-swiping record generally contains more information, and only the features relevant to the present invention are listed here;

further, step S2 includes:

{v _i ＝(t _i ,f _i )|i＝1,2,…,I} (3)

wherein, I is the total number of students, and the card number of the campus card corresponding to each student I is C ⁱ Then all the card numbers CardNO in the data set are consumed _n ＝C ⁱ Consumption record p of _n (n.ltoreq.N) the subset of consumption records P constituting the student ⁱ . Its total amount of consumption t _i Has a value of P ⁱ Sum of medium consumption, frequency of consumption f _i Then is P ⁱ Number of pieces recorded in (1):

f _i ＝|P ⁱ | (5)

obtaining a subset L of consumption location records valid for student i ⁱ : for each student i, the card number of the campus card corresponding to the student i is C ⁱ Regarding consumption card swiping data, regarding consumption records with time intervals smaller than a preset threshold value in card swiping records of the same card number and the same consumption place as the same consumption process, only reserving one record to represent the current consumption place and eliminating other records, wherein all card numbers CardNO in a consumption place set _n ＝C ⁱ Consumption record l of _n (N ≦ N) the subset of consumption site records L that make up the student ⁱ Let the time interval threshold be T _interv If:

then see l _n+m And l _n Only one record of card swiping is reserved for the same consumption process _n From L to ⁱ Middle removing _n+m To l, to _n Traverse all l _n+m If the condition of formula (6) is not satisfied, let l _n ＝l _n+m+1 Repeating the above condition judgment until L ⁱ All the data items in the system are checked, and a valid consumption place record subset L of the student i is obtained ⁱ ；

Repeating the steps until the consumption place record subset corresponding to each student is obtained;

acquiring the effective entrance guard card swiping record of the student i: to pairPreset period Dur, read-in and l _n The entrance guard card-swiping records of the same batch of students in the same period of the middle consumption data are utilized _n Access control identification code Acc in _n And Time of card swiping Time _n Screening records of the card swiping time and the consumption time which are adjacent from the entrance guard card swiping records, and checking the entrance guard card swiping place Tbuilding in the records _n Adding the data into consumption place data records to obtain a complete school life behavior data set a of the student school _n ：

a _n ＝{(CardNo _n ，Time _n ，Loction _n ，Tbuilding _n ，Money _n ，Accn，Sex _n )|n＝1，2，3，...，N} (7)

Further, step S3 includes:

s3.1: for two-dimensional feature vector v _i It can be considered as a set in two-dimensional space with (t) _i ,f _i ) The method is characterized by comprising the following steps of (1) setting a horizontal coordinate and a vertical coordinate, wherein I =1,2, \8230, and I, each point corresponds to the consumption behavior of a student, and each point is taken as an independent initial class to realize the initialization of a clustering process;

s3.2: randomly selecting a point v _x As an initial centroid cen _x ；

S3.3: by the center of mass cen _x Selecting a sliding window with the bandwidth of r for the center, recording a set formed by all points in the window range as W, and temporarily marking the sliding window as belonging to class clu _x And increasing the access frequency of the points within the class by 1;

s3.4: calculating all points in the sliding window to the initial centroid cen _x Radial basis kernel weighted average distance M _r As the mean shift vector:

s3.5: by mean shift vector M _r Update the coordinates of the centroid to：

cen _x+1 ＝M _r +cen _x (9)

S3.6: repeating the steps 3.3 to 3.5 until the offset M _r Less than threshold T _conv Then the centroid cen at this time is determined _X As a cluster center, all points accessed in the repeated iteration process of steps 3.3 to 3.5 belong to the class clu corresponding to the center _X Converging the drift;

s3.7: if current class clu _X The distance between the cluster center and the center of a certain existing class is less than a threshold value T _dis If the current class is not the existing class, the current class is not reserved as the new class;

s3.9: and attributing all the points to corresponding clustering centers according to the marks, and clustering the points to a type with higher access frequency if one point is marked by a plurality of types of access marks.

Preferably, step S4 comprises:

Further, the exception category and the exception item include:

potential economically difficult students; a potential student not at school; a potentially high consumer group;

potential economic difficulties students: for the cluster center point set CEN, if CEN _x (t _x ,f _x ) And its corresponding class clus _x The following conditions are met, which indicates that the total consumption amount of the students in the class in the preset period is very low and the consumption frequency is higher,

potential student absence: for the cluster center point set CEN, if CEN _x (t _x ,f _x ) And its corresponding class clus _x If the following conditions are met, the total consumption amount and the consumption frequency of the students in the class in the preset period are both extremely low,

potentially high consumer groups: for the cluster center point set CEN, if CEN _x (t _x ,f _x ) And its corresponding class clus _x The following conditions are met, which indicates that the total consumption amount and the consumption frequency of the students in the class in the preset period are extremely high,

further, step S5 includes:

digging association rules for the gender item set and the consumption place item set in the consumption place record subset, judging whether strong association rules exist, and reflecting that the gender of the student is associated with the dining room selection more definitely if the strong association rules exist; if the gender of the student does not exist, the fact that obvious interaction influence does not exist between the gender of the student and the selection of the dining hall is shown, and the method specifically comprises the following steps:

step 5.1.1, recording the subset l from the consumption site _n Taking two elements of consumption place and sex in each item to form tau _n ＝{Loction _n ,Sex _n L N =1,2, \8230;, N } as a matter of association rule mining, in which M different places and 2 different sexes Loction are contained ^m ,Sex ^q (M =1,2, \8230;, M, q =1,2) is an item in a transaction, transaction database D ^l ＝{τ ₁ ,τ ₂ ,…,τ _N }；

Step 5.1.2, set Ite = { Loction = ^m ,Sex ^q I M =1,2, \ 8230, M, q =1,2} is D ^l Then any non-empty subset X of Ite is D ^l In order to determine the association rule between gender and consumption location, the item set in (1) is a 2-item set X comprising 2 items and composed of location and gender items in Ite _k (k =1,2, \8230;, M × 2) to obtain a degree of support:

wherein, the first and the second end of the pipe are connected with each other,

is D ^l Middle inclusion item set X _k N is D ^l The total number of medium transactions;

step 5.1.3, setting the minimum support threshold value as

For the 2 item set X in step 5.1.2 _k If at all

Step 5.1.4, in order to know whether the student gender and the dining room selection have strong association relation, collecting X from the frequent item set ^F In the method, all association rules between the gender and the consumption place are generated, taking the gender as a condition and the place as a result as an example:

Sex ^q →Location ^m ,m＝1,2,…,M,q＝1,2 (14)

step 5.1.5, solving the confidence of each association rule:

step (ii) of5.1.6, set minimum confidence threshold to

Known frequent item set X ^F In

If it is

Then Sex ^q →Location ^m Is a strong association rule;

performing association rule mining on the entrance and exit place item set and the consumption place item set in the campus life behavior data set, judging whether strong association rules exist, and reflecting that relatively clear association exists between the teaching building selection and the dining room selection if strong association rules exist; if the result does not exist, the method indicates that no obvious interaction exists between the teaching building selection and the dining hall selection, and specifically comprises the following substeps:

step 5.2.1, from campus life behavior data set a _n Taking two elements of consumption place and gender in each item to form alpha _n ＝{Loction _n ,Tbuilding _n L N =1,2, \8230;, N } as a matter of association rule mining, wherein M different canteens and S different teaching buildings are included ^m ,Tbuilding ^s (M =1,2, \8230;, M, S =1,2, \8230;, S) is an item in a transaction, transaction database D ^a ＝{τ ₁ ,τ ₂ ,…,τ _N }；

Step 5.2.2, let it = { Location = ^m ,Tbuilding ^s L M =1,2, \8230, M, S =1,2, \8230, S is D ^a Then any non-empty subset X of Ite is D ^a The item set in (1) is a relation rule between a teaching building and a canteen, and firstly, a 2-item set X which comprises 2 items and is formed by the items of the teaching building and the canteen in item _k (k =1,2, \8230;, S × M) is supported:

step 5.2.3, setting the minimum support threshold value as

For the 2-item set X in step 5.2.2 _k If at all

Step 5.2.4, in order to know whether the strong association relationship exists between the teaching building and the dining room selection, the frequent item set X is collected ^F Generating all association rules between the teaching building and the canteen, taking the association rules with the teaching building as a condition and the canteen as a result as an example:

Tbuilding ^s →Location ^m ,m＝1,2,…,M,s＝1,2,…,S (17)

step 5.2.5, solving the confidence of each association rule:

step 5.2.6, set the minimum confidence threshold to

Known frequent item set X ^F In (1)

If it is

Tbuilding ^s →Location ^m Is a strongly associated rule.

Further, step S6 includes:

The invention also provides a specific embodiment:

the embodiment takes campus card consumption swiping data as a center, and finds out consumption behavior characteristics of students, card swiping place association and preference, abnormal consumption behaviors and the like:

step 1, reading personal information of a certain student group in a preset period, consumption card swiping records of campus one-card, and access control card swiping records in the same period, and forming an original data set reflecting life behaviors of the campus together (note that campus one-card data takes card numbers as unique identifiers, and corresponds to students one by one, and consumption and access control card swiping records both contain card number information, so that the three groups of data records can be associated with each other through the card number information, and the method comprises the following substeps:

step 1.1, reading consumption card swiping records N =241014 of 3267 student campus card in a preset period Dur of 3 months 1 to 30 days in the embodiment, wherein each consumption card swiping record p _n The set of several features that can be expressed is as follows:

p _n ＝{(CardNo _n ,Time _n ,Loction _n ,Money _n )|n＝1,2,3,…,N} (1)

wherein, cardNO _n ,Time _n ,Loction _n ,Money _n Record p for consumption card swiping separately _n Card number, time of consumption, place of consumption, amount of consumption. It should be noted that more information is typically contained in the actual card swipe record, and only the features relevant to the present invention are listed here.

Step 1.2, the campus card takes the card number as the unique identifier, and corresponds to the students one by one, and here, the acquisition of the personal information of the students refers to the utilization of the card number of the card CardNo in each record _n Inquiring and reading access control identification code Acc of student corresponding to the card number _n And Sex information Sex _n . Student personal information data and consumption card swiping record p in step 1.1 _n The consumption place data sets which comprise N =241014 card swiping records and student personal information in the preset period Dur are formed together:

and 2, cleaning and counting the data set. Processing the original data set obtained in the step 1 according to different living behavior characteristics to obtain a campus living behavior data set for analysis, and the method comprises the following sub-steps:

step 2.1, counting the consumption frequency and the consumption total amount in a preset period aiming at each card number (namely each student), and expressing the consumption frequency and the consumption total amount as a two-dimensional characteristic vector set of the consumption frequency and the consumption total amount:

{v _i ＝(t _i ,f _i )|i＝1,2,…,I} (3)

wherein I =3267 is the total number of students. It should be noted that the campus card takes the card number as the unique identifier, and corresponds to students one by one, and for each student i, the card number of the campus card corresponding to the student i is C ⁱ Then all the card numbers CardNO in the data set are consumed _n ＝C ⁱ Consumption record p of _n (n.ltoreq.N) the subset of consumption records P constituting the student ⁱ . Its total amount of consumption t _i Has a value of P ⁱ Sum of medium consumption, frequency of consumption f _i Then is P ⁱ Number of pieces recorded in (1):

f _i ＝|P ⁱ | (5)

and 2.2, regarding the consumption card swiping data, regarding consumption records with the time interval smaller than a preset threshold value in the card swiping records of the same card number and the same consumption place as the same consumption process, only keeping one record to represent the current consumption place and removing other records to form a consumption place record subset. Specifically, for each student i, the card number of the campus card corresponding to the student i is C ⁱ Then all the card numbers in the consumption place set are CardNO _n ＝C ⁱ Consumption record l _n (n.ltoreq.N) the subset L of consumption location records forming the student ⁱ Time interval threshold T _interv Can be determined by the person skilled in the art, in this embodiment, the time interval threshold T is taken _interv = 60min if:

then look at l _n+m And l _n Only one record of card swiping is reserved for the same consumption process _n From L to L ⁱ Middle eliminating of _n+m . To l is to _n Traverse all l _n+m If the condition of formula (6) is not satisfied, let l _n ＝l _n+m+1 Repeating the above condition until L ⁱ All the data items in the system are checked, and a valid consumption place record subset L of the student i is obtained ⁱ 。

And 2.3, repeating the step 2.2 until all the consumption place record subsets corresponding to the card numbers are obtained, and finally obtaining 133082 pieces of effective consumption place data of 3267 students in the embodiment.

Step 2.4, for the preset period Dur, read in and _n consuming the same cycle of data (i.e. consuming3 months and 1 to 30 days) of the same group of students, and using l _n In (2) the access control identification code Acc _n And Time of card swiping Time _n Screening records of which the card swiping time is adjacent to the consumption time from the entrance guard card swiping records, and checking the entrance guard card swiping place Tbuilding in the records _n Adding the data into consumption place data records to obtain a complete school life behavior data set a of the student school _n ：

a _n ＝{(CardNo _n ,Time _n ,Loction _n ,Tbuilding _n ,Money _n ,Acc _n ,Sex _n )|n＝1,2,3,…,N} (7)

And 3, carrying out clustering analysis by using a mean shift method aiming at two-dimensional data formed by consumption frequency and consumption total. Comprising the following substeps:

step 3.1, for two-dimensional feature vector v _i It can be considered as a set in two-dimensional space with (t) _i ,f _i ) A set of points on the abscissa with I =1,2, \ 8230;, I, each point corresponding to the consumption behavior of one student. And taking each point as a separate initial class to realize the initialization of the clustering process.

Step 3.2, randomly selecting a point v _x As an initial centroid cen _x ；

Step 3.3, with centroid cen _x Selecting a sliding window with the bandwidth of r for the center, recording a set formed by all points in the window range as W, and temporarily marking the sliding window as belonging to class clu _x And increase the access frequency of these points within the class by 1.

Step 3.4, calculating all points in the sliding window to the initial centroid cen _x Radial basis kernel weighted average distance M _r As the mean shift vector:

where g () represents a radial basis kernel function.

Step 3.5, mean shift vector M _r Updating the centroid coordinates as:

cen _x+1 ＝M _r +cen _x (9)

step 3.6, repeat steps 1.3.3 to 1.3.5 until offset M _r Less than a threshold value T _conv Then the centroid cen at that time is determined _X As a clustering center, all points visited in the repeated iteration process of steps 1.3.3 to 1.3.5 belong to the class clu corresponding to the center _X And the drift is converged.

Step 3.7, if the current class clu _X The distance between the cluster center and the center of an existing class is less than a threshold value T _dis If the current class is not the new class, the current class is classified into the existing class, otherwise, the current class is kept as the new class.

And 3.8, repeating the steps 1.3.1 to 1.3.7 until all the points are accessed, and ending the mean shift clustering process.

And 3.9, attributing all the points to corresponding clustering centers according to the marks, and clustering the points to a type with higher access frequency if one point is marked by a plurality of types of access marks. The number of clusters finally obtained in this embodiment is 6.

Step 4, according to the clustering analysis result, acquiring consumption behavior characteristics of students, and providing management and self-management references for schools and students respectively, comprising the following substeps:

and 4.1, feeding back the positions of the students in the frequency-total two-dimensional space according to the frequency-total clustering result, providing individual consumption behavior reports for the students, helping the students to know and improve self consumption behaviors and habits, and developing financial consciousness. For example, if students are in a class with relatively low consumption frequency and relatively high total consumption, the average single consumption amount of the students is high, which reflects that the students have high requirements on consumption or life quality.

And 4.2, analyzing the frequency-total clustering result, acquiring abnormal classes and abnormal items, feeding back the abnormal classes and abnormal items to a student management department as early warning information to remind a corresponding student group to pay special attention, and simultaneously providing necessary consumption suggestions to students. In particular, the following classes of data are labeled as potentially anomalous:

(1) For the cluster center point set CEN, if CEN _x (t _x ,f _x ) And its corresponding class clus _x The following conditions are met, which indicates that the total consumption amount of students in the class in the preset period is very low, and the consumption frequency is higher, and the students belong to potential students with economic difficulties:

wherein, t _CEN Set of abscissas of the CEN set of central points, f _I And the ordinate sets, mean (), std (), and num () of all points are mean, standard deviation, and count functions, respectively.

(2) For CEN, if CEN, in the cluster center point set _x (t _x ,f _x ) And its corresponding class clus _x The following conditions are met, which indicates that the total consumption amount and the consumption frequency of the students in the class in the preset period are extremely low, and the students belong to potential students not at school:

(3) For CEN, if CEN, in the cluster center point set _x (t _x ,f _x ) And its corresponding class clus _x The following conditions are met, which indicates that the total consumption amount and the consumption frequency of the students in the class in the preset period are extremely high, and the students belong to potential high consumption groups:

in the embodiment, there are no classes meeting the conditions (1) and (2), 1 class meets the condition (3), and the classes contain potential high-consumption groups of students, and can be provided with rational consumption suggestions in campus card association application and continuously pay attention as special attention groups of student management departments.

Step 5, aiming at the campus life behavior data set and the subset thereof, carrying out joint analysis on the campus life behavior data set by using an Apriori association rule analysis method, taking a canteen as a center, mining the association relation between the attributes of students and the entrance and exit of a teaching building and the selection of the canteen, and analyzing the life behavior track preference of different student groups, wherein the method comprises the following substeps:

and 5.1, mining association rules aiming at the gender item set and the consumption site item set in the consumption site record subset obtained in the step 2.3, and judging whether strong association rules exist or not. If the relationship exists, the relationship between the academic nature and the selection of the canteens is more definite; if the gender of the student does not exist, the gender of the student does not have obvious mutual influence on selection of the canteen. The specific implementation process is as follows:

step 5.1.1, recording the subset l from the consumption site _n Taking two elements of consumption place and sex in each item to form tau _n ＝{Loction _n ,Sex _n I N =1,2, \8230;, N }, 6 different locations and 2 different sexes Loction included in the present embodiment as a matter of association rule mining ^m ,Sex ^q (m =1,2, \8230;, 6,q =1,2) is an item in a transaction, transaction database D ^l ＝{τ ₁ ,τ ₂ ,…,τ _N }。

Step 5.1.2, set Ite = { Loction = ^m ,Sex ^q I m =1,2, \ 8230 |, 6,q =1,2} is D ^l Then any non-empty subset X of Ite is D ^l In order to determine the association rule between gender and consumption location, the item set in (1) is a 2-item set X comprising 2 items and composed of location and gender items in Ite _k (k =1,2, \8230;, 12) requires a degree of support:

is D ^l Middle containing item set X _k N is D ^l The total number of transactions in.

Step 5.1.3, setting the minimum support threshold value as

In specific implementation, the threshold is selected by a person skilled in the art, and is set in the embodiment

For the 2-item set X in step 5.1.2 _k If at all

Then X _k For frequent item set, the set of all frequent item sets is marked as X ^F 。

Step 5.1.4, in order to know whether the student gender and the dining room selection have strong association relation, a frequent item set X is collected ^F In the step (3), all association rules between the gender and the consumption location are generated, and the association rule taking the gender as a condition and the location as a result is taken as an example:

Sex ^q →Location ^m ,m＝1,2,…,6,q＝1,2 (14)

step 5.1.5, solving the confidence of each association rule:

step 5.1.6, set the minimum confidence threshold to

Known frequent itemset set X ^F In

If it is

Then Sex ^q →Location ^m Is a strongly associated rule. For association rule Location ^m →Sex ^q The support degree is the same as the calculation process of the confidence degree, and the details are not repeated here. The strong association rule in this embodiment is "bamboo garden dining hall → girls" through analysis, and the support and confidence are 0.1502 and 0.7366, respectively.

And 5.2, mining association rules aiming at the entrance guard access point (namely the teaching building) item set and the consumption point (namely the ready-to-eat hall) item set in the campus life behavior data set obtained in the step 2.4, and judging whether strong association rules exist or not. If the current association exists, the fact that a relatively clear association exists between the teaching building selection and the canteen selection is reflected; if not, the fact that the interaction between the teaching building selection and the canteen selection is not obvious is indicated. The specific implementation process is as follows:

step 5.2.1, from campus life behavior data set a _n Taking two elements of consumption place and gender in each item to form alpha _n ＝{Loction _n ,Tbuilding _n I N =1,2, \8230;, N } as a matter of association rule mining, including 6 different canteens and 8 different teaching buildings, loction ^m ,Tbuilding ^s (m =1,2, \8230;, 6,s =1,2, \8230;, 8) is an entry in a transaction, transaction database D ^a ＝{τ ₁ ,τ ₂ ,…,τ _N }。

Step 5.2.2, let it = { Location = ^m ,Tbuilding ^s I m =1,2, \ 8230 |, 6,s =1,2, \8230;, 8} is D ^a Then any non-empty subset X of Ite is D ^a The item set in (1) is a correlation rule between an education building and a canteen, and firstly, a 2-item set X containing 2 items and formed by the education building and the canteen items in Ite _k (k =1,2, \8230;, 48) support:

wherein the content of the first and second substances,

is D ^a Middle containing item set X _k N is D ^a The total number of transactions in.

Step 5.2.3, setting the minimum support threshold value as

For the 2-item set X in step 5.2.2 _k If, if

Step 5.2.4, in order to know whether the teaching building and the dining room have strong association relation, the frequent item set X is collected ^F Generating all association rules between the teaching building and the canteen, taking the teaching building as a condition and the canteen as a result as an example:

Tbuilding ^s →Location ^m ,m＝1,2,…,6,s＝1,2,…,8 (17)

step 5.2.5, solving the confidence of each association rule:

step 5.2.6, set the minimum confidence threshold to

Known frequent itemset set X ^F In (1)

If it is

Then Tbuilding ^s →Location ^m Is a strongly associated rule. For association rule Location ^m →Tbuilding ^s The support degree is the same as the calculation process of the confidence degree, and the details are not repeated here. The strong association rule obtained by analysis in the embodiment is 'lan garden canteen → yazai', and the support degree and the confidence degree are 0.1520 and 0.5080 respectively; "bamboo garden canteen → two educations", its support and confidence are 0.1566, 0.6731 respectively.

And 6, acquiring the life track behavior characteristics of the students according to the analysis result of the association rule, and providing a meal preparation planning suggestion and a meal recommendation for the canteen manager and the students respectively. Comprising the following substeps:

step 6.1, by utilizing the strong association rule obtained in the step 5.1, the greater probability of having a meal to go to the bamboo garden dining room is known as girls, so that the meal categories which better accord with the preference of girls can be correspondingly increased, the meal quantity demand of girls is considered to be lower than that of girls, the meal quantity is properly reduced, and the waste is avoided.

And 6.2, by using the strong association rule obtained in the step 5.2, the students going to the orchards canteens can be known to come from eight teachers with a high probability, and the students going to the bamboo garden canteens can come from two teachers with a high probability, so that the dining room management party can adjust the meal supply time and meal preparation amount by combining the course arrangement (such as the time of going to and going from classes and the volume of course students) of the eight teachers and the two teachers. For students, the prediction of the flow of people in the canteens can be provided for the students according to the capacity conditions of the students in the teaching building on the same day, so that the students can be assisted to properly select the off-peak dining mode.

The invention has the following beneficial effects:

the campus card swiping system can utilize data generated by the campus card swiping to conduct automatic analysis of multiple dimensions such as consumption behaviors, life action tracks and consumption place preference, and can provide valuable data analysis results for students and schools; the invention adopts a two-dimensional clustering mode to the consumption data, can carry out combined analysis on the total consumption sum and the frequency to obtain a more comprehensive consumption behavior analysis result, and eliminates the artificial limitation on the clustering number in the clustering process, so that the result is more in line with the objective condition, and the inclusion is stronger.

The embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other.

The principles and embodiments of the present invention have been described herein using specific examples, which are provided only to help understand the method and the core concept of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, the specific embodiments and the application range may be changed. In view of the foregoing, the description is not to be taken in a limiting sense.

Claims

1. An analysis method based on big data of a one-card is characterized by comprising the following steps:

s1: reading personal information of a campus card-through student group, consumption card-swiping records of the campus card-through students in a preset period and access card-swiping records in the same period to obtain an original data set reflecting campus life, wherein the original data set takes a card number as a unique identifier and corresponds to students one by one.

S2: cleaning and counting the original data set, and processing the original data set according to different life behavior characteristics to obtain a campus life behavior data set for analysis and an effective data subset thereof;

s3: acquiring two-dimensional data consisting of consumption frequency and consumption sum through the effective data subset which can be used for analysis, and performing cluster analysis on the two-dimensional data consisting of consumption frequency and consumption sum by using a mean shift method to acquire data after cluster analysis;

S5: performing joint analysis on the campus life behavior data set available for analysis and the effective data subset thereof by using an Apriori association rule analysis method to obtain joint analysis results and analyze life behavior trajectory preferences of different student groups;

2. The method for analyzing the big data of the all-purpose card according to claim 1, wherein the step S1 comprises: presetting a period Dur, reading in N consumption card swiping records of the student campus card in the period, wherein each consumption card swiping record is p _n Expressed as a collection of several features:

p _n ＝{(CardNo _n ，Time _n ，Loction _n ，Money _n )|n＝1，2，3，...，N} (1)

wherein, cardNO _n ，Time _n ，Loction _n ，Money _n Record p for consumption card swiping separately _n Card number, consumption time, consumption place and consumption amount of the user;

l _n ＝{(CardNo _n ，Time _n ，Loction _n ，Money _n ，Acc _n ，Sex _n )|n＝1，2，3，...，N} (2)。

3. the method for analyzing the one-card big data according to claim 2, wherein the step S2 comprises:

{v _i ＝(t _i ，f _i )|i＝1，2，...，I} (3)

wherein I is the total number of students, and the card number of the campus card corresponding to each student I is C ⁱ Then all the card numbers CardCo in the data set are consumed _n ＝C ⁱ Consumption record p of _n (n.ltoreq.N) constitutes a subset P of consumption records for the student ⁱ . Its total amount of consumption t _i Has a value of P ⁱ Sum of medium consumption, frequency of consumption f _i Then is P ⁱ Number of pieces recorded in (1):

f _i ＝|P ¹ | (5)

obtaining a subset L of consumption location records valid for student i ⁱ : for each student i, the card number of the campus card corresponding to the student i is C ⁱ All card numbers CardNo in the consumption site set _n ＝C ⁱ Consumption record l _n (n.ltoreq.N) the subset L of consumption location records forming the student ⁱ Setting the time interval threshold value as T _interv If:

then look at l _n+m And l _n Only retaining l for the card-swiping record generated in the same consumption process _n From L to ⁱ Middle eliminating of _n+m To l, to _n Traverse all l _n+m If the condition of formula (6) is not satisfied, let l _n ＝l _n+m+1 Repeating the above condition judgment until L ⁱ All data items in (1)After the examination is finished, obtaining the effective consumption place record subset L of the students i ⁱ ；

obtaining the effective entrance guard card swiping record of the student i: for a predetermined period Dur, read in and _n the entrance guard card-swiping record of the same batch of students in the same period of the middle consumption data is recorded by utilizing the I _n Access control identification code Acc in _n And Time of card swiping Time _n Screening records of the card swiping time and the consumption time which are adjacent from the entrance guard card swiping records, and checking the entrance guard card swiping place Tbuilding in the records _n Adding the data into a consumption place data record to obtain a complete student campus life behavior data set a _n ：

4. The method for analyzing the big data of the smart card of claim 3, wherein the step S3 comprises:

s3.1: for two-dimensional feature vector v _i Viewed as a set in two-dimensional space with (t) _i ，f _i ) The clustering method comprises the following steps of taking a point set of horizontal and vertical coordinates, wherein I =1, 2.. And I, each point corresponds to the consumption behavior of a student, and each point is taken as an independent initial class to realize the initialization of a clustering process;

s3.2: randomly selecting a point v _x As an initial centroid cen _x ；

s3.4: computing all points to the beginning within a sliding windowCenter of mass cen _x Radial basis kernel weighted average distance M _r As the mean shift vector:

s3.5: by mean shift vector M _r Updating the centroid coordinates as:

cen _x+1 ＝M _r +cen _x (9)

s3.6: repeating the steps 3.3 to 3.5 until the offset M _r Less than threshold T _conv Then the centroid cen at this time _X As a clustering center, all points visited in the repeated iteration process of steps 3.3 to 3.5 belong to the class clu corresponding to the center _X Converging the drift;

s3.7: if the current class clu _X The distance between the cluster center and the center of a certain existing class is less than a threshold value T _dis If the current class is not the existing class, the current class is not reserved as the new class;

s3.9: and (3) attributing all the points to corresponding clustering centers according to the marks, and clustering the points to a class with higher access frequency if one point is marked by a plurality of class access marks to obtain a frequency-total clustering result.

5. The method for analyzing the big data of the smart card of claim 4, wherein the step S4 comprises:

6. The method for analyzing big data of a one-card according to claim 5, wherein the abnormal items and the abnormal classes comprise:

potential economic difficulties students: for CEN, if CEN, in the cluster center point set _x (t _x ，f _x ) And its corresponding class clus _x The following conditions are met, which indicates that the total consumption amount of the students in the class in the preset period is very low and the consumption frequency is higher,

potential out-of-school students: for the cluster center point set CEN, if CEN _x (t _x ，f _x ) And its corresponding class clus _x The following conditions are met, which indicates that the total consumption amount and the consumption frequency of the students in the class in the preset period are extremely low,

potential high consumer groups: for CEN, if CEN, in the cluster center point set _x (t _x ，f _x ) And its corresponding class clus _x The following conditions are met, which indicates that the total consumption amount and the consumption frequency of the students in the class in the preset period are extremely high,

7. the method for analyzing the big data of the all-purpose card according to claim 6, wherein the step S5 comprises:

performing association rule mining on the gender item set and the consumption site item set in the consumption site record subset, judging whether a strong association rule exists, and reflecting that the gender of the student is clearly associated with the selection of the dining room if the strong association rule exists; if the gender of the student does not exist, the fact that obvious mutual influence does not exist between the gender of the student and the selection of the canteen is shown, and the method specifically comprises the following steps:

step 5.1.1, recording the subset l from the consumption site _n Taking two elements of consumption place and gender in each item to form tau _n ＝{Loction _n ，Sex _n I N =1, 2., N } as a matter of association rule mining, containing M different places and 2 different sexes Loction ^m ，Sex ^q (M =1, 2.., M, q =1, 2) is an entry in a transaction, transaction database D ^l ＝{τ ₁ ，τ ₂ ，...，τ _N }；

Step 5.1.2, let it = { Loction ^m ，Sex ^q I M =1, 2.. M, q =1,2} is D ^l If any of the items in (1) is D, then any non-empty subset X of Ite is D ^l In order to determine the association rule between gender and consumption location, the item set in (1) is a 2-item set X comprising 2 items and composed of location and gender items in Ite _k (k =1, 2.., mx 2) to obtain a support degree:

wherein the content of the first and second substances,

is D ^l Middle containing item set X _k N is D ^l A total number of transactions;

step 5.1.3, setting the minimum support threshold value as

For the 2 item set X in step 5.1.2 _k If at all

Step 5.1.4, in order to know whether the student sex and the dining room selection have a strong association relationship, a frequent item set X is collected ^F Generating all association rules between the gender and the consumption place, wherein the association rule taking the gender as a condition and the place as a result is as follows:

Sex ^q →Location ^m ，m＝1，2，...，M，q＝1，2 (14)

step 5.1.5, calculating the confidence of each association rule:

step 5.1.6, set the minimum confidence threshold to

Known frequent item set X ^F In (1)

If it is

Then Sex ^q →Location ^m Is a strong association rule;

step 5.2.1, from campus life behavior data set a _n Taking two elements of consumption place and gender in each item to form alpha _n ＝{Loction _n ，Tbuilding _n I N =1,2,. And N }, as a matter mined by association rules, wherein M different canteens and S different teaching buildings Loction are included in the affairs mined by the association rules ^m ，Tbuilding ^s (M =1, 2.. Said.m, S =1, 2.. Said.s.) is an entry in a transaction, the transaction database D ^a ＝{τ ₁ ，τ ₂ ，...，τ _N }；

Step 5.2.2, let it = { Location = ^m ，Tbuilding ^s I M =1, 2.. Said, M, S =1, 2.. Said, S } is D ^a If any of the items in (1) is D, then any non-empty subset X of Ite is D ^a The item set in (1) is an association rule between a teaching building and a canteen, and firstly, a 2-item set X which comprises 2 items and is formed by the items of the teaching building and the canteen in Ire _k (k =1, 2.., sxm) to obtain a support degree:

wherein the content of the first and second substances,

step 5.2.3, setting the minimum support threshold value as

For the 2-item set X in step 5.2.2 _k If, if

Step 5.2.4, in order to know whether a strong association relationship exists between the teaching building and the selection of the dining room, generating all association rules between the teaching building and the dining room from the frequent item set XF, wherein the association rules of the teaching building as the conditions and the dining room as the results are as follows:

Tbuilding ^s →Location ^m ，m＝1，2，...，M，s＝1，2，...，S (17)

step 5.2.5, calculating the confidence of each association rule:

step 5.2.6, set the minimum confidence threshold to

Known frequent itemset set X ^F In

If it is

Then Tbuilding ^s →Location ^m Is a strongly associated rule.

8. The method for analyzing the big data of the smart card of claim 7, wherein the step S6 comprises:

by utilizing the strong association rule, potential influences of coming in and going out of different teaching buildings on the dining room selection are known to the dining room which is strongly associated with the teaching buildings, the on-and-off time of courses arranged on the day of the corresponding teaching building and the capacity of students in the courses are combined to provide suggestions of meal supply time and meal preparation amount for a dining room management party, and suggestions of meal place selection and peak-off meal place selection are provided for students.