CN113010578B - Community data analysis method and device, community intelligent interaction platform and storage medium - Google Patents

Community data analysis method and device, community intelligent interaction platform and storage medium Download PDF

Info

Publication number
CN113010578B
CN113010578B CN202110303133.6A CN202110303133A CN113010578B CN 113010578 B CN113010578 B CN 113010578B CN 202110303133 A CN202110303133 A CN 202110303133A CN 113010578 B CN113010578 B CN 113010578B
Authority
CN
China
Prior art keywords
community
residents
data
node
centrality
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110303133.6A
Other languages
Chinese (zh)
Other versions
CN113010578A (en
Inventor
莫海彤
刘玉亭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China University of Technology SCUT
Original Assignee
South China University of Technology SCUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China University of Technology SCUT filed Critical South China University of Technology SCUT
Priority to CN202110303133.6A priority Critical patent/CN113010578B/en
Publication of CN113010578A publication Critical patent/CN113010578A/en
Application granted granted Critical
Publication of CN113010578B publication Critical patent/CN113010578B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2474Sequence data queries, e.g. querying versioned data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/285Clustering or classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/29Geographical information databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • G06Q10/06393Score-carding, benchmarking or key performance indicator [KPI] analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/26Government or public services
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2216/00Indexing scheme relating to additional aspects of information retrieval not explicitly covered by G06F16/00 and subgroups
    • G06F2216/03Data mining

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Human Resources & Organizations (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Tourism & Hospitality (AREA)
  • Software Systems (AREA)
  • Economics (AREA)
  • Educational Administration (AREA)
  • Strategic Management (AREA)
  • Development Economics (AREA)
  • Computational Linguistics (AREA)
  • Fuzzy Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Entrepreneurship & Innovation (AREA)
  • General Business, Economics & Management (AREA)
  • Mathematical Physics (AREA)
  • Marketing (AREA)
  • Game Theory and Decision Science (AREA)
  • Quality & Reliability (AREA)
  • Operations Research (AREA)
  • Remote Sensing (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a community data analysis method, a device, a community intelligent interaction platform and a storage medium, wherein the method comprises the following steps: acquiring space-time track data and personal information data of community residents, binding the space-time track data and the personal information data with basic geographic block data, and marking various activities of the community residents as different types; recording the daily activities of each resident of the community through a sequence consisting of different activity types; performing sequence analysis on resident activities of each community to obtain basic behavior patterns of residents of different communities; and mining social and economic attributes of residents associated with various behavior modes according to personal information data of the residents. The invention is helpful to improve the quality and efficiency of basic community treatment, clarify the intelligent method and the cooperative mechanism of community treatment, and promote the construction and perfection of a community co-establishment co-treatment sharing system.

Description

Community data analysis method and device, community intelligent interaction platform and storage medium
Technical Field
The invention relates to a community data analysis method and device, a community intelligent interaction platform and a storage medium, and belongs to the field of digital management and community development planning.
Background
The community is a basic stone for social management, and the promotion of community 'treatment' is the top-level design guidance of the new era on community planning. The construction of the complete community is also drawn into the work emphasis of the national housing and urban and rural construction department in 2020, and the emphasis of the community construction is that the community construction is further enhanced on the basis of supplementing the public service resource short boards so as to better adapt to the increasingly beautiful living needs of community residents.
Aiming at the construction and popularization of a complete community and community co-construction co-treatment sharing treatment system, urban and rural communities in various places in China have successively developed co-construction activities of good environments and happy lives, and go deep into basic-level communities to develop specific activities such as mass interviews, questionnaires, communication discussions and the like, and experts and scholars also provide a lot of beneficial guidance for how the co-construction activities should be developed and implemented. However, due to insufficient financial resources, material resources and manpower, the covered communities are still very limited, and long-acting operation mechanisms are difficult to form. The basic community still faces the outstanding management difficulty, and needs to complement community facilities, resources and short management capacity plates so as to intelligently, efficiently and systematically improve the management level of the basic community.
The information communication technology has deeply penetrated into various fields of national governance, and the digital governance mode based on the information communication technology has become the basic feature of the modernization of the national governance system and governance capability, but in the community level, the method and mechanism for governance digitization, refinement and intellectualization still need to be further mined.
Disclosure of Invention
In order to solve the defects of the prior art, the invention provides a community data analysis method, a device, a community intelligent interaction platform and a storage medium, which are used for realizing specialized collection, analysis and processing of massive community data resources by using a big data technology platform, obtaining basic behavior patterns of residents of different communities, mining social and economic attributes of residents associated with various behavior patterns, calculating the accessibility of community life circle facilities, analyzing the roles of each main body in a community management network, and further providing a community intelligent interaction platform for information exchange and decision interaction jointly participated by a plurality of main bodies of governments, enterprises, expert students, community residents, social organizations and the like. The method is beneficial to improving the quality and efficiency of the domestic basic community treatment, clarifying the intelligent method and the cooperative mechanism of the community treatment, and deepening the construction and perfection of a complete community and co-establishment co-treatment sharing system.
A first object of the present invention is to provide a community data analysis method.
A second object of the present invention is to provide a community data analysis device.
The third object of the invention is to provide a community intelligent interaction platform.
A fourth object of the present invention is to provide a storage medium.
The first object of the present invention can be achieved by adopting the following technical scheme:
a community data analysis method, the method comprising:
acquiring space-time track data and personal information data of community residents, binding the space-time track data and the personal information data with basic geographic block data, and marking various activities of the community residents as different types;
recording the daily activities of each resident of the community through a sequence consisting of different activity types;
performing sequence analysis on resident activities of each community to obtain basic behavior patterns of residents of different communities;
and mining social and economic attributes of residents associated with various behavior modes according to personal information data of the residents.
Further, the sequence analysis is performed on the activities of residents in each community to obtain basic behavior patterns of residents in different communities, which specifically comprises the following steps:
Comparing the sequences of residents in the communities;
classifying according to the sequence similarity, wherein each sequence classification represents a behavior mode;
and comparing results are carried out according to the sequences of the residents in the communities, so that basic behavior patterns of the residents in different communities are obtained.
Further, the comparing the sequences of each resident of the community specifically includes:
constructing a valuation function: if the two sequences of corresponding characters match, assigning a score of 1, and if the two sequences of corresponding characters mismatch, assigning a score of 0; gap occurs in either sequence strand, designated as gap penalty d;
two sequence alignment: when two sequences are aligned, the global alignment uses Needleman-Wunsch algorithm, the local alignment uses Smith-Waterman algorithm, each time the two algorithms are used for alignment, and one algorithm with the optimal matching result is selected to complete the sequence alignment.
Further, the classifying is performed according to the sequence similarity, and each sequence classification represents a behavior pattern, which specifically includes:
in global alignment, H ij The calculation formula of (2) is as follows:
in the local alignment, H ij The calculation formula of (2) is as follows:
the similarity score of the two sequences is marked as S, and the calculation formula is as follows:
S=H ijmax
classifying sequences with highest pairwise comparison scores into the same class according to the S value, wherein each sequence classification represents a resident behavior mode; wherein H is ij Representing the similarity of matrix elements of the ith row and jth column, W ij Representing the similarity weight for row i and column j, d represents the gap penalty.
Further, the comparison result is performed according to each resident sequence of the community, so as to obtain basic behavior patterns of residents of different communities, which specifically comprises:
and carrying out multi-sequence comparison analysis on the sequences, constructing a evolutionary tree by using a maximum likelihood method on the comparison result, extracting typical sequences of all branches in the evolutionary tree, and analyzing to obtain basic behavior patterns of residents in different communities.
Further, the mining of social and economic attributes of residents associated with various behavior patterns according to the personal information data of the residents specifically comprises the following steps:
and obtaining the socioeconomic properties of the residents from the resident personal information data, coding the socioeconomic properties of the residents, and mining frequent items of the socioeconomic properties of the residents in each sequence classification to obtain the socioeconomic properties of the high-frequency residents associated with various behavior modes.
Further, the method further comprises:
and according to the resident behavior mode of each community, reversely deducing the type of service facilities commonly used by residents of each community, and calculating the accessibility of living circle facilities of the community.
Further, the method further comprises:
Acquiring attention information and comment information of a plurality of subjects in a community communication forum of an interaction platform, and analyzing the roles of the subjects in a community treatment network by constructing a relationship network of the plurality of subjects in the community treatment; the method for constructing the relationship network of the multi-element main body in the community management comprises the following steps of:
constructing a relationship network of a main body in community treatment, calculating the degree centrality, the proximity centrality and the intermediary centrality of each node in the network, and analyzing the roles of multiple main bodies in the community treatment network.
The second object of the invention can be achieved by adopting the following technical scheme:
a community data analysis device applied to a cloud server, the device comprising:
the acquisition unit is used for acquiring space-time track data and personal information data of community residents, binding the space-time track data and the personal information data with basic geographic block data and marking all activities of the community residents as different types;
the sequence unit is used for recording the daily activities of all residents in the community through a sequence consisting of different activity types;
the analysis unit is used for carrying out sequence analysis on resident activities of all communities to obtain basic behavior patterns of residents of different communities;
And the association unit is used for mining social and economic attributes of residents associated with various behavior modes according to the personal information data of the residents.
The third object of the present invention can be achieved by adopting the following technical scheme:
the utility model provides an intelligent interaction platform of community, the platform includes user side, staff's end and cloud server, user side, staff's end and cloud server are two liang respectively connected, user side, staff's end and cloud server specifically include:
the user end is used for inputting information and sending the input information to the staff end;
the staff end is used for auditing and registering information and filing information sent by the user end, returning the audited information to the user end and sending the audited information to the cloud server;
the cloud server is used for executing the community data analysis method.
The fourth object of the present invention can be achieved by adopting the following technical scheme:
a storage medium storing a program which, when executed by a processor, implements the community data analysis method described above.
Compared with the prior art, the invention has the following beneficial effects:
the method has the advantages that the space-time track data and the personal information data of community residents are obtained, and basic behavior patterns and socioeconomic properties of different community residents are obtained through sequence analysis, so that the method can be used for assisting planning practitioners in making planning decisions of community public service facilities and providing decision references for business service facility layout and positioning for related enterprises; the service facility types commonly used by residents in each community are reversely deduced through the resident behavior mode of each community, the accessibility of living circle facilities in the community is calculated by using an accumulated opportunity method, and the service facility type can be used for evaluating and designing the overall accessibility of the facilities in each community; by constructing a relationship network of the multiple subjects in the community treatment, the network relationship of benefit-critical parties such as government, enterprises, social organizations, expert scholars, residents, industry committee and the like in the community treatment can be observed, and the action of the multiple subjects in the treatment network can be measured quantitatively by calculation; the community intelligent platform collects news, comments and complaint information of each network platform about each community in real time, extracts useful information and is used for guiding improvement and promotion of the living environment quality of the communities.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to the structures shown in these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a block diagram of a community intelligent interaction platform according to embodiment 1 of the present invention.
Fig. 2 is a flowchart of a community data analysis method according to embodiment 1 of the present invention.
FIG. 3 is a graph of the alignment of two sequences of the global alignment algorithm of example 1 of the present invention with better global alignment.
FIG. 4 is a diagram showing the alignment of two sequences of the local alignment algorithm with better global alignment in example 1 of the present invention.
FIG. 5 is a graph of the alignment of two sequences of the global alignment algorithm with better local alignment in example 1 of the present invention.
FIG. 6 is a graph of the alignment of two sequences of the global alignment algorithm with better local alignment in example 1 of the present invention.
FIG. 7 is a diagram of example 1 of the present invention k FP-tree representation of (b).
Fig. 8 is a schematic diagram of the FP-tree of the final assembly of embodiment 1 of the present invention.
Fig. 9 is a schematic diagram of FP subtrees of c3 of embodiment 1 of the present invention.
Fig. 10 is a schematic diagram of FP subtrees of b2 of embodiment 1 of the present invention.
Fig. 11 is a block diagram showing the configuration of a community data analysis device according to embodiment 2 of the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments, and all other embodiments obtained by those skilled in the art without making any inventive effort based on the embodiments of the present invention are within the scope of protection of the present invention.
Example 1:
the embodiment provides a community intelligent interaction platform, as shown in fig. 1, the platform is built based on WeChat applet/APP/website, cloud server and multi-source heterogeneous data, and the platform comprises a user end, a worker end and a cloud server, wherein the user end is connected with the worker end, the cloud server is respectively connected with the user end and the worker end, the user end comprises an information registration module, a community service module and a supervision module, the worker end comprises an authentication module, the cloud server comprises a community service module, a data service module and a supervision module, and the specific description of each module is as follows:
1. Information registration module
The user login/registration interface is an interface for registering information of residents, enterprises, social organizations and the like, the residents need to fill in personal information such as identity information, education level, income level, wedding conditions and the like, and the enterprises (such as property management companies) and the social organizations (such as endowment service stations, staff families, artificial organizations and the like) need to register various information of the enterprises/organizations, project libraries, business data and the like.
2. Authentication module
Auditing registered resident information in the background by staff, and performing qualification screening and information registration and filing on enterprises, social organizations and practitioners thereof; staff mainly refers to staff who are responsible for related government departments, street offices, living parties and the like.
3. Community service module
Selecting 'community service' in the platform can jump to the page, mainly comprises five parts of community information display, project application and management, activity registration/service application, community communication forum and community management, and can continuously embed other plates.
Community information display: including information about the relevant policies of the community, planning files, enterprise/social organization data, community public service facilities, public open spaces, etc., including its spatial location, use and maintenance conditions, etc.
(II) project application and management: the government at this page presents government purchasing social services, rewarding with rewards, and other policy criteria, rules, and budgets, and opens the way for submitting related community projects declarations by street office, business, community organization, residents, and the like. The result of the project declaration, and the information of the operating condition, financial balance, social benefit and the like of the project are displayed/disclosed in the interface.
(III) active entry/service application: the page displays content introduction, holding time and other information of various community activities and services, and opens resident application to participate in activities/services or serve as entrance of volunteers. Community types encompass community micro-engineering, community farms, artistic activities, communication conferences, etc., and community service types include long-person care, early education canteens, etc.
(IV) community communication forum: the user can access the page after logging in. The communication function is realized by adopting a mode of network forum (Bulletin Board System, BBS) and group chat, which specifically comprises a community industry commission and a discussion group in residents, and each party can post or reply posts on the forum; the built-in page surveys the satisfaction degree of governments, enterprises, social organizations and residents on various works of the community, and can perform opinion collection activities such as community transaction voting; the page also opens the interface for complaints and reports.
And (V) community management: the method comprises two parts of security and property management. The security plate comprises closed-circuit monitoring system networking management, parking space management, fire control management, entrance guard on-duty personnel information display and the like. The property management plate comprises property, water, electricity, gas and other fee payment, maintenance declaration, consultation complaint, neighborhood mediation and the like.
In addition, the community service module can be internally provided with other plates, such as community resident cognitive map investigation, community facilities, public space management and the like, or can be provided with exclusive plates for important community activities such as community micro-transformation, common construction and the like.
4. Data service module
The selection of the data service in the platform can be skipped to the page, and the data service module mainly comprises three parts of data acquisition and processing, data sharing and model analysis.
Data acquisition and processing
1. The data sources mainly comprise:
(1) Official base database:
accessing basic databases of each level of government, street office, living commission, dispatching place, planning and designing institute and other years, including geographic information base, legal person base, project base, population base, community planning file, taxi registration information, monitoring point information, crime record, medical record, student registration information, statistical annual survey, statistical communal data, community house attribute and the like;
(2) Network open source database:
community substance spatial properties: accessing network map data. By utilizing the requests, pandas, json library of python and related codes, the POI and AOI information under each class such as 'business house' in an open source network map (such as an API of a hundred-degree map and a Goldmap) is crawled, and data such as Interest points (Point of Interest, POIs) of urban and rural communities and various facilities and vector boundaries (AOI) thereof, roads, greenbelts, water systems and the like are extracted.
Space-time trajectory of residents: and extracting track data of residents by using mobile phone signaling data (such as hundred-degree map comet data and Unicom smart footprint).
Community mindset: collecting news, comment/complaint information of each network platform (such as people's net, xinhua society, black cat complaints, knowledge, hundred-degree bar, and the like) about each community in real time;
(3) Interaction platform data:
social and economic attributes of residents: the information registration module can collect personal information such as identity information, education level, income level, wedding status and the like of residents.
Basic information of enterprises (such as property and various life service industries) and social organizations (such as pension service stations, families of workers, artificial organizations and the like), project libraries, business data and the like are collected through the information registration module.
Satisfaction of each party: and obtaining satisfaction degree of government, residents, enterprises/social organizations on various work of communities through the interactive platform.
Because the collected data structures are different, especially many original data information are numerous and complicated, various data and information are required to be cleaned, processed and processed so as to fully mine the data value.
2. The content of the data processing mainly comprises:
(1) Community substance spatial attributes
And extracting names, vector boundaries, longitude and latitude, and vector boundaries of roads, water systems and greenbelts of various communities and various facilities (such as dining, shopping, company enterprises, traffic facility services and the like) after the POI and AOI data crawled by the network map are de-duplicated and invalid information is removed.
(2) Resident identity information and trajectory data
The basic geographic block information is integrated for standby, then the original mobile phone signaling data is processed, and the real residence and trip information of the user is restored based on the noise reduction calibration technology. Selecting a Spark cluster data processing platform, storing the cleaned mobile phone signaling data and basic map information in the Spark cluster data processing platform, matching the topological relation between the base station cell and the geographic block AOI, and projecting user track data to a geographic entity. In addition, the mobile phone signaling data is bound with the identity card information of the residents, and can be associated with the personal information of the residents collected in the information grade module according to the name, so that personal economic attribute data such as gender, age, education level, income level and the like of the residents and travel track data are obtained.
(3) Community-related enterprise and social organization information
The method comprises the steps of registering enterprises and social organizations of communities and related staff in a classified manner, and classifying and documenting projects related to community construction and management of each enterprise/social organization.
(4) News stories, comments, complaints, etc
Because of the lack of an official community communication platform at present, community owners/tenants have resident comments and complaint information on communities in WeChat groups, black cat complaints, knowledge, hundred-degree bar, newwave microblogs and the like, and various network media such as people's nets, new talents and the like also have relevant information on each community. Relevant information of communities is collected from the websites through web crawler means, texts are detected through python programming, junk texts such as yellow pages, advertisements and the like are removed, and the remaining texts are reserved for standby.
(5) Data information relating to personal privacy information and business confidentiality
The platform is particularly required to perform processing treatment of classifying related data information related to personal privacy information and business confidentiality, such as specific protection of sensitive information such as names, identification numbers, home addresses, incomes and the like, so as to improve the security protection capability of the data information.
(II) data sharing
Different users are faced with to share different data, and part of data content allows the users to modify, increase and decrease. The user types include government, technical support (such as scientific research institutions, planning and design institutions, etc.), social organizations, enterprises, community residents, etc., wherein data information related to personal and business confidentiality is subjected to desensitization. In addition, various forms, specification file downloads, workbooks and example displays of various community activities are also provided.
(III) model analysis
The basic data are stored in the cloud server for background analysts to perform model operation analysis.
As shown in fig. 2, the embodiment provides a community data analysis method, which includes resident activity sequence analysis, community life circle facility accessibility analysis and community management network analysis, and specifically includes the following steps:
s201, acquiring space-time track data and personal information data of community residents, binding the space-time track data and the personal information data with basic geographical block data, and marking all activities of the community residents as different types.
Further, the step S201 specifically includes:
s2011, integrating basic geographic block information.
And extracting names, vector boundaries, longitude and latitude, and vector boundaries of roads, water systems and greenbelts of various communities and various facilities such as catering, shopping, company enterprises, traffic facility services and the like after the POI and AOI data crawled by the network map are subjected to duplication removal and invalid information removal, and integrating.
S2012, original mobile phone signaling data are processed, and real residence and travel information of the user is restored based on a noise reduction calibration technology.
S2013, selecting a Spark cluster data processing platform, storing the cleaned mobile phone signaling data and the basic map block information, matching the topological relation between the base station cell and the geographic block AOI, and projecting the user track data to the geographic entity. The mobile phone signaling data is bound with the identity card information of the residents, and can be associated according to the personal information of the residents collected in the information grade module, so that personal economic attribute data such as gender, age, education level, income level and the like of the residents and travel track data are obtained.
S2014, based on space-time track data of residents and collected personal information data, the data are bound with basic geographical block data, and accordingly all activities of the residents can be identified as nine types of residents, namely, home (A), work (B), school (C), medical visit (D), shopping (E), leisure exercise (F), sightseeing tour (G), going out office (banks, post offices, libraries and the like) (H) and other (I). The identification rules of living places and working/learning places need to be added with the following two steps:
(1) The longest residence time in 9:00 to 17:00 is designated as workplace/school, and the longest residence time in 21:00 to 8:00 the next day is designated as residence;
(2) And the condition that the working days in one month exceed 10 days is satisfied.
S202, recording the daily activities of all residents of the community through sequences composed of different activity types.
And recording the activity types of residents in the community as A-I respectively, and recording the sequence of activities of each resident every day, wherein the sequence consists of English characters. Such as: a (home) -B (work) -H (out office) -a (home) -F (leisure exercise).
S203, carrying out sequence analysis on resident activities of all communities to obtain basic behavior patterns of residents of different communities.
Further, the step S203 specifically includes:
s2031, comparing resident sequences of communities.
Through python programming, the sequence comparison algorithm is used for comparing each resident sequence, and the method specifically comprises the following steps:
(1) Construction of a valuation function
If the two sequences of corresponding characters are matched, assigning 1 score; assigning a score of 0 to the mismatch; any sequence strand that exhibits gap is designated as gap penalty d, and is taken here as a score of-1. The actual values of the matching, mismatch and gap penalty can be adjusted as desired. Let the similarity weight of the ith row and the jth column be W ij The formula is as follows:
(2) Two sequences are aligned
H ij Representing the similarity of matrix elements of row i and column j, d represents a gap penalty. Wherein the global alignment uses Needleman-Wunsch (NW) algorithm to emphasize the identification of the resident's all day activity matches; local comparison uses a Smith-Waterman (SW) algorithm to emphasize the identification of resident local period activity matches; because different effects may occur when the two different sequences are used by the two methods, the two sequences are compared each time through two algorithms, and one of the sequences with the optimal matching result is selected to complete the sequence comparison.
In global alignment, H ij The calculation formula of (2) is as follows:
in the local alignment, H ij The calculation formula of (2) is as follows:
A. scene with better global comparison result:
assuming two sequences, ADAF and ABHAF, respectively, with GAP-A-D-A-F on the horizontal axis and GAP-A-B-H-A-F on the vertical axis, the values for each grid in the first row and first column are each 0 minus a GAP penalty (1 in this case), and the values for each other grid can originate from three directions, from above, from left, and obliquely from above. If the two letters of the grid can be corresponding, the upper left inclined value is +1, and if the two letters do not correspond, the upper left inclined value is +0; the upper value is subtracted from the upper by a gap penalty (in this case, -1), and the left value, left-1, is also taken from the left. The value of the grid takes the largest value among three directions. When a local comparison method is used, if the maximum value is a negative value, taking 0; when the global comparison method is used, whether the maximum value is a negative value or not, the original value is taken, and 0 is not needed.
In the comparison of the two sequences, the maximum value of the last column is found, the two sequences are reversely pushed back, the maximum values are taken along the paths, and then the optimal comparison path is found.
If the global alignment NW algorithm is used, the alignment results of the two sequences are shown in fig. 3, and three sets of values can be corresponding.
If the local alignment SW algorithm is used, the alignment result of the two sequences is shown in fig. 4, and only two sets of values can correspond, so that the two sets of sequences are more suitable to use the global matching method.
B. Scene with better local comparison result:
assuming that the two sequences are AFAC and GHAEA respectively, if a global alignment NW algorithm is used, the alignment results of the two sequences are as shown in fig. 5, and a group of values can correspond to each other; if the local alignment SW algorithm is used, the alignment of these two sequences is shown in fig. 6, where two sets of values can be mapped.
The two groups of sequences are more suitable to use a local alignment method, and the local alignment method is more beneficial to capturing the part with higher matching degree in the two sequences.
S2032, classifying according to the sequence similarity, wherein each sequence classification represents a behavior mode.
Classifying according to the sequence similarity, wherein the sequence with the maximum similarity is self-formed into a group; each sequence class represents a behavior pattern;
the similarity score of the two sequences is marked as S, and the calculation formula is as follows:
S=H ijmax
and according to the S value, comparing every two sequences with the highest score into the same class, wherein each sequence classification represents a resident behavior mode. Such as "living-work-out office-work-shopping-living", "living-leisure exercise-hospitalization-shopping-living", etc.
S2033, comparing results are carried out according to the resident sequences of the communities, and basic behavior patterns of residents of different communities are obtained.
Multiple sequences were imported into MEGA software and subjected to multiple sequence alignment analysis, and analysis results were exported into MEGA format using Clustalw alignment. Loading the MEGA file into MEGA software again, constructing a evolutionary tree by using a maximum likelihood method, extracting typical sequences of all branches in the evolutionary tree for analysis, and summarizing main behavior pattern types of residents; further, the basic behavior patterns of residents in different communities can be known by using the method.
S204, mining social and economic attributes of residents associated with various behavior modes according to personal information data of the residents.
And acquiring and encoding social and economic attribute data of residents according to personal information data of the residents, and mining frequent items of the social and economic attributes of the residents in each sequence classification by using an FP-Growth algorithm through python programming.
The social and economic attributes of residents are classified into two-level codes, wherein the first level comprises gender (a), age (b), occupation (c), income (d), expenditure level (e), health condition (f) and the like, and the second level is a first level of each classification option, for example, the gender is classified into two types of men (1) and women (2). According to this rule, men are denoted "a1", and women are denoted "a2". And respectively mining the socioeconomic properties of residents in the behavior modes of various residents according to the classification of the last step.
First, the attribute sequence of each resident is denoted as S 1 ={a1,b2,c2,d1,e2,f2},S 2 = { a2, b2, c1, d2, e3, f1} … …, traversing the data sets of all socioeconomic property items in all sequences, and calculating the support degree of each item, namely the occurrence times of each item; secondly, descending order of all socioeconomic attribute values is carried out based on the support degree descending order, such as' c3:54 times; a1, 49 times; a2:46 times; b2, 40 times; d2:38 times; e5, 32 times; f1:30 times … …'; thirdly, eliminating non-frequent items of social and economic attributes of residents, namely, numerical values with occurrence times smaller than a certain number of times, wherein in the example, the boundary support degree is set to be 10% of the total sample amount (resident number); fourth, each sequence is read one by one (S 1 、S 2 Etc.), each character in each sequence is inserted into the FP-tree in sequence; the character ranked first is marked as ancestor node, and the character ranked later is marked as descendant node. Such as sequence S k = { a1, b2, c3, d2, e5, f1}, the FP-tree construction is as shown in fig. 7. Similarly, reading other sequences, inserting the other sequences into the FP tree, and if common ancestors exist, adding 1 to the corresponding common ancestor nodes to finally form an integral FP tree, wherein the lower diagram is only a schematic diagram as shown in FIG. 8; fifthly, constructing a conditional FP tree, sequentially searching a conditional mode base corresponding to the item head table item upwards from the node character at the bottommost layer, recursively excavating to obtain a frequent item set, and returning to the applicable frequent item set. For example, the subtree of c3 is shown in FIG. 9, and the frequent item set is (c3:54). And the subtree of b2 is: the frequent item set itself contains (b2:24), the excavation (a1:17), (a2:7), (c3:24), the prefix change of b2 (b2a1:17), (b2a2:7), (b2c3:24) is added, and the result is combined with other subtrees to obtain the final form, as shown in FIG. 10.
The extraction of the frequent items described above can be obtained by python programming. Thus, the social and economic properties of the high-frequency residents and the combination thereof related to various behavior patterns can be obtained, namely, the problem of 'what residents have what time and space behavior patterns' can be solved. Similarly, by applying the method to each community and combining the previous step, the main behavior mode of residents in each community and the main socioeconomic properties of the residents can be known, and the method can be used for assisting planning practitioners in making planning decisions of community public service facilities and providing decision references of business service facility layout and positioning for related enterprises.
S205, reversely deducing the types of service facilities commonly used by residents in each community according to the behavior mode of the residents in each community, and calculating the accessibility of living circle facilities in the community.
By adopting the resident activity sequence analysis method, resident behavior patterns of communities are obtained, and accordingly, common service facility types of residents of communities can be deduced, and the service facility types are denoted as facility set F, and all facilities in the set are denoted as F j ,j=1,2,……,n,F={F 1 ,F 2 ,……,F n }。
The cumulative opportunity method can calculate the number of facilities reachable within a certain time and space distance, and is used for measuring the accessibility of community facilities. According to national urban living area planning and design standard GB50180-2018, urban living areas can be divided into three-level living circles, the three-level living circles can be reached in five, ten and fifteen minutes respectively, the walking distances respectively correspond to 300, 500 and 1000 meters, and the accessibility of various facilities in the three-level living circles can be calculated by adopting an accumulated opportunity method.
The calculation formula of the cumulative opportunity method is as follows:
A j =P j f(d)
wherein A is j Representing cumulative opportunity reachability, P, of facility j in community j For the chance (quantity or quality) of facility j, f (d) is a binary variable, and when the distance cost d from the community to the facility j is smaller than a set threshold, f (d) takes a value of 1, otherwise, f (d) takes a value of 0. Successively calculating a facility set F of each community in combination with POI data j Cumulative opportunity accessibility in different community living circle ranges (300, 500 and 1000 meters), if the facility cumulative opportunity accessibility in the three living circle ranges is 0, the urban living area planning and design standard is referred toIn the above-mentioned publication, the provision of the facility configuration of each living circle is complemented by corresponding facilities. In the aspect of specific facility construction, the result in resident activity sequence analysis can be referred to, and the facility design can be more specifically carried out according to the high-frequency resident socioeconomic attribute related to the resident activity mode containing the facility.
In addition, the cumulative opportunity method can also carry out comprehensive measurement on the accessibility of various facilities in communities, evaluate the total accessibility of the facilities in each community, and the calculation formula is as follows:
s206, acquiring the attention and comment information of the multiple subjects in the community communication forum of the interaction platform, and analyzing the roles of the subjects in the community treatment network by constructing a relationship network of the multiple subjects in the community treatment.
Among these, the multi-element subjects include governments, businesses, social organizations, industry committees, expert scholars, and residents.
And constructing a relation network of a main body in community management through a NetworkX library of python, and calculating the degree centrality, the proximity centrality and the intermediary centrality of each node in the network.
Further, the method specifically comprises the following steps:
s2061, constructing a relationship network of the multi-element main body in community management.
The relationship between the main bodies is divided into two types of symmetry and one-way, wherein the symmetry relationship refers to mutual attention, mutual comment and the like in a community communication forum of the interaction platform; while a one-way relationship refers to unilateral attention, commenting on a post, and so on. Based on the formed social network, network relations of benefit-critical parties such as governments, enterprises, social organizations, expert scholars, residents, industry committees and the like in community treatment can be observed.
S2062, the action of the multi-element main body in the treatment network can be quantified through calculation of the degree centrality, the proximity centrality and the intermediary centrality of each node in the network.
(1) Degree of center (Degree Centrality)
The degree-center degree of a node represents the sum of connecting lines of the node and other nodes in the network, and can represent the cohesive strength of the node main body in the community treatment network. Since the connection is directional, the degree-centrality can be further divided into an In-degree (In-degree) and an Out-degree (Out-degree). The degree of invasiveness represents the attention degree of the node main body in the community treatment network, and if the value is larger, the node main body has higher reputation and is more likely to guide the actions and communication of the community treatment network; the degree of departure represents the degree that the node main body pays attention to other main bodies in the network, and a larger value indicates that the node main body has stronger interaction and enthusiasm in the network. In practice, subjects with higher ranking values may be identified as the leader, leader of community remediation, while subjects with higher ranking values are identified as active participants.
If one network has N nodes, the degree center degree C of the node i D (i) The calculation formula of (2) is as follows:
wherein,the method is used for calculating the contact quantity of the node i with other N-1 nodes. In order to eliminate the influence of the network scale on the degree centrality value of the node, the network scale is standardized according to the following formula:
C D (i)′=C D (i)/(N-1)
(2) Approximate center (Closeness Centrality)
The proximity centrality represents the degree of proximity of a node to other nodes in the network. Because the connections of the nodes of the social network are directional, they can be categorized into an In-proximity center (In-closeness Centrality) and an Out-proximity center (Out-closeness Centrality). In fact, community management activities such as joint construction and the like have strong relevance to entity space, most community management activities are developed on line at the present stage, and on-line professional social organizations and related practitioners are also required to be cultivated. Thus, the proximity centrality index identifies community governance principals and communities where they are located that have a higher proximity centrality in the geographic space, and more important offline community governance, planning activities (e.g., community micro-transformation, co-construction workshops) may be placed in such communities to expand the impact and radiometric power of the demonstration communities.
Specifically, the residence of each principal in the social network is extracted as the principal's node location. The approximate centrality formula for node i relative to other nodes j is:
(3) Center of intermediary (Betweenness Centrality)
Individuals possessing the highest degree of centrality in a certain social network are not necessarily the most active individuals. The mediating centrality may measure the effect of connecting different group networks across group nodes. As in practice, a resident is a member of a community social organization, industry commission, attends different community activities, replies or issues a plurality of posts, and exists in a plurality of community governance networks composed of different bodies. Thus, the roles of a part of key governance subjects (such as the social organization, the trunk, the members of the industry commission and the like) in different governance networks can be identified, and the roles of the subjects in the member connections inside the respective networks can be played.
The calculation formula of the intermediate centrality of the node beta is as follows:
wherein sigma ij (beta) is the shortest path number from node i to node j through node beta, σ ij Is the shortest path number, c, from node i to node j b (beta) measures the importance of node beta in connecting node i and node j.
In practical situations, the path of the node i and the node j passing through the node beta can be interpreted as the recommendation of the member i passing through the node beta (similar to the recommendation function of a business card) and the member j becoming friends, or the members i and j comment on each other under the post issued by the member beta, or the members i and j join in the group established by the member beta at the same time, and the like.
It should be noted that while the method operations of the above embodiments are described in a particular order, this does not require or imply that the operations must be performed in that particular order or that all of the illustrated operations be performed in order to achieve desirable results. Rather, the depicted steps may change the order of execution. Additionally or alternatively, certain steps may be omitted, multiple steps combined into one step to perform, and/or one step decomposed into multiple steps to perform.
In addition, the method can identify the material space environmental characteristics of the community based on the data such as the community photo and the like, construct corresponding evaluation indexes to score and monitor the environmental quality of the community, and the evaluation analysis result can be used for guiding the development of planning practices such as micro-transformation of the community, transformation of old and old communities and the like.
In addition, the model analysis module can also be internally provided with other analysis functions. For example, a lightweight and rapid MobileNet convolutional neural network model is used, material space environmental characteristics of communities are identified based on community photographs and other data, corresponding evaluation indexes are constructed to score and monitor environmental quality of communities, and evaluation analysis results can be used for guiding development of planning practices such as community micro-transformation and old community transformation.
5. Supervision module
The supervision indexes aiming at government work comprise working efficiency, financial balance, the number of times of organizing and negotiating meetings and related activities, resident satisfaction and the like; the supervision indexes aiming at enterprises/social organizations comprise project progress, financial balance, social benefit, government satisfaction, resident satisfaction and the like; the supervision indexes aiming at resident participation mainly comprise the proportion of the number of persons in resident in the resident committee, party organization, assistant organization and third party community organization, and the recommended adoption rate, complaint processing rate, the number of times of participating in the interview and the like of the resident. Satisfaction investigation, complaints and reporting pages are built in the interactive platform, satisfaction and complaints of community public services and facilities, social organizations, property management companies and other works are investigated by government, residents and enterprise/social organizations, the satisfaction is measured by using a Liktet five-component table, and performance and grading of all community works are published in the platform regularly according to the supervision indexes.
In addition, the community intelligent system can collect news, comments/complaints information of each network platform (such as national 12315 platform, people net, xinhua society, black cat complaints, knowledge, hundred-degree bar, and the like) about each community in real time. After the junk text is removed, emotion color analysis is performed on text contents such as news reports, comments, posts and the like by using python programming, comments with positive/negative emotion are identified, and are audited and processed by a specialist to extract useful information and guide improvement and improvement of community living environment quality.
The intelligent interaction platform is based on the user side, the staff side and the cloud server, the user side is used for acquiring information input by a user, the staff side is used for auditing and registering information sent by the user side, filing the information and returning information data to the user side, the cloud server is a platform for collecting, processing, storing and analyzing multi-source heterogeneous data in the background, and the rented cloud server is not specific, for example, the rentable cloud server can be rented.
In order to obtain information data input by a user, the user side can adopt a WeChat applet, an APP, a webpage and the like, wherein the WeChat applet/APP/website is an entry accessed by the user and a page for information display. The information data of the user is sent to the cloud server through the interfaces such as the WeChat applet, the APP and the webpage, and the cloud server performs sequence analysis on resident activities of all communities through data acquisition and processing to obtain basic behavior patterns and resident social and economic attributes of residents of different communities; according to the resident behavior mode of each community, the common service facility types of the residents of each community are deduced, and the accessibility of the living circle facilities of the community is calculated; the method comprises the steps of obtaining the attention and comment information of the multiple subjects in the community communication forum of the interaction platform, and further analyzing and obtaining the effect of each subject in the community treatment network by constructing the relationship network of the multiple subjects in the community treatment.
Example 2:
as shown in fig. 11, the present embodiment provides a community data analysis device, which is applied to a cloud server, and includes an acquisition unit 1101, a sequence unit 1102, an analysis unit 1103, and an association unit 1104, where specific functions of the respective units are as follows:
an obtaining unit 1101, configured to obtain spatiotemporal trajectory data and personal information data of a community resident, bind the spatiotemporal trajectory data and the personal information data with basic geographic block data, and identify each activity of the community resident as a different type;
a sequence unit 1102, configured to record daily activities of each resident of the community through a sequence composed of different activity types;
an analysis unit 1103, configured to perform a sequence analysis on activities of residents in each community, so as to obtain basic behavior patterns of residents in different communities;
and the association unit 1104 is used for mining social and economic attributes of residents associated with various behavior modes according to personal information data of the residents.
Further, the apparatus further comprises:
a calculating unit 1105, configured to inversely deduce the types of service facilities commonly used by residents in each community according to the behavior patterns of residents in each community, and calculate the reachability of the living circle facilities in the community.
Further, the apparatus further comprises:
The second analysis unit 1106 is configured to obtain the attention information and the comment information of the multiple subjects in the community communication forum of the interaction platform, and analyze the roles of the multiple subjects in the community treatment network by constructing a relationship network of the multiple subjects in the community treatment.
The specific implementation of each unit in this embodiment may be referred to embodiment 1, and will not be described in detail herein. It should be noted that, the apparatus provided in this embodiment is only exemplified by the division of the above functional units, and in practical application, the above functional allocation may be performed by different functional units according to needs, that is, the internal structure is divided into different functional units, so as to perform all or part of the functions described above.
It will be understood that the terms "first," "second," etc. used in the devices of the present embodiments may be used to describe various elements, but these elements are not limited by these terms. These terms are only used to distinguish one element from another element. For example, a first hint unit may be referred to as a second hint unit, and similarly, a second hint unit may be referred to as a first hint unit, both being hint units, but not the same hint unit, without departing from the scope of the present invention.
Example 3:
the present embodiment provides a storage medium, which is a computer-readable storage medium storing a computer program that, when executed by a processor, implements the community data analysis method of the above embodiment 1, as follows:
acquiring space-time track data and personal information data of community residents, binding the space-time track data and the personal information data with basic geographic block data, and marking various activities of the community residents as different types;
recording a sequence of activities of each resident of the community each day, the activities comprising different activity types;
performing sequence analysis on resident activities of each community to obtain basic behavior patterns of residents of different communities;
and mining social and economic attributes of residents associated with various behavior modes according to personal information data of the residents.
Further, the method further comprises: and according to the resident behavior mode of each community, reversely deducing the type of service facilities commonly used by residents of each community, and calculating the accessibility of living circle facilities of the community.
Further, the method further comprises: and acquiring the attention information and comment information of the multiple subjects in the community communication forum of the interaction platform, and analyzing the roles of the subjects in the community treatment network by constructing a relationship network of the multiple subjects in the community treatment.
The storage medium in this embodiment may be a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a usb disk, a removable hard disk, or the like.
In summary, the method and the system can be used for carrying out sequence analysis on resident activities of communities by acquiring space-time track data and user information data of the community residents, obtaining basic behavior patterns of the residents of different communities, mining social and economic attributes of residents associated with various behavior patterns, and being capable of being used for assisting planning practitioners in making planning decisions of community public service facilities and providing decision references of business service facility layout and positioning for related enterprises; the service facility types commonly used by residents in each community are reversely deduced through the resident behavior mode of each community, the accessibility of living circle facilities in the community is calculated by using an accumulated opportunity method, and the service facility type can be used for evaluating and designing the overall accessibility of the facilities in each community; by constructing a relationship network of the multi-element main body in the community management, the network relationship of benefit-critical parties such as government, enterprises, social organizations, expert scholars, residents, industry committee and the like in the community management can be observed, and the multi-element main body can be used in the management network by calculating the quantifiable measure.
The above description is only of the preferred embodiments of the present invention, but the protection scope of the present invention is not limited thereto, and any person skilled in the art can substitute or change the technical solution and the inventive conception of the present invention equally within the scope of the disclosure of the present invention.

Claims (8)

1. A community data analysis method applied to a cloud server, the method comprising:
acquiring space-time track data and personal information data of community residents, binding the space-time track data and the personal information data with basic geographic block data, and marking various activities of the community residents as different types;
recording the daily activities of each resident of the community through a sequence consisting of different activity types;
performing sequence analysis on resident activities of each community to obtain basic behavior patterns of residents of different communities;
mining social and economic attributes of residents associated with various behavior modes according to personal information data of the residents;
according to the resident behavior mode of each community, reversely deducing the type of service facilities commonly used by residents of each community, and calculating the accessibility of living circle facilities of the community by using an accumulated opportunity method; the calculation formula of the cumulative opportunity method is as follows:
A j =P j f(d)
Wherein A is j Representing cumulative opportunity reachability, P, of facility j in community j For the opportunity of the facility j, f (d) is a binary variable, when the distance cost d from the community to the facility j is smaller than a set threshold value, the value of f (d) is 1, and otherwise, the value of f (d) is 0;
acquiring attention information and comment information of a plurality of subjects in a community communication forum of an interaction platform, and analyzing the roles of the subjects in a community treatment network by constructing a relationship network of the plurality of subjects in the community treatment; the method for analyzing the roles of the subjects in the community management network by constructing a relationship network of a plurality of subjects in the community management comprises the following steps:
calculating the degree centrality, the proximity centrality and the intermediary centrality of each node in the network by constructing a relation network of the main body in community treatment, and analyzing the roles of the multiple main bodies in the community treatment network; the degree centrality is divided into an ingress degree and an egress degree, the ingress degree represents the attention degree of a node main body in a community treatment network, the egress degree represents the attention degree of the node main body to other main bodies in the network, the proximity centrality represents the close proximity degree of the node and other nodes in the network, the ingress proximity centrality and the egress proximity centrality are divided, and the mediation centrality is used for measuring the effect of connecting different group networks across group nodes;
Calculating the degree centrality, the proximity centrality and the intermediary centrality of each node in the network, wherein the degree centrality, the proximity centrality and the intermediary centrality are as follows:
if there are N nodes in a network,the degree center degree C of node i D (i) The calculation formula of (2) is as follows:
wherein,the method is used for calculating the contact quantity of the node i and other N-1 nodes, and the degree center formula is standardized as follows:
C D (i)′=C D (i)/(N-1)
the residence of each main body in the social network is extracted as the node position of the main body, and the approximate centrality formula of the node i relative to other nodes j is as follows:
the calculation formula of the intermediate centrality of the node beta is as follows:
wherein sigma ij (beta) is the shortest path number from node i to node j through node beta, σ ij Is the shortest path number, c, from node i to node j b (β) measures the importance of node β in connecting node i and node j;
the method comprises the steps of acquiring space-time track data and personal information data of community residents, binding the space-time track data and the personal information data with basic geographical block data, and marking all activities of the community residents as different types, wherein the method specifically comprises the following steps:
the method comprises the steps of extracting names, vector boundaries, longitudes and latitudes of communities and various facility categories and vector boundaries of roads, water systems and greenbelts through removing duplicate and invalid information of POI and AOI data crawled by a network map, and integrating the names, the vector boundaries and the vector boundaries as basic geographic block data;
Processing the original mobile phone signaling data, and restoring the real residence and trip information of the user based on the noise reduction calibration technology;
selecting a Spark cluster data processing platform, storing the cleaned mobile phone signaling data and basic map block data, matching the topological relation between a base station cell and a geographic block AOI, projecting user track data to a geographic entity, and correlating according to acquired resident personal information to obtain resident personal economic attribute data and travel track data;
the space-time track data and the personal information data are bound with the basic geographical block data, and all activities of residents are identified as nine types of residence, work, school, medical visit, shopping, leisure exercise, sightseeing tour, going out to transact business and others.
2. The community data analysis method according to claim 1, wherein the sequence analysis is performed on the activities of residents in each community to obtain basic behavior patterns of residents in different communities, and the method specifically comprises the following steps:
comparing the sequences of residents in the communities;
classifying according to the sequence similarity, wherein each sequence classification represents a behavior mode;
and comparing results are carried out according to the sequences of the residents in the communities, so that basic behavior patterns of the residents in different communities are obtained.
3. The community data analysis method according to claim 2, wherein the comparing each resident sequence of the community specifically comprises:
constructing a valuation function: if the two sequences of corresponding characters match, assigning a score of 1, and if the two sequences of corresponding characters mismatch, assigning a score of 0; gap occurs in either sequence strand, designated as gap penalty d;
two sequence alignment: when two sequences are aligned, the global alignment uses Needleman-Wunsch algorithm, the local alignment uses Smith-Waterman algorithm, each time the two algorithms are used for alignment, and one algorithm with the optimal matching result is selected to complete the sequence alignment.
4. The community data analysis method according to claim 2, wherein the categorizations are based on sequence similarity, each sequence categorization representing a pattern of behavior, comprising:
in global alignment, H ij The calculation formula of (2) is as follows:
in the local alignment, H ij The calculation formula of (2) is as follows:
the similarity score of the two sequences is marked as S, and the calculation formula is as follows:
S=H ijmax
classifying sequences with highest pairwise comparison scores into the same class according to the S value, wherein each sequence classification represents a resident behavior mode; wherein H is ij Representing the similarity of matrix elements of the ith row and jth column, W ij Representing the similarity weight for row i and column j, d represents the gap penalty.
5. The community data analysis method according to claim 2, wherein the comparing result is performed according to each resident sequence of the community, and the basic behavior patterns of the residents of different communities are obtained, specifically including:
and carrying out multi-sequence comparison analysis on the sequences, constructing a evolutionary tree by using a maximum likelihood method on the comparison result, extracting typical sequences of all branches in the evolutionary tree, and analyzing to obtain basic behavior patterns of residents in different communities.
6. The community data analysis method according to claim 1, wherein the mining of social and economic attributes of residents associated with various behavior patterns according to personal information data of the residents specifically comprises:
and obtaining the socioeconomic properties of the residents from the resident personal information data, coding the socioeconomic properties of the residents, and mining frequent items of the socioeconomic properties of the residents in each sequence classification to obtain the socioeconomic properties of the high-frequency residents associated with various behavior modes.
7. A community data analysis device applied to a cloud server, the device comprising:
the acquisition unit is used for acquiring space-time track data and personal information data of community residents, binding the space-time track data and the personal information data with basic geographic block data and marking all activities of the community residents as different types;
The sequence unit is used for recording the daily activities of all residents in the community through a sequence consisting of different activity types;
the first analysis unit is used for carrying out sequence analysis on resident activities of all communities to obtain basic behavior patterns of residents of different communities;
the association unit is used for mining social and economic attributes of residents associated with various behavior modes according to personal information data of the residents;
the computing unit is used for reversely deducing the types of service facilities commonly used by residents in each community according to the behavior mode of the residents in each community and computing the accessibility of living circle facilities of the community by using an accumulated opportunity method; the calculation formula of the cumulative opportunity method is as follows:
A j =P j f(d)
wherein A is j Representing cumulative opportunity reachability, P, of facility j in community j For the opportunity of the facility j, f (d) is a binary variable, when the distance cost d from the community to the facility j is smaller than a set threshold value, the value of f (d) is 1, and otherwise, the value of f (d) is 0;
the second analysis unit is used for acquiring the attention information and comment information of the multi-element body in the community communication forum of the interaction platform, and analyzing the effect of each body in the community treatment network by constructing a relationship network of the multi-element body in the community treatment; the method for analyzing the roles of the subjects in the community management network by constructing a relationship network of a plurality of subjects in the community management comprises the following steps:
Calculating the degree centrality, the proximity centrality and the intermediary centrality of each node in the network by constructing a relation network of the main body in community treatment, and analyzing the roles of the multiple main bodies in the community treatment network; the degree centrality is divided into an ingress degree and an egress degree, the ingress degree represents the attention degree of a node main body in a community treatment network, the egress degree represents the attention degree of the node main body to other main bodies in the network, the proximity centrality represents the close proximity degree of the node and other nodes in the network, the ingress proximity centrality and the egress proximity centrality are divided, and the mediation centrality is used for measuring the effect of connecting different group networks across group nodes;
calculating the degree centrality, the proximity centrality and the intermediary centrality of each node in the network, wherein the degree centrality, the proximity centrality and the intermediary centrality are as follows:
if one network has N nodes, the degree center degree C of the node i D (i) The calculation formula of (2) is as follows:
wherein,the method is used for calculating the contact quantity of the node i and other N-1 nodes, and the degree center formula is standardized as follows:
C D (i)′=C D (i)/(N-1)
the residence of each main body in the social network is extracted as the node position of the main body, and the approximate centrality formula of the node i relative to other nodes j is as follows:
the calculation formula of the intermediate centrality of the node beta is as follows:
Wherein sigma ij (beta) is the shortest path number from node i to node j through node beta, σ ij Is the shortest path number, c, from node i to node j b (β) measures the importance of node β in connecting node i and node j;
the method comprises the steps of acquiring space-time track data and personal information data of community residents, binding the space-time track data and the personal information data with basic geographical block data, and marking all activities of the community residents as different types, wherein the method specifically comprises the following steps:
the method comprises the steps of extracting names, vector boundaries, longitudes and latitudes of communities and various facility categories and vector boundaries of roads, water systems and greenbelts through removing duplicate and invalid information of POI and AOI data crawled by a network map, and integrating the names, the vector boundaries and the vector boundaries as basic geographic block data;
processing the original mobile phone signaling data, and restoring the real residence and trip information of the user based on the noise reduction calibration technology;
selecting a Spark cluster data processing platform, storing the cleaned mobile phone signaling data and basic map block data, matching the topological relation between a base station cell and a geographic block AOI, projecting user track data to a geographic entity, and correlating according to acquired resident personal information to obtain resident personal economic attribute data and travel track data;
The space-time track data and the personal information data are bound with the basic geographical block data, and all activities of residents are identified as nine types of residence, work, school, medical visit, shopping, leisure exercise, sightseeing tour, going out to transact business and others.
8. The community intelligent interaction platform is characterized by comprising a user end, a worker end and a cloud server, wherein the user end is connected with the worker end, and the cloud server is respectively connected with the user end and the worker end;
the user end is used for inputting information and sending the input information to the staff end;
the staff end is used for auditing and registering information and filing information sent by the user end, returning the audited information to the user end and sending the audited information to the cloud server;
the cloud server is configured to perform the community data analysis method of any one of claims 1 to 6.
CN202110303133.6A 2021-03-22 2021-03-22 Community data analysis method and device, community intelligent interaction platform and storage medium Active CN113010578B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110303133.6A CN113010578B (en) 2021-03-22 2021-03-22 Community data analysis method and device, community intelligent interaction platform and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110303133.6A CN113010578B (en) 2021-03-22 2021-03-22 Community data analysis method and device, community intelligent interaction platform and storage medium

Publications (2)

Publication Number Publication Date
CN113010578A CN113010578A (en) 2021-06-22
CN113010578B true CN113010578B (en) 2024-03-15

Family

ID=76404401

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110303133.6A Active CN113010578B (en) 2021-03-22 2021-03-22 Community data analysis method and device, community intelligent interaction platform and storage medium

Country Status (1)

Country Link
CN (1) CN113010578B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113792245A (en) * 2021-09-18 2021-12-14 合肥学院 Smart community dynamic information acquisition method
CN114997678B (en) * 2022-06-14 2024-09-20 上海同济城市规划设计研究院有限公司 Public service facility layout evaluation method based on travel life circle and facility type
CN115687280B (en) * 2022-10-19 2024-03-19 无锡辅仁信息科技有限公司 Cloud platform-based community information sharing system and method
CN116703189B (en) * 2022-11-01 2024-07-12 清华大学 Regional information processing method and device based on object movement unbalance analysis
CN116307653B (en) * 2023-05-25 2023-08-18 北京数立通科技有限责任公司 Community economy sharing system and method for information identification area planning integration

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018237098A1 (en) * 2017-06-20 2018-12-27 Graphika, Inc. Methods and systems for identifying markers of coordinated activity in social media movements
CN109388663A (en) * 2018-08-24 2019-02-26 中国电子科技集团公司电子科学研究院 A kind of big data intellectualized analysis platform of security fields towards the society
CN110619277A (en) * 2019-08-15 2019-12-27 青岛文达通科技股份有限公司 Multi-community intelligent deployment and control method and system
CN110968617A (en) * 2019-10-16 2020-04-07 北京交通大学 Road network key road section correlation analysis method based on position field
CN112215735A (en) * 2020-09-30 2021-01-12 全民认证科技(杭州)有限公司 Floating population intelligent analysis system based on cloud computing and analysis method thereof
CN112309505A (en) * 2020-11-05 2021-02-02 湖南大学 Anti-neocoronal inflammation drug discovery method based on network characterization
CN112380425A (en) * 2020-10-23 2021-02-19 华南理工大学 Community recommendation method, system, computer equipment and storage medium
CN112437091A (en) * 2020-11-30 2021-03-02 成都信息工程大学 Abnormal flow detection method oriented to host community behaviors

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI570646B (en) * 2015-11-23 2017-02-11 財團法人資訊工業策進會 Location based community integration matchmaking system, method and computer readable recording media for optimizing sales

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018237098A1 (en) * 2017-06-20 2018-12-27 Graphika, Inc. Methods and systems for identifying markers of coordinated activity in social media movements
CN109388663A (en) * 2018-08-24 2019-02-26 中国电子科技集团公司电子科学研究院 A kind of big data intellectualized analysis platform of security fields towards the society
CN110619277A (en) * 2019-08-15 2019-12-27 青岛文达通科技股份有限公司 Multi-community intelligent deployment and control method and system
CN110968617A (en) * 2019-10-16 2020-04-07 北京交通大学 Road network key road section correlation analysis method based on position field
CN112215735A (en) * 2020-09-30 2021-01-12 全民认证科技(杭州)有限公司 Floating population intelligent analysis system based on cloud computing and analysis method thereof
CN112380425A (en) * 2020-10-23 2021-02-19 华南理工大学 Community recommendation method, system, computer equipment and storage medium
CN112309505A (en) * 2020-11-05 2021-02-02 湖南大学 Anti-neocoronal inflammation drug discovery method based on network characterization
CN112437091A (en) * 2020-11-30 2021-03-02 成都信息工程大学 Abnormal flow detection method oriented to host community behaviors

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
"厚数据+大数据" 激活老旧社区公共生活――以北京鸭子桥社区为例;张希煜;茅明睿;邢晓旭;高硕;张鹏英;姜冬睿;;北京规划建设;20180915(05);全文 *
中国城市社区管理与服务的智慧化路径;柴彦威等;《地理科学进展》;20150415(第04期);全文 *
城市居民时空行为序列模式挖掘方法;李雄等;《地理与地理信息科学》;20090315(第02期);正文10-14页 *
曹杨新村社区更新的社会绩效评估――基于社会网络分析方法;杨辰;辛蕾;;城乡规划;20200215(01);全文 *
魏冬青,戴昊,贾贵华,张永红,徐沁."计算机辅助药物设计 全局和局部序列对比".《计算机辅助药物设计》.2017, *

Also Published As

Publication number Publication date
CN113010578A (en) 2021-06-22

Similar Documents

Publication Publication Date Title
CN113010578B (en) Community data analysis method and device, community intelligent interaction platform and storage medium
Purtova Property rights in personal data: A European perspective
Pucci et al. Mapping urban practices through mobile phone data
Chen et al. Analyzing the sentiment correlation between regular tweets and retweets
van den Homberg et al. Bridging the information gap of disaster responders by optimizing data selection using cost and quality
Mburu et al. Relative importance and determinants of landowners’ transaction costs in collaborative wildlife management in Kenya: an empirical analysis
CN110427406A (en) The method for digging and device of organization's related personnel's relationship
Jefferson Policing, data, and power-geometry: Intersections of crime analytics and race during urban restructuring
Paiho et al. Opportunities of collected city data for smart cities
CN112784116A (en) Method for identifying user industry identity in block chain
Ramsahai et al. Crime prediction in Trinidad and Tobago using big data analytics: Predictive policing in developing countries
Wiesböck et al. Crossing the Border for Higher Status? Occupational Mobility of East–West Commuters in the Central European Region
Alizadeh et al. A new model for efficiency evaluation of a bus fleet by window analysis in DEA and data mining
Dokshin The public speaks: Using large-scale public comments data in public response research
Yusifov Using public registers for development of electronic demography system: The case of Azerbaijan
Khanom et al. The News Crawler: A Big Data Approach to Local Information Ecosystems
Elliot et al. Data environment analysis and the key variable mapping system
Cho et al. Space–Time Sequential Similarity for Identifying Factors of Activity‐Travel Pattern Segmentation: A Measure of Sequence Alignment and Path Similarity
Weigand et al. A structural catalogue of the settlement morphology in refugee and IDP camps
Akatsuka et al. Analysis of the relationship between urban dynamics and prevalence of remote work based on population data generated from cellular networks
Li et al. Data Collection
Mendoza et al. Is camera trapping helping us to fill knowledge gaps related to the conservation of wild mammals?
Chow et al. Using web demographics to model population change of Vietnamese-Americans in Texas between 2000 and 2009
Dadashpoor et al. The impacts of ICT on urban travels: A significant shift during the Covid-19 pandemic
Berry et al. The Missing Link? Using LinkedIn Data to Measure Race, Ethnic, and Gender Differences in Employment Outcomes at Individual Companies

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant