CN112241422A - Intelligent water visitor mining analysis method based on entry and exit records - Google Patents

Intelligent water visitor mining analysis method based on entry and exit records Download PDF

Info

Publication number
CN112241422A
CN112241422A CN202010985527.XA CN202010985527A CN112241422A CN 112241422 A CN112241422 A CN 112241422A CN 202010985527 A CN202010985527 A CN 202010985527A CN 112241422 A CN112241422 A CN 112241422A
Authority
CN
China
Prior art keywords
water
entry
visitor
exit
behavior
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010985527.XA
Other languages
Chinese (zh)
Inventor
苏学武
水军
杨刚
冯少龙
曾志梁
周昭丽
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhuhai Xindehui Information Technology Co ltd
Original Assignee
Zhuhai Xindehui Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhuhai Xindehui Information Technology Co ltd filed Critical Zhuhai Xindehui Information Technology Co ltd
Priority to CN202010985527.XA priority Critical patent/CN112241422A/en
Publication of CN112241422A publication Critical patent/CN112241422A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/02Knowledge representation; Symbolic representation
    • G06N5/022Knowledge engineering; Knowledge acquisition
    • G06N5/025Extracting rules from data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/26Government or public services

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Tourism & Hospitality (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Educational Administration (AREA)
  • Human Resources & Organizations (AREA)
  • Probability & Statistics with Applications (AREA)
  • Fuzzy Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Economics (AREA)
  • General Health & Medical Sciences (AREA)
  • Development Economics (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • General Business, Economics & Management (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a water visitor intelligent mining analysis method based on entry and exit records, which comprises the following steps: s1, cleaning the existing massive immigration data, and marking the condition that one person has more certificates; s2, analyzing and obtaining behavior rules of the water visitor according to the captured water visitor entry and exit records; s3, according to the behavior rule of the water visitor in the step S2, the marked historical data in the step S1 are cleaned and analyzed, and a record conforming to the behavior rule of the water visitor is mined; and S4, automatically pushing the potential water passenger behavior object at regular intervals. According to the invention, the intelligent water visitor excavation analysis model is obtained by utilizing a big data technology and a data excavation/machine learning algorithm, so that the purpose of automatically pushing potential water visitor behavior objects is realized, the research and judgment efficiency is improved, and the tourism department can be effectively assisted to carry out water visitor behavior striking actions.

Description

Intelligent water visitor mining analysis method based on entry and exit records
Technical Field
The invention relates to the technical field of public safety management, in particular to an intelligent excavation analysis method for a water visitor based on an entry-exit record, which is applied to the field supervision of the customhouse trip inspection on the land.
Background
Customs is one of the most important constituent mechanisms in the management system of the inbound and outbound activities in China, and is mainly responsible for port supervision of inbound and outbound personnel, articles, goods and trade flows. At the land-road inspection port (such as Macau and Zhuhai), according to incomplete statistics, the number of professional water customers who live in both Macau and Marau throughout the year is ten thousands, which disturbs the market order and causes the loss of customs duty. The tourism process cannot be effectively identified due to the fact that the water visitor behavior persons are mixed with common tourists, employees, local residents and the like to pass.
In order to solve the problem, the current method is to carry out early warning when passing through the pass verification and finding the corresponding certificate for a plurality of times in one day, arrange to check the luggage and the like, and carry out spot check by depending on the experience of the side-checking personnel for years.
Although some water passengers can be caught by the methods, the method has the following major disadvantages: firstly, the spot check has subjectivity and is easy to misjudge; secondly, the situation that the same person uses a plurality of certificates for clearance is not considered, for example, the situation that different certificates such as passports, harbor and Macau pass are used for clearance for a plurality of times in one day is not considered, the situation that one person passes clearance for a plurality of times in the same day cannot be identified effectively in time, and missing judgment is easy to occur.
Disclosure of Invention
The invention provides an intelligent excavation analysis method for a water visitor based on entry and exit records, and aims to solve the problems that the existing entry and exit customs verification method is easy to misjudge, the condition that one person passes the customs many times in the same day cannot be effectively identified in time, and the judgment is easy to miss, so that the condition that one person passes the customs many times in the same day can be effectively identified in time, the occurrence of leakage situations is prevented, and potential professional water visitor behavior objects can be intelligently deduced based on massive passenger customs records.
In order to solve the technical problems, the technical scheme adopted by the invention is as follows.
The intelligent water visitor mining analysis method based on the entry and exit records comprises the following steps:
s1, cleaning the existing massive immigration data, and marking the condition that one person has more certificates;
s2, analyzing and obtaining behavior rules of the water visitor according to the captured water visitor entry and exit records;
s3, according to the behavior rule of the water visitor in the step S2, the marked historical data in the step S1 are cleaned and analyzed, and a record conforming to the behavior rule of the water visitor is mined;
and S4, automatically pushing the potential water passenger behavior object at regular intervals.
Further optimizing the technical solution, the step S1 includes the following steps:
s11, extracting certificate number information, dimension record information and transit picture information in all entry and exit records, and calculating whether the certificate number information, the dimension record information and the transit picture information exist: the certificate number information is different, and the dimension record information is the same;
s12, comparing the transit picture information on the basis of the situation that the certificate number information is different and the dimension record information is the same;
s13, in step S12, if the transit photograph information is the same, it is determined that the corresponding certificate number belongs to the same object, and the certificate is marked.
Further optimizing the technical scheme, the dimension record information comprises name, date of birth, gender and place of residence.
Further optimizing the technical scheme, the dimension record information comprises name, date of birth, gender and place of issue of certificate.
Further optimizing the technical solution, the step S2 includes the following steps:
s21, reversely checking the history entry and exit records of the captured water client objects by analyzing the captured water client objects;
s22, mining frequent item sets and association rules among all elements in the entry and exit records of the water customers by adopting an Apriori algorithm;
s23, generating an item set list related to each element in the entry and exit record;
s24, scanning and calculating the minimum support degree requirement of each item set, and removing the sets which do not accord with the minimum support degree;
s25, combining the remaining item sets;
s26, rescanning entry and exit records, and removing item sets which do not accord with the minimum support degree;
s27, repeating the step S25 and the step S26 until all item sets are removed;
and S28, obtaining the key elements and the incidence relation of the key elements, and determining the behavior rule of the water visitor.
In step S21, the time interval of each entry and exit, the number of round trips per day, and the percentage of the number of round trips per day in each month are calculated.
In step S22, the elements further include: gender, nationality, entry and exit categories, entry and exit border stations, entry and exit time, traffic mode of each entry and exit, the proportion of each entry and exit time interval in each month, the proportion of the number of round trips per day in each month, and the proportion of the number of round trips per day in each month.
In step S3 and step S4, the backstage automatic full-quantity comparison analysis is performed on the entry and exit records per month according to the behavior rules of the water guests, and the information of the potential water guest behavior objects is automatically pushed to assist the tourist inspectors in fighting the water guests.
Due to the adoption of the technical scheme, the technical progress of the invention is as follows.
According to the invention, the intelligent water visitor excavation analysis model is obtained by utilizing a big data technology and a data excavation/machine learning algorithm, so that the purpose of automatically pushing potential water visitor behavior objects is realized, the research and judgment efficiency is improved, and the tourism department can be effectively assisted to carry out water visitor behavior striking actions.
According to the invention, through data cleaning and customs rule mining analysis on the historical entry and exit record, the potential water customers according with the rule can be automatically pushed out, the incidence relation among one person and multiple certificates is solved, and the problem of missing during behavior analysis of the entry and exit record water customers is reduced as much as possible.
Drawings
FIG. 1 is a flow chart of the architecture of the present invention.
Detailed Description
The invention will be described in further detail below with reference to the figures and specific examples.
An intelligent water visitor mining analysis method based on entry and exit records is shown in fig. 1 and comprises the following steps:
s1, data cleaning (one person for more evidences): the existing massive entry and exit data are cleaned, the condition that one person has multiple certificates is marked, and the accuracy of deduction is improved.
Step S1 includes the following steps:
s11, extracting certificate number information, dimension record information and transit picture information in all entry and exit records, and calculating whether the certificate number information, the dimension record information and the transit picture information exist: the certificate number information is different, and the dimension record information is the same. The dimension record information includes name, date of birth, gender, place of residence, and the like. Or the dimension record information includes name, date of birth, gender, place of issue, etc.
S12, if the certificate number information is different and the dimension record information is the same, the two situations are divided into the following two situations:
A. the certificate numbers are different, and the dimensional records of the name, the birth date, the gender, the original place and the like are the same;
B. the certificate numbers are different, and the dimension records of the name, the birth date, the gender, the certificate issuing place and the like are the same.
The transit picture information is compared on the basis.
S13, in step S12, if the transit photograph information is the same, it is determined that the corresponding certificate number belongs to the same object, and the certificate is marked. During data mining, records generated by different certificate numbers are all used as the same object to be analyzed.
S2, generating a behavior rule of the water guest: and calculating to obtain new elements through the captured entry and exit records of the water visitor at present, and analyzing to obtain the behavior rule of the water visitor every day/every month.
Step S2 includes the following steps:
and S21, reversely checking the historical entry and exit records of the captured water client objects by analyzing the captured water client objects. In step S21, the time interval between each entry and exit, the number of round trips per day, and the percentage of days of round trips per month per day are calculated.
And S22, mining frequent item sets and association rules among all elements in the entry and exit records of the water customers by adopting an Apriori algorithm.
In step S22, each element includes: gender, nationality, entry and exit categories, entry and exit border stations (gateways), entry and exit time, traffic mode for each entry and exit, proportion of each entry and exit time interval in each month, proportion of the number of round trips per day in each month, and proportion of the number of round trips per day in each month.
S23, an item set list related to each element in the entry/exit record is generated.
Taking the example that the border station of the pearl sea gate captures the entry and exit records of the water guest 2019, the items are as follows:
Figure BDA0002689112750000051
and S24, scanning and calculating the minimum support degree requirement of each item set, and removing the sets which do not accord with the minimum support degree. By analyzing the support degree of each item set, the calculation result is as follows:
item set Degree of support
Nationality (CHI) 0.5
Nationality book (HK) 0.3
Nationality (TAI) 0.2
Traffic mode (Walking) 0.7
Traffic mode (Car) 0.3
The time interval of one time for one time to go out and one time for one time to go in the same day is 0-20min) 0.2
Time interval of one inlet and one outlet on the same day (20-40min) 0.6
Time interval of one inlet and one outlet on the same day (40-60min) 0.2
The same day satisfies the record of the time interval between the entrance and the exit (>=3) 1.0
Number of eligible days per month: (>=9) 0.7
Number of eligible days per month: (<9) 0.3
Setting the minimum support degree to be 0.3, wherein only the items more than 0.3 belong to the frequent item set, and the final result is as follows:
item set Degree of support
Nationality (CHI) 0.5
Nationality book (HK) 0.3
Traffic mode (Walking) 0.7
Traffic mode (Car) 0.3
Time interval of one inlet and one outlet on the same day (20-40min) 0.6
The same day satisfies the record of the time interval between the entrance and the exit (>=3) 1.0
Number of eligible days per month: (>=9) 0.7
Number of eligible days per month: (<9) 0.3
And S25, combining the rest item sets to generate an item set containing more than two elements.
Combining two items to obtain the following item sets:
Figure BDA0002689112750000061
Figure BDA0002689112750000071
and S26, rescanning the entry and exit records, and removing the item sets which do not meet the minimum support degree.
By analyzing the support degree of each item set, the calculation result is as follows:
Figure BDA0002689112750000072
Figure BDA0002689112750000081
setting the minimum support degree to be 0.5, only the items more than 0.5 belong to the frequent item set, and the final result is as follows:
Figure BDA0002689112750000082
s27, repeating the step S25 and the step S26 until all item sets are removed.
And S28, obtaining the key elements and the incidence relation of the key elements, and determining the behavior rule of the water visitor.
The key elements are an entry and exit border station (gateway), entry and exit time, a traffic mode for each entry and exit, an entry and exit time interval, times of round trip each day and a monthly proportion, wherein the monthly proportion is the proportion of the entry and exit time interval in each month, the times of round trip each day in each month and the days of round trip each day in each month.
The following relationship is finally obtained: a. transportation means (walking); b. one-in and one-out time interval (20-40min) on the same day; c. meeting the record of the time interval of entrance and exit on the same day (> ═ 3); d. the number of days per month (> ═ 9).
And (4) scanning all data according to the rules, and pushing the objects as suspicious water visitor objects to the front end as long as the transit records simultaneously accord with the 4 rules to assist the tourists in hitting the water visitors.
And S3, according to the behavior rules of the water customers in the step S2, the marked historical data in the step S1 are cleaned and analyzed, records conforming to the behavior rules of the water customers are mined, particularly people whose entry and exit records conform to the rules of the water customers in the last month are comprehensively operated, and intelligent mining deduction is carried out.
And S4, automatically pushing the potential water passenger behavior object at regular intervals.
All suspected water passenger information obtained through algorithm mining and analysis can be recorded into an appointed data table in an updating mode, the business analysis system obtains the latest record from the data table regularly, and relevant objects are checked/compared again to finally confirm the identity background of the relevant objects.
In step S3 and step S4, the backstage automatically compares the total amount of the entries and exits in each month according to the behavior rules of the water visitor, and automatically pushes the information of the potential water visitor behavior object to assist the tourist staff in fighting the water visitor.
According to the invention, through data cleaning, the incidence relation between one person and multiple certificates is solved, and the problem of omission in behavior analysis of entry and exit recorded water customers is reduced as much as possible.
According to the invention, the intelligent water visitor excavation analysis model is obtained by utilizing a big data technology and a data excavation/machine learning algorithm, so that the purpose of automatically pushing potential water visitor behavior objects is realized, the research and judgment efficiency is improved, and the tourism department can be effectively assisted to carry out water visitor behavior striking actions.

Claims (8)

1. The intelligent water visitor mining analysis method based on the entry and exit records is characterized by comprising the following steps of:
s1, cleaning the existing massive immigration data, and marking the condition that one person has more certificates;
s2, analyzing and obtaining behavior rules of the water visitor according to the captured water visitor entry and exit records;
s3, according to the behavior rule of the water visitor in the step S2, the marked historical data in the step S1 are cleaned and analyzed, and a record conforming to the behavior rule of the water visitor is mined;
and S4, automatically pushing the potential water passenger behavior object at regular intervals.
2. The intelligent excavation analysis method for water customers based on entry and exit records according to claim 1, wherein the step S1 comprises the following steps:
s11, extracting certificate number information, dimension record information and transit picture information in all entry and exit records, and calculating whether the certificate number information, the dimension record information and the transit picture information exist: the certificate number information is different, and the dimension record information is the same;
s12, comparing the transit picture information on the basis of the situation that the certificate number information is different and the dimension record information is the same;
s13, in step S12, if the transit photograph information is the same, it is determined that the corresponding certificate number belongs to the same object, and the certificate is marked.
3. The intelligent visitor mining analysis method based on entry and exit records according to claim 2, wherein the dimension record information comprises name, date of birth, gender and place of residence.
4. The intelligent entry and exit record-based water visitor mining analysis method as claimed in claim 2, wherein the dimension record information includes name, date of birth, gender and place of issue.
5. The intelligent excavation analysis method for water customers based on entry and exit records according to claim 1, wherein the step S2 comprises the following steps:
s21, reversely checking the history entry and exit records of the captured water client objects by analyzing the captured water client objects;
s22, mining frequent item sets and association rules among all elements in the entry and exit records of the water customers by adopting an Apriori algorithm;
s23, generating an item set list related to each element in the entry and exit record;
s24, scanning and calculating the minimum support degree requirement of each item set, and removing the sets which do not accord with the minimum support degree;
s25, combining the remaining item sets;
s26, rescanning entry and exit records, and removing item sets which do not accord with the minimum support degree;
s27, repeating the step S25 and the step S26 until all item sets are removed;
and S28, obtaining the key elements and the incidence relation of the key elements, and determining the behavior rule of the water visitor.
6. The intelligent mining analysis method for the water visitor based on the entry and exit record as claimed in claim 5, wherein in step S21, the time interval, the number of round trips per day, and the percentage of the number of round trips per day in each month are calculated respectively.
7. The intelligent mining analysis method for water customers based on entry and exit records according to claim 6, wherein in the step S22, each element comprises: gender, nationality, entry and exit categories, entry and exit border stations, entry and exit time, traffic mode of each entry and exit, the proportion of each entry and exit time interval in each month, the proportion of the number of round trips per day in each month, and the proportion of the number of round trips per day in each month.
8. The intelligent digging and analyzing method for water visitor based on entry and exit record as claimed in claim 1, wherein in steps S3 and S4, the backstage automatic full-quantity comparison analyzes the monthly entry and exit record according to the behavior rules of water visitor, and pushes the information of the potential water visitor behavior object automatically to assist the tourist staff to hit the water visitor.
CN202010985527.XA 2020-09-18 2020-09-18 Intelligent water visitor mining analysis method based on entry and exit records Pending CN112241422A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010985527.XA CN112241422A (en) 2020-09-18 2020-09-18 Intelligent water visitor mining analysis method based on entry and exit records

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010985527.XA CN112241422A (en) 2020-09-18 2020-09-18 Intelligent water visitor mining analysis method based on entry and exit records

Publications (1)

Publication Number Publication Date
CN112241422A true CN112241422A (en) 2021-01-19

Family

ID=74171556

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010985527.XA Pending CN112241422A (en) 2020-09-18 2020-09-18 Intelligent water visitor mining analysis method based on entry and exit records

Country Status (1)

Country Link
CN (1) CN112241422A (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2002077776A2 (en) * 2001-03-27 2002-10-03 Vande Pol Mark E Free-market environmental management system having insured certification to a process standard
CN102075963A (en) * 2009-11-25 2011-05-25 中国移动通信集团贵州有限公司 A mobile business data acquisition analysis method and a system for the same
CN110716957A (en) * 2019-09-23 2020-01-21 珠海市新德汇信息技术有限公司 Intelligent mining and analyzing method for class case suspicious objects

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2002077776A2 (en) * 2001-03-27 2002-10-03 Vande Pol Mark E Free-market environmental management system having insured certification to a process standard
CN102075963A (en) * 2009-11-25 2011-05-25 中国移动通信集团贵州有限公司 A mobile business data acquisition analysis method and a system for the same
CN110716957A (en) * 2019-09-23 2020-01-21 珠海市新德汇信息技术有限公司 Intelligent mining and analyzing method for class case suspicious objects

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
胡振: "基于大数据的深港口岸客流出行行为分析", 《中国优秀博硕士学位论文全文数据库(硕士)》 *

Similar Documents

Publication Publication Date Title
Toqué et al. Short & long term forecasting of multimodal transport passenger flows with machine learning methods
Joshi et al. Crime analysis using K-means clustering
Joh Policing by numbers: big data and the Fourth Amendment
CN105719489B (en) A kind of fake-licensed car detection method that probability is flowed to based on bayonet vehicle
CN104036360B (en) User data processing system and processing method based on magcard attendance behaviors
CN109767327A (en) Customer information acquisition and its application method based on anti money washing
Bartel An analysis of firm demand for protection against crime
CN111476177B (en) Method and device for detecting suspects
Yong et al. Mining metro commuting mobility patterns using massive smart card data
CN112363999B (en) Public traffic passenger flow analysis method, device, equipment and storage medium
CN108257385B (en) Method for discriminating abnormal events based on public transportation
CN110889092A (en) Short-time large-scale activity peripheral track station passenger flow volume prediction method based on track transaction data
Yang et al. DBSCAN clustering algorithm applied to identify suspicious financial transactions
CN113077182B (en) Vehicle maintenance abnormity monitoring system and method
CN110164136B (en) Fake-licensed vehicle identification method
Webber Youth justice conferences versus children's court: A comparison of cost-effectiveness
CN109409563B (en) Method, system and storage medium for analyzing real-time number of people in public transport operation vehicle
Caicedo et al. Influence of Socioeconomic Factors on Transit Demand During the COVID-19 Pandemic: A Case Study of Bogotá’s BRT System
CN112241422A (en) Intelligent water visitor mining analysis method based on entry and exit records
Gu et al. Detecting pickpocketing offenders by analyzing beijing metro subway data
CN103761449A (en) Criminal propensity and risk degree quantifying method and system based on AHP
CN113221472A (en) Passenger flow prediction method based on LSTM
Hamdy et al. Criminal act detection and identification model
Detotto et al. Consolidation of prosecutor offices
CN116775747A (en) Personnel early warning method and system based on Apriori algorithm

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20210119