CN117291758A - Hotel consumption full data aggregation platform based on data mining - Google Patents

Hotel consumption full data aggregation platform based on data mining Download PDF

Info

Publication number
CN117291758A
CN117291758A CN202311189235.5A CN202311189235A CN117291758A CN 117291758 A CN117291758 A CN 117291758A CN 202311189235 A CN202311189235 A CN 202311189235A CN 117291758 A CN117291758 A CN 117291758A
Authority
CN
China
Prior art keywords
hotel
consumption
data
data mining
database
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311189235.5A
Other languages
Chinese (zh)
Inventor
张长征
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Zhenhui Information Technology Co ltd
Original Assignee
Shanghai Zhenhui Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Zhenhui Information Technology Co ltd filed Critical Shanghai Zhenhui Information Technology Co ltd
Priority to CN202311189235.5A priority Critical patent/CN117291758A/en
Publication of CN117291758A publication Critical patent/CN117291758A/en
Pending legal-status Critical Current

Links

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention provides a hotel consumption analysis method based on data mining, which comprises the following steps: acquiring offline hotel consumption data and online hotel consumption data updated in real time through a data mining technology; after the consumption data are respectively processed, the obtained hotel consumption data are stored to form a database; extracting a field containing a hotel name from a database by utilizing a data mining technology, and verifying and comparing to obtain a data set corresponding to the hotel name so as to construct a corresponding hotel aggregation platform; the hotel aggregation platform at least comprises a database for storing a data set, a data input unit, a data retrieval and query unit and a statistical analysis unit; the statistical analysis unit is used for acquiring average consumption amount, average consumption frequency and consumption trend graph of the hotel through an index algorithm in a set time period, and obtaining a statistical analysis report. The invention obtains complete consumption data through data mining to obtain optimal discount and realize cost saving of travel.

Description

Hotel consumption full data aggregation platform based on data mining
Technical Field
The invention relates to the technical field of data mining, in particular to a hotel consumption full data aggregation platform based on data mining.
Background
In the prior art, acquisition and arrangement of hotel consumption data generally requires manual operation, and is low in efficiency and easy to cause data errors. Furthermore, the prior art fails to provide detailed consumption reports and insights that fail to help businesses understand and optimize travel costs. Especially, hotel consumption data aiming at two sales modes of online and offline can not be well obtained to obtain a reasonable consumption hotel in data isolation, so a new technical scheme is needed to solve the problems.
Disclosure of Invention
The invention aims to solve the problem that complete consumption data cannot be acquired due to data isolation under a plurality of TMC modes of enterprises by utilizing a data mining mode in the field of purchasing or tourism industry. Through the aggregation platform, enterprises can obtain complete consumption data, so that the enterprises are helped to obtain optimal discounts, and travel cost saving is achieved.
The invention is realized by the following technical scheme.
The hotel consumption analysis method based on data mining comprises the following steps:
acquiring offline hotel consumption data and online hotel consumption data updated in real time through a data mining technology;
after data processing is carried out on online hotel consumption data and offline hotel consumption data respectively, the obtained hotel consumption data are stored to form a database;
extracting a field containing a hotel name from a database by utilizing a data mining technology, and verifying and comparing to obtain a data set corresponding to the hotel name so as to construct a corresponding hotel aggregation platform;
the hotel aggregation platform at least comprises a database for storing a data set, a data input unit, a data retrieval and query unit and a statistical analysis unit;
the statistical analysis unit is used for acquiring average consumption amount, average consumption frequency and consumption trend graph of the hotel through an index algorithm in a set time period, and obtaining a statistical analysis report.
As a further improvement of the invention, the data mining technology comprises the step of acquiring hotel consumption data by utilizing an interface of a third party cooperation mechanism in a mode of cooperation of the third party mechanism.
As a further improvement of the present invention, the data mining technique also includes obtaining hotel consumption data by cooperating with the hotel or investigating interviews.
As a further development of the invention, the data processing is in particular: and carrying out duplicate removal by comparing and deleting the hotel consumption data containing the date, the amount, the hotel name and the place, filling by checking the missing value, carrying out unified processing on the hotel name and the place by using the same field format, and converting the amount into a unified unit.
As a further improvement of the invention, the invention also comprises the construction of an extraction model of hotel names, which is specifically as follows:
s11) constructing a database table containing text data by using online hotel consumption data;
s12) obtaining characteristics in text data through a language processing technology;
s13) labeling samples containing hotel names and non-hotel names by using an algorithm to form labeling data;
s14) constructing an extraction model with naive bayes, and then performing cross-validation to obtain a final extraction model.
As a further improvement of the present invention, the language processing technique in step S12) is specifically an N-gram model, which segments text data into a plurality of words, each word and the word before segmentation respectively constitute a feature.
As a further improvement of the invention, before constructing the corresponding hotel aggregation platform, the method further comprises the step of acquiring the association relation between different hotel names through dividing words, and placing hotels with cross association into the same data set.
As a further improvement of the invention, the database also comprises consumption items, the hotel aggregation platform is embedded with an Apriori algorithm, generates association rules through frequent item sets in the data set, further obtains the association rules among the consumption items by using the support and the confidence,
as a further improvement of the invention, the generation of the consumption report specifically comprises the steps of:
s21) obtaining average consumption amounts of each day, each month and each quarter by using an average value method, and generating a corresponding change trend graph;
s22) selecting hotel consumption data in a set time period, and drawing a consumption frequency and a consumption frequency change trend chart in unit time in the time period;
s23) selecting an exponential smoothing method through a time sequence analysis method to obtain and predict a consumption trend graph in a certain time period;
s24) displaying the consumption trend graphs, analyzing the consumption trend graphs with standard deviation to obtain periodical change rules, integrating analysis results of the consumption trend graphs with a decision tree, and correlating the consumption trend graphs with correlations to obtain correlations so as to form the statistical analysis report.
As a further improvement of the invention, the hotel aggregation platform also comprises a visualization component and an interactive dashboard for selecting to view consumption reports and average consumption trends.
The beneficial effects of the invention are as follows:
according to the invention, the online and offline hotel consumption data are obtained by utilizing the data mining method, so that omission of offline data of the hotel consumption data is avoided, the data are more complete, and a user or a purchasing party can have more choices to provide basic data for subsequent analysis.
In the invention, the data of the hotel names is mined, so that the correlation among a plurality of possibly related hotels is mined, a data basis is provided for subsequent long-term cooperation, the correlation among a plurality of hotels is further obtained, and the method has more advantages in negotiations such as subsequent preferential degree and the like.
According to the hotel aggregation platform, the generated hotel aggregation platform is comprehensive in data, can perform various analyses, further enables a user to have better experience, enables corresponding selection to be more intuitively performed, and saves more cost for enterprises and users.
Drawings
Fig. 1 is a flow chart of a hotel consumption analysis method based on data mining.
Fig. 2 is a flow chart of the construction of the extraction model of hotel names provided by the invention.
FIG. 3 is a flow chart of generating a consumption report provided by the present invention.
Detailed Description
The present invention will be described in detail below with reference to the embodiments shown in the drawings, but it should be understood that the embodiments are not limited to the present invention, and functional, method, or structural equivalents and alternatives according to the embodiments are within the scope of protection of the present invention by those skilled in the art.
The hotel consumption analysis method based on data mining in the embodiment comprises the following steps:
firstly, acquiring offline hotel consumption data and online hotel consumption data updated in real time through a data mining technology;
because the disclosed data can only acquire part of hotel consumption data, in the invention, in order to deeply mine the data, a plurality of data mining modes are selected, in particular
1) The data mining technology comprises the step of acquiring hotel consumption data by utilizing an interface of a third party cooperation mechanism in a cooperation mode of the third party cooperation mechanism.
In this embodiment, the third party affiliates, including but not limited to hotel management systems, payment institutions, and other institutions subscribing to hotel consumers, are relative terms, and the data of these third party institutions is not limited to web pages or apps. Such as obtaining hotel consumption data in a hotel reservation platform, a hotel management system, and a credit card payment channel, using crawler technology.
Specifically, such as through a hotel reservation platform: by cooperating with the main online hotel reservation platforms, like booking. Com, expedia, carry-over, etc., you can get hotel reservation and consumption data of the user on these platforms. These platforms typically provide API interfaces that you can use to obtain data.
Credit card payment channel: in cooperation with the credit card payment channel, hotel consumption data for payment through the credit card can be obtained. You can cooperate with banks or payment institutions to obtain users when consuming in hotels
To obtain more complete data, the data mining technique also includes obtaining hotel consumption data by cooperating with the hotel or investigating interviews.
Specifically, for example, a hotel management system may obtain offline hotel consumption data by contacting a hotel owner or operator, negotiating. Hotel management systems typically record guests' check-in information, amounts spent, payment methods, etc.
Secondly, respectively carrying out data processing on online hotel consumption data and offline hotel consumption data, and then storing the obtained hotel consumption data to form a database;
in this embodiment, the data processing specifically includes: and carrying out duplicate removal by comparing and deleting the hotel consumption data containing the date, the amount, the hotel name and the place, filling by checking the missing value, carrying out unified processing on the hotel name and the place by using the same field format, and converting the amount into a unified unit. By this step, the purpose is to ensure accuracy and consistency of the data. The data may be cleaned using a data cleaning tool, repeated data removed, and converted to a uniform format. In this embodiment, common fields in hotel consumption data, such as consumption date, consumption amount, hotel name, etc., are determined before processing. These common fields will be used to integrate the data.
In this embodiment, due to different acquisition channels, the obtained data may be different, so that the data after being sorted may be stored and managed according to a uniform format and structure, so as to facilitate subsequent processing and analysis. Relational databases or NoSQL databases may be used to store data and appropriate data models and indexes are designed to support data storage and query operations.
For example, a batch of hotel consumption data is obtained from a hotel reservation website, including fields for consumption date, consumption amount, hotel name, etc. First, the data is crawled from the website using data mining techniques and saved to a database. Then, the data is cleaned, repeated data is removed, and the data is converted into a uniform format. Finally, the cleaned and converted data is stored in a database for subsequent processing and analysis.
Thirdly, extracting fields containing hotel names from a database by utilizing a data mining technology, and verifying and comparing to obtain a data set corresponding to the hotel names so as to construct a corresponding hotel aggregation platform;
firstly, constructing an extraction model comprising hotel names, which specifically comprises the following steps:
s11) constructing a database table containing text data by using online hotel consumption data;
in this embodiment, a data mining technique is applied to extract a field of the name of the enterprise from the collected data. The fields containing the business name may be extracted from the data using text mining techniques such as keyword extraction, named entity recognition, etc. It is also noted that the database table it forms in the dataset contains fields that may contain business names. The fields, at least one of Chinese, english or Arabic numerals, may also contain special letters and punctuation marks.
S12) obtaining characteristics in text data through a language processing technology;
the language processing technique in step S12) is specifically an N-gram model, which segments text data into a plurality of words, and each word and the word before segmentation respectively form a feature.
Thereby identifying primary and secondary hotels, and group affiliate hotels and associated hotels
In this embodiment, the feature extraction may be performed using a bag of words model, TF-IDF, n-gram, etc. You can use Natural Language Processing (NLP) technology and libraries (e.g., NLTK, spaCy), but since there may be various types of associations such as the same word, etc. by a primary and secondary company, etc., an N-gram model is chosen that can split the text into a sequence of N consecutive words. The N-gram model may capture contextual information between terms, e.g., the 2-gram model represents two consecutive terms. Thereby avoiding that some hotels that may have an association are not associated.
Specifically, the words in the embodiment include keywords, and the regular names of businesses on the outgoing line can be identified and extracted later through regular expressions or keyword matching. A set of rules may be designed or machine learning algorithms may be used to identify and extract business names.
In order to ensure the authenticity of the data, in this embodiment, the extracted enterprise name may also be verified and compared to determine the authenticity and credibility thereof. An external data source or other verification means may be used to verify and compare the extracted business names.
Suppose that some fields containing the name of the business, such as "ABC company", "XYZ group", etc., are extracted from hotel consumption data. First, these fields are extracted from the data using data mining techniques. Then, by setting regular expressions or keyword matching, etc., regular business names on the outgoing line, such as "ABC company", are identified and extracted. And finally, verifying and comparing the extracted enterprise name to ensure the authenticity and credibility of the enterprise name.
S13) labeling samples containing hotel names and non-hotel names by using an algorithm to form labeling data;
in this step, a labeling dataset is prepared for the machine learning algorithm, which contains samples of known business names and non-business names. It is also possible to choose to manually annotate a part of the dataset or to use an existing annotated dataset.
S14) constructing an extraction model with naive bayes, and then performing cross-validation to obtain a final extraction model.
The Naive Bayes algorithm is a classification algorithm based on Bayes theorem, which can be used for feature extraction and classification tasks. In the naive bayes algorithm, the process of feature extraction is implemented by calculating the conditional probability of each feature under a given class. The following are the basic steps of feature extraction in a naive bayes algorithm:
data preparation: a training dataset is prepared that contains features and categories. Ensuring that the data set is cleaned and preprocessed, e.g. to remove noise, to process missing values, etc.
Feature selection: appropriate features are selected for training the model. Domain knowledge, statistical methods, or feature selection algorithms (e.g., chi-square test, information gain, etc.) may be used to select the most relevant features.
Feature extraction: a conditional probability under a given class is calculated for each feature. The naive bayes algorithm assumes that features are independent of each other, so that the conditional probabilities of each feature can be calculated independently and then combined to obtain the overall probability.
Training a model: the training data set is used to calculate the conditional probability for each feature under each category. Frequency statistics or smoothing techniques (e.g., laplace smoothing) may be used to estimate the probability values.
Feature extraction and classification: for a new input sample, its eigenvalues are calculated and the probability that the sample belongs to each class is calculated using bayesian theorem. The category with the highest probability is selected as the prediction result.
The hotel aggregation platform at least comprises a database for storing a data set, a data input unit, a data retrieval and inquiry unit and a statistical analysis unit;
specifically, in order to ensure that the association of hotels is thoroughly mined, and meanwhile, the method also comprises the step of acquiring association relations among different hotel names through segmentation words before constructing a corresponding hotel aggregation platform by matching with segmentation text characteristics in an N-gram model, and placing hotels with cross association into the same data set.
In this embodiment, an aggregation platform is established by extracting the name of the enterprise, etc., for storing and managing the consumption total data. The database structure and data model are designed to support data storage and query operations. Basic functions of the aggregation platform are realized, including data uploading, data storage, data retrieval and the like.
The statistical analysis unit is used for acquiring average consumption amount, average consumption frequency and consumption trend graph of the hotel through an index algorithm in a set time period, and obtaining a statistical analysis report.
The database also comprises consumption items, the hotel aggregation platform is embedded with an Apriori algorithm, and generates association rules through frequent item sets in the data set, further obtains the association rules among the consumption items by using the support degree and the confidence degree,
the mining process of the specific association rule is as follows:
to mine the association between different consumer items of a hotel, an association rule mining algorithm may be used, with the most commonly used algorithm being the Apriori algorithm. The Apriori algorithm may discover frequent item sets and association rules for describing association relationships between different consumer items. The following is the basic steps of association rule mining using the Apriori algorithm:
data preparation: a transaction dataset containing hotel consumer items is prepared by a hotel aggregation platform, or an existing database. Each transaction contains one or more consumer items.
Frequent item set mining: using Apriori algorithm, combinations of consumer items that frequently occur in the dataset, i.e., frequent item sets, are found. Frequent item sets refer to combinations of items that occur more frequently in a data set than a preset threshold.
And (3) generating association rules: association rules are generated from the frequent item set. The association rule refers to a rule like "a- > B", and indicates that there is an association relationship between the appearance of the consumption item a and the appearance of the consumption item B. The generation of association rules is based on a measure of support and confidence.
Support and confidence calculation: and calculating the support and the confidence of each frequent item set and the association rule. The support represents the proportion of the number of transactions comprising the item set or rule to the total number of transactions, and the confidence represents the probability of B occurring in the case of a occurring.
Association rule screening: and screening out the association rule with enough support and confidence according to the set support and confidence threshold. The threshold may be adjusted as needed to control the number and quality of rules.
Through these steps, it is possible to find the association between different consumer items of the hotel, for example, some consumer items may appear together frequently, or the appearance of some consumer items may result in the appearance of other consumer items. These association rules may provide insight to hotels regarding consumer behavior and consumption habits, thereby optimizing the operating policies and services of the hotels. Meanwhile, enterprises can use the association rules, and the mining method discovers association relations among different consumption items so as to better optimize travel cost.
The generation of the consumption report specifically comprises the following steps:
s21) obtaining average consumption amounts of each day, each month and each quarter by using an average value method, and generating a corresponding change trend graph; such as calculating the total amount of money consumed per month, and then dividing by the number of times or number of times consumed. And obtaining the method by embedding a calculation formula.
S22) selecting hotel consumption data in a set time period, and drawing a consumption frequency and a consumption frequency change trend chart in unit time in the time period; specifically, the calculation method is to divide the total number of consumption times by the length of the period of time, such as monthly or quarterly. The quality of the hotel can be seen through the frequency, and the hotels are in the busy season and the off season.
S23) selecting an exponential smoothing method through a time sequence analysis method to obtain and predict a consumption trend graph in a certain time period, wherein the consumption trend graph is used for measuring the change trend of the consumption amount or the consumption times so as to know the development trend of the consumption behavior;
s24) displaying the consumption trend graphs, analyzing the consumption trend graphs with standard deviation to obtain periodical change rules, integrating analysis results of the consumption trend graphs with a decision tree, and correlating the consumption trend graphs with correlations to obtain correlations so as to form the statistical analysis report.
In this embodiment, appropriate statistical indexes and algorithms are designed according to the needs and actual conditions of the enterprise. For example, in measuring the consumption of an enterprise, an average consumption amount, consumption frequency, etc. may be designed to better evaluate the consumption of an enterprise and travel costs. Meanwhile, these indices, such as average value, standard deviation, etc., may be calculated using a suitable algorithm.
The analysis report in this embodiment is that,
design of statistical indexes and algorithms: utility of the generated consumption report and insight: when generating detailed consumption reports and insights, it is necessary to ensure that the report and insight content has practical application value. For example, based on the results of statistics and analysis, a visualization tool such as a consumption trend graph, a consumption profile, etc. may be generated to more intuitively demonstrate the consumption situation. Meanwhile, specific optimization suggestions can be provided, such as reasonable travel schedule, selection of economical hotels and the like, so as to help enterprises to optimize travel costs.
In this embodiment, through the above values, when planning such as travel cost, a corresponding calculation method and algorithm can be designed according to the needs of the enterprise, for example, accumulating various costs, calculating average cost, and the like.
According to the invention, through the constructed platform, operators can select or add different data mining and statistical analysis methods, and then select proper data mining and statistical analysis methods according to actual conditions and analysis purposes. For example, the consumer data may be analyzed using cluster analysis, association rule mining, time series analysis, and the like. Depending on the nature of the data and the purpose of the analysis, a suitable method is selected to extract useful information and insight.
Design of statistical indexes and algorithms: and designing proper statistical indexes and algorithms to measure consumption conditions and travel costs according to the requirements and actual conditions of enterprises. For example, indices such as average consumption amount, frequency of consumption, consumption trend, etc., and corresponding calculation methods and algorithms may be designed. According to the specific requirements of enterprises, proper indexes and algorithms are selected to evaluate the consumption condition and the travel cost.
The hotel polymerization platform also comprises a visualization component and an interactive dashboard, wherein the interactive dashboard is used for selecting to view consumption reports and average consumption change trends.
In the invention, when in application, detailed consumption reports and insights are generated according to the results of statistics and analysis, and corresponding visualization tools and reports are provided. Data visualization tools, such as charts, dashboards, etc., can be used to present statistics and insight. Visual and understandable visualization tools and reports are designed according to the demands of enterprises and habits of users, so that the users can quickly understand and utilize statistical results and insights.
For applications later in the platform, it has several roles, such as assuming an enterprise uses a data mining-based hotel consumption full data aggregation platform to count and analyze its consumption data over the year. Firstly, according to the requirements and actual conditions of enterprises, proper data mining and statistical analysis methods, such as cluster analysis and association rule mining, are selected. Then, appropriate statistical indexes and algorithms, such as indexes of average consumption amount, consumption frequency, etc., are designed, and the indexes are calculated using appropriate algorithms. Next, detailed consumption reports and insights are generated from the results of the statistics and analysis. The report includes visualizations such as consumption trend graphs, consumption profiles, etc. to more intuitively show consumption. Meanwhile, specific optimization suggestions are provided in the report, such as reasonable travel schedule, selection of economical hotels and the like, so as to help enterprises optimize travel costs. By implementing the improvement scheme, enterprises can evaluate consumption conditions and travel costs more accurately and optimize according to suggestions provided by reports and insights, so that the goals of reducing cost and improving efficiency are achieved.
For sales enterprises with frequent business trips, better cooperative hotels including hotels where staff check in for a long time can be selected by analyzing the report, long-term cooperative negotiations can be carried out by utilizing the data, and further, the enterprises can evaluate consumption conditions and business trip costs more accurately and optimize according to suggestions provided by the report and insight, so that the goals of reducing cost and improving efficiency are achieved. The improvement scheme has important practical application value, can help enterprises understand and optimize travel cost, and improves the competitiveness of the enterprises.
One application of the invention is as follows:
assume that an enterprise uses a data mining-based hotel consumption full data aggregation platform to manage and analyze hotel consumption data. In the implementation process, cluster analysis and association rule mining are selected as data mining and statistical analysis methods according to the requirements and actual conditions of enterprises.
Firstly, classifying enterprises according to consumption conditions through cluster analysis to obtain enterprise groups of different categories. Then, according to the demands of enterprises, statistical indexes such as average consumption amount, consumption frequency and the like are designed, and the indexes are calculated by using a corresponding algorithm. Through association rule mining, association relations among different consumption items, such as association between hotel reservations and restaurant consumption, are found.
Finally, detailed consumption reports and insights are generated based on the results of the statistics and analysis. The consumption situation and travel cost of different enterprise groups are displayed by using data visualization tools, such as bar charts, line charts and the like. Meanwhile, an interactive dashboard is provided, and a user can freely select and view statistical results and insights according to own requirements and attention points.
Through implementation of the improvement scheme, the enterprise can evaluate the consumption condition and the travel cost more accurately, so that the enterprise is helped to understand and optimize the travel cost. At the same time, the generated consumption reports and insights may be provided to business decision makers and administrators, helping them make more intelligent decisions and optimize travel strategies.
The above list of detailed descriptions is only specific to practical embodiments of the present invention, and they are not intended to limit the scope of the present invention, and all equivalent embodiments or modifications that do not depart from the spirit of the present invention should be included in the scope of the present invention.
It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential characteristics thereof. The present embodiments are, therefore, to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference sign in a claim should not be construed as limiting the claim concerned.
Furthermore, it should be understood that although the present disclosure describes embodiments, not every embodiment is provided with a separate embodiment, and that this description is provided for clarity only, and that the disclosure is not limited to the embodiments described in detail below, and that the embodiments described in the examples may be combined as appropriate to form other embodiments that will be apparent to those skilled in the art.

Claims (10)

1. The hotel consumption analysis method based on data mining is characterized by comprising the following steps of:
acquiring offline hotel consumption data and online hotel consumption data updated in real time through a data mining technology;
after data processing is carried out on online hotel consumption data and offline hotel consumption data respectively, the obtained hotel consumption data are stored to form a database;
extracting a field containing a hotel name from a database by utilizing a data mining technology, and verifying and comparing to obtain a data set corresponding to the hotel name so as to construct a corresponding hotel aggregation platform;
the hotel aggregation platform at least comprises a database for storing a data set, a data input unit, a data retrieval and query unit and a statistical analysis unit;
the statistical analysis unit is used for acquiring average consumption amount, average consumption frequency and consumption trend graph of the hotel through an index algorithm in a set time period, and obtaining a statistical analysis report.
2. The method for analyzing hotel consumption based on data mining according to claim 1, wherein the data mining technique comprises the step of acquiring hotel consumption data by means of cooperation of a third party agency and using an interface of the third party agency.
3. The data mining-based hotel consumption analysis method of claim 1, wherein the data mining technique further comprises obtaining hotel consumption data by collaborating with a hotel or investigating an interview.
4. The hotel consumption analysis method based on data mining according to claim 1, wherein the data processing is specifically: and carrying out duplicate removal by comparing and deleting the hotel consumption data containing the date, the amount, the hotel name and the place, filling by checking the missing value, carrying out unified processing on the hotel name and the place by using the same field format, and converting the amount into a unified unit.
5. The hotel consumption analysis method based on data mining according to claim 1, further comprising the construction of an extraction model of hotel names, specifically:
s11) constructing a database table containing text data by using online hotel consumption data;
s12) obtaining characteristics in text data through a language processing technology;
s13) labeling samples containing hotel names and non-hotel names by using an algorithm to form labeling data;
s14) constructing an extraction model with naive bayes, and then performing cross-validation to obtain a final extraction model.
6. The method according to claim 5, wherein the language processing technique in step S12) is specifically an N-gram model, which divides the text data into a plurality of words, and each word and the word before division form a feature.
7. The method for analyzing hotel consumption based on data mining according to claim 6, wherein before constructing the corresponding hotel aggregation platform, further comprising obtaining association relations between different hotel names by dividing words, and placing hotels with cross-association into the same data set.
8. The hotel consumption analysis method based on data mining according to claim 1, wherein the database further comprises consumption items, the hotel aggregation platform is embedded with an Apriori algorithm, and association rules are generated through frequent item sets in the data set, so that the association rules among the consumption items are obtained by using the support degree and the confidence degree.
9. The hotel consumption analysis method based on data mining according to claim 1, wherein the generating a consumption report specifically comprises the steps of:
s21) obtaining average consumption amounts of each day, each month and each quarter by using an average value method, and generating a corresponding change trend graph;
s22) selecting hotel consumption data in a set time period, and drawing a consumption frequency and a consumption frequency change trend chart in unit time in the time period;
s23) selecting an exponential smoothing method through a time sequence analysis method to obtain and predict a consumption trend graph in a certain time period;
s24) displaying the consumption trend graphs, analyzing the consumption trend graphs with standard deviation to obtain periodical change rules, integrating analysis results of the consumption trend graphs with a decision tree, and correlating the consumption trend graphs with correlations to obtain correlations so as to form the statistical analysis report.
10. The data mining-based hotel consumption analysis method of claim 1, wherein the hotel aggregation platform further comprises a visualization component and an interactive dashboard for selecting to view consumption reports and average consumption trends.
CN202311189235.5A 2023-09-14 2023-09-14 Hotel consumption full data aggregation platform based on data mining Pending CN117291758A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311189235.5A CN117291758A (en) 2023-09-14 2023-09-14 Hotel consumption full data aggregation platform based on data mining

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311189235.5A CN117291758A (en) 2023-09-14 2023-09-14 Hotel consumption full data aggregation platform based on data mining

Publications (1)

Publication Number Publication Date
CN117291758A true CN117291758A (en) 2023-12-26

Family

ID=89251016

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311189235.5A Pending CN117291758A (en) 2023-09-14 2023-09-14 Hotel consumption full data aggregation platform based on data mining

Country Status (1)

Country Link
CN (1) CN117291758A (en)

Similar Documents

Publication Publication Date Title
US8103534B2 (en) System and method for managing supplier intelligence
US11372896B2 (en) Method and apparatus for grouping data records
US7475062B2 (en) Apparatus and method for selecting a subset of report templates based on specified criteria
US20100268673A1 (en) Associate memory learning agent technology for travel optimization and monitoring
CN111324602A (en) Method for realizing financial big data oriented analysis visualization
CN111125343A (en) Text analysis method and device suitable for human-sentry matching recommendation system
US20120011139A1 (en) Unified numerical and semantic analytics system for decision support
CN112000656A (en) Intelligent data cleaning method and device based on metadata
CN110705307A (en) Information change index monitoring method and device, computer equipment and storage medium
Fortino Data mining and predictive analytics for business decisions: a case study approach
Zealand Data integration manual
Lucko et al. Quantitative research: Preparation of incongruous economic data sets for archival data analysis
US11636418B2 (en) Currency reduction for predictive human resources synchronization rectification
CN117291758A (en) Hotel consumption full data aggregation platform based on data mining
Fajri et al. Implementation of Business Intelligence to Determine Evaluation of Activities (Case Study Indonesia Stock Exchange).
Cárdenas Extracting value from job vacancy information
US20210117886A1 (en) Data Preparation Method Related to Data Utilization and Data Utilization System
Aletdinova et al. The Collection and Processing Specifics of Online Data on Job Vacancies in the Russian Labor Market
CN112950392A (en) Information display method, posterior information determination method and device and related equipment
CN117556118B (en) Visual recommendation system and method based on scientific research big data prediction
Wibawa et al. Complaint Data Text Analysis Concerning the Apps Provided by Government Agency using Inference LDA
Nekvasil et al. Towards savvy adoption of semantic technology: From published use cases to category-specific adopter readiness models
JP2022066892A (en) Work analysis system and work analysis method
WO2021111403A1 (en) Data processing system
CN117556118A (en) Visual recommendation system and method based on scientific research big data prediction

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination