CN112488384A

CN112488384A - Method, terminal and storage medium for predicting target area based on social media sign-in

Info

Publication number: CN112488384A
Application number: CN202011358914.7A
Authority: CN
Inventors: 史文中; 刘哲维
Original assignee: Shenzhen Research Institute HKPU
Current assignee: Shenzhen Research Institute HKPU
Priority date: 2020-11-27
Filing date: 2020-11-27
Publication date: 2021-03-12
Anticipated expiration: 2040-11-27
Also published as: CN112488384B

Abstract

The invention discloses a method, a terminal and a storage medium for predicting a target area based on social media sign-in, wherein a geographical position label in a social media sign-in record is obtained, and an area feature vector is generated according to the geographical position label; generating multi-dimensional region feature vectors according to the region feature vectors of all regions, training a machine learning model according to the multi-dimensional region feature vectors and target region correlation vectors generated based on the multi-dimensional region feature vectors, and taking the trained machine learning model as a prediction model; obtaining a region feature vector to be predicted, sequencing the region to be predicted through the prediction model and the region feature vector to be predicted, and determining a target region in the region to be predicted according to the sequencing result. The invention abstracts the task of determining the user resident area into a sequencing problem, sequences each area visited by the user by using a machine learning model, and finally successfully predicts the resident area of the user.

Description

Method, terminal and storage medium for predicting target area based on social media sign-in

Technical Field

The invention relates to the field of geographic information, in particular to a method, a terminal and a storage medium for predicting a target area based on social media sign-in.

Background

The social media check-in data with the geographic position tags is used for presuming the resident area of the user, and is an important research means in the fields of geographic information science, human mobile mode research and the like. In terms of technical means, a common customer premises presumption method mostly uses a simple statistical method. The method is mostly based on experience intuition of people, lacks of rigorous proof and theoretical basis, and has lower detection result precision.

Thus, there is still a need for improvement and development of the prior art.

Disclosure of Invention

The technical problem to be solved by the present invention is to provide a method, a terminal and a storage medium for predicting a target area based on social media sign-in, aiming at solving the problem that the target area of a user is difficult to be accurately predicted according to social media sign-in data in the prior art.

The technical scheme adopted by the invention for solving the problems is as follows:

in a first aspect, an embodiment of the present invention provides a method for predicting a target area based on social media check-in, where the method includes:

acquiring a geographical position tag in the social media check-in record, and generating a regional characteristic vector according to the geographical position tag;

generating multi-dimensional region feature vectors according to the region feature vectors of all regions, training a machine learning model according to the multi-dimensional region feature vectors and target region correlation vectors generated based on the multi-dimensional region feature vectors, and taking the trained machine learning model as a prediction model;

obtaining a region feature vector to be predicted, sequencing the region to be predicted through the prediction model and the region feature vector to be predicted, and determining a target region in the region to be predicted according to the sequencing result.

In one embodiment, the obtaining a geo-location tag in the social media check-in record, and the generating an area feature vector according to the geo-location tag includes:

obtaining a geographical position tag in the social media check-in record, and classifying the social media check-in record through the geographical position tag;

generating region check-in frequency data according to the classification result;

generating regional active day frequency data according to the classification result;

generating active frequency data of the region in a preset time period according to the classification result;

and generating a region feature vector of each region according to the region check-in frequency data, the region active day frequency data and the active frequency data of the region in a preset time period.

In one embodiment, the generating region check-in frequency data according to the classification result includes:

calculating the number of the social media check-ins published by the user in each region according to the classification result;

acquiring the total number of social media check-ins published by a user according to the geographical position tag;

and taking the ratio of the number of the social media check-ins published by the user in each region to the total number of the social media check-ins published by the user as region check-in frequency data of each region.

In one embodiment, the generating of the frequency of days of area activity data according to the classification result includes:

calculating the number of active days of the user in each area according to the classification result; the number of active days is the number of days that the user at least publishes a social media check-in record;

adding the calculated active days of the user in each area to obtain the total active days of the user;

and taking the ratio of the number of active days of the user in each area to the total number of active days of the user as the frequency data of the number of active days of each area.

In one embodiment, the active frequency data of the region for a preset period of time includes: the active frequency data of the region in the preset time period according to the classification result comprises the following active frequency data at night, active frequency data in summer and active frequency data on weekends:

the active frequency data of the region in a preset time period comprises: night active frequency data, summer active frequency data, and weekend active frequency data; the generating of the active frequency data of the region in the preset time period according to the classification result includes:

calculating the night activity days of the region according to the classification result; the night active days are days when the user is located in the area and has released at least one in a preset time period at night;

performing addition operation on the calculated night activity days of the user in each region to obtain the total night activity days;

taking the ratio of the night activity days of the area to the total night activity days of the area as night activity frequency data of the area;

calculating the number of social media check-ins published by the user in the region between preset months according to the classification result;

acquiring the total number of social media check-ins issued by the user in each region between the preset months according to the geographical position tags;

taking the ratio of the number of social media check-ins posted by the user in the region between preset months to the total number of social media check-ins posted by the user in each region between the preset months as the summer active frequency number;

calculating the number of social media check-ins published by the user in the region on the weekend according to the classification result;

acquiring the total number of social media check-ins issued by the user in each region on weekends according to the geographical position tags;

and taking the ratio of the number of social media check-ins posted by the user in the region on the weekend to the total number of social media check-ins posted by the user in each region on the weekend as the weekend active frequency data.

In one embodiment, the generating multidimensional region feature vectors according to the region feature vectors of all the regions, training a machine learning model according to the multidimensional region feature vectors and a target region correlation vector generated based on the multidimensional region feature vectors, and using the trained machine learning model as a prediction model includes:

acquiring and integrating regional characteristic vectors of all regions, and taking the integrated vector as a multi-dimensional regional characteristic vector;

acquiring a target region correlation vector generated based on the multi-dimensional region feature vector;

taking the multi-dimensional region feature vector as input data of a machine learning model, taking the target region correlation vector generated based on the multi-dimensional region feature vector as output data of the machine learning model, and training the machine learning model;

and taking the trained machine learning model as a prediction model.

In one embodiment, the obtaining the target region correlation vector generated based on the multi-dimensional region feature vector comprises:

according to the relevance between each region in the multi-dimensional region feature vector and a target region, scoring each region to obtain the relevance value of the target region of each region;

and integrating the target region relevance scores of all the regions, and taking the vector obtained by integration as a target region relevance vector.

In one embodiment, the obtaining the feature vector of the region to be predicted, sorting the region to be predicted by the prediction model and the feature vector of the region to be predicted, and determining the target region in the region to be predicted according to the sorting result includes:

acquiring a region feature vector to be predicted, and inputting the region feature vector to be predicted into the prediction model;

obtaining a score which is output by the prediction model and generated based on the regional feature vector to be predicted;

and sequencing the regions to be predicted based on the scores, and determining a target region in the regions to be predicted according to a sequencing result.

In a second aspect, an embodiment of the present invention further provides a mobile terminal, where the mobile terminal includes: a processor, a storage medium communicatively coupled to the processor, the storage medium adapted to store a plurality of instructions; the processor is adapted to invoke instructions in the storage medium to consistently implement the steps of any of the above methods for predicting a target area based on social media check-in.

In a second aspect, the present invention further provides a computer-readable storage medium, where the instructions are adapted to be loaded and executed by a processor to implement the steps of any one of the above methods for predicting a target area based on social media check-in.

The invention has the beneficial effects that: according to the embodiment of the invention, the geographical position tags in the social media sign-in record are obtained, and the regional characteristic vectors are generated according to the geographical position tags; generating multi-dimensional region feature vectors according to the region feature vectors of all regions, training a machine learning model according to the multi-dimensional region feature vectors and target region correlation vectors generated based on the multi-dimensional region feature vectors, and taking the trained machine learning model as a prediction model; obtaining a feature vector of a region to be predicted, sequencing the region to be predicted through the prediction model, and determining a target region in the region to be predicted according to the sequencing result. The invention abstracts the task of determining the user resident area into a sequencing problem, sequences each area visited by the user by using a machine learning model, and finally successfully predicts the resident area of the user.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments described in the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

Fig. 1 is a flowchart illustrating a method for predicting a target area based on social media check-in according to an embodiment of the present invention.

Fig. 2 is a schematic flow chart of generating a region feature vector according to an embodiment of the present invention.

Fig. 3 is a schematic flowchart of obtaining a prediction model according to an embodiment of the present invention.

Fig. 4 is a schematic flowchart of determining a target area according to an embodiment of the present invention.

Fig. 5 is a schematic block diagram of a terminal according to an embodiment of the present invention.

Fig. 6 is a graph of experimental results provided by an embodiment of the present invention for evaluating the technical effects of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention clearer and clearer, the present invention is further described in detail below with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

It should be noted that, if directional indications (such as up, down, left, right, front, and back … …) are involved in the embodiment of the present invention, the directional indications are only used to explain the relative positional relationship between the components, the movement situation, and the like in a specific posture (as shown in the drawing), and if the specific posture is changed, the directional indications are changed accordingly.

In recent years, with the popularization of mobile positioning equipment and the rise of location-based services, a novel online social media, namely a social network based on geographic location, is generated by the fusion of the traditional social network and the positioning technology, and the users are supported to share their own location information anytime and anywhere. Typical behavior of users of this type of application is to make social media check-ins or comment on check-in places, etc. The personal check-in data may represent a historical movement trajectory of the person, and the check-in data of a large number of users may reveal human movement patterns and life laws. Because the check-in data is the social network data with the geographic information, the check-in data can reflect the social network behavior of the user and the movement behavior of the user. Meanwhile, due to the fact that the acquisition mode is simple and the cost is low, more and more students adopt check-in data for research in recent years.

One such study involves inferring the user's residence area using social media check-in data with geo-location tags. The method is an important research means in the fields of geographic information science, human movement pattern research and the like. In terms of technical means, a common customer premises presumption method mostly uses a simple statistical method. The method is mostly based on experience intuition of people, lacks of rigorous proof and theoretical basis, and has lower detection result precision.

Based on the above-mentioned drawbacks of the prior art, the present invention provides a method for determining a target area to accurately determine a resident area of a user. According to the method, tasks for determining the user's resident area are abstracted into a sequencing problem, and the machine learning model is used for sequencing each area visited by the user, so that the resident area of the user is successfully predicted finally.

As shown in fig. 1, the method comprises the steps of:

s100, obtaining a social media check-in record of a user, and generating a region feature vector according to the social media check-in record.

The social media check-in record of the user incorporates an element of the user's current geographic location. Inside a user's social media check-in record, the user may check-in to a place, disclose their geographic location, and leave comment information at the check-in place. When the social media is combined with the position, the time and the place of the user activity can be analyzed, and the activity rule of the user in each area can be better known. Therefore, social media check-in records of the user need to be obtained first, and region feature vectors of all regions are generated according to the social media check-in records.

As shown in fig. 2, in an implementation manner, the step S100 specifically includes the following steps:

step S110, obtaining geographic position tags in the social media check-in records, and classifying the social media check-in records through the geographic position tags;

step S120, generating region check-in frequency data according to the classification result;

s130, generating regional activity day frequency data according to the classification result;

step S140, generating active frequency data of the region in a preset time period according to the classification result;

and S150, generating a region feature vector of each region according to the region check-in frequency data, the region active day frequency data and the active frequency data of the region in a preset time period.

After social media check-in records of a user are obtained, in order to analyze the activity condition of the user in each area based on the social media check-in records, the social media check-in records need to be classified by taking the area as a unit according to geographical position tags in the social media check-in records, area check-in frequency data, area active day frequency data and active frequency data of the area in a preset time period are generated according to classification results, and finally a feature vector of the area is formed based on the frequency data. In an implementation manner, the number of the social media check-ins posted by the user in each region may be calculated according to the classification result, then, the total number of the social media check-ins posted by the user is obtained according to the social media check-in record of the user, and finally, the ratio of the number of the social media check-ins posted by the user in each region to the total number of the social media check-ins posted by the user is used as the region check-in frequency data of each region.

In addition, the number of active days of the user in each area can be calculated according to the classification result, and then the calculated number of active days of the user in each area is added to obtain the total number of active days of the user; and taking the ratio of the number of active days of the user in each area to the total number of active days of the user as the frequency data of the number of active days of each area.

And finally, taking the ratio of the number of the social media check-ins posted by the user in each region in the preset time period to the total number of the social media check-ins posted by the user in the preset time period as the active frequency data of each region in the preset time period.

In one implementation, the active frequency data of the region for a preset time period includes: night activity frequency data, summer activity frequency data, and weekend activity frequency data. In order to obtain these three types of frequency data, this embodiment needs to calculate night activity days of the area according to the classification result, where the night activity days are days for which the user has published at least one time in the area in a preset time period at night. And then, carrying out addition operation on the calculated night activity days of the user in each region to obtain the total night activity days. And finally, taking the ratio of the night activity days of the region to the total night activity days of the region as the night activity frequency data of the region.

In addition, the number of social media check-ins posted by the user in the region between the preset months needs to be calculated according to the classification result. And then, acquiring the total number of social media check-ins published by the user in each region between the preset months according to the geographical position tags. And then, taking the ratio of the number of social media check-ins posted by the user in the region between the preset months to the total number of social media check-ins posted by the user in each region between the preset months as the summer active frequency number.

In addition, the number of social media check-ins posted by the user in the region on the weekend needs to be calculated according to the classification result. And then, acquiring the total number of social media check-ins published by the user in each region on the weekend according to the geographical position tag. And finally, taking the ratio of the number of social media check-ins posted by the user in the region on the weekend to the total number of social media check-ins posted by the user in each region on the weekend as the weekend active frequency data.

In one implementation, the user's resident area is reflected more in social media check-ins at night than in daytime, given relevant research; similarly, the user signs in the social media in 5-9 months in summer, and can reflect the resident area of the user better than in winter; the user checks in on the social media on weekends, and the resident area of the user can be reflected better than that of a working day. Therefore, the present embodiment is based on the above-mentioned related studies, and can set the night preset time period to 19 pm to 7 am, between the preset months to 5 months to 9 months, and the weekend is saturday to sunday in the conventional sense.

For example, for a user u_iThe geographical location tag in its social media record shows that it is in a certain area r_jWhen the feature vector ur of the region appears, the feature vector ur of the region^j _i＝(rt_p，rt_ad，rt_np，rt_an，rt_s，rt_w). Wherein rt is_pFor user u_iIn the region r_jThe published social media check-in number accounts for the proportion of all social media check-in numbers; rt is an integer of_adFor user u_iIn the region r_jIs defined as the user posted at least one social media check-in on that day. Then rt is_anFor user u_iIn the region r_jIs defined as the user issues at least one social media sign between 19 pm and seven am of the dayTo; rt is an integer of_sFor user u_iIn the region r_jThe social media check-in number in the summer from 5 months to 9 months accounts for the proportion of the social media check-in number in all the summer; rt is an integer of_wFor user u_iIn the region r_jThe number of social media on weekends of (a) is a proportion of the number of social media check-ins on all of its weekends.

To implement the training of the machine learning model, as shown in fig. 1, the method further includes the following steps:

step S200, generating multi-dimensional region feature vectors according to the region feature vectors of all regions, training a machine learning model according to the multi-dimensional region feature vectors and target region correlation vectors generated based on the multi-dimensional region feature vectors, and taking the trained machine learning model as a prediction model.

Specifically, the embodiment uses a machine learning model of supervised learning type, that is, a training process of the machine learning model is changed into a learning task, and the machine learning model learns how to predict the output variables from the input variables by establishing a mathematical relationship between the input variables and the output variables. Therefore, it is necessary to first obtain a multi-dimensional region feature vector as input data and a target region correlation vector as output data, and then train a machine learning model according to the two vectors. The trained machine learning model can be used as a prediction model, for example, for predicting the resident area of the user.

As shown in fig. 3, in an implementation manner, the step S200 specifically includes the following steps:

step S210, obtaining and integrating the regional characteristic vectors of all the regions, and taking the integrated vector as a multi-dimensional regional characteristic vector;

step S220, obtaining a target area correlation vector generated based on the multi-dimensional area feature vector;

step S230, taking the multidimensional region feature vector as input data of a machine learning model, taking the target region correlation vector generated based on the multidimensional region feature vector as output data of the machine learning model, and training the machine learning model;

and step S240, taking the trained machine learning model as a prediction model.

Specifically, in this embodiment, the region feature vectors of all the regions are obtained and integrated to obtain the multidimensional region feature vector. And then generating a target region correlation vector according to the multi-dimensional region feature vector. In order to generate a target region relevance vector, in an implementation manner, in this embodiment, each region is scored according to the relevance between each region and a target region in the multi-dimensional region feature vector, so as to obtain a target region relevance score of each region, then the target region relevance scores of each region are integrated, and the integrated vector is used as the target region relevance vector. The target region relevance score for each region at this step may correctly indicate how close each region is relevant to the target region. After the multi-dimensional region feature vector and the target region correlation vector are obtained, the multi-dimensional region feature vector is used as input data of a machine learning model, the target region correlation vector is used as output data of the machine learning model, the machine learning model is trained, and finally the trained machine learning model is used as a prediction model.

For example, when the target area to be predicted is the user's resident area, the present embodiment needs to collect the user's resident area in advance. Specifically, the personal homepage information of the social media of the user can be crawled through a web crawler, or the current city of the user is filled in the personal information by the user in the social media such as a microblog, a twitter and a photo wall, and the resident area of the user is determined according to the information. For all the areas visited by the user, if the area is a resident area of the user, the relevance score is 1; otherwise the region relevance score is 0. The specific method is that all the areas (r) visited by the user₁,r₂,…,r_m) If the area is the resident area of the user, the relevance score is 1; otherwise the region relevance is scoredThe value is 0, forming an m-dimensional vector (0,0, …,1, …,0) until the regional relevance score for each region is calculated.

In one implementation, the multiple decision tree lambdamat model is used as the prediction model in this embodiment, since the multiple decision tree lambdamat model is very effective for the search ranking algorithm for building the "ranking learning" framework. LambdaMART is a Listwise LTR algorithm, and converts a search engine result sorting problem into a Regression decision tree problem based on a LambdaRank algorithm and a MART (multiple Additive Regression Tree) algorithm. MART is actually a Gradient Boosting Decision Tree (GBDT) algorithm. The core idea of the GBDT is that in continuous iteration, a regression decision tree model generated in a new iteration is fitted with the gradient of a loss function, and finally all regression decision trees are superposed to obtain a final model. In the prior art, Lambdamart is a very mature model, and the whole training process is very streamlined. When the model is trained, the training process of the model can be realized only by constructing input data and output data of the model as training data.

After acquiring the input data and the output data for training the machine learning model, in order to predict the target area of the user, as shown in fig. 1, the method further includes the following steps:

step S300, obtaining the characteristic vector of the region to be predicted, sequencing the region to be predicted through the prediction model and the characteristic vector of the region to be predicted, and determining a target region in the region to be predicted according to the sequencing result.

The predictive model is trained, so that it can automatically predict correct output data from the output data. In order to predict the target area of the user, firstly, area feature vectors of all areas to be predicted are generated according to the steps, then the area feature vectors to be predicted are input into the prediction model, the areas to be predicted are sequenced through the prediction model, the correlation between each area in the areas to be predicted and the target area is determined according to the sequencing result, and then the target area of the user to be predicted is determined in the areas to be predicted.

As shown in fig. 4, in an implementation manner, the step S300 specifically includes the following steps:

step S310, obtaining a region feature vector to be predicted, and inputting the region feature vector to be predicted into the prediction model;

step S320, obtaining a score which is output by the prediction model and generated based on the regional feature vector to be predicted;

and S330, sequencing the regions to be predicted based on the scores, and determining a target region in the regions to be predicted according to a sequencing result.

Specifically, the regional feature vectors of the regions to be predicted are input into the prediction model, the regions to be predicted are scored according to the prediction model, the regions to be predicted are ranked based on scoring results, and the predicted target regions are determined in the regions to be predicted according to the ranking results. In one implementation, each of the regions to be predicted may be sorted from large to small according to the magnitude of the target region relevance score, and when the target region to be predicted is a user's resident region, the region located at the first ordinal position may be used as the user's resident region.

In order to illustrate the effect of the method for predicting the target area based on social media check-in provided by the embodiment of the invention, the embodiment of the invention adopts real data to carry out experiments. FIG. 6 is an experimental result of the present invention using real photo wall social media check-in data. In this example, indexes such as Accuracy, F-measure, and Balanced Accuracy are used to perform quantitative evaluation on the method of the present invention and other comparison methods. The quantitative result shows that the method can obtain larger Accuracy, F-measure and Balanced Accuracy, proves the superiority of the method compared with other methods, and can more accurately predict the resident area of the social media user compared with other prediction methods when the target area to be predicted is the resident area of the user.

Based on the above embodiment, the present invention further provides an intelligent terminal, and a schematic block diagram thereof may be as shown in fig. 5. The intelligent terminal comprises a processor, a memory, a network interface and a display screen which are connected through a system bus. Wherein, the processor of the intelligent terminal is used for providing calculation and control capability. The memory of the intelligent terminal comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The network interface of the intelligent terminal is used for being connected and communicated with an external terminal through a network. The computer program, when executed by a processor, implements a method of predicting a target area based on social media check-ins. The display screen of the intelligent terminal can be a liquid crystal display screen or an electronic ink display screen.

It will be understood by those skilled in the art that the block diagram shown in fig. 5 is only a block diagram of a part of the structure related to the solution of the present invention, and does not constitute a limitation to the intelligent terminal to which the solution of the present invention is applied, and a specific intelligent terminal may include more or less components than those shown in the figure, or combine some components, or have a different arrangement of components.

In one implementation, one or more programs are stored in a memory of the smart terminal and configured to be executed by one or more processors include instructions for performing a method of predicting a target area based on social media check-in.

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, databases, or other media used in embodiments provided herein may include non-volatile and/or volatile memory. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).

In summary, the present invention discloses a method for predicting a target area based on social media sign-in, which is characterized in that the method comprises: acquiring a geographical position tag in the social media check-in record, and generating a regional characteristic vector according to the geographical position tag; generating multi-dimensional region feature vectors according to the region feature vectors of all regions, training a machine learning model according to the multi-dimensional region feature vectors and target region correlation vectors generated based on the multi-dimensional region feature vectors, and taking the trained machine learning model as a prediction model; obtaining a feature vector of a region to be predicted, sequencing the region to be predicted through the prediction model, and determining a target region in the region to be predicted according to the sequencing result. The invention abstracts the task of determining the user resident area into a sequencing problem, and sequences each area visited by the user by using a machine learning model, thereby finally successfully predicting the resident area of the user.

It is to be understood that the invention is not limited to the examples described above, but that modifications and variations may be effected thereto by those of ordinary skill in the art in light of the foregoing description, and that all such modifications and variations are intended to be within the scope of the invention as defined by the appended claims.

Claims

1. A method for predicting a target area based on social media check-in, the method comprising:

2. The method of claim 1, wherein the obtaining a geo-location tag in the social media check-in record and the generating a region feature vector according to the geo-location tag comprises:

3. The method of predicting target areas based on social media check-in of claim 2, wherein the generating area check-in frequency data according to classification results comprises:

4. The method of predicting target areas based on social media check-in of claim 2, wherein the generating frequency of days of area activity data from the classification result comprises:

5. The method of claim 2, wherein the activity frequency data of the region for a preset period of time comprises: the active frequency data of the region in the preset time period according to the classification result comprises the following active frequency data at night, active frequency data in summer and active frequency data on weekends:

6. The method of claim 1, wherein the generating multidimensional region feature vectors according to the region feature vectors of all regions, training a machine learning model according to the multidimensional region feature vectors and a target region correlation vector generated based on the multidimensional region feature vectors, and using the trained machine learning model as a prediction model comprises:

and taking the trained machine learning model as a prediction model.

7. The method of claim 6, wherein obtaining a target region relevance vector generated based on the multi-dimensional region feature vector comprises:

8. The method of claim 1, wherein the obtaining of the feature vector of the region to be predicted, the ranking of the region to be predicted according to the prediction model and the feature vector of the region to be predicted, and the determining of the target region in the region to be predicted according to the ranking result comprises:

9. A mobile terminal, comprising: a processor, a storage medium communicatively coupled to the processor, the storage medium adapted to store a plurality of instructions; the processor is adapted to invoke instructions in the storage medium to consistently implement the steps of the method of any of claims 1-8 above for predicting a target area based on social media check-in.

10. A computer-readable storage medium having stored thereon instructions adapted to be loaded and executed by a processor to perform the steps of the method for predicting a target area based on social media check-in as claimed in any one of claims 1 to 8.