WO2023273298A1 - User track recognition method, apparatus and device, and storage medium - Google Patents

User track recognition method, apparatus and device, and storage medium Download PDF

Info

Publication number
WO2023273298A1
WO2023273298A1 PCT/CN2022/071481 CN2022071481W WO2023273298A1 WO 2023273298 A1 WO2023273298 A1 WO 2023273298A1 CN 2022071481 W CN2022071481 W CN 2022071481W WO 2023273298 A1 WO2023273298 A1 WO 2023273298A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
wifi
identified
user
recognition
Prior art date
Application number
PCT/CN2022/071481
Other languages
French (fr)
Chinese (zh)
Inventor
张霖
徐赛奕
朱磊
赵文婕
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2023273298A1 publication Critical patent/WO2023273298A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/29Geographical information databases
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S19/00Satellite radio beacon positioning systems; Determining position, velocity or attitude using signals transmitted by such systems
    • G01S19/01Satellite radio beacon positioning systems transmitting time-stamped messages, e.g. GPS [Global Positioning System], GLONASS [Global Orbiting Navigation Satellite System] or GALILEO
    • G01S19/13Receivers
    • G01S19/14Receivers specially adapted for specific applications
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9024Graphs; Linked lists
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/237Lexical tools
    • G06F40/242Dictionaries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W4/00Services specially adapted for wireless communication networks; Facilities therefor
    • H04W4/02Services making use of location information
    • H04W4/029Location-based management or tracking services

Definitions

  • the present application relates to the field of data processing, and in particular to a user trajectory identification method, device, equipment and storage medium.
  • the main methods used in user trajectory identification are mobile phone GPS identification and mobile phone base station identification.
  • POIs Point of Interest
  • the present application provides a user trajectory identification method, device, equipment and storage medium, which are used to solve the technical problem that the existing user trajectory identification method has low accuracy in identifying the user's actual trajectory.
  • the first aspect of the present application provides a user trajectory identification method, including: obtaining the original wifi data and gps information of the user in the time period to be identified, the original wifi data including wifi connection time; performing data pre-processing on the original wifi data processing to obtain the data to be identified; according to the preset expert rule dictionary, the data to be identified is identified once to obtain a recognition result; if the identification result of the first identification is successful, the location category of the data to be identified is obtained ; if the primary recognition result is a recognition failure, then input the data to be recognized into a pre-trained wifi recognition model to obtain a secondary recognition result, wherein the secondary recognition result includes the data to be recognized location category; divide the section to be identified into slices according to the wifi connection time to obtain at least one section of wifi connection time section, and mark the wifi connection time section according to the location category of the data to be identified to obtain the user location labeling information; generating the user track of the user according to the user location labeling information, the original wifi data and the gps information.
  • the second aspect of the present application provides a user trajectory identification method device, including a memory, a processor, and computer-readable instructions stored on the memory and operable on the processor, and the processor executes the computer
  • the following steps are realized during the readable instruction: obtain the original wifi data and gps information of the user in the time period to be identified, the original wifi data includes the wifi connection time; the original wifi data is carried out data preprocessing to obtain the data to be identified;
  • the preset expert rule dictionary is used to identify the data to be identified once to obtain an identification result; if the identification result of the identification is successful, the location category of the data to be identified is obtained; if the identification result of the identification is If the recognition fails, the data to be recognized is input into a pre-trained wifi recognition model to obtain a secondary recognition result, wherein the secondary recognition result includes the location category of the data to be recognized; according to the wifi connection Slicing and dividing the section to be identified by time to obtain at least one wifi connection time section, and marking the wifi connection time section according to the location
  • the third aspect of the present application provides a computer-readable storage medium, wherein computer instructions are stored in the computer-readable storage medium, and when the computer instructions are run on the computer, the computer is made to perform the following steps: obtain the The original wifi data and gps information of time period, described original wifi data comprises wifi connection time; Carry out data preprocessing to described original wifi data, obtain the data to be identified; According to the preset expert rule dictionary, to the described data to be identified Perform a recognition to obtain a recognition result; if the recognition is successful, the location category of the data to be recognized is obtained; if the recognition is failed, the data to be recognized is input to the In the trained wifi recognition model, a secondary recognition result is obtained, wherein the secondary recognition result includes the location category of the data to be recognized; according to the wifi connection time, the segment to be recognized is divided into slices to obtain at least A section of wifi connection time period, and mark the wifi connection time period according to the location category of the data to be identified to obtain user location label information; according to the user location label information, the original wifi
  • the fourth aspect of the present application provides a user trajectory identification method and device, wherein the user trajectory identification method and device includes: an acquisition module that acquires the original wifi data and gps information of the user in the time period to be identified, the original wifi data includes wifi connection time; a preprocessing module for performing data preprocessing on the original wifi data to obtain data to be identified; a primary identification module for performing a primary identification on the data to be identified according to a preset expert rule dictionary, Obtain a primary recognition result, when the primary recognition result is a successful recognition, then obtain the location category of the data to be recognized; a secondary recognition module is used to convert the to-be-recognized data when the primary recognition result is a recognition failure
  • the data is input into a pre-trained wifi recognition model to obtain a secondary recognition result, wherein the secondary recognition result includes the location category of the data to be recognized; a labeling module is used to classify the described wifi connection time according to the wifi connection time Sections to be identified are divided into slices to obtain at least one wifi connection
  • the original wifi data and gps information of the user in the time period to be identified are obtained, the original wifi data includes the wifi connection time; data preprocessing is performed on the original wifi data to obtain the data to be identified; according to the preset Expert rule dictionary, conduct a recognition on the data to be recognized, and get a recognition result; if the recognition result is successful, then get the location category of the data to be recognized; if the recognition result is a failure, then input the data to be recognized into the pre-training
  • the secondary recognition result is obtained, wherein the secondary recognition result includes the location category of the data to be recognized; according to the wifi connection time, the segment to be recognized is divided into slices to obtain at least one wifi connection time period, and according to the to-be-recognized data Identify the location category of the data and mark the wifi connection time period to obtain the user location label information; according to the user location label information, original wifi data and gps information, the user track of the user is generated.
  • this method Based on deep learning technology, this method generates user location labeling information on wifi data, conducts small-scale fine user trajectory identification based on user location labeling information and wifi data, and combines GPS information to perform wide-area user trajectory identification in a large range to generate
  • the user track of the user the user track identification is performed according to various data, the recognition accuracy of the user track is improved, and the identification process of the user track can be automated.
  • Fig. 1 is the schematic diagram of the first embodiment of the user trajectory identification method in the embodiment of the present application
  • FIG. 2 is a schematic diagram of a second embodiment of the user trajectory identification method in the embodiment of the present application.
  • FIG. 3 is a schematic diagram of a third embodiment of the user trajectory identification method in the embodiment of the present application.
  • FIG. 4 is a schematic diagram of a fourth embodiment of the user trajectory identification method in the embodiment of the present application.
  • FIG. 5 is a schematic diagram of a fifth embodiment of the user trajectory identification method in the embodiment of the present application.
  • FIG. 6 is a schematic diagram of an embodiment of the user trajectory identification device in the embodiment of the present application.
  • FIG. 7 is a schematic diagram of another embodiment of the user trajectory identification device in the embodiment of the present application.
  • Fig. 8 is a schematic diagram of an embodiment of a user trajectory identification device in the embodiment of the present application.
  • Embodiments of the present application provide a method, device, device, and storage medium for identifying user trajectories, which are used to solve the technical problem of low accuracy in identifying actual user trajectories in existing user trajectory identification methods.
  • An embodiment of the user trajectory identification method in the embodiment of the present application includes:
  • the execution subject of the present application may be a user trajectory recognition device, and may also be a terminal or a server, which is not specifically limited here.
  • the embodiment of the present application is described by taking the server as an execution subject as an example.
  • the user track of the user in a time period can be identified, and the time period to be identified can be selected as one day or one week, which is not limited in this application.
  • the original wifi data mainly includes wifi names, wifiids and connected wifi device numbers and corresponding wifi addresses of all wifis connected by the user in the time period to be identified, wherein each wifi name, wifiid and The device numbers are one-to-one, and after the subsequent category recognition of the wifi name, the identified category, wifi name, wifiid, and device number together form a user track.
  • the outdoor user trajectory of the time period to be identified can be described through the GPS information, and indoor positioning can be performed through the original wifi data, and the indoor user trajectory of the user in the time period to be identified can be described, combined with the indoor User trajectories and outdoor user trajectories can obtain the total user trajectories of users.
  • data preprocessing mainly includes data cleaning and word segmentation
  • data cleaning mainly deletes abnormal data , delete invalid data, delete blank data
  • word segmentation processing is mainly to segment the wifi name data in the original wifi data, decompose the wifi name into an array composed of multiple words, and remove stop words, stop words are composed of no actual Functional words of meaning, such as modal particles, punctuation, etc., and then delete these stop words from the wifi name array, and the rest are valid words, and the valid phrases are jointly constructed by valid words. The purpose of this is to reduce subsequent operations quantity.
  • the wifi name is segmented mainly through the stuttering word segmentation method, which is a stuttering word segmentation module of Python, which supports three word segmentation modes: precise mode, full mode and search engine mode.
  • the stop words in the wifi name data can be removed through the preset stop dictionary, and the number of stop words in the stop word dictionary can be increased according to different needs.
  • the wifi name it is to identify the category of the wifi name, such as including “airport” and “railway station”, generally for transportation; including "restaurant” for food and beverage; including common router brand names Usually home wifi.
  • an expert rule dictionary is established, and the dictionary contains specific words corresponding to types and categories.
  • the primary recognition result is a recognition failure
  • the secondary recognition result includes the location category of the data to be recognized
  • a recognition is performed through the expert rule dictionary, and the array of wifi names output in the previous step is traversed through the expert rule dictionary. If none of the rule words in the dictionary are hit, it is determined that the first recognition fails, and a second recognition is required. Secondary recognition is carried out through the preset wifi recognition model, wherein the wifi recognition model is established through automatic training by using the ability of machine automatic learning and through the already marked training data.
  • the wifi recognition model is formed using DNN (Deep Neural Network), and the DNN model includes a word vector layer, a maximum pooling layer, a fully connected hidden layer and an output layer, and the words in the wifi name array are mapped by the word vector layer To the word vector, the maximum pooling is performed on the time series. The pooling process eliminates the difference in the number of words in different corpus samples, and extracts the maximum value of each subscript position in the word vector. After the maximum pooling The vector is sent to two consecutive fully connected hidden layers for calculation, and finally a normalized probability distribution is output through the output layer, and the result is 1.
  • the output layer has multiple neurons, and the number of neurons is the same as the wifi needs to be classified.
  • the categories are the same, so the output of the xth neuron can be considered as the predicted probability that the wifi name data belongs to the xth category of wifi category, and the wifi category corresponding to the maximum value is used as the category of the wifi name data, and as the secondary recognition result.
  • the primary recognition result is obtained through the expert rule dictionary
  • the secondary recognition result is obtained through the wifi recognition model.
  • Both the primary recognition result and the secondary recognition result have the recognition type of the original wifi data, and the recognized type
  • the wifi name data is associated with the wifi name row that the user is connected to, so that the type of wifi that the user is connected to can be identified.
  • the user location label information of the user can be identified.
  • the identified user location label information is as follows:
  • the user's GPS information can be used to locate the user, and then realize the generation of the user's track.
  • the GPS cannot work effectively in the building, and there are 0-100
  • the error of the meter causes the wrong judgment of the user's trajectory.
  • the wifi address information in the original wifi data can be used to describe the user's indoor user trajectory, mainly through the method of location fingerprinting.
  • the location of each location is associated with some kind of "fingerprint", and a location corresponds to a unique fingerprint. This fingerprint can be single-dimensional or multi-dimensional.
  • the fingerprint can be one or more characteristics of this information or signal, such as the signal strength and delay of the connected wifi, etc., through wifi Address and location fingerprints are used for indoor positioning of users, combined with GPS outdoor positioning for all-round precise positioning.
  • the locations connected to wifi in the trajectories are marked in conjunction with the user's location labeling information. Through the marked information and trajectories, it can be judged whether the user deviates from the daily trajectories. If the user deviates from the daily trajectories, Then remind the person associated with the user in advance. For example, if the user is a student and the user's trajectory deviates from the daily trajectory during the time period to be identified, the parents can be reminded.
  • the type of wifi that the user is connected to can be identified through the user location annotation information of the user track, and if it is not the type of wifi that is connected daily, it can be determined that the user deviates from the daily track.
  • the original wifi data and gps information of the user in the time period to be identified are obtained, and the original wifi data includes the wifi connection time; data preprocessing is carried out to the original wifi data to obtain the data to be identified; according to the preset expert rule dictionary , perform a recognition on the data to be recognized, and get a recognition result; if the recognition result is successful, then get the location category of the data to be recognized; if the recognition result is a failure, then input the data to be recognized to the pre-trained wifi
  • a secondary recognition result is obtained, wherein the secondary recognition result includes the location category of the data to be recognized; according to the wifi connection time, the segment to be recognized is divided into slices to obtain at least one wifi connection time period, and according to the data to be recognized
  • the location category marks the wifi connection time period to obtain the user location label information; according to the user location label information, original wifi data and gps information, the user's user track is generated.
  • this method Based on deep learning technology, this method generates user location labeling information on wifi data, conducts small-scale fine user trajectory identification based on user location labeling information and wifi data, and combines GPS information to perform wide-area user trajectory identification in a large range to generate
  • the user track of the user the user track identification is performed according to various data, the recognition accuracy of the user track is improved, and the identification process of the user track can be automated.
  • the second embodiment of the user trajectory identification method in the embodiment of the present application includes:
  • Step 201 in this embodiment is similar to step 101 in the first embodiment, and will not be repeated here.
  • data preprocessing mainly includes data cleaning and word segmentation, wherein data cleaning mainly deletes abnormal data, deletes invalid data, and deletes blank data.
  • Word segmentation of the wifi name data decomposing the wifi name into an array composed of multiple words, and removing stop words, which are composed of functional words without actual meaning, such as modal particles, punctuation, etc., and then from the wifi name array
  • stop words are deleted, and the rest are effective words, and the effective phrases are jointly constructed by the effective words. The purpose of this is to reduce the amount of subsequent calculations.
  • the preset prefix dictionary construct the directed acyclic graph of the sequence array, and calculate the probability of each path in the directed acyclic graph respectively;
  • the wifi name is segmented mainly through the stuttering word segmentation method, which is a stuttering word segmentation module of Python, which supports three word segmentation modes: precise mode, full mode and search engine mode.
  • the stop words in the wifi name data can be removed through the preset stop dictionary, and the number of stop words in the stop word dictionary can be increased according to different needs.
  • the prefix dictionary is constructed based on the statistical dictionary.
  • the prefixes of the word “Peking University” in the statistical dictionary are “ ⁇ ”, “ ⁇ ”, and “ ⁇ ” respectively; the prefixes of the word “ ⁇ ” are “ ⁇ ” , use “ ⁇ ”, “ ⁇ ”, “ ⁇ ” and “ ⁇ ” as prefixes, and then segment the input text based on the prefix dictionary.
  • the frequency of each word will be marked (equal to the number of occurrences divided by the total number, when the overall sample is large, it can be approximated as the probability of the word), after knowing the frequency of each word, it can be based on
  • the dynamic programming method is used to find the word segmentation path with the highest probability.
  • General dynamic programming finds the optimal path from left to right, but here it is from right to left to find the optimal path. This is mainly because the center of gravity in a Chinese sentence is often at the back, which is the backbone of the sentence, so the correct rate calculated from right to left is often higher than that from left to right.
  • the primary recognition result is a recognition failure
  • the secondary recognition result includes the location category of the data to be recognized
  • Steps 207-211 in this embodiment are similar to steps 103-107 in the first embodiment, and will not be repeated here.
  • this embodiment describes in detail the process of performing data preprocessing on the original wifi data to obtain the data to be identified, and obtains the data cleaning result by performing data cleaning on the wifi name data;
  • the wifi name data in the data cleaning result is subjected to word segmentation processing to obtain a wifi word segmentation array; the stop words in the wifi word segmentation array are removed to obtain the data to be identified.
  • word segmentation and stop word elimination the remaining effective words are used to construct effective phrases together, reducing the amount of follow-up calculations and improving the efficiency of user trajectory recognition.
  • the third embodiment of the user trajectory identification method in the embodiment of the present application includes:
  • Steps 301-302 in this embodiment are similar to steps 101-102 in the first embodiment, and will not be repeated here.
  • the location category corresponding to the location word whose data to be identified is successfully matched is used as a recognition result
  • the wifi name for some specific words in the wifi name, it is to identify the category of the wifi name, such as including "airport” and “railway station", which are generally transportation trips; including "restaurant” are generally catering; Including the common router brand name is generally home wifi.
  • an expert rule dictionary is established, and the dictionary contains specific words corresponding to types and categories.
  • Expert identification rules can classify and identify most wifi names containing keywords, effectively reducing the amount of subsequent modeling data, and do special rule screening for wifi in specific places to improve the accuracy of identification.
  • the wifi name array output in the previous step By traversing the wifi name array output in the previous step, if the regular word in the dictionary is hit, the wifi name will be marked as the specified category. If none of the regular words in the dictionary are hit, it is determined that the first recognition fails, and a second recognition is required. .
  • the primary recognition result is a recognition failure
  • the secondary recognition result includes the location category of the data to be recognized
  • Steps 306-308 in this embodiment are similar to steps 104-106 in the first embodiment, and will not be repeated here.
  • this embodiment describes in detail that according to the preset expert rule dictionary, the data to be recognized is recognized once to obtain a recognition result.
  • the location category corresponding to the location word that the data to be recognized is successfully matched is used as a recognition result; if the matching fails, the recognition result is set as a recognition failure.
  • the preset expert rule dictionary the data to be recognized is roughly recognized, and the data to be recognized that is successfully recognized does not need to be re-recognized, which improves the efficiency of user trajectory recognition.
  • the fourth embodiment of the user trajectory identification method in the embodiment of the present application includes:
  • Historical wifi data includes manually identified location categories ;
  • the word vector model (word to vector, word2vec) is used to pre-train historical wifi data to obtain the trained word vector weight, and the word vector layer and convolution layer of the neural network model are initialized using the word vector weight.
  • Parameters of network layer and fully connected layer It is set not to update the parameters of the word vector layer before the nth training round, n is a positive integer, and the specific value is set by the technician according to the actual situation.
  • the parameters of the word vector layer are adjusted from the n+1 round of training. Adjusting the learning rate of other network layers of the neural network model except the word vector layer to the initial learning rate, adjusting the learning rate of the word vector layer to a learning rate smaller than the initial learning rate, continuing to train the neural network model until convergence, That is, if the loss value is less than the error threshold, it is determined that the converged neural network model is the wifi recognition model.
  • Steps 406-409 in this embodiment are similar to steps 101-104 in the first embodiment, and will not be repeated here.
  • a recognition result is a recognition failure, input the data to be recognized into the word vector layer in the wifi recognition model, and convert the data to be recognized into a word vector sequence;
  • the wifi recognition model includes a word vector layer, a maximum pooling layer, a fully connected hidden layer and an output layer, wherein the word vector layer is used to better represent the semantics between different words relationship, first transform words into fixed-dimensional vectors.
  • the word vector layer is used to better represent the semantics between different words relationship, first transform words into fixed-dimensional vectors.
  • the semantic similarity between words and words can be expressed by the distance between their word vectors. The more semantically similar, the closer the distance.
  • the maximum pooling layer is used to eliminate the difference in the number of words in different corpus samples. difference, and extract the maximum value at each subscript position in the word vector. After pooling, the vector sequence output by the word vector layer is converted into a fixed-dimensional vector.
  • the result of max pooling is: [7, 4, 6].
  • the fully connected hidden layer is used to send the vector after the maximum pooling to two consecutive hidden layers for calculation.
  • the number of neurons in the output layer is consistent with the number of categories of samples. For example, in In the binary classification problem, the output layer will have 2 neurons.
  • the Softmax activation function the output result is a normalized probability distribution, and the sum is 1. This application is a multi-classification model, and the output layer has multiple neurons.
  • the number of neurons is the same as the category that wifi needs to classify, so the xth
  • the output of each neuron can be considered as the predicted probability that the wifi name data belongs to the x-th wifi category, and the wifi category corresponding to the maximum value is used as the category of the wifi name data, and as the secondary recognition result.
  • this embodiment describes in detail the process of inputting data to be recognized into a pre-trained wifi recognition model to obtain a secondary recognition result.
  • the data to be recognized is converted into a sequence of word vectors; the word vector is input into the maximum pooling layer in the wifi recognition model to obtain the maximum pooling result; the maximum pooling Input the results of the transformation into the fully connected hidden layer in the wifi recognition model, and use the Softmax function to classify the output results of the fully connected hidden layer to obtain the secondary recognition results, wherein the secondary recognition results include the location corresponding to the data to be recognized category.
  • the specific training process of the wifi recognition model is described in detail.
  • the wifi recognition model constructed by deep learning can accurately determine the location category of wifi data, thereby improving the generation efficiency of user trajectories.
  • the fifth embodiment of the user trajectory identification method in the embodiment of the present application includes:
  • the primary recognition result is a recognition failure
  • Steps 501-505 in this embodiment are similar to steps 101-105 in the first embodiment, and will not be repeated here.
  • the user may have connected to multiple wifis during the time period to be identified, and there may be an intermediate interval between multiple wifi connections.
  • the time period to be identified is when the user connects to wifi on a certain day.
  • the time is 0-8:00-9:00-10:00-12:00-13:00-14-18:00-18:00-22:00-22:00-24:00.
  • the connection time is stored as one of the original wifi data, and the identified type of wifi name data is associated with the wifi name row connected by the user, so that the type of wifi connected by the user can be identified.
  • the user location label information of the user can be identified.
  • the identified user location label information is as follows:
  • Step 509 in this embodiment is similar to step 107 in the first embodiment, and will not be repeated here.
  • this embodiment describes in detail how to generate the user location label information of the user in the time period to be recognized according to the primary recognition result or the secondary recognition result.
  • the location type of the wifi data can be marked, and then the location type of each point in the user track can be generated.
  • An embodiment of the user trajectory identification device in the embodiment of the application includes:
  • Obtaining module 601 obtaining the original wifi data and gps information of the user in the time period to be identified, the original wifi data including wifi connection time;
  • the one-time recognition module 603 is configured to perform one-time recognition on the data to be recognized according to the preset expert rule dictionary, and obtain a recognition result once, and when the recognition result of the first recognition is successful, obtain the location of the data to be recognized category;
  • the secondary recognition module 604 is configured to input the data to be recognized into a pre-trained wifi recognition model to obtain a secondary recognition result when the primary recognition result is a recognition failure, wherein the secondary recognition result a category of locations comprising said data to be identified;
  • a labeling module 605 configured to divide the section to be identified into slices according to the wifi connection time to obtain at least one section of the wifi connection time period, and mark the wifi connection time period according to the location category of the data to be identified, Obtain user location annotation information;
  • a trajectory drawing module 606 configured to generate a user trajectory of the user according to the user location annotation information, the original wifi data and the GPS information.
  • the user trajectory identification device operates the above user trajectory identification method, the user trajectory identification device obtains the original wifi data and gps information of the user in the time period to be identified, the original wifi data includes wifi connection time; for the original Perform data preprocessing on the wifi data to obtain the data to be recognized; according to the preset expert rule dictionary, perform a recognition on the data to be recognized once, and obtain a recognition result; if the recognition result is successful, then obtain the location category of the data to be recognized; If the primary recognition result is recognition failure, input the data to be recognized into the pre-trained wifi recognition model to obtain the secondary recognition result, wherein the secondary recognition result includes the location category of the data to be recognized; Sections are divided into slices to obtain at least one wifi connection time period, and the wifi connection time period is marked according to the location category of the data to be identified to obtain the user location label information; according to the user location label information, original wifi data and gps information, the user is generated user trajectory.
  • this method Based on deep learning technology, this method generates user location labeling information for wifi data, conducts small-scale fine user trajectory identification based on user location labeling information and wifi data, and combines GPS information for wide-area user trajectory identification to generate The user track of the user, the user track identification is performed according to various data, the recognition accuracy of the user track is improved, and the identification process of the user track can be automated.
  • the second embodiment of the user trajectory recognition device in the embodiment of the present application includes:
  • Obtaining module 601 obtaining the original wifi data and gps information of the user in the time period to be identified, the original wifi data including wifi connection time;
  • the one-time recognition module 603 is configured to perform one-time recognition on the data to be recognized according to the preset expert rule dictionary, and obtain a recognition result once, and when the recognition result of the first recognition is successful, obtain the location of the data to be recognized category;
  • the secondary recognition module 604 is configured to input the data to be recognized into a pre-trained wifi recognition model to obtain a secondary recognition result when the primary recognition result is a recognition failure, wherein the secondary recognition result a category of locations comprising said data to be identified;
  • a labeling module 605 configured to divide the section to be identified into slices according to the wifi connection time to obtain at least one section of the wifi connection time period, and mark the wifi connection time period according to the location category of the data to be identified, Obtain user location annotation information;
  • a trajectory drawing module 606 configured to generate a user trajectory of the user according to the user location annotation information, the original wifi data and the GPS information.
  • the preprocessing module 602 includes: a data cleaning unit 6021, which is used to perform data cleaning processing on the wifi name data to obtain a data cleaning result; a word segmentation unit 6022, which is used to convert the wifi name data in the data cleaning result Perform word segmentation processing to obtain a wifi word segmentation array; the removing unit 6023 is used to remove stop words in the wifi word segmentation array to obtain data to be recognized.
  • the word segmentation unit 6022 is specifically configured to: perform word segmentation on the wifi name data in the data cleaning result to obtain a sequence array; construct a directed non-return sequence array of the sequence array according to a preset prefix dictionary. , and calculate the probability of each path in the directed acyclic graph respectively; according to the path corresponding to the maximum probability in the directed acyclic graph, the optimal word segmentation result is obtained, and the optimal word segmentation result is used for the described optimal word segmentation result
  • the wifi name data in the data cleaning result is segmented to obtain a wifi word segmentation array.
  • the primary recognition module 603 is specifically configured to: match the data to be recognized with the location words in the expert rule dictionary; The category of the location is used as a recognition result; if the matching fails, the recognition result is set as a recognition failure.
  • the secondary recognition module 604 is specifically configured to: input the data to be recognized into the word vector layer in the wifi recognition model, convert the data to be recognized into a word vector sequence; convert the word The vector is input to the maximum pooling layer in the wifi recognition model to obtain the maximum pooling result; the maximum pooling result is input to the fully connected hidden layer in the wifi recognition model, and by the Softmax function, the The output results of the fully connected hidden layer are classified to obtain the secondary recognition result.
  • the user trajectory identification device further includes a model training module 607, and the model training module 607 is specifically configured to: acquire historical wifi data and a preset neural network model, and initialize the word vector layer in the neural network model , the network parameters of the maximum pooling layer and the network parameters of the fully connected hidden layer, the historical wifi data includes the artificially identified place category; the historical wifi data is input in the neural network model to obtain the predicted place category; Calculate the preset loss function according to the location category manually identified and the location category predicted by the neural network model according to the historical wifi data, obtain a loss value, and judge whether the loss value is less than a preset threshold; if so, then according to the specified
  • the network parameters of the word vector layer, the maximum pooling layer and the fully connected hidden layer in the neural network model determine the wifi identification model; if not, then update the network parameters of the neural network model by the backpropagation algorithm according to the loss value, Iterate the model training process repeatedly until the loss value is less than the preset threshold, and determine the network parameters of the word vector
  • the labeling module 606 is specifically configured to: slice and divide the time period to be identified according to the wifi connection time of the original wifi data to obtain at least one period of wifi connection time; The corresponding relationship between each original wifi data and each data to be identified determines the location category corresponding to each original wifi data; according to the location category of the original wifi data and the original wifi data, the wifi connection time period is marked to obtain the user Location annotation information.
  • this embodiment describes in detail the specific functions of each module and the unit composition of some modules.
  • the user location labeling information is generated for wifi data.
  • the user location tagging information and wifi data are used for small-scale fine user trajectory identification, combined with GPS information for wide-area wide-area user trajectory identification, to generate the user trajectory of the user, and to perform user trajectory identification based on various data to improve
  • the recognition accuracy of user trajectories can automate the identification process of user trajectories.
  • Fig. 8 is a schematic structural diagram of a user trajectory recognition device provided by an embodiment of the present application.
  • the user trajectory recognition device 800 may have relatively large differences due to different configurations or performances, and may include one or more processors (central processing units) , CPU) 810 (eg, one or more processors) and memory 820, and one or more storage media 830 (eg, one or more mass storage devices) for storing application programs 833 or data 832 .
  • the memory 820 and the storage medium 830 may be temporary storage or persistent storage.
  • the program stored in the storage medium 830 may include one or more modules (not shown in the figure), and each module may include a series of instruction operations for the user trajectory recognition device 800 .
  • the processor 810 may be configured to communicate with the storage medium 830, and execute a series of instruction operations in the storage medium 830 on the user trajectory identification device 800, so as to implement the steps of the above user trajectory identification method.
  • the user trajectory recognition device 800 may also include one or more power sources 840, one or more wired or wireless network interfaces 850, one or more input and output interfaces 860, and/or, one or more operating systems 831, such as Windows Server , Mac OS X, Unix, Linux, FreeBSD, etc.
  • operating systems 831 such as Windows Server , Mac OS X, Unix, Linux, FreeBSD, etc.
  • Blockchain essentially a decentralized database, is a series of data blocks associated with each other using cryptographic methods. Each data block contains a batch of network transaction information, which is used to verify its Validity of information (anti-counterfeiting) and generation of the next block.
  • the blockchain can include the underlying platform of the blockchain, the platform product service layer, and the application service layer.
  • the present application also provides a computer-readable storage medium.
  • the computer-readable storage medium may be a non-volatile computer-readable storage medium.
  • the computer-readable storage medium may also be a volatile computer-readable storage medium. Instructions are stored in the computer-readable storage medium, and when the instructions are run on the computer, the computer is made to execute the steps of the user trajectory identification method.
  • the integrated unit is realized in the form of a software function unit and sold or used as an independent product, it can be stored in a computer-readable storage medium.
  • the technical solution of the present application is essentially or part of the contribution to the prior art or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium , including several instructions to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in the various embodiments of the present application.
  • the aforementioned storage medium includes: U disk, mobile hard disk, read-only memory (read-only memory, ROM), random access memory (random access memory, RAM), magnetic disk or optical disc and other media that can store program codes. .

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Software Systems (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Remote Sensing (AREA)
  • Mathematical Physics (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Biophysics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Signal Processing (AREA)
  • Quality & Reliability (AREA)
  • Evolutionary Biology (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention a user track recognition method, apparatus and device (800), and a storage medium, which relate to the field of data processing. The method comprises: obtaining original WiFi data and GPS information of a user in a time period to be recognized (101); performing data pre-processing on the original WiFi data to obtain data to be recognized (102); according to a preset expert rule dictionary, performing primary recognition on the data to be recognized to obtain a primary recognition result (103); if the primary recognition result is recognition failure, inputting the data to be recognized into a pretrained WiFi recognition model and obtaining a secondary recognition result (105); according to the primary recognition result or the secondary recognition result, generating user position annotation information of the user in the time period to be recognized; and generating a user track of the user according to the user position annotation information, the original WiFi data, and the GPS information (107). According to the present method, the user track of a user can be automatically recognized by means of a pre-established expert rule dictionary and a model.

Description

用户轨迹识别方法、装置、设备及存储介质User trajectory recognition method, device, equipment and storage medium
本申请要求于2021年06月30日提交中国专利局、申请号为202110732370.4、发明名称为“用户轨迹识别方法、装置、设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在申请中。This application claims the priority of the Chinese patent application with the application number 202110732370.4 and the title of the invention "User Trajectory Identification Method, Device, Equipment, and Storage Medium" submitted to the China Patent Office on June 30, 2021, the entire contents of which are incorporated by reference in application.
技术领域technical field
本申请涉及数据处理领域,尤其涉及一种用户轨迹识别方法、装置、设备及存储介质。The present application relates to the field of data processing, and in particular to a user trajectory identification method, device, equipment and storage medium.
背景技术Background technique
智能终端及定位技术的迅猛发展极大的推动了基于位置服务应用的普及,如今,用户是很多企业提供服务的核心基础,通过分析用户的位置变化可以对用户行为进行描述,对于优化用户推荐系统、提升企业的服务质量、助力智慧城市布局等方面有着重大意义,考虑到用户的日常移动轨迹中包含了用户在时间与空间上的信息,与用户的日常行为有着密切关联,对于用户轨迹的研究一直受到学者们的关注。The rapid development of smart terminals and positioning technology has greatly promoted the popularity of location-based service applications. Nowadays, users are the core basis for many companies to provide services. By analyzing the changes in user locations, user behavior can be described, which is useful for optimizing user recommendation systems. It is of great significance to improve the service quality of enterprises and help the layout of smart cities. Considering that the user's daily movement trajectory contains the user's information in time and space, which is closely related to the user's daily behavior, the research on user trajectory has always attracted the attention of scholars.
目前应用于用户轨迹识别的主要方法是手机GPS识别、手机基站识别。目前通过手机GPS和基站识别用户轨迹存在以下不足。第一,由于现有GPS和基站因为信号质量的原因存在0-100米的误差,造成用户轨迹判断错误。第二,同一个地址或者位置,会存在多个POI(Point of Interest),无法精确判断用户实际轨迹。At present, the main methods used in user trajectory identification are mobile phone GPS identification and mobile phone base station identification. At present, there are the following deficiencies in identifying user trajectories through mobile phone GPS and base stations. First, due to the error of 0-100 meters in the existing GPS and base station due to the signal quality, the user trajectory judgment is wrong. Second, there will be multiple POIs (Point of Interest) at the same address or location, and it is impossible to accurately determine the actual trajectory of the user.
发明内容Contents of the invention
本申请提供了一种用户轨迹识别方法、装置、设备及存储介质,用于解决现有的用户轨迹识别方式识别用户实际轨迹精度低的技术问题。The present application provides a user trajectory identification method, device, equipment and storage medium, which are used to solve the technical problem that the existing user trajectory identification method has low accuracy in identifying the user's actual trajectory.
本申请第一方面提供了一种用户轨迹识别方法,包括:获取用户在待识别时间段的原始wifi数据和gps信息,所述原始wifi数据包括wifi连接时间;对所述原始wifi数据进行数据预处理,得到待识别数据;根据预设的专家规则词典,对所述待识别数据进行一次识别,得到一次识别结果;若所述一次识别结果为识别成功,则得到所述待识别数据的地点类别;若所述一次识别结果为识别失败,则将所述待识别数据输入至预先训练好的wifi识别模型中,得到二次识别结果,其中,所述二次识别结果包括所述待识别数据的地点类别;根据所述wifi连接时间将所述待识别段进行切片划分,得到至少一段wifi连接时间段,并根据所述待识别数据的地点类别对所述wifi连接时间段进行标注,得到用户位置标注信息;根据所述用户位置标注信息、所述原始wifi数据和所述gps信息,生成所述用户的用户轨迹。本申请第二方面提供了一种用户轨迹识别方法设备,包括存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机可读指令,所述处理器执行所述计算机可读指令时实现如下步骤:获取用户在待识别时间段的原始wifi数据和gps信息,所述原始wifi数据包括wifi连接时间;对所述原始wifi数据进行数据预处理,得到待识别数据;根据预设的专家规则词典,对所述待识别数据进行一次识别,得到一次识别结果;若所述一次识别结果为识别成功,则得到所述待识别数据的地点类别;若所述一次识别结果为识别失败,则将所述待识别数据输入至预先训练好的wifi识别模型中,得到二次识别结果,其中,所述二次识别结果包括所述待识别数据的地点类别;根据所述wifi连接时间将所述待识别段进行切片划分,得到至少一段wifi连接时间段,并根据所述待识别数据的地点类别对所述wifi连接时间段进行标注,得到用户位置标注信息;根据所述用户位置标注信息、所述原始wifi数据和所述gps信息,生成所述用户的用户轨迹。本申请的第三方面提供了一种计算机可读存储介质,所述计算机可读存储介质中存储计算机指令,当所述计算机指令在计算机上运行时,使得计算机执行如下步骤:获取用户在待识别时间段的原始wifi数据和gps信息,所述原始wifi数据包括wifi连接时间;对所述原始wifi数据进行数据预处理,得到待识别数据;根据预设的专家规则词典,对所述待识别数据进行一 次识别,得到一次识别结果;若所述一次识别结果为识别成功,则得到所述待识别数据的地点类别;若所述一次识别结果为识别失败,则将所述待识别数据输入至预先训练好的wifi识别模型中,得到二次识别结果,其中,所述二次识别结果包括所述待识别数据的地点类别;根据所述wifi连接时间将所述待识别段进行切片划分,得到至少一段wifi连接时间段,并根据所述待识别数据的地点类别对所述wifi连接时间段进行标注,得到用户位置标注信息;根据所述用户位置标注信息、所述原始wifi数据和所述gps信息,生成所述用户的用户轨迹。本申请第四方面提供了一种用户轨迹识别方法装置,其中,所述用户轨迹识别方法装置包括:获取模块,获取用户在待识别时间段的原始wifi数据和gps信息,所述原始wifi数据包括wifi连接时间;预处理模块,用于对所述原始wifi数据进行数据预处理,得到待识别数据;一次识别模块,用于根据预设的专家规则词典,对所述待识别数据进行一次识别,得到一次识别结果,当所述一次识别结果为识别成功时,则得到所述待识别数据的地点类别;二次识别模块,用于当所述一次识别结果为识别失败时,将所述待识别数据输入至预先训练好的wifi识别模型中,得到二次识别结果,其中,所述二次识别结果包括所述待识别数据的地点类别;标注模块,用于根据所述wifi连接时间将所述待识别段进行切片划分,得到至少一段wifi连接时间段,并根据所述待识别数据的地点类别对所述wifi连接时间段进行标注,得到用户位置标注信息;轨迹描绘模块,用于根据所述用户位置标注信息、所述原始wifi数据和所述gps信息,生成所述用户的用户轨迹。本申请提供的技术方案中,,获取用户在待识别时间段的原始wifi数据和gps信息,原始wifi数据包括wifi连接时间;对原始wifi数据进行数据预处理,得到待识别数据;根据预设的专家规则词典,对待识别数据进行一次识别,得到一次识别结果;若一次识别结果为识别成功,则得到待识别数据的地点类别;若一次识别结果为识别失败,则将待识别数据输入至预先训练好的wifi识别模型中,得到二次识别结果,其中,二次识别结果包括待识别数据的地点类别;根据wifi连接时间将待识别段进行切片划分,得到至少一段wifi连接时间段,并根据待识别数据的地点类别对wifi连接时间段进行标注,得到用户位置标注信息;根据用户位置标注信息、原始wifi数据和gps信息,生成用户的用户轨迹。本方法基于深度学习技术,对wifi数据进行用户位置标注信息的生成,根据用户位置标注信息和wifi数据进行小范围的精细用户轨迹识别,并结合gps信息进行大范围的广域用户轨迹识别,生成用户的用户轨迹,根据多种数据进行用户轨迹识别,提高用户轨迹的识别精度,可以自动化用户轨迹的识别流程。The first aspect of the present application provides a user trajectory identification method, including: obtaining the original wifi data and gps information of the user in the time period to be identified, the original wifi data including wifi connection time; performing data pre-processing on the original wifi data processing to obtain the data to be identified; according to the preset expert rule dictionary, the data to be identified is identified once to obtain a recognition result; if the identification result of the first identification is successful, the location category of the data to be identified is obtained ; if the primary recognition result is a recognition failure, then input the data to be recognized into a pre-trained wifi recognition model to obtain a secondary recognition result, wherein the secondary recognition result includes the data to be recognized location category; divide the section to be identified into slices according to the wifi connection time to obtain at least one section of wifi connection time section, and mark the wifi connection time section according to the location category of the data to be identified to obtain the user location labeling information; generating the user track of the user according to the user location labeling information, the original wifi data and the gps information. The second aspect of the present application provides a user trajectory identification method device, including a memory, a processor, and computer-readable instructions stored on the memory and operable on the processor, and the processor executes the computer The following steps are realized during the readable instruction: obtain the original wifi data and gps information of the user in the time period to be identified, the original wifi data includes the wifi connection time; the original wifi data is carried out data preprocessing to obtain the data to be identified; The preset expert rule dictionary is used to identify the data to be identified once to obtain an identification result; if the identification result of the identification is successful, the location category of the data to be identified is obtained; if the identification result of the identification is If the recognition fails, the data to be recognized is input into a pre-trained wifi recognition model to obtain a secondary recognition result, wherein the secondary recognition result includes the location category of the data to be recognized; according to the wifi connection Slicing and dividing the section to be identified by time to obtain at least one wifi connection time section, and marking the wifi connection time section according to the location category of the data to be identified to obtain user location labeling information; according to the user location labeling information, the original wifi data and the gps information to generate the user track of the user. The third aspect of the present application provides a computer-readable storage medium, wherein computer instructions are stored in the computer-readable storage medium, and when the computer instructions are run on the computer, the computer is made to perform the following steps: obtain the The original wifi data and gps information of time period, described original wifi data comprises wifi connection time; Carry out data preprocessing to described original wifi data, obtain the data to be identified; According to the preset expert rule dictionary, to the described data to be identified Perform a recognition to obtain a recognition result; if the recognition is successful, the location category of the data to be recognized is obtained; if the recognition is failed, the data to be recognized is input to the In the trained wifi recognition model, a secondary recognition result is obtained, wherein the secondary recognition result includes the location category of the data to be recognized; according to the wifi connection time, the segment to be recognized is divided into slices to obtain at least A section of wifi connection time period, and mark the wifi connection time period according to the location category of the data to be identified to obtain user location label information; according to the user location label information, the original wifi data and the gps information , to generate the user track of the user. The fourth aspect of the present application provides a user trajectory identification method and device, wherein the user trajectory identification method and device includes: an acquisition module that acquires the original wifi data and gps information of the user in the time period to be identified, the original wifi data includes wifi connection time; a preprocessing module for performing data preprocessing on the original wifi data to obtain data to be identified; a primary identification module for performing a primary identification on the data to be identified according to a preset expert rule dictionary, Obtain a primary recognition result, when the primary recognition result is a successful recognition, then obtain the location category of the data to be recognized; a secondary recognition module is used to convert the to-be-recognized data when the primary recognition result is a recognition failure The data is input into a pre-trained wifi recognition model to obtain a secondary recognition result, wherein the secondary recognition result includes the location category of the data to be recognized; a labeling module is used to classify the described wifi connection time according to the wifi connection time Sections to be identified are divided into slices to obtain at least one wifi connection time period, and the wifi connection time period is marked according to the location category of the data to be identified to obtain user position labeling information; the track drawing module is used for according to the described The user location label information, the original wifi data and the gps information generate the user track of the user. In the technical solution provided by the present application, the original wifi data and gps information of the user in the time period to be identified are obtained, the original wifi data includes the wifi connection time; data preprocessing is performed on the original wifi data to obtain the data to be identified; according to the preset Expert rule dictionary, conduct a recognition on the data to be recognized, and get a recognition result; if the recognition result is successful, then get the location category of the data to be recognized; if the recognition result is a failure, then input the data to be recognized into the pre-training In a good wifi recognition model, the secondary recognition result is obtained, wherein the secondary recognition result includes the location category of the data to be recognized; according to the wifi connection time, the segment to be recognized is divided into slices to obtain at least one wifi connection time period, and according to the to-be-recognized data Identify the location category of the data and mark the wifi connection time period to obtain the user location label information; according to the user location label information, original wifi data and gps information, the user track of the user is generated. Based on deep learning technology, this method generates user location labeling information on wifi data, conducts small-scale fine user trajectory identification based on user location labeling information and wifi data, and combines GPS information to perform wide-area user trajectory identification in a large range to generate The user track of the user, the user track identification is performed according to various data, the recognition accuracy of the user track is improved, and the identification process of the user track can be automated.
附图说明Description of drawings
图1为本申请实施例中用户轨迹识别方法的第一个实施例示意图;Fig. 1 is the schematic diagram of the first embodiment of the user trajectory identification method in the embodiment of the present application;
图2为本申请实施例中用户轨迹识别方法的第二个实施例示意图;FIG. 2 is a schematic diagram of a second embodiment of the user trajectory identification method in the embodiment of the present application;
图3为本申请实施例中用户轨迹识别方法的第三个实施例示意图;FIG. 3 is a schematic diagram of a third embodiment of the user trajectory identification method in the embodiment of the present application;
图4为本申请实施例中用户轨迹识别方法的第四个实施例示意图;FIG. 4 is a schematic diagram of a fourth embodiment of the user trajectory identification method in the embodiment of the present application;
图5为本申请实施例中用户轨迹识别方法的第五个实施例示意图;FIG. 5 is a schematic diagram of a fifth embodiment of the user trajectory identification method in the embodiment of the present application;
图6为本申请实施例中用户轨迹识别装置的一个实施例示意图;FIG. 6 is a schematic diagram of an embodiment of the user trajectory identification device in the embodiment of the present application;
图7为本申请实施例中用户轨迹识别装置的另一个实施例示意图;FIG. 7 is a schematic diagram of another embodiment of the user trajectory identification device in the embodiment of the present application;
图8为本申请实施例中用户轨迹识别设备的一个实施例示意图。Fig. 8 is a schematic diagram of an embodiment of a user trajectory identification device in the embodiment of the present application.
具体实施方式detailed description
本申请实施例提供了一种用户轨迹识别方法、装置、设备及存储介质,用于解决现有的用户轨迹识别方式识别用户实际轨迹精度低的技术问题。Embodiments of the present application provide a method, device, device, and storage medium for identifying user trajectories, which are used to solve the technical problem of low accuracy in identifying actual user trajectories in existing user trajectory identification methods.
本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二”、“第三”、“第四”等(如果存在)是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理 解这样使用的数据在适当情况下可以互换,以便这里描述的实施例能够以除了在这里图示或描述的内容以外的顺序实施。此外,术语“包括”或“具有”及其任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。The terms "first", "second", "third", "fourth", etc. (if any) in the specification and claims of the present application and the above drawings are used to distinguish similar objects, and not necessarily Used to describe a specific sequence or sequence. It should be understood that the terms so used are interchangeable under appropriate circumstances such that the embodiments described herein can be practiced in sequences other than those illustrated or described herein. Furthermore, the term "comprising" or "having" and any variations thereof, are intended to cover a non-exclusive inclusion, for example, a process, method, system, product or device comprising a sequence of steps or elements is not necessarily limited to those explicitly listed instead, may include other steps or elements not explicitly listed or inherent to the process, method, product or apparatus.
为便于理解,下面对本申请实施例的具体流程进行描述,请参阅图1,本申请实施例中用户轨迹识别方法的一个实施例包括:For ease of understanding, the following describes the specific process of the embodiment of the present application. Please refer to FIG. 1. An embodiment of the user trajectory identification method in the embodiment of the present application includes:
101、获取用户在待识别时间段的原始wifi数据和gps信息;101. Obtain the original wifi data and GPS information of the user in the time period to be identified;
可以理解的是,本申请的执行主体可以为用户轨迹识别装置,还可以是终端或者服务器,具体此处不做限定。本申请实施例以服务器为执行主体为例进行说明。It can be understood that the execution subject of the present application may be a user trajectory recognition device, and may also be a terminal or a server, which is not specifically limited here. The embodiment of the present application is described by taking the server as an execution subject as an example.
需要强调的是,为保证数据的私密和安全性,上述原始wifi数据可以存储于一区块链的节点中。It should be emphasized that, in order to ensure data privacy and security, the above-mentioned original wifi data can be stored in a block chain node.
在本实施例中,可以识别用户在一个时间段中的用户轨迹,待识别时间段的选择可以是一天或是一周时间,本申请不做限定。In this embodiment, the user track of the user in a time period can be identified, and the time period to be identified can be selected as one day or one week, which is not limited in this application.
在本实施例中,原始wifi数据主要包括用户在待识别时间段中连接的所有wifi的wifi名称,wifiid和连接的wifi设别的设备号和对应的wifi地址,其中每个wifi名称、wifiid和设备号一一对应,后续对wifi名称进行类别识别后,将识别的类别、wifi名称、wifiid和设备号共同组成用户轨迹。In this embodiment, the original wifi data mainly includes wifi names, wifiids and connected wifi device numbers and corresponding wifi addresses of all wifis connected by the user in the time period to be identified, wherein each wifi name, wifiid and The device numbers are one-to-one, and after the subsequent category recognition of the wifi name, the identified category, wifi name, wifiid, and device number together form a user track.
在本实施例中,通过gps信息能够对待识别时间段的室外用户轨迹进行描绘,通过,通过原始wifi数据,能够进行室内定位,对用户在待识别时间段中的室内用户轨迹进行描绘,结合室内用户轨迹和室外用户轨迹,能够得到用户的总用户轨迹。In this embodiment, the outdoor user trajectory of the time period to be identified can be described through the GPS information, and indoor positioning can be performed through the original wifi data, and the indoor user trajectory of the user in the time period to be identified can be described, combined with the indoor User trajectories and outdoor user trajectories can obtain the total user trajectories of users.
102、对原始wifi数据进行数据预处理,得到待识别数据;102. Perform data preprocessing on the original wifi data to obtain data to be identified;
在实际应用用,采集来的原始wifi数据有时候不一定完整,有时候没录上,有时候随便写个数字,也有一个特征有数据其他全部为0的数据,这些数据不满足算法本身的要求,比如回归模型中,相关性,共线性的特征,会导致算法无法收敛或失效,需要提前处理;在本实施例中,数据预处理主要数据清洗处理和分词处理,其中数据清洗主要删除异常数据,删除无效数据,删除空白数据,分词处理主要是将与原始wifi数据中的wifi名称数据进行分词,将wifi名称分解由多个词组成的数组,并去除停用词,停用词由没有实际含义的功能性词组成,如语气词、标点等,然后从wifi名称数组中删除这些停用词,剩下的为有效词,由有效词共同构建有效词组,这样做的目的是为了减少后续运算量。In practical applications, the original wifi data collected is sometimes not complete, sometimes not recorded, sometimes just write a number, there is also a feature that has data and other data that are all 0, these data do not meet the requirements of the algorithm itself , for example, in the regression model, the correlation and collinear features will cause the algorithm to fail to converge or fail, and need to be processed in advance; in this embodiment, data preprocessing mainly includes data cleaning and word segmentation, and data cleaning mainly deletes abnormal data , delete invalid data, delete blank data, word segmentation processing is mainly to segment the wifi name data in the original wifi data, decompose the wifi name into an array composed of multiple words, and remove stop words, stop words are composed of no actual Functional words of meaning, such as modal particles, punctuation, etc., and then delete these stop words from the wifi name array, and the rest are valid words, and the valid phrases are jointly constructed by valid words. The purpose of this is to reduce subsequent operations quantity.
在本实施例中,主要是通过结巴分词法对所述wifi名称进行分词处理,结巴分词法为Python的结巴分词模块,该方法支持精确模式、全模式和搜索引擎模式三种分词模式。通过预设的停用词典,可以将wifi名称数据中的停用词进行剔除,可以根据不同需求可以增加停用词词典的停用词数量。In this embodiment, the wifi name is segmented mainly through the stuttering word segmentation method, which is a stuttering word segmentation module of Python, which supports three word segmentation modes: precise mode, full mode and search engine mode. The stop words in the wifi name data can be removed through the preset stop dictionary, and the number of stop words in the stop word dictionary can be increased according to different needs.
103、根据预设的专家规则词典,对待识别数据进行一次识别,得到一次识别结果;103. Perform one recognition on the data to be recognized according to the preset expert rule dictionary, and obtain one recognition result;
在本实施例中,针对wifi名称中一些特定词语,就是识别wifi名称的类别,如包含“机场”、“火车站”一般为交通出行;包含“饭店”一般为餐饮;包含常见路由器品牌名则一般为家用wifi。针对一些特别的词组,建立专家规则词典,字典包含类型和类别对应的特定单词。此外,针对特定应用场景,我们也能建立特殊字典,例如汽车品牌名字典用于识别汽车销售、汽车服务等场所wifi;ktv、会所、马术等店名字典用于识别高端消费场所wifi等。遍历上一步输出的wifi名称数组,如果命中字典中的规则单词,则将改wifi名称标记为指定类别。专家识别规则能够对大部分包含关键词的wifi名称进行分类标识,有效减少后续建模数据量,并且针对特定场所wifi做特殊规则筛选以提高识别的准确度。In this embodiment, for some specific words in the wifi name, it is to identify the category of the wifi name, such as including "airport" and "railway station", generally for transportation; including "restaurant" for food and beverage; including common router brand names Usually home wifi. For some special phrases, an expert rule dictionary is established, and the dictionary contains specific words corresponding to types and categories. In addition, for specific application scenarios, we can also create special dictionaries. For example, a car brand name dictionary is used to identify Wi-Fi in places such as car sales and car services; a dictionary of KTV, clubs, equestrian and other store names is used to identify Wi-Fi in high-end consumer places. Traversing the wifi name array output in the previous step, if it hits a rule word in the dictionary, the wifi name will be marked as the specified category. Expert identification rules can classify and identify most wifi names containing keywords, effectively reducing the amount of subsequent modeling data, and do special rule screening for wifi in specific places to improve the accuracy of identification.
104、若一次识别结果为识别成功,则得到待识别数据的地点类别;104. If the recognition result is successful, the location category of the data to be recognized is obtained;
105、若一次识别结果为识别失败,则将待识别数据输入至预先训练好的wifi识别模型中,得到二次识别结果,二次识别结果包括所述待识别数据的地点类别;105. If the primary recognition result is a recognition failure, input the data to be recognized into the pre-trained wifi recognition model to obtain a secondary recognition result, the secondary recognition result includes the location category of the data to be recognized;
在本实施例中,通过专家规则词典进行一次识别,通过专家规则词典遍历上一步输出的wifi名称数组,如果均未命中字典中的规则单词,则确定未一次识别失败,需要进行二次识别,通过预设的wifi识别模型进行二次识别,其中,利用机器自动学习的能力,通过已经标识好的训练数据,通过自动化训练建立wifi识别模型。In this embodiment, a recognition is performed through the expert rule dictionary, and the array of wifi names output in the previous step is traversed through the expert rule dictionary. If none of the rule words in the dictionary are hit, it is determined that the first recognition fails, and a second recognition is required. Secondary recognition is carried out through the preset wifi recognition model, wherein the wifi recognition model is established through automatic training by using the ability of machine automatic learning and through the already marked training data.
在本实施例中,wifi识别模型是使用DNN(深度神经网络)构成,DNN模型包括词向量层、最大池化层、全连接隐层和输出层,通过词向量层将wifi名称数组中词语映射到词向量,最大池化在时间序列上进行,池化过程消除了不同语料样本在单词数量多少上的差异,并提炼出词向量中每一下标位置上的最大值,经过最大池化后的向量被送入两个连续的全连接隐层进行计算,最后通过输出层输出一个归一化的概率分布,和为1的结果,输出层有多个神经元,神经元的数量与wifi需要分类的类别相同,因此第x个神经元的输出就可以认为是wifi名称数据属于第x类wifi类别的预测概率,将其中的最大值对应的wifi类别作为wifi名称数据的类别,并作为二次识别结果。In this embodiment, the wifi recognition model is formed using DNN (Deep Neural Network), and the DNN model includes a word vector layer, a maximum pooling layer, a fully connected hidden layer and an output layer, and the words in the wifi name array are mapped by the word vector layer To the word vector, the maximum pooling is performed on the time series. The pooling process eliminates the difference in the number of words in different corpus samples, and extracts the maximum value of each subscript position in the word vector. After the maximum pooling The vector is sent to two consecutive fully connected hidden layers for calculation, and finally a normalized probability distribution is output through the output layer, and the result is 1. The output layer has multiple neurons, and the number of neurons is the same as the wifi needs to be classified. The categories are the same, so the output of the xth neuron can be considered as the predicted probability that the wifi name data belongs to the xth category of wifi category, and the wifi category corresponding to the maximum value is used as the category of the wifi name data, and as the secondary recognition result.
106、根据wifi连接时间将待识别段进行切片划分,得到至少一段wifi连接时间段,并根据待识别数据的地点类别对wifi连接时间段进行标注,得到用户位置标注信息;106. Slicing the section to be identified according to the wifi connection time to obtain at least one section of the wifi connection time period, and marking the wifi connection time period according to the location category of the data to be identified to obtain user location labeling information;
在本实施例中,通过专家规则词典获得一次识别结果,通过wifi识别模型得到二次识别结果,一次识别结果和二次识别结果中均有原始wifi数据对于的识别类型,将已经识别出类型的wifi名称数据,和用户连接的wifi名称行关联,这就可以识别出用户连接的wifi类别。通过对用户连接的wifi类别进行汇总,就可以识别出用户的用户位置标注信息。识别后的用户位置标注信息如下:In this embodiment, the primary recognition result is obtained through the expert rule dictionary, and the secondary recognition result is obtained through the wifi recognition model. Both the primary recognition result and the secondary recognition result have the recognition type of the original wifi data, and the recognized type The wifi name data is associated with the wifi name row that the user is connected to, so that the type of wifi that the user is connected to can be identified. By summarizing the types of wifi connected by the user, the user location label information of the user can be identified. The identified user location label information is as follows:
0-8点家庭住宅wifiid:aaabac设备号xxxxx;0-8 o'clock home wifiid: aaabac device number xxxxx;
9点交通出行wifiid:erdfhethrh设备号xYYYY;9 o'clock traffic travel wifiid: erdfhethrh device number xYYYY;
10点-12点写字楼商业建筑wifiid:qegehwr设备号xYYX;From 10:00 to 12:00, wifiid of office buildings and commercial buildings: qegehwr equipment number xYYX;
13点餐饮wifiid:EHFDrh设备号xxxyzz;Restaurant wifiid at 13:00: EHFDrh device number xxxyzz;
14-18点写字楼商业建筑wifiid:ETHHDF设备号xxxyzzz;14-18 o'clock office building commercial building wifiid: ETHHDF equipment number xxxyzzz;
18点-22点娱乐设施wifiid:ehdfhR设备号xxxyzzzz;18:00-22:00 entertainment facility wifiid: ehdfhR device number xxxyzzzz;
22点-24点家庭住宅wifiid:erhDSG设备号xxxyyzzzz。From 22 o'clock to 24 o'clock family home wifiid: erhDSG device number xxxyyzzzz.
107、根据用户位置标注信息、原始wifi数据和gps信息,生成用户的用户轨迹。107. Generate a user track of the user according to the user location label information, original wifi data, and GPS information.
在本实施例中,通过用户的gps信息,能够对用户定位,进而实现用户轨迹的生成,但是由于信号的严重衰减和多径效应,gps并不能在建筑物内有效地工作,存在0-100米的误差,造成用户轨迹判断错误,在室内时,通过原始wifi数据中的wifi地址信息,能够用户在室内的用户轨迹进行描绘,主要是通过位置指纹的方法,位置指纹是通过把实际环境中的位置和某种“指纹”联系起来,一个位置对应一个独特的指纹。这个指纹可以是单维或多维的,比如待定位设备在接收或者发送信息,那么指纹可以是这个信息或信号的一个特征或多个特征,例如连接的wifi的信号强度,时延等,通过wifi地址和位置指纹对用户进行室内定位,结合gps的室外定位进行全方位的精准定位。In this embodiment, the user's GPS information can be used to locate the user, and then realize the generation of the user's track. However, due to the serious attenuation of the signal and the multipath effect, the GPS cannot work effectively in the building, and there are 0-100 The error of the meter causes the wrong judgment of the user's trajectory. When indoors, the wifi address information in the original wifi data can be used to describe the user's indoor user trajectory, mainly through the method of location fingerprinting. The location of each location is associated with some kind of "fingerprint", and a location corresponds to a unique fingerprint. This fingerprint can be single-dimensional or multi-dimensional. For example, if the device to be located is receiving or sending information, then the fingerprint can be one or more characteristics of this information or signal, such as the signal strength and delay of the connected wifi, etc., through wifi Address and location fingerprints are used for indoor positioning of users, combined with GPS outdoor positioning for all-round precise positioning.
在本实施例中,通过结合用户的室内室外轨迹后,对轨迹中连接了wifi的地点结合用户位置标注信息进行标注,通过标注的信息和轨迹能够判断用户是否偏离日常轨迹,若偏离日常轨迹,则对该用户事先关联的人进行提醒,例如若用户为学生,用户轨迹在待识别时间段偏离日常轨迹,则可以对家长进行提醒。在本实施例中,通过用户轨迹的用户位置标注信息能够识别用户连接的wifi类型,若不是日常连接的wifi类型,则可以确定该用 户偏离日常轨迹。In this embodiment, after combining the user's indoor and outdoor trajectories, the locations connected to wifi in the trajectories are marked in conjunction with the user's location labeling information. Through the marked information and trajectories, it can be judged whether the user deviates from the daily trajectories. If the user deviates from the daily trajectories, Then remind the person associated with the user in advance. For example, if the user is a student and the user's trajectory deviates from the daily trajectory during the time period to be identified, the parents can be reminded. In this embodiment, the type of wifi that the user is connected to can be identified through the user location annotation information of the user track, and if it is not the type of wifi that is connected daily, it can be determined that the user deviates from the daily track.
在本实施例中,获取用户在待识别时间段的原始wifi数据和gps信息,原始wifi数据包括wifi连接时间;对原始wifi数据进行数据预处理,得到待识别数据;根据预设的专家规则词典,对待识别数据进行一次识别,得到一次识别结果;若一次识别结果为识别成功,则得到待识别数据的地点类别;若一次识别结果为识别失败,则将待识别数据输入至预先训练好的wifi识别模型中,得到二次识别结果,其中,二次识别结果包括待识别数据的地点类别;根据wifi连接时间将待识别段进行切片划分,得到至少一段wifi连接时间段,并根据待识别数据的地点类别对wifi连接时间段进行标注,得到用户位置标注信息;根据用户位置标注信息、原始wifi数据和gps信息,生成用户的用户轨迹。本方法基于深度学习技术,对wifi数据进行用户位置标注信息的生成,根据用户位置标注信息和wifi数据进行小范围的精细用户轨迹识别,并结合gps信息进行大范围的广域用户轨迹识别,生成用户的用户轨迹,根据多种数据进行用户轨迹识别,提高用户轨迹的识别精度,可以自动化用户轨迹的识别流程。In this embodiment, the original wifi data and gps information of the user in the time period to be identified are obtained, and the original wifi data includes the wifi connection time; data preprocessing is carried out to the original wifi data to obtain the data to be identified; according to the preset expert rule dictionary , perform a recognition on the data to be recognized, and get a recognition result; if the recognition result is successful, then get the location category of the data to be recognized; if the recognition result is a failure, then input the data to be recognized to the pre-trained wifi In the recognition model, a secondary recognition result is obtained, wherein the secondary recognition result includes the location category of the data to be recognized; according to the wifi connection time, the segment to be recognized is divided into slices to obtain at least one wifi connection time period, and according to the data to be recognized The location category marks the wifi connection time period to obtain the user location label information; according to the user location label information, original wifi data and gps information, the user's user track is generated. Based on deep learning technology, this method generates user location labeling information on wifi data, conducts small-scale fine user trajectory identification based on user location labeling information and wifi data, and combines GPS information to perform wide-area user trajectory identification in a large range to generate The user track of the user, the user track identification is performed according to various data, the recognition accuracy of the user track is improved, and the identification process of the user track can be automated.
请参阅图2,本申请实施例中用户轨迹识别方法的第二个实施例包括:Please refer to Fig. 2, the second embodiment of the user trajectory identification method in the embodiment of the present application includes:
201、获取用户在待识别时间段的原始wifi数据和gps信息;201. Obtain the original wifi data and GPS information of the user in the time period to be identified;
本实施例中的步骤201与第一实施例中的步骤101相似,此处不再赘述。Step 201 in this embodiment is similar to step 101 in the first embodiment, and will not be repeated here.
202、对wifi名称数据进行数据清洗处理,得到数据清洗结果;202. Perform data cleaning processing on the wifi name data to obtain a data cleaning result;
在本实施例中,在本实施例中,数据预处理主要数据清洗处理和分词处理,其中数据清洗主要删除异常数据,删除无效数据,删除空白数据,分词处理主要是将与原始wifi数据中的wifi名称数据进行分词,将wifi名称分解由多个词组成的数组,并去除停用词,停用词由没有实际含义的功能性词组成,如语气词、标点等,然后从wifi名称数组中删除这些停用词,剩下的为有效词,由有效词共同构建有效词组,这样做的目的是为了减少后续运算量。In this embodiment, in this embodiment, data preprocessing mainly includes data cleaning and word segmentation, wherein data cleaning mainly deletes abnormal data, deletes invalid data, and deletes blank data. Word segmentation of the wifi name data, decomposing the wifi name into an array composed of multiple words, and removing stop words, which are composed of functional words without actual meaning, such as modal particles, punctuation, etc., and then from the wifi name array These stop words are deleted, and the rest are effective words, and the effective phrases are jointly constructed by the effective words. The purpose of this is to reduce the amount of subsequent calculations.
203、对数据清洗结果中的wifi名称数据进行单字切分,得到序列数组;203. Segment the wifi name data in the data cleaning result into single characters to obtain a sequence array;
204、根据预设的前缀词典,构建序列数组的有向无回图,并分别计算有向无回图中各路径的概率;204. According to the preset prefix dictionary, construct the directed acyclic graph of the sequence array, and calculate the probability of each path in the directed acyclic graph respectively;
205、根据有向无回图中最大概率对应的路径,得到最优分词结果,并根据最优分词结果对数据清洗结果中的wifi名称数据进行分词,得到wifi分词数组;205. Obtain the optimal word segmentation result according to the path corresponding to the maximum probability in the directed acyclic graph, and perform word segmentation on the wifi name data in the data cleaning result according to the optimal word segmentation result to obtain a wifi word segmentation array;
206、将wifi分词数组中的停用词进行剔除,得到待识别数据;206. Eliminate the stop words in the wifi word segmentation array to obtain the data to be identified;
在本实施例中,主要是通过结巴分词法对所述wifi名称进行分词处理,结巴分词法为Python的结巴分词模块,该方法支持精确模式、全模式和搜索引擎模式三种分词模式。通过预设的停用词典,可以将wifi名称数据中的停用词进行剔除,可以根据不同需求可以增加停用词词典的停用词数量。In this embodiment, the wifi name is segmented mainly through the stuttering word segmentation method, which is a stuttering word segmentation module of Python, which supports three word segmentation modes: precise mode, full mode and search engine mode. The stop words in the wifi name data can be removed through the preset stop dictionary, and the number of stop words in the stop word dictionary can be increased according to different needs.
在本实施例中,基于统计词典构造前缀词典,如统计词典中的词“北京大学”的前缀分别是“北”、“北京”、“北京大”;词“大学”的前缀是“大”,将“北”、“北京”、“北京大”“大”作为前缀,然后基于前缀词典,对输入文本进行切分,对于“去”,没有前缀,那么就只有一种划分方式;对于“北”,则有“北”、“北京”、“北京大学”三种划分方式;对于“京”,也只有一种划分方式;对于“大”,则有“大”、“大学”两种划分方式,依次类推,可以得到每个字开始的前缀词的划分方式。如果待切分的字符串有m个字符,考虑每个字符左边和右边的位置,则有m+1个点对应,点的编号从0到m。把候选词看成边,可以根据词典生成一个切分词图。In this embodiment, the prefix dictionary is constructed based on the statistical dictionary. For example, the prefixes of the word "Peking University" in the statistical dictionary are "北", "北京", and "北京大" respectively; the prefixes of the word "大学" are "大" , use "北", "北京", "北京大" and "大" as prefixes, and then segment the input text based on the prefix dictionary. For "go", there is no prefix, so there is only one division method; for " For "North", there are three ways of dividing "North", "Beijing" and "Peking University"; for "Beijing", there is only one way of dividing; for "Da", there are two ways of dividing "Da" and "University" The division method, and so on, can obtain the division method of the prefix word at the beginning of each character. If the string to be segmented has m characters, considering the left and right positions of each character, there are m+1 points corresponding, and the number of points is from 0 to m. Considering candidate words as edges, a word segmentation graph can be generated based on the dictionary.
在jieba分词中会标记每个词的频率(等于出现的次数除以总数,当总体样本很大时, 可以近似的看做词的概率),在知道每个词出现的频率之后,就可以基于动态规划的方法来寻找概率最大的分词路径。一般的动态规划寻找最优路径都是从左往右,然而在这里是从右往左去寻找最优路径。这主要是因为汉语句子中的重心往往在后面,后面才是句子的主干,因此从右往左计算的正确率往往要高于从左往右的正确率。In jieba participle, the frequency of each word will be marked (equal to the number of occurrences divided by the total number, when the overall sample is large, it can be approximated as the probability of the word), after knowing the frequency of each word, it can be based on The dynamic programming method is used to find the word segmentation path with the highest probability. General dynamic programming finds the optimal path from left to right, but here it is from right to left to find the optimal path. This is mainly because the center of gravity in a Chinese sentence is often at the back, which is the backbone of the sentence, so the correct rate calculated from right to left is often higher than that from left to right.
207、根据预设的专家规则词典,对待识别数据进行一次识别,得到一次识别结果;207. Perform one recognition on the data to be recognized according to the preset expert rule dictionary, and obtain one recognition result;
208、若一次识别结果为识别成功,则得到待识别数据的地点类别;208. If the result of one recognition is a successful recognition, the location category of the data to be recognized is obtained;
209、若一次识别结果为识别失败,则将待识别数据输入至预先训练好的wifi识别模型中,得到二次识别结果,二次识别结果包括所述待识别数据的地点类别;209. If the primary recognition result is a recognition failure, input the data to be recognized into the pre-trained wifi recognition model to obtain a secondary recognition result, the secondary recognition result includes the location category of the data to be recognized;
210、根据wifi连接时间将待识别段进行切片划分,得到至少一段wifi连接时间段,并根据待识别数据的地点类别对wifi连接时间段进行标注,得到用户位置标注信息;210. Divide the section to be identified into slices according to the wifi connection time, obtain at least one section of the wifi connection time period, and mark the wifi connection time period according to the location category of the data to be identified, to obtain user location label information;
211、根据用户位置标注信息、原始wifi数据和gps信息,生成用户的用户轨迹。211. Generate a user track of the user according to the user location label information, original wifi data, and GPS information.
本实施例中的步骤207-211与第一实施例中的步骤103-107相似,此处不再赘述。Steps 207-211 in this embodiment are similar to steps 103-107 in the first embodiment, and will not be repeated here.
本实施例在上一实施例的基础上,详细描述了对原始wifi数据进行数据预处理,得到待识别数据的过程,通过对所述wifi名称数据进行数据清洗处理,得到数据清洗结果;将所述数据清洗结果中的wifi名称数据进行分词处理,得到wifi分词数组;将所述wifi分词数组中的停用词进行剔除,得到待识别数据。通过数据预处理中的数据清洗,分词和停用词剔除,剩下的为有效词,由有效词共同构建有效词组,减少后续运算量,提高用户轨迹识别效率。On the basis of the previous embodiment, this embodiment describes in detail the process of performing data preprocessing on the original wifi data to obtain the data to be identified, and obtains the data cleaning result by performing data cleaning on the wifi name data; The wifi name data in the data cleaning result is subjected to word segmentation processing to obtain a wifi word segmentation array; the stop words in the wifi word segmentation array are removed to obtain the data to be identified. Through data cleaning in data preprocessing, word segmentation and stop word elimination, the remaining effective words are used to construct effective phrases together, reducing the amount of follow-up calculations and improving the efficiency of user trajectory recognition.
请参阅图3,本申请实施例中用户轨迹识别方法的第三个实施例包括:Please refer to Fig. 3, the third embodiment of the user trajectory identification method in the embodiment of the present application includes:
301、获取用户在待识别时间段的原始wifi数据和gps信息;301. Obtain the original wifi data and GPS information of the user in the time period to be identified;
302、对原始wifi数据进行数据预处理,得到待识别数据;302. Perform data preprocessing on the original wifi data to obtain data to be identified;
本实施例中的步骤301-302与第一实施例中的步骤101-102相似,此处不再赘述。Steps 301-302 in this embodiment are similar to steps 101-102 in the first embodiment, and will not be repeated here.
303、将待识别数据与专家规则词典中的地点单词进行匹配;303. Match the data to be identified with the location words in the expert rule dictionary;
304、若匹配成功,则将待识别数据匹配成功的地点单词对应的地点类别作为一次识别结果;304. If the matching is successful, the location category corresponding to the location word whose data to be identified is successfully matched is used as a recognition result;
305、若匹配失败,则将一次识别结果设为识别失败;305. If the matching fails, set a recognition result as recognition failure;
在本实施例中,在本实施例中,针对wifi名称中一些特定词语,就是识别wifi名称的类别,如包含“机场”、“火车站”一般为交通出行;包含“饭店”一般为餐饮;包含常见路由器品牌名则一般为家用wifi。针对一些特别的词组,建立专家规则词典,字典包含类型和类别对应的特定单词。此外,针对特定应用场景,我们也能建立特殊字典,例如汽车品牌名字典用于识别汽车销售、汽车服务等场所wifi;ktv、会所、马术等店名字典用于识别高端消费场所wifi等。专家识别规则能够对大部分包含关键词的wifi名称进行分类标识,有效减少后续建模数据量,并且针对特定场所wifi做特殊规则筛选以提高识别的准确度。通过遍历上一步输出的wifi名称数组,如果命中字典中的规则单词,则将改wifi名称标记为指定类别,如果均未命中字典中的规则单词,则确定未一次识别失败,需要进行二次识别。In this embodiment, in this embodiment, for some specific words in the wifi name, it is to identify the category of the wifi name, such as including "airport" and "railway station", which are generally transportation trips; including "restaurant" are generally catering; Including the common router brand name is generally home wifi. For some special phrases, an expert rule dictionary is established, and the dictionary contains specific words corresponding to types and categories. In addition, for specific application scenarios, we can also create special dictionaries. For example, a car brand name dictionary is used to identify Wi-Fi in places such as car sales and car services; a dictionary of KTV, clubs, equestrian and other store names is used to identify Wi-Fi in high-end consumer places. Expert identification rules can classify and identify most wifi names containing keywords, effectively reducing the amount of subsequent modeling data, and do special rule screening for wifi in specific places to improve the accuracy of identification. By traversing the wifi name array output in the previous step, if the regular word in the dictionary is hit, the wifi name will be marked as the specified category. If none of the regular words in the dictionary are hit, it is determined that the first recognition fails, and a second recognition is required. .
306、若一次识别结果为识别失败,则将待识别数据输入至预先训练好的wifi识别模型中,得到二次识别结果,二次识别结果包括所述待识别数据的地点类别;306. If the primary recognition result is a recognition failure, input the data to be recognized into the pre-trained wifi recognition model to obtain a secondary recognition result, the secondary recognition result includes the location category of the data to be recognized;
307、根据wifi连接时间将待识别段进行切片划分,得到至少一段wifi连接时间段,并根据待识别数据的地点类别对wifi连接时间段进行标注,得到用户位置标注信息;307. Divide the section to be identified into slices according to the wifi connection time, obtain at least one section of the wifi connection time period, and mark the wifi connection time period according to the location category of the data to be identified, to obtain user location label information;
308、根据用户位置标注信息、原始wifi数据和gps信息,生成用户的用户轨迹。308. Generate a user track of the user according to the user location label information, original wifi data, and GPS information.
本实施例中的步骤306-308与第一实施例中的步骤104-106相似,此处不再赘述。Steps 306-308 in this embodiment are similar to steps 104-106 in the first embodiment, and will not be repeated here.
本实施例在前实施例的基础上,详细描述了根据预设的专家规则词典,对所述待识别数据进行一次识别,得到一次识别结果,通过将所述待识别数据与所述专家规则词典中的地点单词进行匹配;若匹配成功,则将所述待识别数据匹配成功的地点单词对应的地点类别作为一次识别结果;若匹配失败,则将所述一次识别结果设为识别失败。通过预设的专家规则词典,对待识别数据进行粗略识别,识别成功的待识别数据不需要进行二次识别,提高了用户轨迹识别的效率。On the basis of the previous embodiments, this embodiment describes in detail that according to the preset expert rule dictionary, the data to be recognized is recognized once to obtain a recognition result. By combining the data to be recognized with the expert rule dictionary If the matching is successful, the location category corresponding to the location word that the data to be recognized is successfully matched is used as a recognition result; if the matching fails, the recognition result is set as a recognition failure. Through the preset expert rule dictionary, the data to be recognized is roughly recognized, and the data to be recognized that is successfully recognized does not need to be re-recognized, which improves the efficiency of user trajectory recognition.
请参阅图4,本申请实施例中用户轨迹识别方法的第四个实施例包括:Please refer to Fig. 4, the fourth embodiment of the user trajectory identification method in the embodiment of the present application includes:
401、获取历史wifi数据和预设的神经网络模型,并初始化神经网络模型中词向量层、最大池化层和的网络参数和全连接隐藏层的网络参数,历史wifi数据包括人工标识的地点类别;401. Obtain historical wifi data and a preset neural network model, and initialize the network parameters of the word vector layer, the maximum pooling layer and the network parameters of the fully connected hidden layer in the neural network model. Historical wifi data includes manually identified location categories ;
402、将历史wifi数据输入神经网络模型中,得到预测的地点类别;402. Input the historical wifi data into the neural network model to obtain the predicted location category;
403、根据历史wifi数据通过人工标识的地点类别和通过神经网络模型预测的地点类别,计算预设的损失函数,得到损失值,并判断损失值是否小于预设阈值;403. Calculate a preset loss function according to the location category manually identified by the historical wifi data and the location category predicted by the neural network model, to obtain a loss value, and determine whether the loss value is less than a preset threshold;
404、若是,则根据神经网络模型中词向量层、最大池化层和全连接隐藏层的网络参数确定wifi识别模型;404. If so, then determine the wifi recognition model according to the network parameters of the word vector layer, the maximum pooling layer and the fully connected hidden layer in the neural network model;
405、若否,则根据损失值通过反向传播算法更新神经网络模型的网络参数,反复迭代模型训练过程,直至损失值小于预设阈值,并确定训练后的神经网络模型的中词向量层、最大池化层和的网络参数和全连接隐藏层的网络参数确定wifi识别模型;405. If not, update the network parameters of the neural network model through the backpropagation algorithm according to the loss value, iterate the model training process repeatedly until the loss value is less than the preset threshold, and determine the middle word vector layer, The network parameters of the maximum pooling layer sum and the network parameters of the fully connected hidden layer determine the wifi recognition model;
在本实施例中,使用词向量模型(word to vector,word2vec)预先训练历史wifi数据得到所述训练好的词向量权重,使用词向量权重,初始化所述神经网络模型的词向量层、卷积网络层和全连接层的参数。设定在第n训练轮次前不更新所述词向量层的参数,n为正整数,具体数值由技术人员根据实际情况进行设定。In this embodiment, the word vector model (word to vector, word2vec) is used to pre-train historical wifi data to obtain the trained word vector weight, and the word vector layer and convolution layer of the neural network model are initialized using the word vector weight. Parameters of network layer and fully connected layer. It is set not to update the parameters of the word vector layer before the nth training round, n is a positive integer, and the specific value is set by the technician according to the actual situation.
在本实施例中,基于所述损失值卷积网络层和全连接隐藏层的参数,从n+1轮次训练起始调整所述词向量层的参数。将神经网络模型除了词向量层以外的其它网络层的学习率调整为初始学习率,将词向量层的学习率调整为比初始学习率小的学习率,继续训练所述神经网络模型直至收敛,即损失值小于误差阈值,确定收敛后的神经网络模型为司wifi识别模型。In this embodiment, based on the parameters of the loss value convolutional network layer and the fully connected hidden layer, the parameters of the word vector layer are adjusted from the n+1 round of training. Adjusting the learning rate of other network layers of the neural network model except the word vector layer to the initial learning rate, adjusting the learning rate of the word vector layer to a learning rate smaller than the initial learning rate, continuing to train the neural network model until convergence, That is, if the loss value is less than the error threshold, it is determined that the converged neural network model is the wifi recognition model.
406、获取用户在待识别时间段的原始wifi数据和gps信息;406. Obtain the original wifi data and GPS information of the user in the time period to be identified;
407、对原始wifi数据进行数据预处理,得到待识别数据;407. Perform data preprocessing on the original wifi data to obtain data to be identified;
408、根据预设的专家规则词典,对待识别数据进行一次识别,得到一次识别结果;408. Perform one recognition on the data to be recognized according to the preset expert rule dictionary, and obtain one recognition result;
409、若一次识别结果为识别成功,则得到待识别数据的地点类别;409. If the result of a recognition is a successful recognition, the location category of the data to be recognized is obtained;
本实施例中的步骤406-409与第一实施例中的步骤101-104相似,此处不再赘述。Steps 406-409 in this embodiment are similar to steps 101-104 in the first embodiment, and will not be repeated here.
410、若一次识别结果为识别失败,将待识别数据输入至wifi识别模型中的词向量层,将待识别数据转化成词向量序列;410. If a recognition result is a recognition failure, input the data to be recognized into the word vector layer in the wifi recognition model, and convert the data to be recognized into a word vector sequence;
411、将词向量输入至wifi识别模型中的最大池化层,得到最大池化结果;411. Input the word vector into the maximum pooling layer in the wifi recognition model to obtain the maximum pooling result;
412、将最大池化结果输入至wifi识别模型中的全连接隐藏层,并通过Softmax函数,对全连接隐藏层的输出结果进行分类,得到二次识别结果,二次识别结果包括待识别数据的地点类别;412. Input the maximum pooling result into the fully connected hidden layer in the wifi recognition model, and use the Softmax function to classify the output results of the fully connected hidden layer to obtain the secondary recognition result, which includes the data to be recognized location category;
在本实施例中,在本实施例中,wifi识别模型包括词向量层、最大池化层、全连接隐层和输出层,其中词向量层用于为了更好地表示不同词之间语义上的关系,首先将词语转化为固定维度的向量。训练完成后,词与词语义上的相似程度可以用它们的词向量之间的距离来表示,语义上越相似,距离越近,最大池化层用于消除了不同语料样本在单词数量 多少上的差异,并提炼出词向量中每一下标位置上的最大值。经过池化后,词向量层输出的向量序列被转化为一条固定维度的向量。例如,假设最大池化前向量的序列为[[2,3,5],[7,3,6],[1,4,0]],则最大池化的结果为:[7,4,6]。全连接隐层用于经过最大池化后的向量被送入两个连续的隐层进行计算,隐层之间为全连接结构,输出层的神经元数量和样本的类别数一致,例如在二分类问题中,输出层会有2个神经元。通过Softmax激活函数,输出结果是一个归一化的概率分布,和为1,本申请为多分类模型,输出层有多个神经元,神经元的数量与wifi需要分类的类别相同,因此第x个神经元的输出就可以认为是wifi名称数据属于第x类wifi类别的预测概率,将其中的最大值对应的wifi类别作为wifi名称数据的类别,并作为二次识别结果。In this embodiment, in this embodiment, the wifi recognition model includes a word vector layer, a maximum pooling layer, a fully connected hidden layer and an output layer, wherein the word vector layer is used to better represent the semantics between different words relationship, first transform words into fixed-dimensional vectors. After the training is completed, the semantic similarity between words and words can be expressed by the distance between their word vectors. The more semantically similar, the closer the distance. The maximum pooling layer is used to eliminate the difference in the number of words in different corpus samples. difference, and extract the maximum value at each subscript position in the word vector. After pooling, the vector sequence output by the word vector layer is converted into a fixed-dimensional vector. For example, assuming that the sequence of vectors before max pooling is [[2, 3, 5], [7, 3, 6], [1, 4, 0]], the result of max pooling is: [7, 4, 6]. The fully connected hidden layer is used to send the vector after the maximum pooling to two consecutive hidden layers for calculation. There is a fully connected structure between the hidden layers, and the number of neurons in the output layer is consistent with the number of categories of samples. For example, in In the binary classification problem, the output layer will have 2 neurons. Through the Softmax activation function, the output result is a normalized probability distribution, and the sum is 1. This application is a multi-classification model, and the output layer has multiple neurons. The number of neurons is the same as the category that wifi needs to classify, so the xth The output of each neuron can be considered as the predicted probability that the wifi name data belongs to the x-th wifi category, and the wifi category corresponding to the maximum value is used as the category of the wifi name data, and as the secondary recognition result.
413、根据wifi连接时间将待识别段进行切片划分,得到至少一段wifi连接时间段,并根据待识别数据的地点类别对wifi连接时间段进行标注,得到用户位置标注信息;413. Divide the section to be identified into slices according to the wifi connection time, obtain at least one section of the wifi connection time period, and mark the wifi connection time period according to the location category of the data to be identified, to obtain user location label information;
414、根据用户位置标注信息、原始wifi数据和gps信息,生成用户的用户轨迹。414. Generate a user track of the user according to the user location label information, original wifi data, and GPS information.
本实施例在前实施例的基础上,详细描述了将待识别数据输入至预先训练好的wifi识别模型中,得到二次识别结果的过程,。通过将待识别数据输入至wifi识别模型中的词向量层,将待识别数据转化成词向量序列;将词向量输入至wifi识别模型中的最大池化层,得到最大池化结果;将最大池化结果输入至wifi识别模型中的全连接隐藏层,并通过Softmax函数,对全连接隐藏层的输出结果进行分类,得到二次识别结果,其中,二次识别结果中包括待识别数据对应的地点类别。同时详细描述了wifi识别模型的具体训练过程,通过深度学习构建的模型进行wifi的识别,能够精确确定wifi数据的地点类别,进而提高用户轨迹的生成效率。Based on the previous embodiments, this embodiment describes in detail the process of inputting data to be recognized into a pre-trained wifi recognition model to obtain a secondary recognition result. By inputting the data to be recognized into the word vector layer in the wifi recognition model, the data to be recognized is converted into a sequence of word vectors; the word vector is input into the maximum pooling layer in the wifi recognition model to obtain the maximum pooling result; the maximum pooling Input the results of the transformation into the fully connected hidden layer in the wifi recognition model, and use the Softmax function to classify the output results of the fully connected hidden layer to obtain the secondary recognition results, wherein the secondary recognition results include the location corresponding to the data to be recognized category. At the same time, the specific training process of the wifi recognition model is described in detail. The wifi recognition model constructed by deep learning can accurately determine the location category of wifi data, thereby improving the generation efficiency of user trajectories.
请参阅图5,本申请实施例中用户轨迹识别方法的第五个实施例包括:Please refer to Fig. 5, the fifth embodiment of the user trajectory identification method in the embodiment of the present application includes:
501、获取用户在待识别时间段的原始wifi数据和gps信息;501. Obtain the original wifi data and GPS information of the user in the time period to be identified;
502、对原始wifi数据进行数据预处理,得到待识别数据;502. Perform data preprocessing on the original wifi data to obtain data to be identified;
503、根据预设的专家规则词典,对待识别数据进行一次识别,得到一次识别结果;503. Perform one recognition on the data to be recognized according to the preset expert rule dictionary, and obtain one recognition result;
504、若一次识别结果为识别成功,则得到待识别数据的地点类别;504. If the result of one recognition is a successful recognition, the location category of the data to be recognized is obtained;
505、若一次识别结果为识别失败,则将待识别数据输入至预先训练好的wifi识别模型中,得到二次识别结果;505. If the primary recognition result is a recognition failure, input the data to be recognized into a pre-trained wifi recognition model to obtain a secondary recognition result;
本实施例中的步骤501-505与第一实施例中的步骤101-105相似,此处不再赘述。Steps 501-505 in this embodiment are similar to steps 101-105 in the first embodiment, and will not be repeated here.
506、根据一次识别结果或二次识别结果,获取待识别时间段中所有原始wifi数据的地点类别;506. According to the first recognition result or the second recognition result, obtain the location category of all the original wifi data in the time period to be recognized;
507、将待识别时间段根据原始wifi数据的wifi连接时间进行切片划分,得到至少一段wifi连接时间段;507. Slicing and dividing the time period to be identified according to the wifi connection time of the original wifi data, to obtain at least one wifi connection time period;
508、根据所有原始wifi数据对应的wifi连接时间段和所有原始wifi数据对应的地点类别,生成待识别时间段的用户位置标注信息;508. According to the wifi connection time period corresponding to all the original wifi data and the location category corresponding to all the original wifi data, generate user location labeling information for the time period to be identified;
在本实施例中,用户在待识别时间段内,可能连接了多个wifi,并且连接多个wifi的可能存在中间间隔时间,例如待识别时间段是用户在某日的一天,用户连接wifi的时间分别为0-8点——9点——10点-12点——13点——14-18点——18点-22点——22点-24点,用户连接的wifi时将wifi连接时间作为原始wifi数据之一进行存储,并将已经识别出类型的wifi名称数据,和用户连接的wifi名称行关联,这就可以识别出用户连接的wifi类别。通过对用户连接的wifi类别进行汇总,就可以识别出用户的用户位置标注信息。识别后的用户位置标注信息如下:In this embodiment, the user may have connected to multiple wifis during the time period to be identified, and there may be an intermediate interval between multiple wifi connections. For example, the time period to be identified is when the user connects to wifi on a certain day. The time is 0-8:00-9:00-10:00-12:00-13:00-14-18:00-18:00-22:00-22:00-24:00. The connection time is stored as one of the original wifi data, and the identified type of wifi name data is associated with the wifi name row connected by the user, so that the type of wifi connected by the user can be identified. By summarizing the types of wifi connected by the user, the user location label information of the user can be identified. The identified user location label information is as follows:
0-8点家庭住宅wifiid:aaabac设备号xxxxx;0-8 o'clock home wifiid: aaabac device number xxxxx;
9点交通出行wifiid:erdfhethrh设备号xYYYY;9 o'clock traffic travel wifiid: erdfhethrh device number xYYYY;
10点-12点写字楼商业建筑wifiid:qegehwr设备号xYYX;From 10:00 to 12:00, wifiid of office buildings and commercial buildings: qegehwr equipment number xYYX;
13点餐饮wifiid:EHFDrh设备号xxxyzz;Restaurant wifiid at 13:00: EHFDrh device number xxxyzz;
14-18点写字楼商业建筑wifiid:ETHHDF设备号xxxyzzz;14-18 o'clock office building commercial building wifiid: ETHHDF equipment number xxxyzzz;
18点-22点娱乐设施wifiid:ehdfhR设备号xxxyzzzz;18:00-22:00 entertainment facility wifiid: ehdfhR device number xxxyzzzz;
22点-24点家庭住宅wifiid:erhDSG设备号xxxyyzzzz。From 22 o'clock to 24 o'clock family home wifiid: erhDSG device number xxxyyzzzz.
509、根据用户位置标注信息、原始wifi数据和gps信息,生成用户的用户轨迹。509. Generate a user track of the user according to the user location label information, original wifi data, and GPS information.
本实施例中的步骤509与第一实施例中的步骤107相似,此处不再赘述。Step 509 in this embodiment is similar to step 107 in the first embodiment, and will not be repeated here.
本实施例在前实施例的基础上,详细描述了根据一次识别结果或二次识别结果,生成用户在待识别时间段的用户位置标注信息。通过一次识别结果或二次识别结果,获取待识别时间段中所有原始wifi数据的地点类别;将待识别时间段根据原始wifi数据的wifi连接时间进行切片划分,得到至少一段wifi连接时间段;根据所有原始wifi数据对应的wifi连接时间段和所有原始wifi数据对应的地点类别,生成待识别时间段的用户位置标注信息。通过本方法,能够对wifi数据的地点类型进行标注,进而生成用户轨迹中各点的地点类型。On the basis of the previous embodiments, this embodiment describes in detail how to generate the user location label information of the user in the time period to be recognized according to the primary recognition result or the secondary recognition result. Obtain the location categories of all original wifi data in the time period to be identified through the primary recognition result or the secondary recognition result; segment the time period to be recognized according to the wifi connection time of the original wifi data, and obtain at least one wifi connection time period; according to The wifi connection time periods corresponding to all the original wifi data and the location categories corresponding to all the original wifi data are used to generate user location label information for the time period to be identified. Through this method, the location type of the wifi data can be marked, and then the location type of each point in the user track can be generated.
上面对本申请实施例中用户轨迹识别方法进行了描述,下面对本申请实施例中用户轨迹识别装置进行描述,请参阅图6,本申请实施例中用户轨迹识别装置一个实施例包括:The user trajectory identification method in the embodiment of the present application is described above, and the user trajectory identification device in the embodiment of the application is described below. Please refer to FIG. 6. An embodiment of the user trajectory identification device in the embodiment of the application includes:
获取模块601,获取用户在待识别时间段的原始wifi数据和gps信息,所述原始wifi数据包括wifi连接时间;Obtaining module 601, obtaining the original wifi data and gps information of the user in the time period to be identified, the original wifi data including wifi connection time;
预处理模块602,用于对所述原始wifi数据进行数据预处理,得到待识别数据;A preprocessing module 602, configured to perform data preprocessing on the original wifi data to obtain data to be identified;
一次识别模块603,用于根据预设的专家规则词典,对所述待识别数据进行一次识别,得到一次识别结果,当所述一次识别结果为识别成功时,则得到所述待识别数据的地点类别;The one-time recognition module 603 is configured to perform one-time recognition on the data to be recognized according to the preset expert rule dictionary, and obtain a recognition result once, and when the recognition result of the first recognition is successful, obtain the location of the data to be recognized category;
二次识别模块604,用于当所述一次识别结果为识别失败时,将所述待识别数据输入至预先训练好的wifi识别模型中,得到二次识别结果,其中,所述二次识别结果包括所述待识别数据的地点类别;The secondary recognition module 604 is configured to input the data to be recognized into a pre-trained wifi recognition model to obtain a secondary recognition result when the primary recognition result is a recognition failure, wherein the secondary recognition result a category of locations comprising said data to be identified;
标注模块605,用于根据所述wifi连接时间将所述待识别段进行切片划分,得到至少一段wifi连接时间段,并根据所述待识别数据的地点类别对所述wifi连接时间段进行标注,得到用户位置标注信息;A labeling module 605, configured to divide the section to be identified into slices according to the wifi connection time to obtain at least one section of the wifi connection time period, and mark the wifi connection time period according to the location category of the data to be identified, Obtain user location annotation information;
轨迹描绘模块606,用于根据所述用户位置标注信息、所述原始wifi数据和所述gps信息,生成所述用户的用户轨迹。A trajectory drawing module 606, configured to generate a user trajectory of the user according to the user location annotation information, the original wifi data and the GPS information.
需要强调的是,为保证数据的私密和安全性,上述原始wifi数据可以存储于一区块链的节点中。It should be emphasized that, in order to ensure data privacy and security, the above-mentioned original wifi data can be stored in a block chain node.
本申请实施例中,所述用户轨迹识别装置运行上述用户轨迹识别方法,所述用户轨迹识别装置获取用户在待识别时间段的原始wifi数据和gps信息,原始wifi数据包括wifi连接时间;对原始wifi数据进行数据预处理,得到待识别数据;根据预设的专家规则词典,对待识别数据进行一次识别,得到一次识别结果;若一次识别结果为识别成功,则得到待识别数据的地点类别;若一次识别结果为识别失败,则将待识别数据输入至预先训练好的wifi识别模型中,得到二次识别结果,其中,二次识别结果包括待识别数据的地点类别;根据wifi连接时间将待识别段进行切片划分,得到至少一段wifi连接时间段,并根据待识别数据的地点类别对wifi连接时间段进行标注,得到用户位置标注信息;根据用户位置标注信息、原始wifi数据和gps信息,生成用户的用户轨迹。本方法基于深度学习技术, 对wifi数据进行用户位置标注信息的生成,根据用户位置标注信息和wifi数据进行小范围的精细用户轨迹识别,并结合gps信息进行大范围的广域用户轨迹识别,生成用户的用户轨迹,根据多种数据进行用户轨迹识别,提高用户轨迹的识别精度,可以自动化用户轨迹的识别流程。In the embodiment of the present application, the user trajectory identification device operates the above user trajectory identification method, the user trajectory identification device obtains the original wifi data and gps information of the user in the time period to be identified, the original wifi data includes wifi connection time; for the original Perform data preprocessing on the wifi data to obtain the data to be recognized; according to the preset expert rule dictionary, perform a recognition on the data to be recognized once, and obtain a recognition result; if the recognition result is successful, then obtain the location category of the data to be recognized; If the primary recognition result is recognition failure, input the data to be recognized into the pre-trained wifi recognition model to obtain the secondary recognition result, wherein the secondary recognition result includes the location category of the data to be recognized; Sections are divided into slices to obtain at least one wifi connection time period, and the wifi connection time period is marked according to the location category of the data to be identified to obtain the user location label information; according to the user location label information, original wifi data and gps information, the user is generated user trajectory. Based on deep learning technology, this method generates user location labeling information for wifi data, conducts small-scale fine user trajectory identification based on user location labeling information and wifi data, and combines GPS information for wide-area user trajectory identification to generate The user track of the user, the user track identification is performed according to various data, the recognition accuracy of the user track is improved, and the identification process of the user track can be automated.
请参阅图7,本申请实施例中用户轨迹识别装置的第二个实施例包括:Please refer to FIG. 7, the second embodiment of the user trajectory recognition device in the embodiment of the present application includes:
获取模块601,获取用户在待识别时间段的原始wifi数据和gps信息,所述原始wifi数据包括wifi连接时间;Obtaining module 601, obtaining the original wifi data and gps information of the user in the time period to be identified, the original wifi data including wifi connection time;
预处理模块602,用于对所述原始wifi数据进行数据预处理,得到待识别数据;A preprocessing module 602, configured to perform data preprocessing on the original wifi data to obtain data to be identified;
一次识别模块603,用于根据预设的专家规则词典,对所述待识别数据进行一次识别,得到一次识别结果,当所述一次识别结果为识别成功时,则得到所述待识别数据的地点类别;The one-time recognition module 603 is configured to perform one-time recognition on the data to be recognized according to the preset expert rule dictionary, and obtain a recognition result once, and when the recognition result of the first recognition is successful, obtain the location of the data to be recognized category;
二次识别模块604,用于当所述一次识别结果为识别失败时,将所述待识别数据输入至预先训练好的wifi识别模型中,得到二次识别结果,其中,所述二次识别结果包括所述待识别数据的地点类别;The secondary recognition module 604 is configured to input the data to be recognized into a pre-trained wifi recognition model to obtain a secondary recognition result when the primary recognition result is a recognition failure, wherein the secondary recognition result a category of locations comprising said data to be identified;
标注模块605,用于根据所述wifi连接时间将所述待识别段进行切片划分,得到至少一段wifi连接时间段,并根据所述待识别数据的地点类别对所述wifi连接时间段进行标注,得到用户位置标注信息;A labeling module 605, configured to divide the section to be identified into slices according to the wifi connection time to obtain at least one section of the wifi connection time period, and mark the wifi connection time period according to the location category of the data to be identified, Obtain user location annotation information;
轨迹描绘模块606,用于根据所述用户位置标注信息、所述原始wifi数据和所述gps信息,生成所述用户的用户轨迹。A trajectory drawing module 606, configured to generate a user trajectory of the user according to the user location annotation information, the original wifi data and the GPS information.
其中,所述预处理模块602包括:数据清洗单元6021,用于对所述wifi名称数据进行数据清洗处理,得到数据清洗结果;分词单元6022,用于将所述数据清洗结果中的wifi名称数据进行分词处理,得到wifi分词数组;剔除单元6023,用于将所述wifi分词数组中的停用词进行剔除,得到待识别数据。Wherein, the preprocessing module 602 includes: a data cleaning unit 6021, which is used to perform data cleaning processing on the wifi name data to obtain a data cleaning result; a word segmentation unit 6022, which is used to convert the wifi name data in the data cleaning result Perform word segmentation processing to obtain a wifi word segmentation array; the removing unit 6023 is used to remove stop words in the wifi word segmentation array to obtain data to be recognized.
可选的,所述分词单元6022具体用于:对所述数据清洗结果中的wifi名称数据进行单字切分,得到序列数组;根据预设的前缀词典,构建所述序列数组的有向无回图,并分别计算所述有向无回图中各路径的概率;根据所述有向无回图中最大概率对应的路径,得到最优分词结果,并根据所述最优分词结果对所述数据清洗结果中的wifi名称数据进行分词,得到wifi分词数组。Optionally, the word segmentation unit 6022 is specifically configured to: perform word segmentation on the wifi name data in the data cleaning result to obtain a sequence array; construct a directed non-return sequence array of the sequence array according to a preset prefix dictionary. , and calculate the probability of each path in the directed acyclic graph respectively; according to the path corresponding to the maximum probability in the directed acyclic graph, the optimal word segmentation result is obtained, and the optimal word segmentation result is used for the described optimal word segmentation result The wifi name data in the data cleaning result is segmented to obtain a wifi word segmentation array.
可选的,所述一次识别模块603具体用于:将所述待识别数据与所述专家规则词典中的地点单词进行匹配;若匹配成功,则将所述待识别数据匹配成功的地点单词对应的地点类别作为一次识别结果;若匹配失败,则将所述一次识别结果设为识别失败。Optionally, the primary recognition module 603 is specifically configured to: match the data to be recognized with the location words in the expert rule dictionary; The category of the location is used as a recognition result; if the matching fails, the recognition result is set as a recognition failure.
可选的,所述二次识别模块604具体用于:将所述待识别数据输入至所述wifi识别模型中的词向量层,将所述待识别数据转化成词向量序列;将所述词向量输入至所述wifi识别模型中的最大池化层,得到最大池化结果;将所述最大池化结果输入至所述wifi识别模型中的全连接隐藏层,并通过Softmax函数,对所述全连接隐藏层的输出结果进行分类,得到二次识别结果。Optionally, the secondary recognition module 604 is specifically configured to: input the data to be recognized into the word vector layer in the wifi recognition model, convert the data to be recognized into a word vector sequence; convert the word The vector is input to the maximum pooling layer in the wifi recognition model to obtain the maximum pooling result; the maximum pooling result is input to the fully connected hidden layer in the wifi recognition model, and by the Softmax function, the The output results of the fully connected hidden layer are classified to obtain the secondary recognition result.
可选的,所述用户轨迹识别装置还包括模型训练模块607,所述模型训练模块607具体用于:获取历史wifi数据和预设的神经网络模型,并初始化所述神经网络模型中词向量层、最大池化层和的网络参数和全连接隐藏层的网络参数,所述历史wifi数据包括人工标识的地点类别;将所述历史wifi数据输入所述神经网络模型中,得到预测的地点类别;根据所述历史wifi数据通过人工标识的地点类别和通过神经网络模型预测的地点类别,计算预设的损失函数,得到损失值,并判断所述损失值是否小于预设阈值;若是,则根据所述 神经网络模型中词向量层、最大池化层和全连接隐藏层的网络参数确定wifi识别模型;若否,则根据所述损失值通过反向传播算法更新所述神经网络模型的网络参数,反复迭代模型训练过程,直至损失值小于预设阈值,并确定训练后的神经网络模型的中词向量层、最大池化层和的网络参数和全连接隐藏层的网络参数确定wifi识别模型。Optionally, the user trajectory identification device further includes a model training module 607, and the model training module 607 is specifically configured to: acquire historical wifi data and a preset neural network model, and initialize the word vector layer in the neural network model , the network parameters of the maximum pooling layer and the network parameters of the fully connected hidden layer, the historical wifi data includes the artificially identified place category; the historical wifi data is input in the neural network model to obtain the predicted place category; Calculate the preset loss function according to the location category manually identified and the location category predicted by the neural network model according to the historical wifi data, obtain a loss value, and judge whether the loss value is less than a preset threshold; if so, then according to the specified The network parameters of the word vector layer, the maximum pooling layer and the fully connected hidden layer in the neural network model determine the wifi identification model; if not, then update the network parameters of the neural network model by the backpropagation algorithm according to the loss value, Iterate the model training process repeatedly until the loss value is less than the preset threshold, and determine the network parameters of the word vector layer, the maximum pooling layer and the network parameters of the fully connected hidden layer of the trained neural network model to determine the wifi recognition model.
可选的,所述标注模块606具体用于:将所述待识别时间段根据所述原始wifi数据的wifi连接时间进行切片划分,得到至少一段wifi连接时间段;根据所述待识别时间段中各原始wifi数据与各待识别数据的对应关系,确定各原始wifi数据对应的地点类别;根据所述原始wifi数据的地点类别和所述原始wifi数据对所述wifi连接时间段进行标注,得到用户位置标注信息。Optionally, the labeling module 606 is specifically configured to: slice and divide the time period to be identified according to the wifi connection time of the original wifi data to obtain at least one period of wifi connection time; The corresponding relationship between each original wifi data and each data to be identified determines the location category corresponding to each original wifi data; according to the location category of the original wifi data and the original wifi data, the wifi connection time period is marked to obtain the user Location annotation information.
本实施例在上一实施例的基础上,详细描述了各个模块的具体功能以及部分模块的单元构成,通过新增的模块,基于深度学习技术,对wifi数据进行用户位置标注信息的生成,根据所述用户位置标注信息和wifi数据进行小范围的精细用户轨迹识别,并结合gps信息进行大范围的广域用户轨迹识别,生成所述用户的用户轨迹,根据多种数据进行用户轨迹识别,提高用户轨迹的识别精度,可以自动化用户轨迹的识别流程。On the basis of the previous embodiment, this embodiment describes in detail the specific functions of each module and the unit composition of some modules. Through the newly added module, based on deep learning technology, the user location labeling information is generated for wifi data. According to The user location tagging information and wifi data are used for small-scale fine user trajectory identification, combined with GPS information for wide-area wide-area user trajectory identification, to generate the user trajectory of the user, and to perform user trajectory identification based on various data to improve The recognition accuracy of user trajectories can automate the identification process of user trajectories.
上面图6和图7从模块化功能实体的角度对本申请实施例中的中用户轨迹识别装置进行详细描述,下面从硬件处理的角度对本申请实施例中用户轨迹识别设备进行详细描述。The above Figures 6 and 7 describe in detail the user trajectory identification device in the embodiment of the present application from the perspective of modular functional entities, and the following describes the user trajectory identification device in the embodiment of the application in detail from the perspective of hardware processing.
图8是本申请实施例提供的一种用户轨迹识别设备的结构示意图,该用户轨迹识别设备800可因配置或性能不同而产生比较大的差异,可以包括一个或一个以上处理器(central processing units,CPU)810(例如,一个或一个以上处理器)和存储器820,一个或一个以上存储应用程序833或数据832的存储介质830(例如一个或一个以上海量存储设备)。其中,存储器820和存储介质830可以是短暂存储或持久存储。存储在存储介质830的程序可以包括一个或一个以上模块(图示没标出),每个模块可以包括对用户轨迹识别设备800中的一系列指令操作。更进一步地,处理器810可以设置为与存储介质830通信,在用户轨迹识别设备800上执行存储介质830中的一系列指令操作,以实现上述用户轨迹识别方法的步骤。Fig. 8 is a schematic structural diagram of a user trajectory recognition device provided by an embodiment of the present application. The user trajectory recognition device 800 may have relatively large differences due to different configurations or performances, and may include one or more processors (central processing units) , CPU) 810 (eg, one or more processors) and memory 820, and one or more storage media 830 (eg, one or more mass storage devices) for storing application programs 833 or data 832 . Wherein, the memory 820 and the storage medium 830 may be temporary storage or persistent storage. The program stored in the storage medium 830 may include one or more modules (not shown in the figure), and each module may include a series of instruction operations for the user trajectory recognition device 800 . Furthermore, the processor 810 may be configured to communicate with the storage medium 830, and execute a series of instruction operations in the storage medium 830 on the user trajectory identification device 800, so as to implement the steps of the above user trajectory identification method.
用户轨迹识别设备800还可以包括一个或一个以上电源840,一个或一个以上有线或无线网络接口850,一个或一个以上输入输出接口860,和/或,一个或一个以上操作系统831,例如Windows Serve,Mac OS X,Unix,Linux,FreeBSD等等。本领域技术人员可以理解,图8示出的用户轨迹识别设备结构并不构成对本申请提供的用户轨迹识别设备的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。The user trajectory recognition device 800 may also include one or more power sources 840, one or more wired or wireless network interfaces 850, one or more input and output interfaces 860, and/or, one or more operating systems 831, such as Windows Server , Mac OS X, Unix, Linux, FreeBSD, etc. Those skilled in the art can understand that the structure of the user trajectory recognition device shown in FIG. 8 does not constitute a limitation to the user trajectory recognition device provided in this application, and may include more or less components than those shown in the figure, or combine certain components, or different component arrangements.
本申请所指区块链是分布式数据存储、点对点传输、共识机制、加密算法等计算机技术的新型应用模式。区块链(Blockchain),本质上是一个去中心化的数据库,是一串使用密码学方法相关联产生的数据块,每一个数据块中包含了一批次网络交易的信息,用于验证其信息的有效性(防伪)和生成下一个区块。区块链可以包括区块链底层平台、平台产品服务层以及应用服务层等。The blockchain referred to in this application is a new application mode of computer technologies such as distributed data storage, point-to-point transmission, consensus mechanism, and encryption algorithm. Blockchain (Blockchain), essentially a decentralized database, is a series of data blocks associated with each other using cryptographic methods. Each data block contains a batch of network transaction information, which is used to verify its Validity of information (anti-counterfeiting) and generation of the next block. The blockchain can include the underlying platform of the blockchain, the platform product service layer, and the application service layer.
本申请还提供一种计算机可读存储介质,该计算机可读存储介质可以为非易失性计算机可读存储介质,该计算机可读存储介质也可以为易失性计算机可读存储介质,所述计算机可读存储介质中存储有指令,当所述指令在计算机上运行时,使得计算机执行所述用户轨迹识别方法的步骤。The present application also provides a computer-readable storage medium. The computer-readable storage medium may be a non-volatile computer-readable storage medium. The computer-readable storage medium may also be a volatile computer-readable storage medium. Instructions are stored in the computer-readable storage medium, and when the instructions are run on the computer, the computer is made to execute the steps of the user trajectory identification method.
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统或装置、单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。Those skilled in the art can clearly understand that for the convenience and brevity of description, the specific working process of the system, device, and unit described above can refer to the corresponding process in the foregoing method embodiments, and details are not repeated here.
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可 以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(read-only memory,ROM)、随机存取存储器(random access memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。If the integrated unit is realized in the form of a software function unit and sold or used as an independent product, it can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present application is essentially or part of the contribution to the prior art or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium , including several instructions to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in the various embodiments of the present application. The aforementioned storage medium includes: U disk, mobile hard disk, read-only memory (read-only memory, ROM), random access memory (random access memory, RAM), magnetic disk or optical disc and other media that can store program codes. .
以上所述,以上实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的精神和范围。As mentioned above, the above embodiments are only used to illustrate the technical solutions of the present application, and are not intended to limit them; although the present application has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that: it can still understand the foregoing The technical solutions described in each embodiment are modified, or some of the technical features are equivalently replaced; and these modifications or replacements do not make the essence of the corresponding technical solutions deviate from the spirit and scope of the technical solutions of the various embodiments of the application.

Claims (20)

  1. 一种用户轨迹识别方法,其中,所述用户轨迹识别方法包括:A user trajectory identification method, wherein the user trajectory identification method comprises:
    获取用户在待识别时间段的原始wifi数据和gps信息,其中,所述原始wifi数据包括wifi连接时间;Obtain the original wifi data and gps information of the user in the time period to be identified, wherein the original wifi data includes wifi connection time;
    对所述原始wifi数据进行数据预处理,得到待识别数据;Carry out data preprocessing to described original wifi data, obtain the data to be identified;
    根据预设的专家规则词典,对所述待识别数据进行一次识别,得到一次识别结果;performing a recognition on the data to be recognized according to a preset dictionary of expert rules to obtain a recognition result;
    若所述一次识别结果为识别成功,则得到所述待识别数据的地点类别;If the recognition result of the first recognition is successful, the location category of the data to be recognized is obtained;
    若所述一次识别结果为识别失败,则将所述待识别数据输入至预先训练好的wifi识别模型中,得到二次识别结果,其中,所述二次识别结果包括所述待识别数据的地点类别;If the primary recognition result is a recognition failure, then input the data to be recognized into a pre-trained wifi recognition model to obtain a secondary recognition result, wherein the secondary recognition result includes the location of the data to be recognized category;
    根据所述wifi连接时间将所述待识别段进行切片划分,得到至少一段wifi连接时间段,并根据所述待识别数据的地点类别对所述wifi连接时间段进行标注,得到用户位置标注信息;Slicing and dividing the segment to be identified according to the wifi connection time to obtain at least one wifi connection time segment, and marking the wifi connection time segment according to the location category of the data to be identified to obtain user location tagging information;
    根据所述用户位置标注信息、所述原始wifi数据和所述gps信息,生成所述用户的用户轨迹。Generating the user track of the user according to the user location tag information, the original wifi data and the gps information.
  2. 根据权利要求1所述的用户轨迹识别方法,其中,所述原始wifi数据包括wifi名称数据,所述对所述原始wifi数据进行数据预处理,得到待识别数据包括:The user track identification method according to claim 1, wherein the original wifi data includes wifi name data, and performing data preprocessing on the original wifi data to obtain data to be identified includes:
    对所述wifi名称数据进行数据清洗处理,得到数据清洗结果;Perform data cleaning processing on the wifi name data to obtain a data cleaning result;
    将所述数据清洗结果中的wifi名称数据进行分词处理,得到wifi分词数组;Carry out word segmentation processing to wifi name data in the data cleaning result, obtain wifi word segmentation array;
    将所述wifi分词数组中的停用词进行剔除,得到待识别数据。The stop words in the wifi word segmentation array are removed to obtain the data to be recognized.
  3. 根据权利要求2所述的用户轨迹识别方法,其中,所述将所述数据清洗结果中的wifi名称数据进行分词处理,得到wifi分词数组包括:The user track identification method according to claim 2, wherein said wifi name data in said data cleaning result is subjected to word segmentation processing to obtain a wifi word segmentation array comprising:
    对所述数据清洗结果中的wifi名称数据进行单字切分,得到序列数组;Carry out word segmentation to the wifi name data in the data cleaning result, obtain sequence array;
    根据预设的前缀词典,构建所述序列数组的有向无回图,并分别计算所述有向无回图中各路径的概率;Constructing the directed acyclic graph of the sequence array according to the preset prefix dictionary, and calculating the probability of each path in the directed acyclic graph respectively;
    根据所述有向无回图中最大概率对应的路径,得到最优分词结果,并根据所述最优分词结果对所述数据清洗结果中的wifi名称数据进行分词,得到wifi分词数组。According to the path corresponding to the maximum probability in the directed acyclic graph, an optimal word segmentation result is obtained, and according to the optimal word segmentation result, the wifi name data in the data cleaning result is segmented to obtain a wifi word segmentation array.
  4. 根据权利要求1-3中任一项所述的用户轨迹识别方法,其中,所述根据预设的专家规则词典,对所述待识别数据进行一次识别,得到一次识别结果包括:The user trajectory recognition method according to any one of claims 1-3, wherein said performing one recognition on said to-be-recognized data according to a preset expert rule dictionary, and obtaining a recognition result comprises:
    将所述待识别数据与所述专家规则词典中的地点单词进行匹配;Matching the data to be identified with the location words in the expert rule dictionary;
    若匹配成功,则将所述待识别数据匹配成功的地点单词对应的地点类别作为一次识别结果;If the matching is successful, the location category corresponding to the location word that the data to be identified is successfully matched is used as a recognition result;
    若匹配失败,则将所述一次识别结果设为识别失败。If the matching fails, set the first recognition result as recognition failure.
  5. 根据权利要求4所述的用户轨迹识别方法,其中,所述将所述待识别数据输入至预先训练好的wifi识别模型中,得到二次识别结果包括:The user trajectory identification method according to claim 4, wherein said inputting said data to be identified into a pre-trained wifi identification model to obtain a secondary identification result comprises:
    将所述待识别数据输入至所述wifi识别模型中的词向量层,将所述待识别数据转化成词向量序列;The data to be identified is input to the word vector layer in the wifi recognition model, and the data to be identified is converted into a word vector sequence;
    将所述词向量输入至所述wifi识别模型中的最大池化层,得到最大池化结果;The word vector is input to the maximum pooling layer in the wifi recognition model to obtain the maximum pooling result;
    将所述最大池化结果输入至所述wifi识别模型中的全连接隐藏层,并通过Softmax函数,对所述全连接隐藏层的输出结果进行分类,得到二次识别结果。The maximum pooling result is input into the fully connected hidden layer in the wifi identification model, and the output result of the fully connected hidden layer is classified by the Softmax function to obtain the secondary identification result.
  6. 根据权利要求5所述的用户轨迹识别方法,其中,所述wifi识别模型通过以下步骤训练得到:The user track identification method according to claim 5, wherein the wifi identification model is obtained through the following steps of training:
    获取历史wifi数据和预设的神经网络模型,并初始化所述神经网络模型中词向量层、最大池化层和的网络参数和全连接隐藏层的网络参数,所述历史wifi数据包括人工标识的 地点类别;Obtain historical wifi data and a preset neural network model, and initialize the network parameters of the word vector layer, the maximum pooling layer and the network parameters of the fully connected hidden layer in the neural network model, and the historical wifi data includes artificially identified location category;
    将所述历史wifi数据输入所述神经网络模型中,得到预测的地点类别;The historical wifi data is input in the neural network model to obtain the predicted location category;
    根据所述历史wifi数据通过人工标识的地点类别和通过神经网络模型预测的地点类别,计算预设的损失函数,得到损失值,并判断所述损失值是否小于预设阈值;Calculate the preset loss function according to the location category manually identified and the location category predicted by the neural network model according to the historical wifi data, obtain a loss value, and judge whether the loss value is less than a preset threshold;
    若是,则根据所述神经网络模型中词向量层、最大池化层和全连接隐藏层的网络参数确定wifi识别模型;If so, then determine the wifi recognition model according to the network parameters of word vector layer, maximum pooling layer and fully connected hidden layer in the neural network model;
    若否,则根据所述损失值通过反向传播算法更新所述神经网络模型的网络参数,反复迭代模型训练过程,直至损失值小于预设阈值,并确定训练后的神经网络模型的中词向量层、最大池化层和的网络参数和全连接隐藏层的网络参数确定wifi识别模型。If not, then update the network parameters of the neural network model through the backpropagation algorithm according to the loss value, iteratively iterate the model training process until the loss value is less than the preset threshold, and determine the middle word vector of the trained neural network model Layer, the network parameters of the maximum pooling layer and the network parameters of the fully connected hidden layer determine the wifi recognition model.
  7. 根据权利要求5所述的用户轨迹识别方法,其中,所述根据所述wifi连接时间将所述待识别段进行切片划分,得到至少一段wifi连接时间段,并根据所述待识别数据的地点类别对所述wifi连接时间段进行标注,得到用户位置标注信息包括:The user trajectory identification method according to claim 5, wherein, according to the wifi connection time, the section to be identified is divided into slices to obtain at least one section of wifi connection time period, and according to the location category of the data to be identified The wifi connection time period is marked, and the user position marking information obtained includes:
    将所述待识别时间段根据所述原始wifi数据的wifi连接时间进行切片划分,得到至少一段wifi连接时间段;Slicing and dividing the time period to be identified according to the wifi connection time of the original wifi data to obtain at least one wifi connection time period;
    根据所述待识别时间段中各原始wifi数据与各待识别数据的对应关系,确定各原始wifi数据对应的地点类别;According to the corresponding relationship between each original wifi data and each data to be identified in the time period to be identified, determine the location category corresponding to each original wifi data;
    根据所述原始wifi数据的地点类别和所述原始wifi数据对所述wifi连接时间段进行标注,得到用户位置标注信息。Marking the wifi connection time period according to the location category of the original wifi data and the original wifi data to obtain user location marking information.
  8. 一种用户轨迹识别方法设备,包括存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机可读指令,所述处理器执行所述计算机可读指令时实现如下步骤:A user trajectory identification method device, comprising a memory, a processor, and computer-readable instructions stored on the memory and operable on the processor, and the processor implements the following steps when executing the computer-readable instructions :
    获取用户在待识别时间段的原始wifi数据和gps信息,其中,所述原始wifi数据包括wifi连接时间;Obtain the original wifi data and gps information of the user in the time period to be identified, wherein the original wifi data includes wifi connection time;
    对所述原始wifi数据进行数据预处理,得到待识别数据;Carry out data preprocessing to described original wifi data, obtain the data to be identified;
    根据预设的专家规则词典,对所述待识别数据进行一次识别,得到一次识别结果;performing a recognition on the data to be recognized according to a preset dictionary of expert rules to obtain a recognition result;
    若所述一次识别结果为识别成功,则得到所述待识别数据的地点类别;If the recognition result of the first recognition is successful, the location category of the data to be recognized is obtained;
    若所述一次识别结果为识别失败,则将所述待识别数据输入至预先训练好的wifi识别模型中,得到二次识别结果,其中,所述二次识别结果包括所述待识别数据的地点类别;If the primary recognition result is a recognition failure, then input the data to be recognized into a pre-trained wifi recognition model to obtain a secondary recognition result, wherein the secondary recognition result includes the location of the data to be recognized category;
    根据所述wifi连接时间将所述待识别段进行切片划分,得到至少一段wifi连接时间段,并根据所述待识别数据的地点类别对所述wifi连接时间段进行标注,得到用户位置标注信息;Slicing and dividing the segment to be identified according to the wifi connection time to obtain at least one wifi connection time segment, and marking the wifi connection time segment according to the location category of the data to be identified to obtain user location tagging information;
    根据所述用户位置标注信息、所述原始wifi数据和所述gps信息,生成所述用户的用户轨迹。Generating the user track of the user according to the user location tag information, the original wifi data and the gps information.
  9. 根据权利要求8所述的用户轨迹识别方法设备,其中,所述原始wifi数据包括wifi名称数据,所述对所述原始wifi数据进行数据预处理,得到待识别数据包括:The user track identification method device according to claim 8, wherein the original wifi data includes wifi name data, and performing data preprocessing on the original wifi data to obtain data to be identified includes:
    对所述wifi名称数据进行数据清洗处理,得到数据清洗结果;Perform data cleaning processing on the wifi name data to obtain a data cleaning result;
    将所述数据清洗结果中的wifi名称数据进行分词处理,得到wifi分词数组;Carry out word segmentation processing to wifi name data in the data cleaning result, obtain wifi word segmentation array;
    将所述wifi分词数组中的停用词进行剔除,得到待识别数据。The stop words in the wifi word segmentation array are removed to obtain the data to be recognized.
  10. 根据权利要求9所述的用户轨迹识别方法设备,其中,所述将所述数据清洗结果中的wifi名称数据进行分词处理,得到wifi分词数组包括:The user track identification method device according to claim 9, wherein said performing word segmentation processing on the wifi name data in the data cleaning result, and obtaining the wifi word segmentation array includes:
    对所述数据清洗结果中的wifi名称数据进行单字切分,得到序列数组;Carry out word segmentation to the wifi name data in the data cleaning result, obtain sequence array;
    根据预设的前缀词典,构建所述序列数组的有向无回图,并分别计算所述有向无回图中各路径的概率;Constructing the directed acyclic graph of the sequence array according to the preset prefix dictionary, and calculating the probability of each path in the directed acyclic graph respectively;
    根据所述有向无回图中最大概率对应的路径,得到最优分词结果,并根据所述最优分 词结果对所述数据清洗结果中的wifi名称数据进行分词,得到wifi分词数组。According to the path corresponding to the maximum probability in the directed acyclic graph, the optimal word segmentation result is obtained, and according to the optimal word segmentation result, the wifi name data in the data cleaning result is segmented to obtain the wifi word segmentation array.
  11. 根据权利要求8-10中任一项所述的用户轨迹识别方法设备,其中,所述根据预设的专家规则词典,对所述待识别数据进行一次识别,得到一次识别结果包括:According to the user trajectory identification method device according to any one of claims 8-10, wherein, performing a primary identification on the data to be identified according to a preset expert rule dictionary, and obtaining a primary identification result includes:
    将所述待识别数据与所述专家规则词典中的地点单词进行匹配;Matching the data to be identified with the location words in the expert rule dictionary;
    若匹配成功,则将所述待识别数据匹配成功的地点单词对应的地点类别作为一次识别结果;If the matching is successful, the location category corresponding to the location word that the data to be identified is successfully matched is used as a recognition result;
    若匹配失败,则将所述一次识别结果设为识别失败。If the matching fails, set the first recognition result as recognition failure.
  12. 根据权利要求11所述的用户轨迹识别方法设备,其中,所述将所述待识别数据输入至预先训练好的wifi识别模型中,得到二次识别结果包括:The user trajectory identification method device according to claim 11, wherein said inputting said data to be identified into a pre-trained wifi identification model to obtain a secondary identification result comprises:
    将所述待识别数据输入至所述wifi识别模型中的词向量层,将所述待识别数据转化成词向量序列;The data to be identified is input to the word vector layer in the wifi recognition model, and the data to be identified is converted into a word vector sequence;
    将所述词向量输入至所述wifi识别模型中的最大池化层,得到最大池化结果;The word vector is input to the maximum pooling layer in the wifi recognition model to obtain the maximum pooling result;
    将所述最大池化结果输入至所述wifi识别模型中的全连接隐藏层,并通过Softmax函数,对所述全连接隐藏层的输出结果进行分类,得到二次识别结果。The maximum pooling result is input into the fully connected hidden layer in the wifi identification model, and the output result of the fully connected hidden layer is classified by the Softmax function to obtain the secondary identification result.
  13. 根据权利要求12所述的用户轨迹识别方法设备,其中,所述wifi识别模型通过以下步骤训练得到:The user track identification method device according to claim 12, wherein the wifi identification model is obtained through the following steps of training:
    获取历史wifi数据和预设的神经网络模型,并初始化所述神经网络模型中词向量层、最大池化层和的网络参数和全连接隐藏层的网络参数,所述历史wifi数据包括人工标识的地点类别;Obtain historical wifi data and a preset neural network model, and initialize the network parameters of the word vector layer, the maximum pooling layer and the network parameters of the fully connected hidden layer in the neural network model, and the historical wifi data includes artificially identified location category;
    将所述历史wifi数据输入所述神经网络模型中,得到预测的地点类别;The historical wifi data is input in the neural network model to obtain the predicted location category;
    根据所述历史wifi数据通过人工标识的地点类别和通过神经网络模型预测的地点类别,计算预设的损失函数,得到损失值,并判断所述损失值是否小于预设阈值;Calculate the preset loss function according to the location category manually identified and the location category predicted by the neural network model according to the historical wifi data, obtain a loss value, and judge whether the loss value is less than a preset threshold;
    若是,则根据所述神经网络模型中词向量层、最大池化层和全连接隐藏层的网络参数确定wifi识别模型;If so, then determine the wifi recognition model according to the network parameters of word vector layer, maximum pooling layer and fully connected hidden layer in the neural network model;
    若否,则根据所述损失值通过反向传播算法更新所述神经网络模型的网络参数,反复迭代模型训练过程,直至损失值小于预设阈值,并确定训练后的神经网络模型的中词向量层、最大池化层和的网络参数和全连接隐藏层的网络参数确定wifi识别模型。If not, then update the network parameters of the neural network model through the backpropagation algorithm according to the loss value, iteratively iterate the model training process until the loss value is less than the preset threshold, and determine the middle word vector of the trained neural network model Layer, the network parameters of the maximum pooling layer and the network parameters of the fully connected hidden layer determine the wifi recognition model.
  14. 根据权利要求12所述的用户轨迹识别方法设备,其中,所述根据所述wifi连接时间将所述待识别段进行切片划分,得到至少一段wifi连接时间段,并根据所述待识别数据的地点类别对所述wifi连接时间段进行标注,得到用户位置标注信息包括:The user trajectory identification method device according to claim 12, wherein, according to the wifi connection time, the section to be identified is divided into slices to obtain at least one section of wifi connection time period, and according to the location of the data to be identified The category marks the wifi connection time period, and the obtained user location marking information includes:
    将所述待识别时间段根据所述原始wifi数据的wifi连接时间进行切片划分,得到至少一段wifi连接时间段;Slicing and dividing the time period to be identified according to the wifi connection time of the original wifi data to obtain at least one wifi connection time period;
    根据所述待识别时间段中各原始wifi数据与各待识别数据的对应关系,确定各原始wifi数据对应的地点类别;According to the corresponding relationship between each original wifi data and each data to be identified in the time period to be identified, determine the location category corresponding to each original wifi data;
    根据所述原始wifi数据的地点类别和所述原始wifi数据对所述wifi连接时间段进行标注,得到用户位置标注信息。Marking the wifi connection time period according to the location category of the original wifi data and the original wifi data to obtain user location marking information.
  15. 一种计算机可读存储介质,所述计算机可读存储介质中存储计算机指令,当所述计算机指令在计算机上运行时,使得计算机执行如下步骤:A computer-readable storage medium, wherein computer instructions are stored in the computer-readable storage medium, and when the computer instructions are run on the computer, the computer is made to perform the following steps:
    获取用户在待识别时间段的原始wifi数据和gps信息,其中,所述原始wifi数据包括wifi连接时间;Obtain the original wifi data and gps information of the user in the time period to be identified, wherein the original wifi data includes wifi connection time;
    对所述原始wifi数据进行数据预处理,得到待识别数据;Carry out data preprocessing to described original wifi data, obtain the data to be identified;
    根据预设的专家规则词典,对所述待识别数据进行一次识别,得到一次识别结果;performing a recognition on the data to be recognized according to a preset dictionary of expert rules to obtain a recognition result;
    若所述一次识别结果为识别成功,则得到所述待识别数据的地点类别;If the recognition result of the first recognition is successful, the location category of the data to be recognized is obtained;
    若所述一次识别结果为识别失败,则将所述待识别数据输入至预先训练好的wifi识别模型中,得到二次识别结果,其中,所述二次识别结果包括所述待识别数据的地点类别;If the primary recognition result is a recognition failure, then input the data to be recognized into a pre-trained wifi recognition model to obtain a secondary recognition result, wherein the secondary recognition result includes the location of the data to be recognized category;
    根据所述wifi连接时间将所述待识别段进行切片划分,得到至少一段wifi连接时间段,并根据所述待识别数据的地点类别对所述wifi连接时间段进行标注,得到用户位置标注信息;Slicing and dividing the segment to be identified according to the wifi connection time to obtain at least one wifi connection time segment, and marking the wifi connection time segment according to the location category of the data to be identified to obtain user location tagging information;
    根据所述用户位置标注信息、所述原始wifi数据和所述gps信息,生成所述用户的用户轨迹。Generating the user track of the user according to the user location tag information, the original wifi data and the gps information.
  16. 根据权利要求15所述的计算机可读存储介质,其中,所述原始wifi数据包括wifi名称数据,所述对所述原始wifi数据进行数据预处理,得到待识别数据包括:The computer-readable storage medium according to claim 15, wherein the original wifi data includes wifi name data, and performing data preprocessing on the original wifi data to obtain the data to be identified includes:
    对所述wifi名称数据进行数据清洗处理,得到数据清洗结果;Perform data cleaning processing on the wifi name data to obtain a data cleaning result;
    将所述数据清洗结果中的wifi名称数据进行分词处理,得到wifi分词数组;Carry out word segmentation processing to wifi name data in the data cleaning result, obtain wifi word segmentation array;
    将所述wifi分词数组中的停用词进行剔除,得到待识别数据。The stop words in the wifi word segmentation array are removed to obtain the data to be recognized.
  17. 根据权利要求16所述的计算机可读存储介质,其中,所述将所述数据清洗结果中的wifi名称数据进行分词处理,得到wifi分词数组包括:The computer-readable storage medium according to claim 16, wherein, performing word segmentation processing on the wifi name data in the data cleaning result, and obtaining the wifi word segmentation array includes:
    对所述数据清洗结果中的wifi名称数据进行单字切分,得到序列数组;Carry out word segmentation to the wifi name data in the data cleaning result, obtain sequence array;
    根据预设的前缀词典,构建所述序列数组的有向无回图,并分别计算所述有向无回图中各路径的概率;Constructing the directed acyclic graph of the sequence array according to the preset prefix dictionary, and calculating the probability of each path in the directed acyclic graph respectively;
    根据所述有向无回图中最大概率对应的路径,得到最优分词结果,并根据所述最优分词结果对所述数据清洗结果中的wifi名称数据进行分词,得到wifi分词数组。According to the path corresponding to the maximum probability in the directed acyclic graph, an optimal word segmentation result is obtained, and according to the optimal word segmentation result, the wifi name data in the data cleaning result is segmented to obtain a wifi word segmentation array.
  18. 根据权利要求15-17中任一项所述的计算机可读存储介质,其中,所述根据预设的专家规则词典,对所述待识别数据进行一次识别,得到一次识别结果包括:The computer-readable storage medium according to any one of claims 15-17, wherein, performing a primary recognition on the data to be recognized according to a preset expert rule dictionary, and obtaining a recognition result includes:
    将所述待识别数据与所述专家规则词典中的地点单词进行匹配;Matching the data to be identified with the location words in the expert rule dictionary;
    若匹配成功,则将所述待识别数据匹配成功的地点单词对应的地点类别作为一次识别结果;If the matching is successful, the location category corresponding to the location word that the data to be identified is successfully matched is used as a recognition result;
    若匹配失败,则将所述一次识别结果设为识别失败。If the matching fails, set the first recognition result as recognition failure.
  19. 根据权利要求18所述的计算机可读存储介质,其中,所述将所述待识别数据输入至预先训练好的wifi识别模型中,得到二次识别结果包括:The computer-readable storage medium according to claim 18, wherein said inputting said to-be-recognized data into a pre-trained wifi recognition model to obtain a secondary recognition result comprises:
    将所述待识别数据输入至所述wifi识别模型中的词向量层,将所述待识别数据转化成词向量序列;The data to be identified is input to the word vector layer in the wifi recognition model, and the data to be identified is converted into a word vector sequence;
    将所述词向量输入至所述wifi识别模型中的最大池化层,得到最大池化结果;The word vector is input to the maximum pooling layer in the wifi recognition model to obtain the maximum pooling result;
    将所述最大池化结果输入至所述wifi识别模型中的全连接隐藏层,并通过Softmax函数,对所述全连接隐藏层的输出结果进行分类,得到二次识别结果。The maximum pooling result is input into the fully connected hidden layer in the wifi identification model, and the output result of the fully connected hidden layer is classified by the Softmax function to obtain the secondary identification result.
  20. 一种用户轨迹识别方法装置,其中,所述用户轨迹识别方法装置包括:A method and device for identifying a user trajectory, wherein the method and device for identifying a user trajectory include:
    获取模块,获取用户在待识别时间段的原始wifi数据和gps信息,所述原始wifi数据包括wifi连接时间;Obtaining module, obtains original wifi data and gps information of user in the time period to be identified, and described original wifi data comprises wifi connection time;
    预处理模块,用于对所述原始wifi数据进行数据预处理,得到待识别数据;A preprocessing module, configured to perform data preprocessing on the original wifi data to obtain data to be identified;
    一次识别模块,用于根据预设的专家规则词典,对所述待识别数据进行一次识别,得到一次识别结果,当所述一次识别结果为识别成功时,则得到所述待识别数据的地点类别;A primary recognition module, configured to perform primary recognition on the data to be recognized according to a preset expert rule dictionary to obtain a primary recognition result, and when the primary recognition result is a successful recognition, obtain the location category of the data to be recognized ;
    二次识别模块,用于当所述一次识别结果为识别失败时,将所述待识别数据输入至预先训练好的wifi识别模型中,得到二次识别结果,其中,所述二次识别结果包括所述待识别数据的地点类别;A secondary recognition module, configured to input the data to be recognized into a pre-trained wifi recognition model when the primary recognition result is a recognition failure, to obtain a secondary recognition result, wherein the secondary recognition result includes the location category of the data to be identified;
    标注模块,用于根据所述wifi连接时间将所述待识别段进行切片划分,得到至少一段wifi连接时间段,并根据所述待识别数据的地点类别对所述wifi连接时间段进行标注, 得到用户位置标注信息;A labeling module, configured to slice and divide the segment to be identified according to the wifi connection time to obtain at least one segment of the wifi connection time segment, and mark the wifi connection time segment according to the location category of the data to be identified to obtain User location marking information;
    轨迹描绘模块,用于根据所述用户位置标注信息、所述原始wifi数据和所述gps信息,生成所述用户的用户轨迹。A trajectory drawing module, configured to generate the user trajectory of the user according to the user location label information, the original wifi data and the GPS information.
PCT/CN2022/071481 2021-06-30 2022-01-12 User track recognition method, apparatus and device, and storage medium WO2023273298A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110732370.4A CN113177101B (en) 2021-06-30 2021-06-30 User track identification method, device, equipment and storage medium
CN202110732370.4 2021-06-30

Publications (1)

Publication Number Publication Date
WO2023273298A1 true WO2023273298A1 (en) 2023-01-05

Family

ID=76927947

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/071481 WO2023273298A1 (en) 2021-06-30 2022-01-12 User track recognition method, apparatus and device, and storage medium

Country Status (2)

Country Link
CN (1) CN113177101B (en)
WO (1) WO2023273298A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116302569A (en) * 2023-05-17 2023-06-23 安世亚太科技股份有限公司 Resource partition intelligent scheduling method based on user request information
CN117111540A (en) * 2023-10-25 2023-11-24 南京德克威尔自动化有限公司 Environment monitoring and early warning method and system for IO remote control bus module

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113177101B (en) * 2021-06-30 2021-11-12 平安科技(深圳)有限公司 User track identification method, device, equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130165143A1 (en) * 2011-06-24 2013-06-27 Russell Ziskind Training pattern recognition systems for determining user device locations
CN108112026A (en) * 2017-12-13 2018-06-01 北京奇虎科技有限公司 WiFi recognition methods and device
CN111372193A (en) * 2020-03-06 2020-07-03 深圳市和讯华谷信息技术有限公司 Method and device for accurately positioning activity area of user in rest period
CN112653748A (en) * 2020-12-17 2021-04-13 北京三快在线科技有限公司 Information pushing method and device, electronic equipment and readable storage medium
CN113177101A (en) * 2021-06-30 2021-07-27 平安科技(深圳)有限公司 User track identification method, device, equipment and storage medium

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107657335A (en) * 2017-09-06 2018-02-02 武汉科技大学 A kind of spatial and temporal distributions Forecasting Methodology of airport traffic
CN108594170B (en) * 2018-04-04 2021-09-14 合肥工业大学 WIFI indoor positioning method based on convolutional neural network identification technology
US11455383B2 (en) * 2019-04-30 2022-09-27 TruU, Inc. Supervised and unsupervised techniques for motion classification
US10641610B1 (en) * 2019-06-03 2020-05-05 Mapsted Corp. Neural network—instantiated lightweight calibration of RSS fingerprint dataset
CN111757244B (en) * 2019-06-14 2022-03-25 广东小天才科技有限公司 Building positioning method and electronic equipment
CN111757464B (en) * 2019-06-26 2022-03-01 广东小天才科技有限公司 Region contour extraction method and device
CN111050281A (en) * 2019-12-16 2020-04-21 杭州电子科技大学 Indoor and outdoor positioning seamless connection method and system

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130165143A1 (en) * 2011-06-24 2013-06-27 Russell Ziskind Training pattern recognition systems for determining user device locations
CN108112026A (en) * 2017-12-13 2018-06-01 北京奇虎科技有限公司 WiFi recognition methods and device
CN111372193A (en) * 2020-03-06 2020-07-03 深圳市和讯华谷信息技术有限公司 Method and device for accurately positioning activity area of user in rest period
CN112653748A (en) * 2020-12-17 2021-04-13 北京三快在线科技有限公司 Information pushing method and device, electronic equipment and readable storage medium
CN113177101A (en) * 2021-06-30 2021-07-27 平安科技(深圳)有限公司 User track identification method, device, equipment and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
LIU XIANXIAN: "Research on User Mobility Model in Opportunistic Networks", CHINESE MASTER'S THESES FULL-TEXT DATABASE, 1 June 2019 (2019-06-01), pages 1 - 90, XP093020057 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116302569A (en) * 2023-05-17 2023-06-23 安世亚太科技股份有限公司 Resource partition intelligent scheduling method based on user request information
CN116302569B (en) * 2023-05-17 2023-08-15 安世亚太科技股份有限公司 Resource partition intelligent scheduling method based on user request information
CN117111540A (en) * 2023-10-25 2023-11-24 南京德克威尔自动化有限公司 Environment monitoring and early warning method and system for IO remote control bus module
CN117111540B (en) * 2023-10-25 2023-12-29 南京德克威尔自动化有限公司 Environment monitoring and early warning method and system for IO remote control bus module

Also Published As

Publication number Publication date
CN113177101A (en) 2021-07-27
CN113177101B (en) 2021-11-12

Similar Documents

Publication Publication Date Title
WO2023273298A1 (en) User track recognition method, apparatus and device, and storage medium
CN103455545B (en) The method and system of the location estimation of social network user
CN111160471B (en) Interest point data processing method and device, electronic equipment and storage medium
US9262438B2 (en) Geotagging unstructured text
CN109635108B (en) Man-machine interaction based remote supervision entity relationship extraction method
CN105956053B (en) A kind of searching method and device based on the network information
CN108287858A (en) The semantic extracting method and device of natural language
CN107016042B (en) Address information verification system based on user position log
CN108153824B (en) Method and device for determining target user group
CN109857846B (en) Method and device for matching user question and knowledge point
CN109359302B (en) Optimization method of domain word vectors and fusion ordering method based on optimization method
CN106126751A (en) A kind of sorting technique with time availability and device
CN104615616A (en) Group recommendation method and system
Tian et al. Amendable generation for dialogue state tracking
CN110059177A (en) A kind of activity recommendation method and device based on user's portrait
KR20230171234A (en) Method for Providing Question-and-Answer Service Based on User Participation And Apparatus Therefor
CN113946657A (en) Knowledge reasoning-based automatic identification method for power service intention
CN114254615A (en) Volume assembling method and device, electronic equipment and storage medium
Chong et al. Fine-grained geolocation of tweets in temporal proximity
CN111460044B (en) Geographic position data processing method and device
CN111339258B (en) University computer basic exercise recommendation method based on knowledge graph
CN116757498A (en) Method, equipment and medium for pushing benefit-enterprise policy
CN116431746A (en) Address mapping method and device based on coding library, electronic equipment and storage medium
Yang et al. Point‐of‐interest detection from Weibo data for map updating
Sun et al. Urban region function mining service based on social media text analysis

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22831134

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 22831134

Country of ref document: EP

Kind code of ref document: A1