CN111125550B - Point-of-interest classification method, device, equipment and storage medium - Google Patents

Point-of-interest classification method, device, equipment and storage medium Download PDF

Info

Publication number
CN111125550B
CN111125550B CN201811296498.5A CN201811296498A CN111125550B CN 111125550 B CN111125550 B CN 111125550B CN 201811296498 A CN201811296498 A CN 201811296498A CN 111125550 B CN111125550 B CN 111125550B
Authority
CN
China
Prior art keywords
interest
point
information
vector
interest point
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811296498.5A
Other languages
Chinese (zh)
Other versions
CN111125550A (en
Inventor
万程
彭继东
刘鹏
杨胜文
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN201811296498.5A priority Critical patent/CN111125550B/en
Publication of CN111125550A publication Critical patent/CN111125550A/en
Application granted granted Critical
Publication of CN111125550B publication Critical patent/CN111125550B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention discloses a method, a device, equipment and a storage medium for classifying interest points. Wherein the method comprises the following steps: generating an interest point name vector according to the name information of the interest point; generating an interest point label vector according to the label information of the interest point; and obtaining the category information of the interest points according to the interest point name vector and the interest point label vector by adopting an interest point classification model which is obtained through training in advance. According to the technical scheme provided by the embodiment of the invention, the category information of the interest points is determined by combining the two dimensional information of the names and the labels of the interest points and adopting the interest point classification model obtained by pre-training, so that the classification accuracy of the interest points is improved. And then information pushing is carried out based on the category information of the interest points, so that the pushed information meets the requirement of the user.

Description

Point-of-interest classification method, device, equipment and storage medium
Technical Field
The embodiment of the invention relates to the technical field of artificial intelligence, in particular to a method, a device, equipment and a storage medium for classifying interest points.
Background
With the development of science and technology, the appearance of electronic maps provides convenience for life of people. The image displayed in the electronic map basically consists of a dotted line and a plane, and the interest points serving as important components of the point data are indispensable components in the electronic map; theoretically any building, area and point of particular interest that can be named can be presented as point of interest data, such as a restaurant, cell, parking lot, bus station, etc.
However, the category information of the interest point directly affects the service based on the interest point, for example, information pushing is performed based on the category of the interest point, so that the category identification of the interest point is very critical. The existing method for identifying the categories of the interest points is low in identification precision, so that the information pushed based on the identified categories of the interest points cannot meet the actual demands of users. It is therefore important to provide a new method for accurately identifying the category of interest points.
Disclosure of Invention
The embodiment of the invention provides a method, a device, equipment and a storage medium for classifying interest points, which improve the classification accuracy of the interest points.
In a first aspect, an embodiment of the present invention provides a method for classifying points of interest, where the method includes:
generating an interest point name vector according to the name information of the interest point;
generating an interest point label vector according to the label information of the interest point;
and obtaining the category information of the interest points according to the interest point name vector and the interest point label vector by adopting an interest point classification model which is obtained through training in advance.
In a second aspect, an embodiment of the present invention further provides a device for classifying points of interest, where the device includes:
the name vector generation module is used for generating an interest point name vector according to the name information of the interest point;
the tag vector generation module is used for generating a tag vector of the interest point according to the tag information of the interest point;
the category information acquisition module is used for acquiring category information of the interest points according to the interest point name vector and the interest point label vector by adopting an interest point classification model which is obtained through training in advance.
In a third aspect, an embodiment of the present invention further provides an apparatus, including:
one or more processors;
a storage means for storing one or more programs;
the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the method of interest point classification as any of the first aspects.
In a fourth aspect, an embodiment of the present invention further provides a storage medium having stored thereon a computer program, which when executed by a processor implements the method for classifying points of interest according to any of the first aspects.
According to the interest point classification method, the device, the equipment and the storage medium, the interest point name vector and the interest point label vector are obtained through processing the name information and the label information of the interest points respectively, and the class information of the interest points is obtained through training the interest point name vector and the interest point label vector by adopting a pre-trained interest point classification model. According to the method, the category information of the interest points is determined by combining the two dimensional information of the names and the labels of the interest points and adopting the interest point classification model obtained through pre-training, so that the classification accuracy of the interest points is improved. And then information pushing is carried out based on the category information of the interest points, so that the pushed information meets the requirement of the user.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the embodiments will be briefly described below, it being understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and other related drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a method for classifying points of interest according to a first embodiment of the present invention;
FIG. 2 is a flowchart of a method for classifying points of interest according to a second embodiment of the present invention;
FIG. 3 is a flowchart of a method for classifying points of interest according to a third embodiment of the present invention;
fig. 4 is a block diagram of a point of interest classification device according to a fourth embodiment of the present invention;
fig. 5 is a schematic structural diagram of an apparatus according to a fifth embodiment of the present invention.
Detailed Description
Embodiments of the present invention will be described in further detail below with reference to the drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the embodiments of the invention and are not limiting of the invention. It should be further noted that, for convenience of description, only some, but not all of the structures related to the embodiments of the present invention are shown in the drawings.
Example 1
Fig. 1 is a flowchart of a method for classifying points of interest according to an embodiment of the present invention. The method and the device are suitable for the situation of classifying the interest points accurately. The method may be performed by the point of interest classification apparatus provided by the embodiments of the present invention, where the apparatus may be implemented in software and/or hardware, and the apparatus may be integrated into a computing device. Referring to fig. 1, the method specifically includes:
s110, generating an interest point name vector according to the interest point name information.
The interest point is an information point, and is a point recorded by a navigation software merchant and capable of directly finding a corresponding destination on navigation software (such as an electronic map). Bubble icons such as scenic spots, government agencies, companies, malls, restaurants, etc. displayed on the electronic map all represent points of interest; the name information of the point of interest may include the name of the point of interest, for example, "starbucks", keywords in the name of the point of interest, and the like.
The point-of-interest name vector refers to a representation of the name information of a point of interest in a vector space, which can be obtained by processing, e.g., modeling, text. For example, the names of the points of interest may be first processed, such as word segmentation, to extract keywords, and then the keywords are mapped into N-dimensional real vectors by training using a text depth representation model word2vector, so as to obtain the name vectors of the points of interest. Where N is typically a hyper-parameter in the model.
For example, generating the point of interest name vector from the point of interest name information may include: and taking the name information of the interest point as the input of a word bag model BOW, a text depth representation model word2vector or a topic model LDA to obtain the name vector of the interest point.
Among them, the Bag of Words model (bog) is a method commonly used to represent documents, which has been found in the fields of NLP (Natural Language Processing ) and IR (Information Retrieva, information retrieval) at the earliest, ignoring the grammar and the word order of the text, and can be used to represent each document as an N-dimensional vector. The topic model (LDA, latent dirichlet Allocation) is a document topic generation model, also called a three-layer bayesian probability model, which uses the method of a BOW model to represent each document as a word frequency vector. The word2vector model is used to convert words in a corpus into vectors.
Specifically, name information of the interest point can be input into any one of a BOW model, a word2vector model or an LDA model, and the model can be combined with own parameters to analyze the input information and then output an interest point name vector.
The generating the interest point name vector according to the interest point name information may further be: and determining keywords according to the name information of the interest points, and inputting the keywords into a pre-established word-word vector corresponding table to be matched to obtain the interest point name vector. The pre-established word-word vector mapping table is obtained by training a word2vector model based on massive encyclopedia entry data as a prediction.
S120, generating an interest point label vector according to the label information of the interest point.
The tag information of the interest point refers to descriptive information related to the interest point, and can be obtained from related descriptions such as introduction, comments, business hours and the like. The number of tag information of the points of interest associated with one point of interest may be one or more. For example, the name information of the point of interest is "Jili building", and the tag information of the point of interest may include: "parking space", "variety rich", etc.
The interest point label vector refers to a representation of label information of an interest point on a vector space, and can be obtained by processing the label information. For example, the tag information of each interest point can be processed based on the tag information of the interest point to obtain an interest point tag vector; for ease of computation, the tag information for the point of interest may also be processed as a binary vector.
S130, obtaining the category information of the interest points according to the interest point name vector and the interest point label vector by adopting an interest point classification model obtained through pre-training.
The category refers to classification of interest points based on functions, purposes, and the like of the interest points, such as malls, entertainment, accommodation, hospitals, schools, and the like. Correspondingly, the category information refers to the category to which the interest point belongs; the interest point classification model can be obtained by training based on a machine learning model in advance, and can also be obtained by training based on an automatic attribute switch deep trust network; the interest point classification model can also be formed by combining machine learning with an automatic attribute switch deep trust network, namely a two-level or multi-level model.
The automatic attribute switching deep trust network (Attribute Gated Deep Belief Network, AG-DBN) is proposed based on the motivation of automatically mapping an abstract layer of an attribute to an appropriate deep structure internal abstract layer, and the model comprises an adjustable attribute layer control mechanism, can automatically introduce appropriate connection between the attribute and a hidden layer, and has good performance on discrete vector input and higher classification precision compared with a convolutional neural network model (Convolutional Neural Networks, CNN). For example, the AG-DBN model can accurately classify pictures. Therefore, in this embodiment, the AG-DBN model is preferably used for training to obtain the interest point classification model; or the interest point classification model is formed by combining machine learning with automatic attribute switching deep trust network, namely a two-level or multi-level model.
If the interest point classification model is obtained based on automatic attribute switch deep belief network training. Specifically, using the interest points with known category information as training sample data, and respectively processing the names of the sample interest points and sample label information to obtain sample interest point name vectors and sample interest point label vectors; the sample interest point name vector, the sample interest point label vector and the class information of the sample interest point are input into an automatic attribute switch depth trust network for training until the interest point name vector and the interest point label vector of the class information of the unknown interest point are input into the model, and the model can accurately output the class information of the unknown interest point according to the existing parameters, and at the moment, the model is an interest point classification model.
For example, by using a pre-trained interest point classification model, according to an interest point name vector and an interest point label vector, obtaining the category information of the interest point may include: and taking the interest point name vector and the interest point label vector as the input of an interest point classification model to obtain the category information of the interest points. The interest point classification model is obtained based on automatic attribute switching deep belief network training.
Specifically, the interest point name vector and the interest point label vector are used as input variables and are input into an interest point classification model, the interest point classification model is trained by combining parameters of the interest point classification model, and category information of the interest points is output. In addition, by using the AG-DBN model which is excellent in discrete vector input and has higher classification accuracy, it is possible to accurately obtain the class information of the point of interest, as compared with the model obtained by the conventional machine learning.
According to the technical scheme provided by the embodiment of the invention, the name information and the label information of the interest point are respectively processed to obtain the name vector of the interest point and the label vector of the interest point, and the category information of the interest point is obtained by training the name vector of the interest point and the label vector of the interest point by adopting a pre-trained interest point classification model. According to the method, the category information of the interest points is determined by combining the two dimensional information of the names and the labels of the interest points and adopting the interest point classification model obtained through pre-training, so that the classification accuracy of the interest points is improved. And then information pushing is carried out based on the category information of the interest points, so that the pushed information meets the requirement of the user.
Example two
Fig. 2 is a flowchart of a method for classifying points of interest according to a second embodiment of the present invention, where the method further explains the classification information of the points of interest obtained by training in advance according to the point name vector and the point label vector on the basis of the first embodiment. Referring to fig. 2, the method specifically includes:
s210, generating an interest point name vector according to the interest point name information.
S220, generating an interest point label vector according to the label information of the interest point.
S230, taking the interest roll name vector as the input of the first classification model to obtain preliminary classification information.
The preliminary classification information refers to preliminary classification of the points of interest, and may include at least one classification result. The first classification model may be trained in advance based on a machine learning model, such as a CNN model or LTSM model, or the like. Specifically, the sample interest roll name vector and the sample primary classification information can be used as a training sample set, input into a convolutional neural network for training, and after each sample is trained, a first classification model is obtained. When a point-of-interest name vector is input into the first classification model, the model combines the existing parameters of the model, makes a judgment on the input point-of-interest name vector, and outputs corresponding preliminary classification information.
For example, the name information of the interest point is "Jili building", the "Jili building" is subjected to word segmentation and other processing to obtain two keywords of "Jili" and "building", then a word2vector model is adopted to obtain a point-of-interest name vector corresponding to the "Jili building", and the point-of-interest name vector is input into a first classification model to obtain preliminary classification information, which may include: malls, business buildings (or writing buildings), etc.
Because the training complexity of each model is relatively large one by one, in order to reduce the training complexity, the training method can also be that: taking the interest points with known category information as training sample data, and respectively processing the name information and the sample label information of the sample interest points to obtain sample interest point name vectors and sample interest point label vectors; inputting the sample interest point name vector into a convolutional neural network model to obtain preliminary classification information, inputting the preliminary classification information, the sample interest point label vector and the class information of the sample interest point into an automatic attribute switch depth reliance network model, and training the two models until the automatic attribute switch depth reliance network model can accurately output the class information of the interest point, wherein the convolutional neural network model corresponds to the first classification model; the automatic attribute switch depth trust network model is the second classification model. According to the method, the output result of the convolutional neural network model does not need to be concerned, and training can be stopped only by the automatic attribute switch depth dependence network model, and finally, the classification result can be accurately output.
S240, taking the preliminary classification information and the interest point label vector as the input of a second classification model to obtain the class information of the interest point.
The second classification model is obtained based on automatic attribute switching deep belief network training. Alternatively, the second classification model may be trained with the first classification model; or can be obtained by training alone. The second classification model can be used for screening the preliminary classification information obtained by the first classification model, so as to obtain the class information of the interests.
Specifically, in order to make the obtained category information of the interest points more accurate, the scheme obtains the preliminary category information based on the names of the interest points; and inputting the preliminary classification information and the interest point label vector into a second classification model obtained based on automatic attribute switch deep belief network training, wherein the second classification model outputs the classification information of the interest point.
For example, the name information of the point of interest is "a Jili building", and the tag information of the point of interest includes: "parking space available", "variety rich"; the interest point name vector is input into a first classification model, and preliminary classification information can be obtained as follows: malls, business buildings (or writing buildings). The preliminary classification information is processed to obtain corresponding vectors, the corresponding vectors and the interest tag vectors are input into a second classification model, the model finally outputs the class information of the interest points as a market, and the class information accords with the actual situation, so that the recognition accuracy is improved.
According to the technical scheme provided by the embodiment of the invention, the interest point name vector and the interest point label vector are obtained by respectively processing the name information and the label information of the interest point, and the class information of the interest point is obtained by training the interest point name vector and the interest point label vector by adopting a first classification model and a second classification model which is a two-stage model and is constructed by an automatic attribute switch deep trust network with good discrete vector input performance. According to the method, the classification model constructed by the automatic attribute switch deep trust network with good discrete vector input performance is adopted to determine the classification information of the interest point by combining the two dimensional information of the name and the label of the interest point, so that the classification accuracy of the interest point is improved. And then information pushing is carried out based on the category information of the interest points, so that the pushed information meets the requirement of the user.
Example III
Fig. 3 is a flowchart of a method for classifying points of interest according to a third embodiment of the present invention, where the method further explains generation of a point of interest tag vector according to tag information of points of interest based on the above embodiment. Referring to fig. 3, the method specifically includes:
s310, generating an interest point name vector according to the interest point name information.
S320, taking the dimension total amount of the label information of each interest point as the dimension of the label vector of the interest point.
In this embodiment, the number of tag information of each interest point is the dimension of the tag information of the interest point; the total amount of dimensions may be determined based on the dimensions of the tag information for each point of interest. For example, it may be the sum of the dimensions of the tag information of each point of interest. For example, the number of tag information of three points of interest A, B and C, point of interest a is 5; the number of the tag information of the interest point B is 5; the number of the tag information of the interest point C is 4; the total amount of dimensions is 14.
In order to facilitate subsequent calculation, the dimension of the tag vector of the interest point can be reduced, and therefore, the sum of the numbers of tag information of non-intersection parts in the tag information of each interest point can be regarded as the total dimension of the tag information of each interest point. For example, if the number of overlapping tag information between the point of interest a, the point of interest B, and the point of interest C is 1, the number of overlapping tags between the point of interest a and the point of interest B is 2, the number of overlapping tags between the point of interest a and the point of interest C is 1, and the number of overlapping tags between the point of interest B and the point of interest C is 1, the total dimension can be determined to be 11.
Specifically, the dimension total amount can be determined based on the dimension of the tag information of each interest point, and the dimension total amount is used as the dimension of the interest point tag vector corresponding to the tag information of each interest point.
S330, for each dimension in the interest point tag vector, if the interest point has tag information of the dimension, determining that the value in the dimension in the interest point tag vector is a first numerical value; otherwise, determining the value in the dimension as a second numerical value.
The first numerical value is preset, and when the interest point has tag information of a certain dimension, the corresponding value of the dimension in the interest point tag vector is obtained; correspondingly, the second value is preset, and when the interest point does not have the label information of a certain dimension, the value corresponding to the dimension in the interest point label vector is obtained. Optionally, the first value is different from the second value. For ease of calculation, the first value may be set to 1 and the second value to 0.
For example, using vector b to represent the interest point tag vector, numbering all tag information such as b i Such asA certain interest point of the fruit contains tag information b i B is then i =1, otherwise b i =0. Wherein i has a value of 0 to n. The label information of each interest point may be numbered in ascending order according to the order of the interest points, or may be numbered according to a predetermined order.
It should be noted that, the processing of the tag information of the interest point in steps S320 and S330 is performed, and the obtained tag vector of the interest point is a binary vector, so that the training complexity of obtaining the category information of the interest point based on the model training is reduced.
S340, obtaining the category information of the interest points according to the interest point name vector and the interest point label vector by adopting the interest point classification model obtained through pre-training.
According to the technical scheme provided by the embodiment of the invention, the name information and the label information of the interest point are respectively processed to obtain the name vector of the interest point and the label vector of the interest point, and the category information of the interest point is obtained by training the name vector of the interest point and the label vector of the interest point by adopting a pre-trained interest point classification model. According to the method, the category information of the interest points is determined by combining the two dimensional information of the names and the labels of the interest points and adopting the interest point classification model obtained through pre-training, so that the classification accuracy of the interest points is improved. And then information pushing is carried out based on the category information of the interest points, so that the pushed information meets the requirement of the user.
Example IV
Fig. 4 is a block diagram of a point of interest classification device according to a fourth embodiment of the present invention, where the device may execute the point of interest classification method according to any embodiment of the present invention, and the device has functional modules and beneficial effects corresponding to the execution method. As shown in fig. 4, the apparatus may include:
a name vector generation module 410, configured to generate a point of interest name vector according to name information of the point of interest;
the tag vector generation module 420 is configured to generate a tag vector of the interest point according to tag information of the interest point;
the category information obtaining module 430 is configured to obtain category information of the point of interest according to the point of interest name vector and the point of interest tag vector by using the point of interest classification model obtained through training in advance.
According to the technical scheme provided by the embodiment of the invention, the name information and the label information of the interest point are respectively processed to obtain the name vector of the interest point and the label vector of the interest point, and the category information of the interest point is obtained by training the name vector of the interest point and the label vector of the interest point by adopting a pre-trained interest point classification model. According to the method, the category information of the interest points is determined by combining the two dimensional information of the names and the labels of the interest points and adopting the interest point classification model obtained through pre-training, so that the classification accuracy of the interest points is improved. And then information pushing is carried out based on the category information of the interest points, so that the pushed information meets the requirement of the user.
The category information acquisition module 430 may also be used, for example:
the method comprises the steps of taking an interest point name vector and an interest point label vector as input of an interest point classification model to obtain category information of an interest point; the interest point classification model is obtained based on automatic attribute switching deep belief network training.
The category information acquisition module 430 may also be used, for example:
taking the point-of-interest name vector as the input of a first classification model to obtain preliminary classification information;
taking the preliminary classification information and the interest point label vector as the input of a second classification model to obtain the class information of the interest point; the second classification model is obtained based on automatic attribute switching deep belief network training.
Illustratively, the name vector generation module 410 is specifically configured to:
and taking the name information of the interest point as the input of a word bag model BOW, a text depth representation model word2vector or a topic model LDA to obtain the name vector of the interest point.
Illustratively, the tag vector generation module 420 is specifically configured to:
taking the total dimension of the label information of the interest points as the dimension of the label vector of the interest points;
for each dimension in the interest point tag vector, if the interest point has tag information of the dimension, determining that the value in the dimension in the interest point tag vector is a first numerical value; otherwise, determining the value in the dimension as a second numerical value.
Example five
Fig. 5 is a schematic structural diagram of an apparatus provided in a fifth embodiment of the present invention, and fig. 5 shows a block diagram of an exemplary apparatus suitable for implementing an embodiment of the present invention. The device 12 shown in fig. 5 is merely an example and should not be construed as limiting the functionality and scope of use of embodiments of the present invention. As shown in fig. 5, device 12 is in the form of a general purpose computing device. Components of device 12 may include, but are not limited to: one or more processors or processing units 16, a system memory 28, a bus 18 that connects the various system components, including the system memory 28 and the processing units 16.
Bus 18 represents one or more of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, a processor, and a local bus using any of a variety of bus architectures. By way of example, and not limitation, such architectures include Industry Standard Architecture (ISA) bus, micro channel architecture (MAC) bus, enhanced ISA bus, video Electronics Standards Association (VESA) local bus, and Peripheral Component Interconnect (PCI) bus.
Device 12 typically includes a variety of computer system readable media. Such media can be any available media that is accessible by device 12 and includes both volatile and nonvolatile media, removable and non-removable media.
The system memory 28 may include computer system readable media in the form of volatile memory, such as Random Access Memory (RAM) 30 and/or cache memory 32. Device 12 may further include other removable/non-removable, volatile/nonvolatile computer system storage media. By way of example only, storage system 34 may be used to read from or write to non-removable, nonvolatile magnetic media (not shown in FIG. 5, commonly referred to as a "hard disk drive"). Although not shown in fig. 5, a magnetic disk drive for reading from and writing to a removable non-volatile magnetic disk (e.g., a "floppy disk"), and an optical disk drive for reading from or writing to a removable non-volatile optical disk (e.g., a CD-ROM, DVD-ROM, or other optical media) may be provided. In such cases, each drive may be coupled to bus 18 through one or more data medium interfaces. The system memory 28 may include at least one program product having a set (e.g., at least one) of program modules configured to carry out the functions of the embodiments of the invention.
A program/utility 40 having a set (at least one) of program modules 42 may be stored in, for example, system memory 28, such program modules 42 including, but not limited to, an operating system, one or more application programs, other program modules, and program data, each or some combination of which may include an implementation of a network environment. Program modules 42 generally perform the functions and/or methods of the embodiments described herein.
Device 12 may also communicate with one or more external devices 14 (e.g., keyboard, pointing device, display 24, etc.), one or more devices that enable a user to interact with device 12, and/or any devices (e.g., network card, modem, etc.) that enable device 12 to communicate with one or more other computing devices. Such communication may occur through an input/output (I/O) interface 22. Also, device 12 may communicate with one or more networks such as a Local Area Network (LAN), a Wide Area Network (WAN) and/or a public network, such as the Internet, via network adapter 20. As shown, network adapter 20 communicates with other modules of device 12 over bus 18. It should be appreciated that although not shown, other hardware and/or software modules may be used in connection with device 12, including, but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, data backup storage systems, and the like.
The processing unit 16 executes various functional applications and data processing by running programs stored in the system memory 28, for example, to implement the point-of-interest classification method provided by the embodiment of the present invention.
Example six
The sixth embodiment of the present invention further provides a computer readable storage medium having a computer program (or called computer executable instructions) stored thereon, where the program when executed by a processor can implement the method for classifying points of interest according to any of the above embodiments.
The computer storage media of embodiments of the invention may take the form of any combination of one or more computer-readable media. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the computer-readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
The computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, either in baseband or as part of a carrier wave. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations for embodiments of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, smalltalk, C ++ and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider).
Note that the above is only a preferred embodiment of the present invention and the technical principle applied. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, while the embodiments of the present invention have been described in connection with the above embodiments, the embodiments of the present invention are not limited to the above embodiments, but may include many other equivalent embodiments without departing from the spirit of the present invention, and the scope of the present invention is determined by the scope of the appended claims.

Claims (12)

1. A method of interest point classification, comprising:
generating an interest point name vector according to the name information of the interest point;
generating an interest point label vector according to the label information of the interest point;
obtaining category information of the interest points according to the interest point name vector and the interest point label vector by adopting an interest point classification model obtained through pre-training;
the method for obtaining the category information of the interest point according to the interest point name vector and the interest point label vector by adopting an interest point classification model obtained through pre-training comprises the following steps:
taking the point-of-interest name vector as the input of a first classification model to obtain preliminary classification information;
and taking the preliminary classification information and the interest point label vector as inputs of a second classification model to obtain the class information of the interest point, wherein the second classification model is used for screening the preliminary classification information obtained by the first classification model, and the first classification model and the second classification model are obtained through simultaneous training.
2. The method of claim 1, wherein obtaining category information of the point of interest from the point of interest name vector and the point of interest tag vector using a pre-trained point of interest classification model comprises:
the method comprises the steps of taking an interest point name vector and an interest point label vector as input of an interest point classification model to obtain category information of an interest point; the interest point classification model is obtained based on automatic attribute switching deep belief network training.
3. The method of claim 1, wherein obtaining category information of the point of interest from the point of interest name vector and the point of interest tag vector using a pre-trained point of interest classification model comprises:
taking the point-of-interest name vector as the input of a first classification model to obtain preliminary classification information;
the preliminary classification information and the interest point label vector are used as the input of a second classification model, so that the class information of the interest points is obtained;
the second classification model is obtained based on automatic attribute switching deep belief network training.
4. The method of claim 1, wherein generating a point of interest name vector from the name information of the point of interest comprises:
and taking the name information of the interest point as the input of a word bag model BOW, a text depth representation model word2vector or a topic model LDA to obtain the name vector of the interest point.
5. The method of claim 1, wherein generating a point of interest tag vector from tag information of a point of interest comprises:
taking the total dimension of the label information of each interest point as the dimension of the label vector of the interest point;
for each dimension in the interest point tag vector, if the interest point has tag information of the dimension, determining that the value in the dimension in the interest point tag vector is a first numerical value; otherwise, determining the value in the dimension as a second numerical value.
6. A point of interest classification device, comprising:
the name vector generation module is used for generating an interest point name vector according to the name information of the interest point;
the tag vector generation module is used for generating a tag vector of the interest point according to the tag information of the interest point;
the category information acquisition module is used for acquiring category information of the interest points according to the interest point name vector and the interest point label vector by adopting an interest point classification model obtained through pre-training;
wherein, the category information acquisition module is further used for:
taking the point-of-interest name vector as the input of a first classification model to obtain preliminary classification information;
and taking the preliminary classification information and the interest point label vector as inputs of a second classification model to obtain the class information of the interest point, wherein the second classification model is used for screening the preliminary classification information obtained by the first classification model, and the first classification model and the second classification model are obtained through simultaneous training.
7. The apparatus of claim 6, wherein the category information acquisition module is further configured to:
the method comprises the steps of taking an interest point name vector and an interest point label vector as input of an interest point classification model to obtain category information of an interest point; the interest point classification model is obtained based on automatic attribute switching deep belief network training.
8. The apparatus of claim 6, wherein the category information acquisition module is further configured to:
taking the point-of-interest name vector as the input of a first classification model to obtain preliminary classification information;
the preliminary classification information and the interest point label vector are used as the input of a second classification model, so that the class information of the interest points is obtained;
the second classification model is obtained based on automatic attribute switching deep belief network training.
9. The apparatus of claim 6, wherein the name vector generation module is specifically configured to:
and taking the name information of the interest point as the input of a word bag model BOW, a text depth representation model word2vector or a topic model LDA to obtain the name vector of the interest point.
10. The apparatus of claim 6, wherein the tag vector generation module is specifically configured to:
taking the total dimension of the label information of the interest points as the dimension of the label vector of the interest points;
for each dimension in the interest point tag vector, if the interest point has tag information of the dimension, determining that the value in the dimension in the interest point tag vector is a first numerical value; otherwise, determining the value in the dimension as a second numerical value.
11. An apparatus, the apparatus comprising:
one or more processors;
a storage means for storing one or more programs;
the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the point of interest classification method of any of claims 1-5.
12. A storage medium having stored thereon a computer program, which when executed by a processor implements the point of interest classification method according to any of claims 1-5.
CN201811296498.5A 2018-11-01 2018-11-01 Point-of-interest classification method, device, equipment and storage medium Active CN111125550B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811296498.5A CN111125550B (en) 2018-11-01 2018-11-01 Point-of-interest classification method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811296498.5A CN111125550B (en) 2018-11-01 2018-11-01 Point-of-interest classification method, device, equipment and storage medium

Publications (2)

Publication Number Publication Date
CN111125550A CN111125550A (en) 2020-05-08
CN111125550B true CN111125550B (en) 2023-11-24

Family

ID=70494926

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811296498.5A Active CN111125550B (en) 2018-11-01 2018-11-01 Point-of-interest classification method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN111125550B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111767359B (en) * 2020-06-30 2023-09-01 北京百度网讯科技有限公司 Point-of-interest classification method, device, equipment and storage medium
CN112328896B (en) * 2020-11-26 2024-03-15 北京百度网讯科技有限公司 Method, apparatus, electronic device, and medium for outputting information
CN112579793B (en) * 2020-12-24 2024-04-30 北京创鑫旅程网络技术有限公司 Model training method, POI label detection method and device

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106682217A (en) * 2016-12-31 2017-05-17 成都数联铭品科技有限公司 Method for enterprise second-grade industry classification based on automatic screening and learning of information
CN107403198A (en) * 2017-07-31 2017-11-28 广州探迹科技有限公司 A kind of official website recognition methods based on cascade classifier
CN107679189A (en) * 2017-09-30 2018-02-09 百度在线网络技术(北京)有限公司 A kind of point of interest update method, device, server and medium
CN108021940A (en) * 2017-11-30 2018-05-11 中国银联股份有限公司 data classification method and system based on machine learning
CN108171276A (en) * 2018-01-17 2018-06-15 百度在线网络技术(北京)有限公司 For generating the method and apparatus of information
CN108363698A (en) * 2018-03-13 2018-08-03 腾讯大地通途(北京)科技有限公司 Point of interest relation recognition method and device

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7756313B2 (en) * 2005-11-14 2010-07-13 Siemens Medical Solutions Usa, Inc. System and method for computer aided detection via asymmetric cascade of sparse linear classifiers
US8295637B2 (en) * 2009-01-07 2012-10-23 Seiko Epson Corporation Method of classifying red-eye objects using feature extraction and classifiers

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106682217A (en) * 2016-12-31 2017-05-17 成都数联铭品科技有限公司 Method for enterprise second-grade industry classification based on automatic screening and learning of information
CN107403198A (en) * 2017-07-31 2017-11-28 广州探迹科技有限公司 A kind of official website recognition methods based on cascade classifier
CN107679189A (en) * 2017-09-30 2018-02-09 百度在线网络技术(北京)有限公司 A kind of point of interest update method, device, server and medium
CN108021940A (en) * 2017-11-30 2018-05-11 中国银联股份有限公司 data classification method and system based on machine learning
CN108171276A (en) * 2018-01-17 2018-06-15 百度在线网络技术(北京)有限公司 For generating the method and apparatus of information
CN108363698A (en) * 2018-03-13 2018-08-03 腾讯大地通途(北京)科技有限公司 Point of interest relation recognition method and device

Non-Patent Citations (10)

* Cited by examiner, † Cited by third party
Title
"Automatic gating of attributes in deep structure";Xiaoming Jin et al.;《ACM》;20180713;正文第3.3,4.2章节 *
Automatic Gating of Attributes in Deep Structure;Xiaoming Jin et al.;《ACM》;正文第4.2章节 *
Methods of Combining MultiPle Classifiers and Their APPlication to Handwriting ReCognitio;Xu L.;IEEE Transactions on Systems,Man,and Cyberneties;全文 *
Serial Combination of Multiple Experts:A Unified Evaluation;A. F. R. Rahman et al.;Pattern Analysis and Application;正文第4.1,6章节 *
基于混合支持向量机多分类器的交通事件检测方法研究;刘清泉;《中国优秀硕士学位论文全文数据库》;全文 *
多层组合分类器研究;蒋艳凰,杨学军;计算机工程与科学;全文 *
康琦."短文本分类".《机器学习中的不平衡分类方法》.2017,163-165. *
柳杨."词袋模型".《数字图像物体识别理论详解与实战》.2018, *
柳杨.《数字图像物体识别理论详解与实战》.2018,62-63. *
胡志坚."自然语言处理".《技术前瞻与评价 第1卷 第2辑 2015版》.2015,第153页. *

Also Published As

Publication number Publication date
CN111125550A (en) 2020-05-08

Similar Documents

Publication Publication Date Title
US11151406B2 (en) Method, apparatus, device and readable storage medium for image-based data processing
US11645314B2 (en) Interactive information retrieval using knowledge graphs
CN110490213B (en) Image recognition method, device and storage medium
CN108052577B (en) Universal text content mining method, device, server and storage medium
US11734375B2 (en) Automatic navigation of interactive web documents
CN108985358B (en) Emotion recognition method, device, equipment and storage medium
CN112329467B (en) Address recognition method and device, electronic equipment and storage medium
CN110276023B (en) POI transition event discovery method, device, computing equipment and medium
CN116795973B (en) Text processing method and device based on artificial intelligence, electronic equipment and medium
CN109918513B (en) Image processing method, device, server and storage medium
CN107210035A (en) The generation of language understanding system and method
CN110415679B (en) Voice error correction method, device, equipment and storage medium
CN111738016A (en) Multi-intention recognition method and related equipment
CN111125550B (en) Point-of-interest classification method, device, equipment and storage medium
CN112836487B (en) Automatic comment method and device, computer equipment and storage medium
CN112085120B (en) Multimedia data processing method and device, electronic equipment and storage medium
CN111126084B (en) Data processing method, device, electronic equipment and storage medium
CN110647613A (en) Courseware construction method, courseware construction device, courseware construction server and storage medium
CN117093687A (en) Question answering method and device, electronic equipment and storage medium
CN114120166B (en) Video question-answering method and device, electronic equipment and storage medium
CN108268443B (en) Method and device for determining topic point transfer and acquiring reply text
WO2021104274A1 (en) Image and text joint representation search method and system, and server and storage medium
CN113254814A (en) Network course video labeling method and device, electronic equipment and medium
CN111460224B (en) Comment data quality labeling method, comment data quality labeling device, comment data quality labeling equipment and storage medium
CN110580294B (en) Entity fusion method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant