CN112380847A - Interest point processing method and device, electronic equipment and storage medium - Google Patents

Interest point processing method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN112380847A
CN112380847A CN202011293631.9A CN202011293631A CN112380847A CN 112380847 A CN112380847 A CN 112380847A CN 202011293631 A CN202011293631 A CN 202011293631A CN 112380847 A CN112380847 A CN 112380847A
Authority
CN
China
Prior art keywords
candidate
interest
target article
points
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011293631.9A
Other languages
Chinese (zh)
Other versions
CN112380847B (en
Inventor
孙王栋
谢红伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN202011293631.9A priority Critical patent/CN112380847B/en
Publication of CN112380847A publication Critical patent/CN112380847A/en
Application granted granted Critical
Publication of CN112380847B publication Critical patent/CN112380847B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The disclosure provides a method and a device for processing points of interest, electronic equipment and a storage medium, and relates to the field of deep learning and the like. The specific implementation scheme is as follows: determining M dimensionality relevant features respectively corresponding to N candidate interest points contained in a target article; wherein N and M are integers greater than or equal to 1; inputting the M dimensionality relevant features corresponding to the N candidate interest points into a preset model to obtain the relevance between the N candidate interest points output by the preset model and the target article; determining a target interest point from the N candidate interest points based on the correlation between the N candidate interest points and the target article, and associating the target interest point with the target article in a map application.

Description

Interest point processing method and device, electronic equipment and storage medium
Technical Field
The present disclosure relates to the field of computer technology. The present disclosure relates to the field of deep learning, among others.
Background
As map applications become more widely used, there is an increasing need for presentation of content related to points of interest in map applications. In the prior art, in the method for hooking or associating the content related to the map application with the interest point in the map application, a series of complex algorithm statements are needed to implement the method, so that the problems of high subsequent development cost and the like are caused, and the processing efficiency and the accuracy of hooking between the interest point and the related content in the map application cannot be improved.
Disclosure of Invention
The disclosure provides a method and a device for processing points of interest, electronic equipment and a storage medium.
According to a first aspect of the present disclosure, there is provided a method of interest point processing, including:
determining M dimensionality relevant features respectively corresponding to N candidate interest points contained in a target article; wherein N and M are integers greater than or equal to 1;
inputting the M dimensionality relevant features corresponding to the N candidate interest points into a preset model to obtain the relevance between the N candidate interest points output by the preset model and the target article;
determining a target interest point from the N candidate interest points based on the correlation between the N candidate interest points and the target article, and associating the target interest point with the target article in a map application.
According to a second aspect of the present disclosure, there is provided a point of interest processing apparatus, comprising:
the characteristic processing module is used for determining M dimensionality relevant characteristics corresponding to N candidate interest points contained in the target article; wherein N and M are integers greater than or equal to 1;
the relevance analysis module is used for inputting the M-dimension relevant features corresponding to the N candidate interest points into a preset model to obtain the relevance between the N candidate interest points output by the preset model and the target article;
and the interest point association module is used for determining a target interest point from the N candidate interest points based on the relevance between the N candidate interest points and the target article, and associating the target interest point with the target article in a map application.
According to a third aspect of the present disclosure, there is provided an electronic device comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the aforementioned method.
According to a fourth aspect of the present disclosure, there is provided a non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the aforementioned method.
By the aid of the method, the candidate interest points contained in the articles can be subjected to multi-dimensional feature extraction, relevance analysis is performed on multi-dimensional feature information of the candidate interest points on the basis of a preset model, the target interest points strongly related to the articles are finally determined, and the target interest points are related to the articles in map application. Therefore, the efficiency of calculating the relevance of the candidate interest points can be improved, and the problems that the development cost is too high and the large-scale processing cannot be carried out due to the fact that a large number of sentences are adopted for analysis in the prior art are solved. In addition, the accuracy of the association between the interest points and the text contents in the map application can be improved, the content coverage of the map application is enriched, and the click rate of the text contents related to the map application is also improved.
It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present disclosure, nor do they limit the scope of the present disclosure. Other features of the present disclosure will become apparent from the following description.
Drawings
The drawings are included to provide a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:
FIG. 1 is a schematic flow chart diagram of a point of interest processing method according to an embodiment of the present disclosure;
FIG. 2 is a schematic flow chart of a process for obtaining relevant features according to an embodiment of the present disclosure;
FIG. 3 is an exemplary flow diagram of a point of interest processing method according to an embodiment of the present disclosure;
FIG. 4 is a schematic diagram of a composition structure of a point of interest processing apparatus according to an embodiment of the present disclosure;
fig. 5 is a block diagram of an electronic device for implementing a point of interest processing method according to an embodiment of the disclosure.
Detailed Description
Exemplary embodiments of the present disclosure are described below with reference to the accompanying drawings, in which various details of the embodiments of the disclosure are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
An embodiment of the present disclosure provides a method for processing a point of interest, as shown in fig. 1, including:
s101: determining M dimensionality relevant features respectively corresponding to N candidate interest points contained in a target article; wherein N and M are integers greater than or equal to 1;
s102: inputting the M dimensionality relevant features corresponding to the N candidate interest points into a preset model to obtain the relevance between the N candidate interest points output by the preset model and the target article;
s103: determining a target interest point from the N candidate interest points based on the correlation between the N candidate interest points and the target article, and associating the target interest point with the target article in a map application.
The embodiment of the invention can be applied to electronic equipment, such as a server or terminal equipment.
The target article may be any one of a plurality of articles. The plurality of articles may be all articles that the user may currently acquire. The plurality of articles may be articles selected by the user from a certain type, such as a tourist type article, a gourmet introduction type article, and so on. Since the processing manner of any one of the articles can be the aforementioned S101-S103, only one of the articles is taken as a target article for detailed description, and the processing parties of the other articles are the same as the target article and are not described in detail.
Before determining M-dimensional relevant features corresponding to the N candidate interest points included in the target article, the method may further include: and determining N candidate interest points contained in the target article.
The specific manner of obtaining N candidate interest points from the target article may include identifying the target article based on a preset algorithm to obtain N candidate interest points included in the target article. The preset algorithm can be used for segmenting and labeling the text content of the target article; correspondingly, N candidate interest points included in the target article are obtained, and the text content marked as the interest point obtained after the processing of the preset algorithm may be used as the N candidate interest points.
In the M-dimensional related features corresponding to the N candidate interest points, the M-dimensional related features corresponding to each candidate interest point may include the attribute features of the candidate interest point, the related features of the candidate interest point in the target article, and the attribute features of the target article.
The inputting the relevant features of the M dimensions corresponding to the N candidate interest points into a preset model to obtain the relevance between the N candidate interest points output by the preset model and the target article may be: and inputting all candidate interest points and relevant characteristics of the target article obtained by current extraction into a preset model to obtain the relevance corresponding to all the candidate interest points output by the preset model. Or, the M-dimensional correlation features corresponding to each candidate interest point may be input to the preset model one by using the candidate interest points as units, so as to obtain the correlation between each candidate interest point output by the preset model and the target article.
The preset model may be set according to actual conditions, for example, the preset model may be a GBDT (Gradient Boosting Decision Tree) model. Other models composed of a binary decision tree (or binary tree) may also be used as the preset model, for example, an XGBoost (eXtreme Gradient Boosting) model, which is not exhaustive here.
Determining target points of interest from the N candidate points of interest based on the correlations between the N candidate points of interest and the target article, wherein the manner of determining or selecting the target points of interest may include:
and selecting the candidate interest points with the relevance scores higher than the relevance threshold value as the target interest points. Wherein, the correlation threshold value can be set according to the actual situation;
or selecting one or more candidate interest points with highest relevance as the target interest point; specifically, the number of candidate interest points with the highest correlation may be selected according to a preset number, that is, the number of candidate interest points with the highest correlation is selected as the target interest point. For example, if the preset number is 3, the N candidate interest points may be sorted from high to low according to the relevance, and the 3 candidate interest points with the highest relevance are selected as the target interest points.
No matter which way is adopted to select the target interest points, the obtained one or more target interest points are all the target interest points which are strongly related to the target article.
After determining the one or more target interest points, associating the target interest points in the map application with the target article. That is, the one or more target interest points are obtained in the map application, and the one or more target interest points are all hooked with the target article, that is, associated with the target article. When a subsequent user uses or opens the map application, a certain target interest point in the map application can be clicked to be associated with a corresponding target article for the user to view.
By the scheme, the candidate interest points contained in the article can be subjected to multi-dimensional feature extraction, relevance analysis is performed on multi-dimensional feature information of the candidate interest points on the basis of the preset model, the target interest points strongly related to the article are finally determined, and the target interest points are related to the article in map application. Therefore, the efficiency of calculating the relevance of the candidate interest points can be improved, and the problems that the development cost is too high and the large-scale processing cannot be carried out due to the fact that a large number of sentences are adopted for analysis in the prior art are solved. In addition, the accuracy of the association between the interest points and the text contents in the map application can be improved, the content coverage of the map application is enriched, and the click rate of the text contents related to the map application is also improved.
In the above S101, determining M-dimensional relevant features corresponding to N candidate interest points included in the target article may include, as shown in fig. 2:
s201: identifying the content of the target article to obtain the N candidate interest points contained in the target article;
s202: and determining M dimensionality relevant characteristics corresponding to the N candidate interest points respectively based on the relevant information of the N candidate interest points and/or the content of the target article.
The identifying the content of the target article to obtain the N candidate interest points may specifically be: segmenting the content contained in the target article by adopting a preset algorithm, and labeling the character content obtained by segmentation; and in the case that the text content is marked as the interest point, taking the text content as the identified candidate interest point.
The content of the target article may include a title of the target article and a body of the target article. Further, the text of the target article may include all the text and/or pictures contained in the text of the target article.
Here, the preset algorithm may be selected according to actual situations, and the algorithm capable of segmenting a plurality of texts in an article to obtain one or more text contents and labeling each text content is within the protection scope of the embodiment as long as the algorithm is capable of segmenting a plurality of texts in an article to obtain one or more text contents.
The labeling of the text content may be a labeling of a category or a type of the text content, for example, a plurality of text contents obtained by segmenting a certain target article, where a part of the text contents may be labeled as an interest point, a part of the text contents may be labeled as a title, and a part of the text contents may be labeled as a geographic location. The above labels may be referred to as tags, attribute identifications, or attribute tags in other examples, but whatever the name, the category or type that can be used to characterize the text is within the scope of the present embodiment.
Obtaining M dimensions of relevant features corresponding to the N candidate interest points, respectively, based on the N candidate interest points and/or the content of the target article, where the M dimensions of relevant features may be: and determining at least one of the self attribute feature of the ith candidate interest point, the attribute feature of the target article and the related feature of the candidate interest point in the target article based on the ith candidate interest point in the N candidate interest points and/or the content contained in the target article. Wherein i is an integer of 1 or more and N or less. That is, feature information having the same dimension as that of the ith candidate interest point may be determined for each of the N candidate interest points, which is not described herein again.
It should be understood that the attribute feature of the target article may be feature information included in all the N candidate interest points included in the target article. After the attribute features of the target article are obtained by analyzing the target article once, the attribute features are added to the M pieces of dimensional feature information corresponding to the N candidate interest points respectively.
For example, the M-dimensional relevant features of the ith candidate interest point of the N candidate interest points may specifically be relevant information of 10 dimensions, and may include:
the number of times that the core word of the ith candidate interest point appears in the title of the target article;
the number of times that the non-core word of the ith candidate interest point appears in the title of the target article;
the number of times that the core word of the ith candidate interest point appears in the text of the target article;
the number of times that the non-core word of the ith candidate interest point appears in the text of the target article;
whether address information corresponding to the ith candidate interest point appears in the target article or not;
classification information of the ith candidate interest point;
whether the ith candidate interest point contains branch information or not;
classification of the target article;
the length of the text of the target article;
whether the target article contains preset keywords or not.
The preset keywords may be set according to actual conditions, for example, the preset keywords may include travel notes, strategies, tours, gourmets, and the like.
The above feature information of multiple dimensions is only an example, and features of more dimensions may be used in actual processing, or only feature information of some dimensions may be used.
Therefore, the candidate interest points can be directly acquired from the target article, and the multi-dimensional feature information is determined by combining the relevant information of the target article and/or the relevant information of the candidate interest points, so that the M-dimensional relevant features of the acquired candidate interest points can reflect the relationship between the interest points and the target article, and more accurate information is provided for the subsequent determination of the relevance between the interest points and the target article.
More specifically, the determining, based on the relevant information of the N candidate points of interest and/or the content of the target article, relevant features of M dimensions corresponding to the N candidate points of interest respectively includes at least one of:
determining the attribute characteristics of the ith candidate interest point based on the relevant information of the ith candidate interest point in the N candidate interest points; i is an integer of 1 or more and N or less;
determining feature information of the ith candidate interest point in the N candidate interest points, which is related to the target article, based on the related information of the ith candidate interest point and the content of the target article.
Determining the attribute feature of the ith candidate interest point based on the relevant information of the ith candidate interest point in the N candidate interest points may include:
determining a classification of the ith candidate point of interest based on the relevant information of the ith candidate point of interest;
and determining the branch names corresponding to the ith candidate interest point based on the relevant information of the ith candidate interest point.
That is, the self-attribute feature of the ith candidate interest point may include the classification of the ith candidate interest point itself, whether the ith candidate interest point includes a branch name, and the like.
The classification manner of the ith candidate interest point may be determined by inputting the name of the ith candidate interest point into a classification model to obtain a result output by the classification model, and using the result as the classification of the ith candidate interest point. The classification model may be a pre-trained model.
The manner of determining the branch name corresponding to the ith candidate interest point may be to determine whether the ith candidate interest point includes the branch name according to the extended information in the relevant information of the ith candidate interest point, and if so, extract the branch name.
Further, the information related to the ith candidate point of interest may include a name and extension information of the ith candidate point of interest. That is to say, when the ith candidate interest point is obtained from the target article, not only the name of the ith candidate interest point but also the related extended information of the ith candidate interest point in the target article may be obtained; in addition, the extended information may be text content located adjacent to the name of the ith candidate point of interest in the target article, for example, XXX (branch at address a), where XXX may be the name of the ith candidate point of interest, and the branch at address a is the extended information of the ith candidate point of interest. The extended information may include a branch name or address information related to the ith candidate point of interest, and the like, although the extended information may also include other information, which is not exhaustive here.
As described above, the content of the target article may include the title of the target article and the body of the target article. And the related information of the ith candidate interest point comprises the name and the extension information of the ith candidate interest point.
Correspondingly, the determining the feature information related to the ith candidate point of interest and the target article based on the related information of the ith candidate point of interest of the N candidate points of interest and the content of the target article may include at least one of:
segmenting the relevant information of the ith candidate interest point to obtain a core word, and detecting the occurrence frequency of the core word in the title of the target article;
segmenting the relevant information of the ith candidate interest point to obtain a core word and a non-core word, and detecting the occurrence frequency of the non-core word in the title of the target article;
segmenting the relevant information of the ith candidate interest point to obtain a core word, and detecting the occurrence frequency of the core word in the text of the target article; wherein the body of the target article may include all content except the title of the target article;
segmenting the relevant information of the ith candidate interest point to obtain a core word and a non-core word, and detecting the occurrence frequency of the non-core word in the text of the target article;
and acquiring address resolution information from the relevant information of the ith candidate interest point, and detecting whether the address resolution information appears in the target article.
Wherein, the number of times that the Core word appears in the title of the target article may be represented as feature information "Core _ N _ gram _ title"; the core represents a core word, the title represents a title of a target article, and the N _ gram is an algorithm based on a statistical language model, and is used for performing sliding window operation with the size of N on the content of the title in the target article according to bytes to form a byte fragment sequence with the length of N. Further, each byte segment, for example, the core word of the ith candidate interest point may be referred to as a gram, and the occurrence frequencies of all the grams in the content of the title in the target article are counted.
The number of times the Non-core word appears in the title of the target article may be represented as feature information "Non _ core _ N _ gram _ title"; where Non _ core represents a Non-core word, and the interpretation of title and N _ gram is the same as the foregoing, and will not be described repeatedly.
The number of times the Core word appears in the body of the target article may be represented as feature information "Core _ N _ gram _ content"; the content indicates the text of the target article, and the descriptions of Core and N _ gram are the same as those described above, and will not be described repeatedly.
The number of times the Non-core word appears in the body of the target article may be represented as feature information "Non _ core _ N _ gram _ content"; the content indicates the text of the target article, and the descriptions of Non _ core and N _ gram are the same as the above description, and will not be repeated.
Whether the address resolution information appears in the target article or not can be represented as feature information "Poi _ name", where POI is point of interest, and whether the address resolution information appears in the target article or not is represented by the feature information, where yes or no may be included, such as output of 1 or 0, and also may include the number of occurrences, which is not exhaustive here.
It is further understood that the above process may further include: and segmenting the ith candidate interest point to obtain at least one participle, determining a core word of the ith candidate interest point from the at least one participle, and taking other participles except the core word as non-core words of the ith candidate interest point. The specific way of determining which segmented word is the core word may be various, such as matching with one or more preset core words, and the like, which is not exhaustive here.
Therefore, by adopting the processing, the self attribute characteristics of the candidate interest points and the relevant characteristics of the candidate interest points in the target article can be obtained, so that the dimensionality of richer characteristic information can be provided for the subsequent relevance determination, and the accuracy of analyzing the relevance between the candidate interest points and the target article is improved.
The determining, based on the relevant information of the N candidate interest points and/or the content of the target article, relevant features of M dimensions corresponding to the N candidate interest points, further includes:
determining the attribute characteristics of the target article based on the content of the target article, and adding the attribute characteristics of the target article to the M-dimensional related characteristics corresponding to the N candidate interest points respectively.
Specifically, the content of the target article may include a title of the target article and a body of the target article; correspondingly, the determining the attribute characteristics of the target article based on the content of the target article may specifically include:
determining the length of the text of the target article based on the text of the target article; wherein, the length may include at least one of a number of words, a number of lines, a number of pages, and the like;
determining a classification of the target article based on the title of the target article;
and determining whether the target article is matched with a preset keyword or not based on the title of the target article and/or the text of the target article and the preset keyword, and taking a matching result as keyword matching characteristic information of the target article.
Specifically, the length of the text of the target article may be represented by feature information "Content _ length", where "Content" represents the text and "length" is the length; the length may mean at least one of a number of words, a number of lines, and a number of pages.
The classification of the target article may be represented by feature information "Topic _ class," where "Topic" is the title and "class" represents the classification.
The keyword matching feature information of the target article may be represented as feature information "Youji/gonglv".
It should be further noted that, based on the title of the target article, determining the classification of the target article may be based on a preset classification model, or may be implemented by referring to a title classification model, for example, the title of the target article may be input into the classification model, and a result output by the classification model is obtained. The specific training mode of the classification model is not limited in this embodiment.
The length of the text of the target article, the classification of the target article, and the keyword matching feature information of the target article, which are included in the attribute features of the target article, may be used as feature information included in all of the N candidate interest points included in the target article. For example, the attribute features of the target article may be obtained first, and the attribute features may be added to M-dimensional related features corresponding to all N candidate interest points.
For example, the M-dimensional correlation features of each of the N candidate interest points may include the following 10: the number of times that the core word of the candidate interest point appears in the title of the target article; the number of times that the non-core words of the candidate interest points appear in the title of the target article; the number of times that the core words of the candidate interest points appear in the text of the target article; the number of times that the non-core words of the candidate interest points appear in the text of the target article; whether address information corresponding to the candidate interest points appears in the target article or not; classification information of the candidate interest points; whether the candidate interest points contain branch information; classification of the target article; the length of the text of the target article; whether the target article contains preset keywords or not. The 10-dimensional feature information is only an exemplary description, and features of more dimensions may be used in actual processing, or only feature information of a part of the 10 dimensions may be used. The number of the elements can be increased or decreased according to actual conditions, and the embodiment is not exhaustive.
Therefore, by adopting the processing, the self attribute characteristics of the candidate interest points and the relevant characteristics of the candidate interest points in the target article can be obtained, and the relevant characteristics of the target article can be added in the multi-dimensional characteristic information of the candidate interest points, so that richer dimensions of the characteristic information can be provided for the subsequent determination of the relevance, and the accuracy of analyzing the relevance between the candidate interest points and the target article is improved.
The determining a target point of interest from the N candidate points of interest based on the relevance between the N candidate points of interest and the target article comprises:
selecting candidate interest points of which the correlation exceeds a correlation threshold value from the N candidate interest points; and taking the candidate interest points exceeding the relevance threshold as the target interest points.
Specifically, one or more correlation thresholds may be preset, for example, a correlation threshold of strong correlation, a correlation threshold of weak correlation; or only one correlation threshold may be set, etc.
For example, in one example, a correlation threshold is preset;
correspondingly, one or more candidate interest points with the correlation exceeding a preset correlation threshold may be selected from the N candidate interest points. Namely, all candidate interest points with the correlation scores larger than the preset correlation threshold value in the N candidate interest points are taken as target interest points. These target points of interest are target points of interest that are strongly related to the target article.
In this example, after the target interest points included in the target article are obtained, the strongly related target interest points in the map application are associated with the target article.
For example, there are 10 candidate interest points in the target article, which are respectively interest point 1 to interest point 10; after correlation calculation is performed by a preset model, 10 correlation scores corresponding to the interest points 1 to 10 are obtained respectively. Assuming that the preset correlation threshold is 1.4, the correlation scores of the interest points 1, 3 and 4 selected from the interest points 1 to 10 are greater than the correlation threshold, and then the interest points 1, 3 and 4 are used as target interest points. In map application, the interest points 1, 3 and 4 are respectively hooked with the target article.
Subsequently, when the user opens the map application, if the user wants to view other related articles of the interest point 1 of the map application, the target article can be provided for the user, so that the user can directly click and view the target article.
In another example, two correlation thresholds are preset, namely a correlation threshold with strong correlation and a correlation threshold with weak correlation; wherein the correlation threshold for strong correlations may be 1.4 and the correlation threshold for weak correlations may be 0.4. The specific values in the actual processing may be the same as or different from those described above, and it is within the protection scope of the present embodiment as long as the correlation threshold of the strong correlation is greater than the correlation threshold of the weak correlation.
Selecting candidate interest points of which the correlation exceeds a correlation threshold from the N candidate interest points may include:
taking the candidate interest points with the correlation larger than the correlation threshold value of strong correlation in the N candidate interest points as strong correlation target interest points;
taking the candidate interest points of which the correlation is greater than a preset weak correlation threshold value and less than or equal to a strong correlation threshold value in the N candidate interest points as weak correlation target interest points;
and taking the candidate interest points with the relevance scores smaller than a preset weakly-relevant relevance threshold value in the N candidate interest points as irrelevant interest points.
Correspondingly, the candidate interest points exceeding the relevance threshold may be the target interest points, and both the strongly related target interest points and the weakly related target interest points may be the target interest points. That is, the target interest points include: strongly correlated target interest points and weakly correlated target interest points.
And in the map application, the strong relevant target interest points and the weak relevant target interest points are associated or hooked with the target article.
By way of example, with reference to fig. 3, 5 candidate interest points are identified from the target article, which are respectively interest point 1 to interest point 5; respectively acquiring corresponding 10-dimension related characteristics of the interest points 1 to 5; after correlation calculation is performed by a preset model (which may be a GBDT model), correlations (or referred to as correlation scores) corresponding to the interest points 1 to 5 are obtained, respectively.
Assume that the correlation threshold for strong correlations is 1.4 and the correlation threshold for weak correlations is 0.4. Selecting the interest points 1 and 3 from the interest points 1 to 5, wherein the correlation of the interest points 1 and 3 is greater than the correlation threshold of the strong correlation, and taking the interest points 1 and 3 as strong correlation target interest points; and if the relevance score of the interest point 2 is larger than the relevance threshold of the weak relevance and smaller than or equal to the relevance threshold of the strong relevance, the interest point 2 is the target interest point of the weak relevance. And respectively associating the interest point 1, the interest point 3 and the interest point 2 with the target article in the map application. The associating of the interest point 1, the interest point 3, and the interest point 2 with the target article in the map application may specifically be hooking of the interest point 1, the interest point 3, and the interest point 2 with the target article in the map application.
Subsequently, when the user opens the map application, if the user wants to view the related articles of the interest point 1 of the map application, the target articles can be provided for the user to directly view.
By adopting the scheme, the candidate interest points contained in the target article can be subjected to multi-dimensional feature extraction, relevance analysis is carried out on the multi-dimensional feature information of the candidate interest points, the candidate interest points higher than a relevance threshold value are taken as the target interest points on the basis of the relevance of the candidate interest points, and finally the target interest points and the article can be associated in the map. Therefore, the relevance between the target article hooked by the interest point of the map application and the interest point is high, the accuracy of the relevance between the interest point and the text content in the map is improved, the content coverage range of the map application is enriched, a large amount of invalid information can be avoided when a user clicks the interest point to acquire the related text content, and the click rate of the text content related to the map application can be further improved.
According to an embodiment of the present application, there is also provided a point of interest processing apparatus, as shown in fig. 4, including:
the feature processing module 401 is configured to determine M-dimensional related features corresponding to N candidate interest points included in the target article, respectively; wherein N and M are integers greater than or equal to 1;
a relevance analysis module 402, configured to input, into a preset model, the M-dimensional relevant features corresponding to the N candidate interest points, respectively, to obtain relevance between the N candidate interest points output by the preset model and the target article;
an interest point associating module 403, configured to determine a target interest point from the N candidate interest points based on the correlations between the N candidate interest points and the target article, and associate the target interest point with the target article in a map application.
The feature processing module 401 is configured to identify the content of the target article, so as to obtain the N candidate interest points included in the target article; and determining M dimensionality relevant characteristics corresponding to the N candidate interest points respectively based on the relevant information of the N candidate interest points and/or the content of the target article.
The feature processing module 401 is configured to perform at least one of the following:
determining the attribute characteristics of the ith candidate interest point based on the relevant information of the ith candidate interest point in the N candidate interest points; i is an integer of 1 or more and N or less;
determining feature information of the ith candidate interest point in the N candidate interest points, which is related to the target article, based on the related information of the ith candidate interest point and the content of the target article.
The feature processing module 401 is configured to determine an attribute feature of the target article based on the content of the target article, and add the attribute feature of the target article to M-dimensional related features corresponding to the N candidate interest points, respectively.
The interest point associating module 403 is configured to select candidate interest points, of which the correlation exceeds a correlation threshold, from the N candidate interest points; and taking the candidate interest points exceeding the relevance threshold as the target interest points.
According to an embodiment of the present application, an electronic device and a readable storage medium are also provided.
Fig. 5 is a block diagram of an electronic device according to an embodiment of the present application. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the present application that are described and/or claimed herein.
As shown in fig. 5, the electronic apparatus includes: one or more processors 701, a memory 702, and interfaces for connecting the various components, including a high-speed interface and a low-speed interface. The various components are interconnected using different buses and may be mounted on a common motherboard or in other manners as desired. The processor may process instructions for execution within the electronic device, including instructions stored in or on the memory to display graphical information of a GUI on an external input/output apparatus (such as a display device coupled to the interface). In other embodiments, multiple processors and/or multiple buses may be used, along with multiple memories and multiple memories, as desired. Also, multiple electronic devices may be connected, with each device providing portions of the necessary operations (e.g., as a server array, a group of blade servers, or a multi-processor system). In fig. 5, one processor 701 is taken as an example.
The memory 702 is a non-transitory computer readable storage medium as provided herein. Wherein the memory stores instructions executable by the at least one processor to cause the at least one processor to perform the point of interest processing methods provided herein. The non-transitory computer-readable storage medium of the present application stores computer instructions for causing a computer to perform the point-of-interest processing method provided by the present application.
The memory 702, which is a non-transitory computer readable storage medium, may be used to store non-transitory software programs, non-transitory computer executable programs, and modules, such as program instructions/modules (e.g., the feature processing module, the correlation analysis module, the interest point association module shown in fig. 4) corresponding to the interest point processing method in the embodiments of the present application. The processor 701 executes various functional applications of the server and data processing, i.e., implementing the point-of-interest processing method in the above-described method embodiments, by executing non-transitory software programs, instructions, and modules stored in the memory 702.
The memory 702 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to use of the electronic device handled by the point of interest, and the like. Further, the memory 702 may include high speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, the memory 702 may optionally include memory located remotely from the processor 701, which may be connected to point-of-interest processing electronics over a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The point of interest processing electronic device may further comprise: an input device 703 and an output device 704. The processor 701, the memory 702, the input device 703 and the output device 704 may be connected by a bus or other means, and fig. 5 illustrates an example of a connection by a bus.
The input device 703 may receive entered numeric or character information and generate key signal inputs related to user settings and function controls of the XXX electronic device, such as a touch screen, keypad, mouse, track pad, touch pad, pointer stick, one or more mouse buttons, track ball, joystick, or other input device. The output devices 704 may include a display device, auxiliary lighting devices (e.g., LEDs), and tactile feedback devices (e.g., vibrating motors), among others. The display device may include, but is not limited to, a Liquid Crystal Display (LCD), a Light Emitting Diode (LED) display, and a plasma display. In some implementations, the display device can be a touch screen.
Various implementations of the systems and techniques described here can be realized in digital electronic circuitry, integrated circuitry, application specific ASICs (application specific integrated circuits), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.
These computer programs (also known as programs, software applications, or code) include machine instructions for a programmable processor, and may be implemented using high-level procedural and/or object-oriented programming languages, and/or assembly/machine languages. As used herein, the terms "machine-readable medium" and "computer-readable medium" refer to any computer program product, apparatus, and/or device (e.g., magnetic discs, optical disks, memory, Programmable Logic Devices (PLDs)) used to provide machine instructions and/or data to a programmable processor, including a machine-readable medium that receives machine instructions as a machine-readable signal. The term "machine-readable signal" refers to any signal used to provide machine instructions and/or data to a programmable processor.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), and the Internet.
The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, also called a cloud computing server or a cloud host, and is a host product in a cloud computing service system, so as to solve the defects of high management difficulty and weak service expansibility in the traditional physical host and Virtual Private Server (VPS) service. The server may also be a server of a distributed system, or a server incorporating a blockchain.
According to the technical scheme of the embodiment of the application, the candidate interest points contained in the article can be subjected to multi-dimensional feature extraction, relevance analysis is performed on multi-dimensional feature information of the candidate interest points on the basis of the preset model, the target interest points strongly related to the article are finally determined, and the target interest points are associated with the article in map application. Therefore, the efficiency of calculating the relevance of the candidate interest points can be improved, and the problems that the development cost is too high and the large-scale processing cannot be carried out due to the fact that a large number of sentences are adopted for analysis in the prior art are solved. In addition, the accuracy of the association between the interest points and the text contents in the map application can be improved, the content coverage of the map application is enriched, and the click rate of the text contents related to the map application is also improved.
It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present application may be executed in parallel, sequentially, or in different orders, as long as the desired results of the technical solutions disclosed in the present application can be achieved, and the present invention is not limited herein.
The above-described embodiments should not be construed as limiting the scope of the present application. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present application shall be included in the protection scope of the present application.

Claims (12)

1. A method of interest point processing, comprising:
determining M dimensionality relevant features respectively corresponding to N candidate interest points contained in a target article; wherein N and M are integers greater than or equal to 1;
inputting the M dimensionality relevant features corresponding to the N candidate interest points into a preset model to obtain the relevance between the N candidate interest points output by the preset model and the target article;
determining a target interest point from the N candidate interest points based on the correlation between the N candidate interest points and the target article, and associating the target interest point with the target article in a map application.
2. The method of claim 1, wherein the determining M-dimensional relevant features corresponding to the N candidate points of interest contained in the target article comprises:
identifying the content of the target article to obtain the N candidate interest points contained in the target article;
and determining M dimensionality relevant characteristics corresponding to the N candidate interest points respectively based on the relevant information of the N candidate interest points and/or the content of the target article.
3. The method of claim 2, wherein the determining, based on the relevant information of the N candidate points of interest and/or the content of the target article, relevant features of M dimensions corresponding to the N candidate points of interest respectively comprises at least one of:
determining the attribute characteristics of the ith candidate interest point based on the relevant information of the ith candidate interest point in the N candidate interest points; i is an integer of 1 or more and N or less;
determining feature information of the ith candidate interest point in the N candidate interest points, which is related to the target article, based on the related information of the ith candidate interest point and the content of the target article.
4. The method of claim 3, wherein the determining M-dimensional relevant features corresponding to the N candidate points of interest based on the relevant information of the N candidate points of interest and/or the content of the target article further comprises:
determining the attribute characteristics of the target article based on the content of the target article, and adding the attribute characteristics of the target article to the M-dimensional related characteristics corresponding to the N candidate interest points respectively.
5. The method of any of claims 1-4, wherein the determining a target point of interest from the N candidate points of interest based on the relevance between the N candidate points of interest and the target article comprises:
selecting candidate interest points of which the correlation exceeds a correlation threshold value from the N candidate interest points; and taking the candidate interest points exceeding the relevance threshold as the target interest points.
6. A point of interest processing apparatus, comprising:
the characteristic processing module is used for determining M dimensionality relevant characteristics corresponding to N candidate interest points contained in the target article; wherein N and M are integers greater than or equal to 1;
the relevance analysis module is used for inputting the M-dimension relevant features corresponding to the N candidate interest points into a preset model to obtain the relevance between the N candidate interest points output by the preset model and the target article;
and the interest point association module is used for determining a target interest point from the N candidate interest points based on the relevance between the N candidate interest points and the target article, and associating the target interest point with the target article in a map application.
7. The apparatus of claim 6, wherein the feature processing module is configured to identify content of the target article, and obtain the N candidate points of interest included in the target article; and determining M dimensionality relevant characteristics corresponding to the N candidate interest points respectively based on the relevant information of the N candidate interest points and/or the content of the target article.
8. The apparatus of claim 7, wherein the feature processing module is configured to perform at least one of:
determining the attribute characteristics of the ith candidate interest point based on the relevant information of the ith candidate interest point in the N candidate interest points; i is an integer of 1 or more and N or less;
determining feature information of the ith candidate interest point in the N candidate interest points, which is related to the target article, based on the related information of the ith candidate interest point and the content of the target article.
9. The apparatus of claim 8, wherein the feature processing module is configured to determine an attribute feature of the target article based on the content of the target article, and add the attribute feature of the target article to M-dimensional related features corresponding to the N candidate points of interest respectively.
10. The apparatus according to any one of claims 6 to 9, wherein the interest point associating module is configured to select, from the N candidate interest points, a candidate interest point whose relevance exceeds a relevance threshold; and taking the candidate interest points exceeding the relevance threshold as the target interest points.
11. An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-5.
12. A non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of any one of claims 1-5.
CN202011293631.9A 2020-11-18 2020-11-18 Point-of-interest processing method and device, electronic equipment and storage medium Active CN112380847B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011293631.9A CN112380847B (en) 2020-11-18 2020-11-18 Point-of-interest processing method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011293631.9A CN112380847B (en) 2020-11-18 2020-11-18 Point-of-interest processing method and device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN112380847A true CN112380847A (en) 2021-02-19
CN112380847B CN112380847B (en) 2024-03-29

Family

ID=74585097

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011293631.9A Active CN112380847B (en) 2020-11-18 2020-11-18 Point-of-interest processing method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN112380847B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113553421A (en) * 2021-06-22 2021-10-26 北京百度网讯科技有限公司 Comment text generation method and device, electronic equipment and storage medium
CN113792230A (en) * 2021-08-24 2021-12-14 北京百度网讯科技有限公司 Service linking method and device, electronic equipment and storage medium
CN114896967A (en) * 2022-06-06 2022-08-12 山东浪潮爱购云链信息科技有限公司 Processing method, equipment and storage medium for forum problems in purchasing platform
CN115396813A (en) * 2021-05-24 2022-11-25 北京三快在线科技有限公司 Interest point portrait generation method and system, electronic device and storage medium

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150187107A1 (en) * 2012-07-18 2015-07-02 Google Inc. Highlighting related points of interest in a geographical region
US20180143998A1 (en) * 2016-11-21 2018-05-24 Google Inc. Electronic map interfaces
CN110119511A (en) * 2019-05-17 2019-08-13 网易传媒科技(北京)有限公司 Prediction technique, medium, device and the calculating equipment of article hot spot score
CN111506675A (en) * 2019-01-11 2020-08-07 阿里巴巴集团控股有限公司 Method, apparatus, device and medium for determining points of interest
CN111767359A (en) * 2020-06-30 2020-10-13 北京百度网讯科技有限公司 Interest point classification method, device, equipment and storage medium
CN111782977A (en) * 2020-06-29 2020-10-16 北京百度网讯科技有限公司 Interest point processing method, device, equipment and computer readable storage medium
CN111831935A (en) * 2019-09-17 2020-10-27 北京嘀嘀无限科技发展有限公司 Interest point ordering method and device, electronic equipment and storage medium
CN111949820A (en) * 2020-06-24 2020-11-17 北京百度网讯科技有限公司 Video associated interest point processing method and device and electronic equipment

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150187107A1 (en) * 2012-07-18 2015-07-02 Google Inc. Highlighting related points of interest in a geographical region
US20180143998A1 (en) * 2016-11-21 2018-05-24 Google Inc. Electronic map interfaces
CN111506675A (en) * 2019-01-11 2020-08-07 阿里巴巴集团控股有限公司 Method, apparatus, device and medium for determining points of interest
CN110119511A (en) * 2019-05-17 2019-08-13 网易传媒科技(北京)有限公司 Prediction technique, medium, device and the calculating equipment of article hot spot score
CN111831935A (en) * 2019-09-17 2020-10-27 北京嘀嘀无限科技发展有限公司 Interest point ordering method and device, electronic equipment and storage medium
CN111949820A (en) * 2020-06-24 2020-11-17 北京百度网讯科技有限公司 Video associated interest point processing method and device and electronic equipment
CN111782977A (en) * 2020-06-29 2020-10-16 北京百度网讯科技有限公司 Interest point processing method, device, equipment and computer readable storage medium
CN111767359A (en) * 2020-06-30 2020-10-13 北京百度网讯科技有限公司 Interest point classification method, device, equipment and storage medium

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115396813A (en) * 2021-05-24 2022-11-25 北京三快在线科技有限公司 Interest point portrait generation method and system, electronic device and storage medium
CN113553421A (en) * 2021-06-22 2021-10-26 北京百度网讯科技有限公司 Comment text generation method and device, electronic equipment and storage medium
CN113553421B (en) * 2021-06-22 2023-05-05 北京百度网讯科技有限公司 Comment text generation method and device, electronic equipment and storage medium
CN113792230A (en) * 2021-08-24 2021-12-14 北京百度网讯科技有限公司 Service linking method and device, electronic equipment and storage medium
CN113792230B (en) * 2021-08-24 2024-04-09 北京百度网讯科技有限公司 Service linking method, device, electronic equipment and storage medium
CN114896967A (en) * 2022-06-06 2022-08-12 山东浪潮爱购云链信息科技有限公司 Processing method, equipment and storage medium for forum problems in purchasing platform
CN114896967B (en) * 2022-06-06 2024-01-19 山东浪潮爱购云链信息科技有限公司 Method, equipment and storage medium for processing forum problem in purchasing platform

Also Published As

Publication number Publication date
CN112380847B (en) 2024-03-29

Similar Documents

Publication Publication Date Title
CN112560912B (en) Classification model training method and device, electronic equipment and storage medium
CN111221983B (en) Time sequence knowledge graph generation method, device, equipment and medium
CN111709247B (en) Data set processing method and device, electronic equipment and storage medium
CN111125435B (en) Video tag determination method and device and computer equipment
CN111967262A (en) Method and device for determining entity tag
CN112380847A (en) Interest point processing method and device, electronic equipment and storage medium
CN112507068A (en) Document query method and device, electronic equipment and storage medium
CN111522967A (en) Knowledge graph construction method, device, equipment and storage medium
CN111858905B (en) Model training method, information identification device, electronic equipment and storage medium
CN111078878A (en) Text processing method, device and equipment and computer readable storage medium
CN111522944A (en) Method, apparatus, device and storage medium for outputting information
CN111241810A (en) Punctuation prediction method and device
CN113220835A (en) Text information processing method and device, electronic equipment and storage medium
CN111090991A (en) Scene error correction method and device, electronic equipment and storage medium
CN112989235A (en) Knowledge base-based internal link construction method, device, equipment and storage medium
CN114244795B (en) Information pushing method, device, equipment and medium
CN111563198A (en) Material recall method, device, equipment and storage medium
CN113516491A (en) Promotion information display method and device, electronic equipment and storage medium
CN112699237B (en) Label determination method, device and storage medium
CN111832313A (en) Method, device, equipment and medium for generating emotion collocation set in text
CN111666417A (en) Method and device for generating synonyms, electronic equipment and readable storage medium
CN112101012B (en) Interactive domain determining method and device, electronic equipment and storage medium
CN111125445A (en) Community theme generation method and device, electronic equipment and storage medium
CN114201607B (en) Information processing method and device
CN115688802A (en) Text risk detection method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant