CN110232156B - Information recommendation method and device based on long text - Google Patents

Information recommendation method and device based on long text Download PDF

Info

Publication number
CN110232156B
CN110232156B CN201910473094.7A CN201910473094A CN110232156B CN 110232156 B CN110232156 B CN 110232156B CN 201910473094 A CN201910473094 A CN 201910473094A CN 110232156 B CN110232156 B CN 110232156B
Authority
CN
China
Prior art keywords
text
short
participles
recommendation information
long
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910473094.7A
Other languages
Chinese (zh)
Other versions
CN110232156A (en
Inventor
王卓然
亓超
马宇驰
陈华荣
温泉
范彦革
梁伟
岳媛媛
刁德纯
曹圣明
李宇舰
王东亮
赵巍
林梓悦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN201910473094.7A priority Critical patent/CN110232156B/en
Publication of CN110232156A publication Critical patent/CN110232156A/en
Application granted granted Critical
Publication of CN110232156B publication Critical patent/CN110232156B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention discloses an information recommendation method based on a long text, which comprises the following steps: performing intention recognition on a long text to obtain a plurality of short texts, wherein the short texts are texts expanded based on word segmentation in the long text; performing intention identification on each short text to obtain recommendation information of each short text; acquiring recommendation information associated with the word segmentation in the long text from the recommendation information of each short text; recommending information associated with the participles in the long text to a user. Because the recommendation information recommended to the user is obtained from the recommendation information of each short text according to the word segmentation in the long text, the number of recommendation information can be reduced so as to be convenient for the user to search, and because each short text is obtained by performing intention recognition on the long text, the recommended information can be more accurate.

Description

Information recommendation method and device based on long text
Technical Field
The embodiment of the invention relates to the technical field of information processing, in particular to a long text-based information recommendation method and device.
Background
Long text, refers to text in which there are multiple participles (i.e., words of practical significance). In the prior art, if information needs to be recommended to a user according to a long text, generally, the long text is segmented, corresponding hot content or recommended content related to analysis and preferred by the user is obtained according to the segmentation, the segmentation of the long text is more, the recommended content is also more, and the user needs to search a large amount of information to obtain the content of interest, which is very inconvenient.
Disclosure of Invention
In view of this, embodiments of the present invention provide a method and an apparatus for recommending information based on a long text, which can reduce the number of information recommended to a user and make the recommended information more accurate.
In order to solve the above problems, embodiments of the present invention mainly provide the following technical solutions:
in a first aspect, an embodiment of the present invention provides a method for recommending information based on a long text, where the method includes: performing intention recognition on a long text to obtain a plurality of short texts, wherein the short texts are texts expanded based on word segmentation in the long text; identifying intentions of each short text to obtain recommendation information of each short text; acquiring recommendation information associated with the word segmentation in the long text from the recommendation information of each short text; recommending information associated with the participles in the long text to a user.
In a second aspect, an embodiment of the present invention further provides an information recommendation apparatus based on a long text, where the apparatus includes: the acquisition module is used for performing intention recognition on a long text to obtain a plurality of short texts, wherein the short texts are texts expanded based on word segmentation in the long text; the identification module is used for identifying the intention of each short text to obtain the recommendation information of each short text; the determining module is used for acquiring recommendation information associated with the participles in the long text from the recommendation information of each short text; and the recommending module is used for recommending the recommending information associated with the participles in the long text to the user.
In a third aspect, an embodiment of the present invention provides an electronic device, including: at least one processor; and at least one memory, bus connected with the processor; the processor and the memory complete mutual communication through the bus; the processor is configured to call the program instructions in the memory to perform the method in one or more of the above-mentioned embodiments.
In a fourth aspect, an embodiment of the present invention provides a computer-readable storage medium, where the storage medium includes a stored program, and when the program runs, a device in which the storage medium is located is controlled to perform a method in one or more of the above technical solutions.
The information recommendation method and device based on the long text provided by the embodiment of the invention have the advantages that firstly, intention recognition is carried out on the long text to obtain a plurality of short texts, wherein the short texts are texts expanded based on word segmentation in the long text; then, performing intention identification on each short text to obtain recommendation information of each short text; then, acquiring recommendation information associated with the word segmentation in the long text from the recommendation information of each short text; and finally, recommending information associated with the word segmentation in the long text to the user. Because the recommendation information recommended to the user is obtained from the recommendation information of each short text according to the word segmentation in the long text, the number of recommendation information can be reduced so as to be convenient for the user to search, and because each short text is obtained by performing intention recognition on the long text, the recommended information can be more accurate.
The foregoing description is only an overview of the technical solutions of the embodiments of the present invention, and the embodiments of the present invention can be implemented according to the content of the description in order to make the technical means of the embodiments of the present invention more clearly understood, and the detailed description of the embodiments of the present invention is provided below in order to make the foregoing and other objects, features, and advantages of the embodiments of the present invention more clearly understandable.
Drawings
Various additional advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the embodiments of the invention. Also, like reference numerals are used to refer to like parts throughout the drawings. In the drawings:
fig. 1 is a first flowchart illustrating a long text-based information recommendation method according to an embodiment of the present invention;
fig. 2 is a flowchart illustrating a second method for recommending information based on a long text in an embodiment of the present invention;
FIG. 3 is a schematic structural diagram of a long text-based information recommendation apparatus according to an embodiment of the present invention;
fig. 4 is a schematic structural diagram of an electronic device in an embodiment of the present invention.
Detailed Description
Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited by the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
In practical application, when a user chats on electronic equipment by using instant messaging software such as short messages and WeChat, or browses news in a webpage or an application program, recommendation information sometimes needs to be obtained based on a certain text in a chat interface or the webpage. Especially, when the text selected by the user is a long text, a large amount of information can be prevented from being recommended to the user, and accurate recommendation of the information is realized.
The information recommendation method based on the long text provided by the embodiment of the invention is explained in detail below.
Fig. 1 is a first flowchart illustrating a long text-based information recommendation method in an embodiment of the present invention, and referring to fig. 1, the method may include:
s101: and performing intention recognition on the long text to obtain a plurality of short texts.
Wherein, the short text is a text expanded based on the word segmentation in the long text.
The intention recognition of a long text means to recognize the meaning of the long text. I.e. more information the user wants to know after seeing text such as a news headline, there may be multiple meanings for long text, each of which represents a short text.
For example, for a long text "two kings are covered in korea of the 1500 mi fanoku novelties of the short track world championship", a plurality of short texts can be obtained by identifying the intention of the long text. For example: the texts of the short track world championship gold placard, the short track world championship korean celebration, the fangkong gold medal and the like. Of course, the three short texts are obtained by way of example only, and other short texts may be obtained by performing intent recognition on the long text, which is not limited herein.
S102: and identifying the intention of each short text to obtain the recommendation information of each short text.
After obtaining a plurality of short texts, the intention recognition needs to be performed on each short text. I.e. the intention to recognize each short text. For example: the obtained short text is "country trade building", and the intention of the short text is recognized, so that the short text can be obtained as an intention based on the address. For another example: the obtained short text is 'nine am tomorrow', intention recognition is carried out on the short text, and the short text can be obtained to be an intention based on time. For another example: the obtained short text is 'gold medal list', intention recognition is carried out on the short text, and the intention that the short text is recommended based on the number of gold medals can be obtained.
After the intention of each short text is recognized, according to different intentions, the short text is recommended to the user either directly based on the intention of the short text or based on the intention of the short text, and the search result is recommended to the user. For example: for the short text of "nine am tomorrow", recognizing that the short text is a time-based intention, the user may need to take notes, and thus, the note book is used as the recommendation information. For another example: since the short text "gold medal list" is recognized as an intention to recommend the short text based on the number of gold medals, the gold medal acquirement statuses of various events can be obtained by inputting the "gold medal list" into a search engine for searching, and the gold medal acquirement statuses of various events can be used as recommendation information. Here, the search engine may be a hundredth, google, necessity, or the like search engine, and of course, may also be another search engine, which is not limited herein.
S103: and acquiring recommendation information associated with the participles in the long text from the recommendation information of each short text.
Because the recommendation information of each short text may have a problem of a large quantity, which is not beneficial for the user to search, it is necessary to combine the long text again to screen recommendation information associated with the word segmentation in the long text from the recommendation information of each short text.
For example, for a short text of "gold placard", the number of recommendations obtained is large, for example: the Olympic games golden board, the Asian games golden board and the like are combined with a long text that 'the short-channel world games 1500 Mm Fancao New Youcai involves two kings in Korea', so that the short-channel world games golden board can be screened out from numerous golden boards, and further the quantity of recommendation information is reduced, so that a user can search conveniently.
S104: recommending information associated with the participles in the long text to the user.
In practical application, recommendation information can be inserted into the lower half part of the interface currently used by the user, so that the use of the current interface by the user is not influenced, and the information can be recommended to the user based on the long text selected by the user.
As can be seen from the above, in the information recommendation method based on the long text provided in the embodiment of the present invention, first, intent recognition is performed on the long text to obtain a plurality of short texts, where the short texts are texts expanded based on word segmentation in the long text; then, performing intention identification on each short text to obtain recommendation information of each short text; then, acquiring recommendation information associated with the word segmentation in the long text from the recommendation information of each short text; and finally, recommending recommendation information associated with the participles in the long text to the user. Because the recommendation information recommended to the user is obtained from the recommendation information of each short text according to the word segmentation in the long text, the number of recommendation information can be reduced so as to be convenient for the user to search, and because each short text is obtained by performing intention recognition on the long text, the recommended information can be more accurate.
Further, as a refinement and extension of the method shown in fig. 1, the embodiment of the present invention further provides a long text-based information recommendation method. Fig. 2 is a schematic flowchart of a second method for recommending information based on a long text in an embodiment of the present invention, and as shown in fig. 2, the method may include:
s201: and acquiring a plurality of participles in the long text.
Because a plurality of participles exist in the long text and the content recommended to the user is obtained based on the participles in the long text, after the long text selected by the user is determined, the plurality of participles in the long text need to be obtained first.
In practical application, the participles in the long text may be obtained by methods such as participles, and certainly, the participles in the long text may also be obtained by other methods, which is not specifically limited herein.
After a plurality of segmented words in the long text are obtained, a plurality of short texts need to be generated based on the plurality of segmented words. In a specific implementation process, a plurality of short texts corresponding to the long text may be obtained by using any one of the methods S202 and S203.
S202: and generating a short text based on each word segmentation, and taking all the finally generated short texts as a plurality of short texts corresponding to the long text.
Specifically, each word in the long text is subjected to intent recognition to obtain a short text corresponding to each word, and the short text corresponding to each word forms a plurality of short texts corresponding to the long text.
For example, suppose that two segmentations of "short track world championship" and "two kings" are obtained from a long text, then the two segmentations are respectively subjected to intention recognition, the short text corresponding to the segmentations of "short track world championship" is the "short track world championship", and the short text corresponding to the segmentations of "two kings" is the "gold medal list". The two short texts of the short track world championship and the gold placard are a plurality of short texts corresponding to the long text.
S203: and generating a short text based on at least two word segmentations, and taking all the finally generated short texts as a plurality of short texts corresponding to the long text.
Here, according to the number of the participles in the long text, two participles may be selected to generate one short text, three participles may be selected to generate one short text, two participles may be selected to generate one short text, and three participles may be selected to generate one short text, which is not limited herein. It should be noted that the above two or three are merely examples, and may also be four, five or six, etc., and need to be determined according to the actual number and actual situation of the participles in the long text.
For example, assume that there are five word segments in a long text, which are: word segmentation A, word segmentation B, word segmentation C, word segmentation D and word segmentation E. One way to generate multiple short texts is: generating a short text a based on the participle A and the participle B, generating a short text B based on the participle B and the participle C, generating a short text C based on the participle A and the participle C, and generating a short text d based on the participle A, the participle B and the participle C. Thus, based on at least two of the participles in the text, four short texts are generated, namely short text a, short text b, short text c and short text d.
Here, it should be noted that: in the process of generating a short text based on at least two participles in a long text, the participles in the long text may be used only once or multiple times, which is not limited herein. In order to obtain more short texts, the recommendation information of the short texts is more comprehensive, and the recommendation information is more accurate, two and three of the multiple word segmentations in the long text can be combined in a combination mode in the permutation and combination mode until the multiple word segmentations are combined together to generate one short text.
When at least two participles in the long text are combined to obtain a plurality of combined participles, some combined participles do not have corresponding recommendation information. For example: the four participles of the short track world championship, the standard championship and the golden board are combined, and the standard championship does not lose money in the short track world championship of 1500 m, so that the four participles do not have corresponding recommendation information after being combined. In order to avoid searching for merged participles without corresponding recommendation information and improve recommendation efficiency, the weight value of the merged participles can be calculated, wherein the smaller the weight value is, the less information corresponding to the participles is represented, and when the weight value of a certain participle is smaller than a preset threshold value, the participle is indicated to have no corresponding information. Therefore, after the weighted value of the merged participle is calculated, the merged participle with the weighted value smaller than the preset threshold value is deleted, the merged participle with the weighted value larger than or equal to the preset threshold value is reserved, the reserved merged participle is used as a plurality of short texts corresponding to the long text, or the reserved merged participle is expanded to expand the short text, and the expanded short text is used as a plurality of short texts corresponding to the long text.
At least two participles in the long text are combined, and the short text is generated based on the combined participles, so that the intention of the short text can be more clear, the number of recommendation information of the short text is reduced, the screening efficiency of the recommendation information of the short text based on the long text is improved, and the recommendation efficiency of the information is finally improved. For example: the short text of the short track golden board of the world championship game can be obtained by combining the two participles of the short track golden board and the two kings, compared with the short text of the golden board, which is used for obtaining recommendation information such as the Olympic conference golden board, the golden board of the world championship game and the like, the short text generated by combining the participles of the short track golden board only has the short track golden board, but does not have the Olympic conference golden board and other golden boards of the world championship game, and thus the number of the recommendation information of the short text is reduced.
Thus, a plurality of short texts can be obtained by either one of the methods S202 or S203.
S204: and performing intention recognition on each short text through one or more of word segmentation processing, named entity recognition and semantic analysis to obtain recommendation information of each short text.
In the above embodiment, the specific process of performing intent recognition on each short text to obtain recommendation information of each short text has been described, and thus is not described herein again. Next, description will be mainly made on how to perform intent recognition on each short text.
Taking the intention recognition of a short text as an example, in a specific implementation process, the intention recognition of the short text can be performed in any one of the following two ways. Of course, the method may be implemented in other ways, and is not limited in particular.
The first mode is as follows: the short text is firstly subjected to word segmentation processing to obtain word segmentation results, named entity recognition is carried out based on the word segmentation results, and then the intention of the short text is recognized.
For example: for the short text "opening a meeting to the national trade building 302 at 3 pm tomorrow", the segmentation results of "tomorrow", "afternoon", "3 o", "to", "national trade building", "302", "opening" are obtained by the segmentation process, and then the segmentation results are subjected to named entity recognition to obtain a named entity based on time, which is "3 pm tomorrow", a named entity based on location, which is "national trade building 302", and a named entity based on event, which is "opening", so as to recognize the notepad and the map which need to be recommended. The Named Entity is Named Entity, namely Named Entity Recognition, NER for short.
The second mode is as follows: and performing semantic analysis on the short text to identify the intention of the short text.
For example: for the short text of 'I want to go to a national trade building', a map needing to be recommended to go to the national trade building can be identified through semantic analysis.
Here, it should be noted that: after the intention of the short text is recognized, if the user can be directly recommended based on the intention, the search may not be performed. For example: if the intention is a time-based intention, the notepad is recommended to the user as the recommendation information. If the user cannot be directly recommended based on the intention, the short text needs to be input into a search engine for searching based on the intention, and the search result is used as recommendation information. For example: and if the intention is an intention for recommending the number of gold medals, inputting the gold medal list into a search engine for searching based on the intention, and recommending the information of the number of the recommended gold medals as recommendation information to the user.
Through any one of the two modes, the intention recognition can be carried out on the short text, and the recommendation information of the short text can be obtained.
S205: the number of recommendation information per short text is determined.
The number of each piece of recommendation information is 1, and the number of the short text recommendation information represents the number of the short text recommendation information.
S206: and if the number of the recommendation information of each short text is greater than or equal to a first preset number, screening the recommendation information of each short text based on the participles in the long text to obtain recommendation information associated with the participles in the long text.
Here, the long text corresponds to a plurality of short texts, and if the number of recommendation information of a certain short text is greater than or equal to the first preset number, it is indicated that the recommendation information of the short text is more, which is not beneficial for the user to search.
For example, for the short text "gold board", there are many pieces of recommendation information of the short text, including a gold board in the olympic games, a gold board in the world championship, a gold board in the subtranship, and a gold board in various domestic sports, and the number of the pieces of recommendation information easily exceeds a first preset number.
It should be noted here that: when the recommendation information of the short text is filtered based on the word segmentation in the long text, in order to realize effective filtering, two different word segmentations are required to be ensured between the word segmentation in the long text and the word segmentation in the short text.
Further, it should be noted that: the first preset number may be set according to actual requirements, and may be two or three or four, and is not specifically limited herein.
Further, there may be a case where: the number of information recommendations of each short text is smaller than the first preset number, but the number of short texts is larger, so the number of recommendation information is larger overall. In this case, it is still necessary to filter recommendation information of a plurality of short texts according to the word segmentation in the long text. Specifically, the number of the plurality of short texts may be determined first, and if the number of the plurality of short texts is greater than or equal to a second preset number, the recommendation information of the plurality of short texts is screened according to the word segmentation in the long text, so as to further avoid recommending more information to the user. Here, the second preset number may be the same as or different from the first preset number, and is not limited herein.
Finally, if the number of the short texts is smaller than the second preset number, it indicates that the number of the recommendation information of the short texts is not large, so that the recommendation information of the short texts can be directly recommended to the user without screening, and the information recommendation efficiency can be further improved.
S207: recommending information associated with the participles in the long text to the user.
At this point, the whole information recommendation process is completed.
As can be seen from the above, in the information recommendation method based on the long text provided in the embodiment of the present invention, first, a plurality of segmented words in the long text are obtained; then, generating a short text based on each word segmentation, and taking all the finally generated short texts as a plurality of short texts corresponding to the long text; or generating a short text based on at least two word segmentations, and taking all the finally generated short texts as a plurality of short texts corresponding to the long text; then, performing intention recognition on each short text to obtain a semantic fragment of each short text; searching based on the semantic fragments of each short text to obtain the search result of each short text, and taking the search result of each short text as the recommendation information of each short text; then, determining the quantity of the recommendation information of each short text; if the number of the recommendation information of each short text is larger than or equal to a first preset number, screening the recommendation information of each short text based on the word segmentation in the long text to obtain recommendation information associated with the word segmentation in the long text; and finally, recommending information associated with the word segmentation in the long text to the user. Because the recommendation information recommended to the user is obtained from the recommendation information of each short text according to the word segmentation in the long text, the number of recommendation information can be reduced so as to be convenient for the user to search, and each short text is obtained by performing intention recognition on the long text, so that the recommendation information can be more accurate.
Based on the same inventive concept, as the implementation of the method, the embodiment of the invention also provides an information recommendation device based on the long text. Fig. 3 is a schematic structural diagram of a long text-based information recommendation apparatus in an embodiment of the present invention, and referring to fig. 3, the apparatus 30 may include: the obtaining module 301 is configured to perform intent recognition on a long text to obtain a plurality of short texts, where the short texts are texts expanded based on word segmentation in the long text; the identification module 302 is configured to perform intent identification on each short text to obtain recommendation information of each short text; a determining module 303, configured to obtain recommendation information associated with a word segmentation in the long text from the recommendation information of each short text; and the recommending module 304 is used for recommending recommendation information associated with the participles in the long text to the user.
Based on the foregoing embodiment, the obtaining module includes: the first acquisition unit is used for acquiring a plurality of participles in the long text; the first identification unit is used for performing intention identification on each word in the long text to obtain a short text corresponding to each word, and taking the short text corresponding to each word as the plurality of short texts.
Based on the foregoing embodiment, the obtaining module includes: the second acquisition unit is used for acquiring a plurality of participles in the long text; the merging unit is used for merging at least two participles in the long text to obtain a plurality of merged participles; and the second identification unit is used for calculating the weighted values of the plurality of merged participles and taking the merged participles with the weighted values larger than or equal to a preset threshold value or short texts expanded according to the merged participles as the plurality of short texts.
Based on the foregoing embodiment, the identification module is specifically configured to perform intent identification on each short text through one or more of word segmentation processing, named entity identification, and semantic analysis, so as to obtain recommendation information of each short text.
Based on the foregoing embodiments, the determining module includes: a first determining unit, configured to determine the number of pieces of recommendation information of each short text; and the first selecting unit is used for screening the recommendation information of each short text based on the participles in the long text to obtain recommendation information associated with the participles in the long text if the number of the recommendation information of each short text is greater than or equal to a first preset number.
Based on the foregoing embodiment, the determining module further includes: a second determining unit, configured to determine the number of the plurality of short texts if the number of the recommendation information of each short text is smaller than a first preset number; and the second selecting unit is used for screening the recommendation information of the plurality of short texts based on the participles in the long text to obtain recommendation information associated with the participles in the long text when the number of the plurality of short texts is greater than or equal to a second preset number.
It is to be noted here that: the above description of the apparatus embodiments, similar to the above description of the method embodiments, has similar beneficial effects as the method embodiments. For technical details not disclosed in the embodiments of the apparatus according to the invention, reference is made to the description of the embodiments of the method according to the invention.
Based on the same inventive concept, the embodiment of the invention also provides electronic equipment. Fig. 4 is a schematic structural diagram of an electronic device in an embodiment of the present invention, and referring to fig. 4, the electronic device 40 may include: at least one processor 401; and at least one memory 402, a bus 403 connected to the processor 401; the processor 401 and the memory 402 complete communication with each other through the bus 403; the processor 401 is configured to call program instructions in the memory 402 to perform the methods in one or more of the embodiments described above.
Here, it should be noted that: the above description of the embodiments of the electronic device is similar to the description of the embodiments of the method described above, and has similar advantageous effects to the embodiments of the method. For technical details not disclosed in the embodiments of the electronic device according to the embodiments of the present invention, please refer to the description of the method embodiments of the present invention.
Based on the same inventive concept, the embodiment of the present invention further provides a computer-readable storage medium, where the computer-readable storage medium includes a stored program, and when the program runs, the apparatus on which the storage medium is located is controlled to execute the method in one or more embodiments described above.
Here, it should be noted that: the above description of the computer-readable storage medium embodiments is similar to the description of the method embodiments described above, with similar beneficial effects as the method embodiments. For technical details not disclosed in the embodiments of the computer-readable storage medium of the embodiments of the present invention, reference is made to the description of the embodiments of the method of the present invention for understanding.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and so forth) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). The memory is an example of a computer-readable medium.
Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in the process, method, article, or apparatus that comprises the element.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and so forth) having computer-usable program code embodied therein.
The above are merely examples of the present application and are not intended to limit the present application. Various modifications and changes may occur to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the scope of the claims of the present application.

Claims (6)

1. A long text-based information recommendation method is characterized by comprising the following steps:
acquiring a plurality of participles in the long text;
combining at least two participles in the long text in pairs and/or in triplex combination by adopting a combination mode in permutation and combination to obtain a plurality of combined participles;
calculating the weighted values of the plurality of merged participles, deleting the merged participles with weighted values smaller than a preset threshold value, keeping the merged participles with weighted values larger than or equal to the preset threshold value, and taking the merged participles with weighted values larger than or equal to the preset threshold value as a plurality of short texts corresponding to the long text; or expanding the reserved combined word segmentation to expand a short text, and taking the expanded short text as a plurality of short texts corresponding to the long text; the weight value of the merged participle is used for identifying the amount of recommendation information corresponding to the merged participle;
performing word segmentation processing on each short text to obtain word segmentation results; carrying out named entity recognition based on the word segmentation result, and recognizing the intention of the short text; inputting the short text into a search engine for searching based on the intention of the short text, and taking a search result as recommendation information to obtain the recommendation information of each short text;
determining the quantity of the recommendation information of each short text;
if the number of the recommendation information of each short text is larger than or equal to a first preset number, screening the recommendation information of each short text based on the word segmentation in the long text to obtain recommendation information associated with the word segmentation in the long text;
if the quantity of the recommendation information of each short text is smaller than a first preset quantity, determining the quantity of the plurality of short texts;
when the number of the short texts is larger than or equal to a second preset number, screening recommendation information of the short texts based on the word segmentation in the long text to obtain recommendation information associated with the word segmentation in the long text; the participles in the long text are different from the participles in the short text;
recommending information associated with the participles in the long text to a user.
2. The method of claim 1, further comprising:
obtaining a plurality of word segments in the long text;
and performing intention recognition on each word in the long text to obtain a short text corresponding to each word, and taking the short text corresponding to each word as the plurality of short texts.
3. An information recommendation device based on long text, comprising:
the acquisition module is used for acquiring a plurality of word segments in the long text; combining at least two participles in the long text in pairs and/or in triplex combination by adopting a combination mode in permutation and combination to obtain a plurality of combined participles; calculating the weighted values of the plurality of merged participles, deleting the merged participles with weighted values smaller than a preset threshold value, keeping the merged participles with weighted values larger than or equal to the preset threshold value, and taking the merged participles with weighted values larger than or equal to the preset threshold value as a plurality of short texts corresponding to the long text; or expanding the reserved combined participles to expand short texts, and taking the expanded short texts as a plurality of short texts corresponding to the long texts; the weight value of the merged participle is used for identifying the amount of recommendation information corresponding to the merged participle;
the recognition module is used for performing word segmentation processing on each short text to obtain word segmentation results; carrying out named entity recognition based on the word segmentation result, and recognizing the intention of the short text; inputting the short text into a search engine for searching based on the intention of the short text, and taking a search result as recommendation information to obtain the recommendation information of each short text;
the determining module is used for determining the quantity of the recommendation information of each short text; if the number of the recommendation information of each short text is larger than or equal to a first preset number, screening the recommendation information of each short text based on the word segmentation in the long text to obtain recommendation information associated with the word segmentation in the long text; if the quantity of the recommendation information of each short text is smaller than a first preset quantity, determining the quantity of the plurality of short texts; when the number of the short texts is larger than or equal to a second preset number, screening recommendation information of the short texts based on the word segmentation in the long text to obtain recommendation information associated with the word segmentation in the long text; the participles in the long text are different from the participles in the short text;
and the recommending module is used for recommending the recommending information associated with the participles in the long text to the user.
4. The apparatus of claim 3, further comprising:
the first acquisition unit is used for acquiring a plurality of participles in the long text;
the first identification unit is used for performing intention identification on each word in the long text to obtain a short text corresponding to each word, and taking the short text corresponding to each word as the plurality of short texts.
5. An electronic device, comprising:
at least one processor;
and at least one memory, bus connected with the processor;
the processor and the memory complete mutual communication through the bus; the processor is configured to invoke program instructions in the memory to perform the method of any of claims 1 to 2.
6. A computer-readable storage medium, characterized in that the storage medium comprises a stored program, wherein the program, when executed, controls an apparatus on which the storage medium is located to perform the method according to any of claims 1-2.
CN201910473094.7A 2019-05-31 2019-05-31 Information recommendation method and device based on long text Active CN110232156B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910473094.7A CN110232156B (en) 2019-05-31 2019-05-31 Information recommendation method and device based on long text

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910473094.7A CN110232156B (en) 2019-05-31 2019-05-31 Information recommendation method and device based on long text

Publications (2)

Publication Number Publication Date
CN110232156A CN110232156A (en) 2019-09-13
CN110232156B true CN110232156B (en) 2022-08-19

Family

ID=67858961

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910473094.7A Active CN110232156B (en) 2019-05-31 2019-05-31 Information recommendation method and device based on long text

Country Status (1)

Country Link
CN (1) CN110232156B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113360613A (en) * 2021-05-31 2021-09-07 维沃移动通信有限公司 Text processing method and device and electronic equipment
CN113887235A (en) * 2021-09-24 2022-01-04 北京三快在线科技有限公司 Information recommendation method and device
CN116126197B (en) * 2021-11-12 2024-06-14 荣耀终端有限公司 Application program recommendation method

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100076991A1 (en) * 2008-09-09 2010-03-25 Kabushiki Kaisha Toshiba Apparatus and method product for presenting recommended information
CN109285030A (en) * 2018-08-29 2019-01-29 深圳壹账通智能科技有限公司 Products Show method, apparatus, terminal and computer readable storage medium
CN109800352A (en) * 2018-12-30 2019-05-24 上海触乐信息科技有限公司 Method, system and the terminal device of information push are carried out based on clipbook

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104142990A (en) * 2014-07-28 2014-11-12 百度在线网络技术(北京)有限公司 Search method and device

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100076991A1 (en) * 2008-09-09 2010-03-25 Kabushiki Kaisha Toshiba Apparatus and method product for presenting recommended information
CN109285030A (en) * 2018-08-29 2019-01-29 深圳壹账通智能科技有限公司 Products Show method, apparatus, terminal and computer readable storage medium
CN109800352A (en) * 2018-12-30 2019-05-24 上海触乐信息科技有限公司 Method, system and the terminal device of information push are carried out based on clipbook

Also Published As

Publication number Publication date
CN110232156A (en) 2019-09-13

Similar Documents

Publication Publication Date Title
CN109447469B (en) Text detection method, device and equipment
CN110232156B (en) Information recommendation method and device based on long text
CN110275965B (en) False news detection method, electronic device and computer readable storage medium
CN111831629B (en) Data processing method and device
CN111159697B (en) Key detection method and device and electronic equipment
CN106959976B (en) Search processing method and device
CN106610931B (en) Topic name extraction method and device
CN113535817B (en) Feature broad table generation and service processing model training method and device
TW202032466A (en) User age prediction method, apparatus, and device
CN107451204B (en) Data query method, device and equipment
CN108427667B (en) Legal document segmentation method and device
CN112287071A (en) Text relation extraction method and device and electronic equipment
CN113835692A (en) Dictionary data processing method and device, electronic equipment and computer storage medium
CN105989066A (en) Information processing method and device
CN109492401B (en) Content carrier risk detection method, device, equipment and medium
CN109033224B (en) Risk text recognition method and device
CN112183181A (en) Information display method
CN110738562A (en) Method, device and equipment for generating risk reminding information
CN110008252B (en) Data checking method and device
CN108460131B (en) Classification label processing method and device
CN110019702B (en) Data mining method, device and equipment
CN110232155B (en) Information recommendation method for browser interface and electronic equipment
CN110968691B (en) Judicial hotspot determination method and device
CN112711718A (en) Review information auditing method, device, medium and electronic equipment
CN111967767A (en) Business risk identification method, device, equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20200724

Address after: 518000 Nanshan District science and technology zone, Guangdong, Zhejiang Province, science and technology in the Tencent Building on the 1st floor of the 35 layer

Applicant after: TENCENT TECHNOLOGY (SHENZHEN) Co.,Ltd.

Address before: 100029, Beijing, Chaoyang District new East Street, building No. 2, -3 to 25, 101, 8, 804 rooms

Applicant before: Tricorn (Beijing) Technology Co.,Ltd.

TA01 Transfer of patent application right
GR01 Patent grant
GR01 Patent grant