CN115730596A - Object recommendation method and device and computer equipment - Google Patents

Object recommendation method and device and computer equipment Download PDF

Info

Publication number
CN115730596A
CN115730596A CN202211510311.3A CN202211510311A CN115730596A CN 115730596 A CN115730596 A CN 115730596A CN 202211510311 A CN202211510311 A CN 202211510311A CN 115730596 A CN115730596 A CN 115730596A
Authority
CN
China
Prior art keywords
text
array
current
entry
double
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211510311.3A
Other languages
Chinese (zh)
Inventor
李宏斌
何雯
何剑斌
郑鸿
王�华
汪倩
刘少君
胡建
陈杰
樊智坤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Southern Power Grid Internet Service Co ltd
Original Assignee
China Southern Power Grid Internet Service Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Southern Power Grid Internet Service Co ltd filed Critical China Southern Power Grid Internet Service Co ltd
Priority to CN202211510311.3A priority Critical patent/CN115730596A/en
Publication of CN115730596A publication Critical patent/CN115730596A/en
Pending legal-status Critical Current

Links

Images

Abstract

The embodiment of the application provides an object recommendation method, which is characterized by comprising the following steps: acquiring a search text aiming at an object to be recommended; segmenting the search text to obtain text characters corresponding to the search text; performing prefix index on the text characters in a pre-constructed array to obtain a first matching result corresponding to the text characters; performing maximum matching on text characters in a pre-constructed double-array Trie tree according to a first matching result to obtain text participles of a search text; and acquiring the object to be recommended matched with the text word segmentation from the recommended object database by using the text word segmentation. In the method provided by the embodiment of the application, a part of data in the original double-array Trie tree, such as a single-child node tree structure, is stored in the array, so that a huge trunk is prevented from being formed, the memory of a server is consumed, and the problem of large memory consumption is solved.

Description

Object recommendation method and device and computer equipment
Technical Field
The present application relates to the field of information search, and in particular, to an object recommendation method, an object recommendation apparatus, and a computer device.
Background
When a user purchases on a shopping platform, the user usually compares 'goods with three', and finally selects a proper commodity to place an order by comparing a plurality of similar commodities. Because the shopping platform has massive commodity data and the efficiency of manually screening similar commodities is low, the intelligent recommendation technology is generally adopted to screen and recommend the similar commodities at present, so that the system can help the user to screen the similar commodities quickly, the purchasing reference data is more comprehensive, and the overall purchasing efficiency is further improved.
For intelligent recommendation technology, "word segmentation" has always been a very important and fundamental step. The word segmentation is to decompose long texts such as sentences, paragraphs and articles into data structures taking words as units, so that the subsequent processing and analysis work is facilitated. In the existing word segmentation technology, a double-array Trie tree algorithm is mostly adopted to segment words of a search text, however, if prefixes of a word and other words are different, a new trunk is unfolded to store the word, so that a huge trunk is formed, and the memory of a server is consumed. Therefore, there is a problem of large memory consumption.
Disclosure of Invention
Therefore, it is necessary to provide an object recommendation method, an object recommendation device and a computer device for optimizing memory usage in view of the above technical problems.
In a first aspect, the present application provides an object recommendation method. The method comprises the following steps:
acquiring a search text aiming at an object to be recommended;
segmenting the search text to obtain text characters corresponding to the search text;
performing prefix index on the text characters in a pre-constructed array to obtain a first matching result corresponding to the text characters; the array comprises a single child node tree structure;
performing maximum matching on the text characters in a pre-constructed double-array Trie tree according to the first matching result to obtain text participles of the search text; the double-array Trie tree comprises a multi-child node tree structure;
and acquiring the object to be recommended matched with the text word segmentation from a recommended object database by using the text word segmentation.
In one embodiment, performing prefix indexing on the text characters in a pre-constructed array to obtain a first matching result corresponding to the text characters includes: combining a first character and a second character in the text characters to obtain a first entry; matching the first term in the array; under the condition that matching is successful, determining that the first matching result is that the array comprises the first entry; and under the condition of failed matching, determining that the first matching result is that the array does not comprise the first vocabulary entry.
In one embodiment, the performing maximum matching on the text characters in a pre-constructed double-array Trie according to the first matching result to obtain text segmentation of the search text includes: taking the last character of the first entry as the current character under the condition that the array comprises the first entry as the first matching result; acquiring a next character of the current character in the text characters, and combining the first entry and the next character to obtain a second entry; taking the second entry as the current entry; inquiring the current entry in the double-array Trie tree to obtain a current matching result corresponding to the current entry; and under the condition that the current matching result is that the double-array Trie tree does not comprise the current entry and the current entry is not the prefix of the word in the double-array Trie tree, taking the current entry as one word in the text word segmentation.
In one embodiment, after querying the current term in the double-array Trie to obtain a current matching result corresponding to the current term, the method further includes: taking the last character of the current entry as the current character under the condition that the current matching result is that the double-array Trie tree comprises the current entry or the current entry is the prefix of the word in the double-array Trie tree; and acquiring a next character of the current character in the text characters, acquiring a new current entry by combining the current entry and the next character, and returning to execute the step of inquiring the current entry in the double-array Trie tree to acquire a current matching result corresponding to the current entry until the current matching result is that the double-array Trie tree does not comprise the current entry and the current entry is not a prefix of a word in the double-array Trie tree.
In one embodiment, the performing maximum matching on the text characters in a pre-constructed double-array Trie according to the first matching result to obtain text segmentation of the search text includes: taking a first character in the first entry as a current character under the condition that the array does not comprise the first entry as a first matching result; acquiring a next character of the current character in the text characters, and combining the current character and the next character to obtain a current entry; inquiring the current entry in the double-array Trie tree to obtain a current matching result corresponding to the current entry; and under the condition that the current matching result is that the double-array Trie tree does not comprise the current entry and the current entry is not the prefix of the word in the double-array Trie tree, taking the current entry as one word in the text word segmentation.
In one embodiment, the method further comprises: querying a data pool for the search text; the data pool comprises historical participled text; and under the condition that the search text does not exist in the data pool, segmenting the search text to obtain text characters corresponding to the search text.
In one embodiment, the obtaining, from a recommendation object database, an object to be recommended that matches the text participle by using the text participle includes: acquiring the word segmentation number of the text word segmentation; determining the product of a preset repetition rate threshold and the word segmentation number as the repeated word segmentation number of the search text and the matched text in the recommended object database; and under the condition that the number of the repeated word segments is greater than or equal to a preset matching threshold value, determining the object corresponding to the matched text as the object to be recommended.
In a second aspect, the present application further provides an object recommendation apparatus, where the apparatus includes:
the device comprises a first acquisition module, a second acquisition module and a third acquisition module, wherein the first acquisition module is used for acquiring a search text aiming at an object to be recommended;
the segmentation module is used for segmenting the search text to obtain text characters corresponding to the search text;
the first matching module is used for carrying out prefix index on the text characters in a pre-constructed array to obtain a first matching result; the array comprises a single child node tree structure;
the second matching module is used for carrying out maximum matching on the text characters in a pre-constructed double-array Trie tree according to the first matching result to obtain text participles of the search text; the dual array Trie tree comprises a multi-child node tree structure;
and the second acquisition module is used for acquiring the object to be recommended matched with the text participle from a recommended object database by using the text participle.
In a third aspect, the present application also provides a computer device. The computer device comprises a memory storing a computer program and a processor implementing the following steps when executing the computer program:
acquiring a search text aiming at an object to be recommended;
segmenting the search text to obtain text characters corresponding to the search text;
performing prefix index on the text characters in a pre-constructed array to obtain a first matching result corresponding to the text characters; the array comprises a single child node tree structure;
performing maximum matching on the text characters in a pre-constructed double-array Trie tree according to the first matching result to obtain text word segmentation of the search text; the dual array Trie tree comprises a multi-child node tree structure;
and acquiring an object to be recommended matched with the text participles from a recommended object database by using the text participles.
In a fourth aspect, the present application further provides a computer-readable storage medium. The computer-readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of:
acquiring a search text aiming at an object to be recommended;
segmenting the search text to obtain text characters corresponding to the search text;
performing prefix index on the text characters in a pre-constructed array to obtain a first matching result corresponding to the text characters; the array comprises a single child node tree structure;
performing maximum matching on the text characters in a pre-constructed double-array Trie tree according to the first matching result to obtain text participles of the search text; the dual array Trie tree comprises a multi-child node tree structure;
and acquiring an object to be recommended matched with the text participles from a recommended object database by using the text participles.
In a fifth aspect, the present application further provides a computer program product. The computer program product comprising a computer program which when executed by a processor performs the steps of:
acquiring a search text aiming at an object to be recommended;
segmenting the search text to obtain text characters corresponding to the search text;
performing prefix index on the text characters in a pre-constructed array to obtain a first matching result corresponding to the text characters; the array comprises a single child node tree structure;
performing maximum matching on the text characters in a pre-constructed double-array Trie tree according to the first matching result to obtain text word segmentation of the search text; the dual array Trie tree comprises a multi-child node tree structure;
and acquiring the object to be recommended matched with the text word segmentation from a recommended object database by using the text word segmentation.
In the object recommendation method, device and computer equipment, the dictionary for segmenting the search text can be segmented in advance and stored in the array and the double-array Trie tree respectively, and further, prefix indexing can be performed on the text characters in the array constructed in advance to obtain a first matching result corresponding to the text characters; then, according to the first matching result, performing maximum matching on the text characters in a pre-constructed double-array Trie tree to obtain text participles of the search text; and finally, acquiring the object to be recommended matched with the text word segmentation from a recommended object database by using the text word segmentation. In the method provided by the embodiment of the application, a part of data in the original double-array Trie tree, such as a single-child node tree structure, is stored in the array, so that a huge trunk is prevented from being formed, the memory of a server is consumed, and the problem of large memory consumption is solved.
Drawings
FIG. 1 is a diagram of an application environment of a method for object recommendation in one embodiment;
FIG. 2 is a flowchart illustrating a method for object recommendation in one embodiment;
fig. 3 is a schematic flow chart illustrating that in one embodiment, prefix indexes are performed on text characters in a pre-constructed array to obtain a first matching result corresponding to the text characters;
fig. 4 is a schematic flow chart illustrating a process of performing maximum matching on text characters in a pre-constructed double-array Trie according to a first matching result to obtain text segmentation of a search text in one embodiment;
fig. 5 is a schematic flow chart illustrating a process of performing maximum matching on text characters in a pre-constructed double-array Trie according to a first matching result to obtain text segmentation of a search text in one embodiment;
FIG. 6 is a schematic flow chart illustrating the steps of obtaining an object to be recommended from a recommended object database by using text participles in another embodiment;
FIG. 7 is a flowchart illustrating a method for object recommendation in another embodiment;
FIG. 8 is a block diagram showing the construction of an object recommending apparatus according to an embodiment;
FIG. 9 is a diagram illustrating an internal structure of a computer device according to an embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more clearly understood, the present application is further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
The anti-collision and early warning method based on the fusion of the laser ranging and the image ranging can be applied to the application environment shown in fig. 1. Wherein the terminal 101 and the server 102 can communicate through a network. The server 102 may be a backend server. The data storage system may store data that the server 102 needs to process. The data storage system may be integrated on the server 102, or may be located on a cloud server or other network server.
In the embodiment of the present application, the terminal 101 may display a search text and display an object, that is, a commodity, corresponding to the search text. The user can click a button for searching similar products on the display screen of the terminal 101.
The server 102 may split the dictionary for implementing word segmentation for the search text in advance, and store the data therein in the array and the dual-array Trie respectively. In response to a search instruction of a user, the server 102 may obtain a search text for an object to be recommended, and segment the search text to obtain text characters corresponding to the search text; furthermore, prefix index can be carried out on the text characters in a pre-constructed array, and a first matching result corresponding to the text characters is obtained; performing maximum matching on text characters in a pre-constructed double-array Trie tree according to a first matching result to obtain text participles of a search text; and acquiring the object to be recommended matched with the text word segmentation from the recommended object database by using the text word segmentation.
The terminal 101 may display the object to be recommended; the object to be recommended may include one or more objects, the user may click one of the objects to be recommended on the display screen of the terminal 101, and in response to a click instruction of the user, the server 102 may obtain relevant introduction information of the object to be recommended and display the information through the display screen of the terminal 101. In the embodiment of the present application, the object may be a commodity.
The terminal 101 may be, but not limited to, various personal computers, notebook computers, smart phones, tablet computers, internet of things devices and portable wearable devices, and the internet of things devices may be smart speakers, smart televisions, smart air conditioners, smart car-mounted devices, cameras, sensors, and the like. The portable wearable device can be a smart watch, a smart bracelet, a head-mounted device, and the like. The server 102 may be implemented as a stand-alone server or as a server cluster comprised of multiple servers.
In an embodiment, as shown in fig. 2, an object recommendation method is provided, which is described by taking the method as an example applied to the terminal 101 in fig. 1, and may include the following steps:
step S201, a search text for an object to be recommended is acquired.
Generally, when a user browses a commodity on a shopping platform, after determining a target commodity, the user can click a button for searching similar commodities on a target commodity page, and at this time, a server responds to a search instruction of the user and can firstly acquire a search text of the target commodity, wherein the search text can be text information corresponding to the target commodity; furthermore, the recommended objects can be displayed through the display screen of the terminal, and the recommended objects can comprise the same brand and the same type of one or more target commodities. The object to be recommended includes a recommended object.
Step S202, the search text is segmented to obtain text characters corresponding to the search text.
The search text can be text information such as sentences, paragraphs, articles and the like;
in the embodiment of the present application, the search text may be segmented by a data segmentation algorithm, and may be segmented into a single character, where the text character may include multiple characters.
Step S203, performing prefix index on the text characters in a pre-constructed array to obtain a first matching result corresponding to the text characters; the array includes a single child node tree structure.
In an embodiment of the present application, the pre-constructed array may include a single child node tree structure. In one possible implementation, the array may be considered as part of data in the original double-array Trie, i.e., a single-child node tree structure.
The specific introduction steps of the prefix index and the first matching result can be referred to the following related description in fig. 3, and are not described herein again.
For an introduction to the double-array Trie, see the following:
the Double-Array Trie structure is a compressed form of the Trie structure, only two linear arrays are used for representing the Trie tree, and the structure has the advantages of fast reading and high searching efficiency. The essence of the double-array Trie is a finite state automata, each node represents one state of the automata, state transfer is carried out according to different variables, and when the end state is reached or transfer cannot be carried out, one-time query operation is completed.
A Double-Array Trie is a Trie with low spatial complexity, which is composed of two integer linear arrays, one is base [ ], and the other is check [ ], each element in the base [ ] Array is equivalent to 1 node in the Trie, and its value is used as a base value for transferring to the next state, and the check [ ] value is the first 1 state of the current state.
And step S204, performing maximum matching on the text characters in the pre-constructed double-array Trie tree according to the first matching result to obtain the text participles of the search text.
The maximum matching algorithm includes a forward maximum matching algorithm for word by word and a reverse maximum matching algorithm for word by word, and it should be understood that the Chinese character segmentation is performed by using a bidirectional segmentation method based on the maximum matching as a principle. The maximum matching principle is to find the longest matching character string in the double-array Trie tree dictionary as a word. For example, "i love the people's republic of China", the word segmentation is performed according to the maximum matching principle, and the word segmentation result is "i", "love", "people's republic of China", rather than "i", "love", "China", "people" and "republic of China".
The description of the double-array Trie may refer to the related description in step S203, and is not described herein again.
The text participles are texts after the participles are obtained, for example, if "i love the people's republic of China" can be regarded as a search text, the array of "i", "love", "people's republic of China" is the text participles.
And S205, acquiring an object to be recommended matched with the text participles from the recommended object database by using the text participles.
The recommendation object database can be a massive commodity library of a shopping platform and is used for acquiring recommendation objects.
The objects to be recommended may include one or more target commodities of the same brand and the same category. In some possible implementation manners, a part of objects to be recommended acquired may be displayed through a display screen of the terminal, that is, the recommended objects are obtained.
In the object recommendation method, device and computer equipment, the dictionary for segmenting the search text can be segmented in advance and stored in the array and the double-array Trie tree respectively, and further, prefix indexing can be performed on the text characters in the array constructed in advance to obtain a first matching result corresponding to the text characters; secondly, performing maximum matching on the text characters in a pre-constructed double-array Trie tree according to the first matching result to obtain text participles of the search text; and finally, acquiring the object to be recommended matched with the text word segmentation from a recommended object database by using the text word segmentation. In the method provided by the embodiment of the application, a part of data in the original double-array Trie tree, such as a single-child node tree structure, is stored in the array, so that a huge trunk is prevented from being formed, the memory of a server is consumed, and the problem of large memory consumption is solved.
In some embodiments, prefix indexing is performed on text characters in a pre-constructed array to obtain a first matching result corresponding to the text characters; the array including the single child tree structure may include:
step S301, combining a first character and a second character in the text characters to obtain a first entry.
For example, after being segmented, the 'I love the people's republic of China 'can be changed into' I ',' love ',' Zhonghua ',' Man ',' Min ',' Co ',' and 'Guo', and the latter array is the text character. The first character 'me' and the second character 'love' are combined to obtain a first entry, and the first entry is 'me love'.
In the embodiment of the application, each character in the text characters can be mapped with a numerical value through a hash algorithm, namely, a hash value is used for word segmentation, the hash value corresponding to the first character "me" can include the address information of the second character "ai", and therefore, in the process of prefix indexing in the array, the second character can be found according to the first character.
Step S302, a first entry is matched in the array.
The process is to query the second character in the array for address information of the first character and the second character contained in the first character in the text characters.
Step S303, in case of successful matching, determining that the first matching result is that the array includes the first entry.
If the second character in the text characters can be searched in the array, the matching is successful, and the array is indicated to comprise the first entry.
Step S304, under the condition of failed matching, determining that the first matching result is that the array does not comprise the first vocabulary entry.
If the second character in the text characters cannot be searched in the array, the matching is failed, and the array can be described as not including the first entry.
In some embodiments, as shown in fig. 4, according to the first matching result, performing maximum matching on text characters in a pre-constructed double-array Trie, and obtaining text segmentation of the search text may include:
step S401, taking the last character of the first entry as the current character when the first matching result is that the array includes the first entry.
Because the array only contains data of the tree structure of the single child node, after the first matching result is obtained, prefix indexing needs to be continued in the double-array Trie tree. In the case where the array includes the first entry, the third character may be queried in the double-array Trie using address information of the third character included in the first entry and the second character.
Step S402, acquiring a next character of a current character in the text characters, and combining the first entry and the next character to obtain a second entry; and the second entry is taken as the current entry.
The address information of the next character exists in the current character, so that when the next character is indexed in the double-array Trie, the next character can be queried through the address information.
Step S403, querying the current entry in the double-array Trie to obtain a current matching result corresponding to the current entry.
Step S404, when the current matching result is that the double-array Trie does not include the current entry and the current entry is not a prefix of a word in the double-array Trie, taking the current entry as a word in the text word segmentation.
And under the condition that the current matching result is that the double-array Trie tree does not comprise the current entry and the current entry is not the prefix of the word in the double-array Trie tree, indicating that the current entry reaches the maximum matching in the double-array Trie tree, so that the word can be segmented out and used as a word in the text word segmentation.
In some embodiments, querying the double-array Trie for the current term to obtain a current matching result corresponding to the current term may further include:
taking the last character of the current entry as the current character under the condition that the current matching result is that the double-array Trie tree comprises the current entry or the current entry is the prefix of the word in the double-array Trie tree;
acquiring a next character of a current character in text characters, and combining the current entry and the next character to obtain a new current entry; and returning to execute the step of inquiring the current entry in the double-array Trie tree to obtain a current matching result corresponding to the current entry until the current matching result is that the double-array Trie tree does not comprise the current entry and the current entry is not the prefix of the word in the double-array Trie tree, and at the moment, taking the current entry as one word in the text word segmentation.
In some embodiments, as shown in fig. 5, performing maximum matching on text characters in a pre-constructed double-array Trie according to the first matching result, and obtaining text segments of the search text may include:
step S501, when the first matching result is that the array does not include the first entry, taking a first character in the first entry as a current character.
In the case that the first entry is not included in the array, the query of the second character in the double-array Trie needs to be continued.
Step S502, obtaining the next character of the current character in the text characters, and obtaining the current entry by the combination of the current character and the next character.
Step S503, inquiring the current entry in the double-array Trie tree to obtain the current matching result corresponding to the current entry.
Step S504, under the condition that the current matching result is that the double-array Trie tree does not comprise the current entry and the current entry is not the prefix of the word in the double-array Trie tree, the current entry is used as one word segmentation in the text word segmentation.
In some embodiments, querying the double-array Trie for the current term to obtain a current matching result corresponding to the current term may further include:
taking the last character of the current entry as the current character under the condition that the current matching result is that the double-array Trie tree comprises the current entry or the current entry is the prefix of the word in the double-array Trie tree;
acquiring a next character of a current character in text characters, and combining the current entry and the next character to obtain a new current entry; and returning to execute the step of inquiring the current entry in the double-array Trie tree to obtain a current matching result corresponding to the current entry until the current matching result is that the double-array Trie tree does not comprise the current entry and the current entry is not the prefix of the word in the double-array Trie tree, and at the moment, taking the current entry as one word in the text word segmentation.
In the method, the searching text is subjected to word segmentation by the aid of the double-array Trie tree, word segmentation efficiency is improved, searching efficiency is improved, pointer directivity exists between the former character and the latter character, and searching efficiency is further improved.
In some embodiments, the object recommendation method provided in the embodiments of the present application may further include:
step 1, searching a search text in a data pool; the data pool includes historical participled text.
The method comprises the steps that a historical participle text is stored in a data pool, if a search text can be inquired in the historical participle text, the participle operation does not need to be carried out on the search text, namely, the search text does not need to be segmented, and text characters corresponding to the search text are obtained; performing prefix index on the text characters in a pre-constructed array to obtain a first matching result corresponding to the text characters; the array comprises a single child node tree structure; performing maximum matching on text characters in a pre-constructed double-array Trie tree according to a first matching result to obtain text participles of a search text; the double-array Trie tree comprises a multi-child-node tree structure, and an object to be recommended matched with the text participle can be directly obtained from a recommended object database.
And 2, under the condition that no search text exists in the data pool, segmenting the search text to obtain text characters corresponding to the search text.
If the search text cannot be inquired in the historical participle text, the participle operation needs to be executed on the search text, namely, the search text does not need to be segmented, and text characters corresponding to the search text are obtained; performing prefix index on the text characters in a pre-constructed array to obtain a first matching result corresponding to the text characters; the array comprises a single child node tree structure; performing maximum matching on text characters in a pre-constructed double-array Trie tree according to a first matching result to obtain text participles of a search text; the double-array Trie tree comprises a multi-child-node tree structure, so that the object to be recommended matched with the text participle can be obtained from the recommended object database by using the text participle.
In the method, the search text is firstly inquired in the historical participled texts in the data pool, if the search text exists, the participled operation is not needed, and the search efficiency is improved.
In one embodiment, as shown in fig. 6, obtaining the object to be recommended matching with the text participle from the recommendation object database by using the text participle may include the following steps:
step S601, acquiring word segmentation number of text word segmentation.
For example, the number of the word segments of the text word = { "i", "love", "the people's republic of china" } is 3.
Step S602, determining a product of a preset repetition rate threshold and the number of the segmented words as the number of the repeated segmented words of the search text and the matching text in the recommendation object database.
For example, the preset repetition rate threshold may be 60%, and then the number of repeated participles is 3 times 60%, which is 1.8, and the whole is 1.
Step S603, determining the object corresponding to the matching text as the object to be recommended when the number of repeated word segmentations is greater than or equal to a preset matching threshold.
For example, the preset matching threshold may be 3, and the number of repeated word segmentations obtained in step S602 is 1, then the object corresponding to the matching text may not be determined as the object to be recommended.
In some embodiments, as shown in fig. 7, another object recommendation method is provided, which may include the steps of:
step S701, the dictionary data is split and stored in the array and the double-array Trie respectively.
Step S702, acquiring a search text aiming at the object to be recommended.
Step S703, search texts are inquired in a data pool; the data pool includes historical participled text.
Step S704, under the condition that no search text exists in the data pool, segmenting the search text to obtain text characters corresponding to the search text.
Step S705, prefix index is carried out on the text characters in the pre-constructed array, and a first matching result corresponding to the text characters is obtained.
In some embodiments, step S705 may include:
step 1, combining a first character and a second character in text characters to obtain a first entry.
For example, after segmentation, the 'I love the people's republic of China 'can be changed into' I ',' Zhong ',' Hua ',' Man ',' Co ',' and 'Guo', and the latter array is the text character. The first character "I" and the second character "love" are combined to obtain a first entry, and the first entry is the "I love".
In the embodiment of the application, each character in the text characters can be mapped with a numerical value, namely a hash value, through a hash algorithm, so that word segmentation is facilitated, the hash value corresponding to the first character "i" can include address information of the second character "i", and therefore, in the process of prefix indexing in an array, the second character can be found according to the first character.
And 2, matching the first entry in the array.
The process is to query the second character in the array for address information of the first character and the second character contained in the first character in the text characters.
And 3, under the condition of successful matching, determining that the first matching result is that the array comprises the first vocabulary entry.
If the second character in the text characters can be searched in the array, the matching is successful, and the array is indicated to comprise the first entry.
And 4, under the condition of failed matching, determining that the first matching result is that the array does not comprise the first vocabulary entry.
If the second character in the text characters cannot be searched in the array, the matching is failed, and the array can be described as not including the first entry.
Step S706, according to the first matching result, performing maximum matching on the text characters in the pre-constructed double-array Trie tree to obtain the text participles of the search text.
In some embodiments, step S706 may include:
step 1, taking the last character of the first entry as the current character under the condition that the array comprises the first entry as the first matching result.
Because the array only contains data of the tree structure of the single child node, after the first matching result is obtained, prefix indexing needs to be continued in the double-array Trie tree. In the case where the array includes the first entry, the third character may be queried in the double-array Trie using address information of the third character included in the first entry and the second character.
Step 2, acquiring a next character of a current character in the text characters, and combining the first entry and the next character to obtain a second entry; and the second entry is taken as the current entry.
The address information of the next character exists in the current character, so that when the next character is indexed in the double-array Trie tree, the next character can be inquired through the address information.
And 3, inquiring the current entry in the double-array Trie tree to obtain a current matching result corresponding to the current entry.
And 4, taking the current entry as one word segmentation in the text word segmentation under the condition that the current matching result is that the double-array Trie tree does not comprise the current entry and the current entry is not the prefix of the word in the double-array Trie tree.
And under the condition that the current matching result is that the double-array Trie tree does not comprise the current entry and the current entry is not the prefix of the word in the double-array Trie tree, indicating that the current entry reaches the maximum matching in the double-array Trie tree, so that the word can be segmented out and used as a word in the text word segmentation.
In some embodiments, querying the double-array Trie for the current term to obtain a current matching result corresponding to the current term may further include:
taking the last character of the current entry as the current character under the condition that the current matching result is that the double-array Trie tree comprises the current entry or the current entry is the prefix of the word in the double-array Trie tree;
acquiring a next character of a current character in the text characters, and combining the current entry and the next character to obtain a new current entry; and returning to execute the step of inquiring the current entry in the double-array Trie tree to obtain a current matching result corresponding to the current entry until the current matching result is that the double-array Trie tree does not comprise the current entry and the current entry is not the prefix of the word in the double-array Trie tree, and at the moment, taking the current entry as one word in the text word segmentation.
In some embodiments, step S706 may further include:
step 1, taking the first character in the first entry as the current character under the condition that the first matching result is that the array does not comprise the first entry.
In the case that the first entry is not included in the array, the query of the second character in the double-array Trie needs to be continued.
And 2, acquiring a next character of the current character in the text characters, and combining the current character and the next character to obtain a current entry.
And 3, inquiring the current entry in the double-array Trie tree to obtain a current matching result corresponding to the current entry.
And 4, taking the current entry as one word segmentation in the text word segmentation on the basis of the condition that the current matching result is that the double-array Trie tree does not comprise the current entry and the current entry is not the prefix of the word in the double-array Trie tree.
In some embodiments, querying the double-array Trie for the current term to obtain a current matching result corresponding to the current term may further include:
taking the last character of the current entry as the current character under the condition that the current matching result is that the double-array Trie tree comprises the current entry or the current entry is the prefix of the word in the double-array Trie tree;
acquiring a next character of a current character in text characters, and combining the current entry and the next character to obtain a new current entry; and returning to execute the step of inquiring the current entry in the double-array Trie tree to obtain a current matching result corresponding to the current entry until the current matching result is that the double-array Trie tree does not comprise the current entry and the current entry is not the prefix of the word in the double-array Trie tree, and at the moment, taking the current entry as one word in the text word segmentation.
In step S707, the number of word segments of the text word segment is obtained.
For example, the number of the word segments of the text word = { "i", "love", "the people's republic of china" } is 3.
Step 708, determining the product of the preset repetition rate threshold and the number of the participles as the number of the repeated participles of the search text and the matched text in the recommendation object database.
For example, the preset repetition rate threshold may be 60%, and then the number of repeated participles is 3 times 60%, which is 1.8, and the whole is 1.
And step S709, determining the object corresponding to the matched text as the object to be recommended under the condition that the number of repeated word segmentation is greater than or equal to a preset matching threshold value.
For example, the preset matching threshold may be 3, and the number of repeated word segmentations obtained in step S602 is 1, then the object corresponding to the matching text may not be determined as the object to be recommended.
In the object recommendation method, device and computer equipment, the dictionary for segmenting the search text can be segmented in advance and stored in the array and the double-array Trie tree respectively, and further, prefix indexing can be performed on the text characters in the array constructed in advance to obtain a first matching result corresponding to the text characters; then, according to the first matching result, performing maximum matching on the text characters in a pre-constructed double-array Trie tree to obtain text participles of the search text; and finally, acquiring the object to be recommended matched with the text word segmentation from a recommended object database by using the text word segmentation. In the method provided by the embodiment of the application, a part of data in the original double-array Trie tree, such as a single-child node tree structure, is stored in the array, so that a huge trunk is prevented from being formed, the memory of a server is consumed, and the problem of large memory consumption is solved.
It should be understood that, although the steps in the flowcharts related to the embodiments as described above are sequentially displayed as indicated by arrows, the steps are not necessarily performed sequentially as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least a part of the steps in the flowcharts related to the embodiments described above may include multiple steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, and the execution order of the steps or stages is not necessarily sequential, but may be rotated or alternated with other steps or at least a part of the steps or stages in other steps.
Based on the same inventive concept, the embodiment of the present application further provides an object recommending apparatus for implementing the object recommending method. The implementation scheme for solving the problem provided by the device is similar to the implementation scheme recorded in the method, so the specific limitations in one or more embodiments of the object recommendation device provided below can be referred to the limitations of the object recommendation method in the above, and are not described herein again.
In one embodiment, as shown in fig. 8, there is provided an object recommending apparatus including: a first obtaining module 810, a first matching module 820, a second matching module 830, and a second obtaining module 840, wherein:
a first obtaining module 810, configured to obtain a search text for an object to be recommended;
the segmentation module is used for segmenting the search text to obtain text characters corresponding to the search text;
a first matching module 820, configured to perform prefix indexing on text characters in a pre-constructed array to obtain a first matching result; the array comprises a single child node tree structure;
the second matching module 830 is configured to perform maximum matching on text characters in a pre-constructed double-array Trie according to the first matching result, so as to obtain text segments of the search text; the double-array Trie tree comprises a multi-child node tree structure;
the second obtaining module 840 is configured to obtain, from the recommended object database, an object to be recommended that matches the text participle by using the text participle.
The modules in the object recommending apparatus can be wholly or partially implemented by software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.
In one embodiment, a computer device is provided, which may be a terminal, and its internal structure diagram may be as shown in fig. 9. The computer device includes a processor, a memory, a communication interface, a display screen, and an input device connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operating system and the computer program to run on the non-volatile storage medium. The communication interface of the computer device is used for carrying out wired or wireless communication with an external terminal, and the wireless communication can be realized through WIFI, a mobile cellular network, NFC (near field communication) or other technologies. The computer program is executed by a processor to implement a radio-based intelligent monitoring and control method. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, a key, a track ball or a touch pad arranged on a shell of the computer equipment, an external keyboard, a touch pad or a mouse and the like.
Those skilled in the art will appreciate that the architecture shown in fig. 9 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
In one embodiment, a computer device is provided, comprising a memory having a computer program stored therein and a processor that when executing the computer program performs the steps of:
acquiring a search text for an object to be recommended;
segmenting the search text to obtain text characters corresponding to the search text;
performing prefix index on the text characters in a pre-constructed array to obtain a first matching result corresponding to the text characters; the array comprises a single child node tree structure;
performing maximum matching on text characters in a pre-constructed double-array Trie tree according to a first matching result to obtain text participles of a search text; the double-array Trie tree comprises a multi-child node tree structure;
and acquiring the object to be recommended matched with the text participles from the recommended object database by using the text participles.
In one embodiment, a computer program product is provided, comprising a computer program which, when executed by a processor, performs the steps of:
acquiring a search text aiming at an object to be recommended;
segmenting the search text to obtain text characters corresponding to the search text;
performing prefix index on the text characters in a pre-constructed array to obtain a first matching result corresponding to the text characters; the array comprises a single child node tree structure;
performing maximum matching on text characters in a pre-constructed double-array Trie tree according to a first matching result to obtain text participles of a search text; the double-array Trie tree comprises a multi-child node tree structure;
and acquiring the object to be recommended matched with the text word segmentation from the recommended object database by using the text word segmentation.
It should be noted that, the user information (including but not limited to user device information, user personal information, etc.) and data (including but not limited to data for analysis, stored data, presented data, etc.) referred to in the present application are information and data authorized by the user or sufficiently authorized by each party.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above may be implemented by hardware that is instructed by a computer program, and the computer program may be stored in a non-volatile computer-readable storage medium, and when executed, may include the processes of the embodiments of the methods described above. Any reference to memory, databases, or other media used in the embodiments provided herein can include at least one of non-volatile and volatile memory. The nonvolatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical Memory, high-density embedded nonvolatile Memory, resistive Random Access Memory (ReRAM), magnetic Random Access Memory (MRAM), ferroelectric Random Access Memory (FRAM), phase Change Memory (PCM), graphene Memory, and the like. Volatile Memory can include Random Access Memory (RAM), external cache Memory, and the like. By way of illustration and not limitation, RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM), among others. The databases referred to in various embodiments provided herein may include at least one of relational and non-relational databases. The non-relational database may include, but is not limited to, a block chain based distributed database, and the like. The processors referred to in the various embodiments provided herein may be, without limitation, general purpose processors, central processing units, graphics processors, digital signal processors, programmable logic devices, quantum computing-based data processing logic devices, or the like.
All possible combinations of the technical features in the above embodiments may not be described for the sake of brevity, but should be considered as being within the scope of the present disclosure as long as there is no contradiction between the combinations of the technical features.
The above examples only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the present application. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present application should be subject to the appended claims.

Claims (10)

1. An object recommendation method, characterized in that the method comprises:
acquiring a search text aiming at an object to be recommended;
segmenting the search text to obtain text characters corresponding to the search text;
performing prefix index on the text characters in a pre-constructed array to obtain a first matching result corresponding to the text characters; the array comprises a single child node tree structure;
performing maximum matching on the text characters in a pre-constructed double-array Trie tree according to the first matching result to obtain text word segmentation of the search text; the dual array Trie tree comprises a multi-child node tree structure;
and acquiring an object to be recommended matched with the text participles from a recommended object database by using the text participles.
2. The method of claim 1, wherein the prefix indexing of the text characters in a pre-constructed array to obtain a first matching result corresponding to the text characters comprises:
combining a first character and a second character in the text characters to obtain a first entry;
matching the first term in the array;
under the condition that matching is successful, determining that the first matching result is that the array comprises the first entry;
and under the condition of failed matching, determining that the first matching result is that the array does not comprise the first vocabulary entry.
3. The method of claim 2, wherein the performing the maximum matching on the text characters in a pre-constructed double-array Trie according to the first matching result to obtain the text segmentation of the search text comprises:
taking the last character of the first entry as the current character under the condition that the array comprises the first entry as the first matching result;
acquiring a next character of the current character in the text characters, and combining the first entry and the next character to obtain a second entry; taking the second entry as a current entry;
inquiring the current entry in the double-array Trie tree to obtain a current matching result corresponding to the current entry;
and under the condition that the current matching result is that the double-array Trie tree does not comprise the current entry and the current entry is not the prefix of the word in the double-array Trie tree, taking the current entry as one word segmentation in the text word segmentation.
4. The method of claim 3, wherein after querying the double-array Trie tree for the current term and obtaining a current matching result corresponding to the current term, the method further comprises:
taking the last character of the current entry as the current character under the condition that the current matching result is that the double-array Trie tree comprises the current entry or the current entry is the prefix of the word in the double-array Trie tree;
and acquiring a next character of the current character in the text characters, acquiring a new current entry by combining the current entry and the next character, and returning to execute the step of inquiring the current entry in the double-array Trie tree to acquire a current matching result corresponding to the current entry until the current matching result is that the double-array Trie tree does not comprise the current entry and the current entry is not a prefix of a word in the double-array Trie tree.
5. The method according to claim 2, wherein the performing maximum matching on the text characters in a pre-constructed double-array Trie according to the first matching result to obtain the text participles of the search text comprises:
taking a first character in the first entry as a current character under the condition that the array does not comprise the first entry as a first matching result;
acquiring a next character of the current character in the text characters, and combining the current character and the next character to obtain a current entry;
inquiring the current entry in the double-array Trie tree to obtain a current matching result corresponding to the current entry;
and under the condition that the current matching result is that the double-array Trie tree does not comprise the current entry and the current entry is not the prefix of the word in the double-array Trie tree, taking the current entry as one word in the text word segmentation.
6. The method of claim 1, further comprising:
querying a data pool for the search text; the data pool comprises historical participled text;
and under the condition that the search text does not exist in the data pool, segmenting the search text to obtain text characters corresponding to the search text.
7. The method according to claim 1, wherein the obtaining, by using the text participles, an object to be recommended that matches the text participles from a recommendation object database comprises:
acquiring the word segmentation number of the text word segmentation;
determining the product of a preset repetition rate threshold and the word segmentation number as the repeated word segmentation number of the search text and the matched text in the recommended object database;
and under the condition that the number of the repeated word segments is greater than or equal to a preset matching threshold value, determining the object corresponding to the matched text as the object to be recommended.
8. An object recommendation apparatus, characterized in that the apparatus comprises:
the device comprises a first acquisition module, a second acquisition module and a third acquisition module, wherein the first acquisition module is used for acquiring a search text aiming at an object to be recommended;
the segmentation module is used for segmenting the search text to obtain text characters corresponding to the search text;
the first matching module is used for carrying out prefix index on the text characters in a pre-constructed array to obtain a first matching result; the array comprises a single child node tree structure;
the second matching module is used for performing maximum matching on the text characters in a pre-constructed double-array Trie tree according to the first matching result to obtain text word segmentation of the search text; the double-array Trie tree comprises a multi-child node tree structure;
and the second acquisition module is used for acquiring the object to be recommended matched with the text participle from a recommended object database by using the text participle.
9. A computer arrangement comprising a memory and a processor, the memory storing a computer program, characterized in that the processor, when executing the computer program, implements the steps of the method according to any of claims 1-7.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 7.
CN202211510311.3A 2022-11-29 2022-11-29 Object recommendation method and device and computer equipment Pending CN115730596A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211510311.3A CN115730596A (en) 2022-11-29 2022-11-29 Object recommendation method and device and computer equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211510311.3A CN115730596A (en) 2022-11-29 2022-11-29 Object recommendation method and device and computer equipment

Publications (1)

Publication Number Publication Date
CN115730596A true CN115730596A (en) 2023-03-03

Family

ID=85298975

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211510311.3A Pending CN115730596A (en) 2022-11-29 2022-11-29 Object recommendation method and device and computer equipment

Country Status (1)

Country Link
CN (1) CN115730596A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117033563A (en) * 2023-10-10 2023-11-10 北京轻松怡康信息技术有限公司 Text retrieval method and device, electronic equipment and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117033563A (en) * 2023-10-10 2023-11-10 北京轻松怡康信息技术有限公司 Text retrieval method and device, electronic equipment and storage medium
CN117033563B (en) * 2023-10-10 2024-04-26 北京轻松怡康信息技术有限公司 Text retrieval method and device, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
US9454580B2 (en) Recommendation system with metric transformation
CN107786943A (en) A kind of tenant group method and computing device
CN112988980B (en) Target product query method and device, computer equipment and storage medium
CN115730596A (en) Object recommendation method and device and computer equipment
CN114328798B (en) Processing method, device, equipment, storage medium and program product for searching text
CN116522003B (en) Information recommendation method, device, equipment and medium based on embedded table compression
CN111161009B (en) Information pushing method, device, computer equipment and storage medium
CN115687560A (en) Mass keyword searching method based on finite automaton
CN116975359A (en) Resource processing method, resource recommending method, device and computer equipment
CN115687350A (en) Index construction method and device, computer equipment and storage medium
CN114595389A (en) Address book query method, device, equipment, storage medium and program product
KR102062139B1 (en) Method and Apparatus for Processing Data Based on Intelligent Data Structure
CN116303405B (en) Data duplicate checking method and device and computer equipment
CN116414778A (en) File integration method, apparatus, computer device, storage medium and program product
CN113051302B (en) Overall design-oriented multi-dimensional data matching method and device and computer storage medium
CN117633395A (en) Multi-website structured data acquisition method, device and computer equipment
CN115659968A (en) Professional term recognition method, device, computer equipment and storage medium
CN114780681A (en) Audit scheme recommendation method and device, computer equipment and storage medium
CN116049350A (en) Data retrieval method, device, computer equipment and storage medium
CN117391057A (en) Table cell processing method and apparatus
CN117033451A (en) Searching method, searching device, computer equipment and storage medium
CN114168787A (en) Music recommendation method and device, computer equipment and storage medium
CN114547066A (en) Nuclear power business data standardization method and device and computer equipment
CN116383508A (en) Searching method, searching device, computer equipment and storage medium
CN114662480A (en) Synonym label judging method and device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination