CN115907928A - Commodity recommendation method, commodity recommendation device, commodity recommendation equipment and commodity recommendation medium - Google Patents

Commodity recommendation method, commodity recommendation device, commodity recommendation equipment and commodity recommendation medium Download PDF

Info

Publication number
CN115907928A
CN115907928A CN202211737299.XA CN202211737299A CN115907928A CN 115907928 A CN115907928 A CN 115907928A CN 202211737299 A CN202211737299 A CN 202211737299A CN 115907928 A CN115907928 A CN 115907928A
Authority
CN
China
Prior art keywords
commodity
word
phrase
grammatical
candidate
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211737299.XA
Other languages
Chinese (zh)
Inventor
黄丕帅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Huanju Shidai Information Technology Co Ltd
Original Assignee
Guangzhou Huanju Shidai Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Huanju Shidai Information Technology Co Ltd filed Critical Guangzhou Huanju Shidai Information Technology Co Ltd
Priority to CN202211737299.XA priority Critical patent/CN115907928A/en
Publication of CN115907928A publication Critical patent/CN115907928A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Abstract

The application relates to a commodity recommendation method, a commodity recommendation device, commodity recommendation equipment and commodity recommendation media in the technical field of e-commerce, wherein the method comprises the following steps: acquiring a commodity title of a commodity, segmenting the commodity title and constructing a corresponding candidate phrase set; judging whether a grammatical relation exists between the word elements in each candidate phrase in the candidate phrase set, and determining a grammatical structure to which the corresponding grammatical relation belongs when the grammatical relation exists; determining the similarity between each candidate phrase of the same genus and the same grammatical structure and the title of the commodity; and screening out the candidate phrases with the similarity meeting the preset conditions according to each grammar structure, wherein the candidate phrases are used as matching keywords of the commodity for commodity recommendation. According to the method and the device, the multi-dimensional keywords of the commodity are deeply mined by combining the grammatical information of the phrases and used for commodity recommendation, and the precision of recommending the commodity can be improved.

Description

Commodity recommendation method, commodity recommendation device, commodity recommendation equipment and commodity recommendation medium
Technical Field
The present application relates to the field of e-commerce technologies, and in particular, to a method for recommending a commodity and a corresponding apparatus, computer device, and computer-readable storage medium.
Background
The commodity conversion rate is one of the core indexes which are mainly concerned about commodity sales profit, and commodity recommendation is just a great interest device for improving the commodity conversion rate. For consumers, the commodity recommendation can enable the consumers to directly pay attention to the recommended commodities without searching, time cost is reduced, shopping experience is improved, and for sellers, the commodity recommendation can enable the exposure rate of the commodities to be improved, commodity transaction is promoted, and revenues are increased.
Generally, a method for recommending commodities based on keywords of the commodities is to determine corresponding commodities as recommendations by considering whether the keywords are hit, however, in an e-commerce scene of an independent site, a commodity title is freely edited by a merchant and is not restricted by explicit editing specifications, and therefore, it becomes particularly difficult to extract the corresponding keywords from the commodity title.
In view of the difficulty of commodity recommendation, the applicant makes corresponding researches.
Disclosure of Invention
A primary object of the present application is to solve at least one of the above problems and provide a method for recommending merchandise, and a corresponding apparatus, computer device, and computer-readable storage medium.
In order to meet various purposes of the application, the following technical scheme is adopted in the application:
a merchandise recommendation method adapted to one of the objects of the present application is provided, including the steps of:
acquiring a commodity title of a commodity, segmenting the commodity title and constructing a corresponding candidate phrase set;
judging whether a grammatical relation exists between the word elements in each candidate phrase in the candidate phrase set, and determining a grammatical structure to which the corresponding grammatical relation belongs when the grammatical relation exists;
determining the similarity between each candidate phrase of the same category and the same grammatical structure and the commodity title;
and screening out the candidate phrases with the similarity meeting the preset conditions for each grammatical structure, and taking the candidate phrases as matching keywords of the commodity for commodity recommendation.
In a further embodiment, the method for acquiring the commodity title of the commodity and constructing the corresponding candidate phrase set after segmenting the commodity title comprises the following steps:
acquiring a commodity title of a commodity, and respectively performing sliding word extraction on the commodity title by adopting a preset word extraction window in multiple moving step lengths to obtain corresponding multiple word segmentation phrases;
summarizing each word segmentation phrase to form a candidate phrase set;
in a further embodiment, the method for obtaining the commodity title of the commodity and respectively performing sliding word extraction on the commodity title by adopting a preset word extraction window and multiple moving step lengths to obtain corresponding multiple word segmentation phrases comprises the following steps:
acquiring a commodity title of a commodity;
moving a word-taking window with a first step length to take words from the commodity title to obtain a first word-dividing sequence, wherein the first step length is smaller than the window size of the word-taking window;
moving a word-taking window with a second step length to take words from the commodity title to obtain a second word segmentation sequence, wherein the second step length is equal to the window size of the word-taking window;
and moving a word-taking window with a third step length to take words from the commodity title to obtain a third word segmentation sequence, wherein the third step length is larger than the window size of the word-taking window.
In a further embodiment, determining whether a grammatical relationship exists between the lemmas in each candidate phrase in the candidate phrase set, and when the grammatical relationship exists, determining the grammar structure to which the corresponding grammatical relationship belongs includes the following steps:
extracting deep semantic information of the candidate phrases by using a preset multi-path classification model and taking a single candidate phrase in the candidate phrase set as input to obtain a corresponding semantic feature vector;
determining whether a grammatical relation exists between the word elements in each candidate phrase in the candidate phrase set through a first classifier of the multi-path classification model based on the semantic feature vector;
when the prediction result of the first classifier is that a grammatical relation exists, determining a grammatical structure of the candidate phrases between the lemmas based on semantic feature vectors of the corresponding candidate phrases through a second classifier of the multi-path classification model;
in a further embodiment, determining the similarity between each candidate phrase of the same generic and same syntactic structure and the title of the commodity comprises the following steps:
extracting deep semantic information corresponding to each candidate phrase and the commodity title which belong to the same syntactic structure by adopting a preset text coding model, and obtaining a text coding vector corresponding to each candidate phrase and a text coding vector of the commodity title;
and calculating the similarity between the text code vector corresponding to each candidate phrase and the text code vector of the commodity title.
In a further embodiment, after the candidate phrases with the similarity satisfying the preset condition are screened out for each grammar structure as the matching keywords of the commodity, the method further includes the following steps:
acquiring matching keywords corresponding to the target commodity and other commodities;
the weights of the number of the matched keywords of the target commodity and other commodities corresponding to the same category and the same grammatical structure and the corresponding grammatical structure are added to obtain a middle value, the ratio of the middle value to the number of the matched keywords of the target commodity and other commodities in total is calculated, and the similarity score between the target commodity and other commodities is obtained;
and screening other commodities with similar scores meeting preset conditions as recommended commodities of the target commodity.
In a further embodiment, before a preset multi-path classification model is adopted and a single candidate phrase in the candidate phrase set is used as an input, the method further comprises the following steps:
obtaining a single phrase sample and a supervision tag thereof from a prepared training set, wherein the phrase sample comprises a plurality of word elements, the supervision tag characterizes whether grammatical relations exist among the word elements in the phrase sample, and when the supervision tag characterizes the grammatical relations among the word elements in the phrase sample, the supervision tag also comprises a grammar structure which characterizes the belonging of the word elements in the corresponding phrase sample;
inputting the phrase samples into a multi-path classification model, extracting deep semantic information corresponding to the phrase samples, and obtaining corresponding semantic feature vectors;
predicting a corresponding first inference result through a first classifier based on the semantic feature vector;
predicting a corresponding second inference result through a second classifier based on the semantic feature vector when grammatical relations exist among the word elements in the supervision label characterization phrase sample;
and adopting the supervision label of the phrase sample, correspondingly determining a first loss value corresponding to the first reasoning result and the second reasoning result when the syntactic relation exists between the word elements in the supervision label characterization phrase sample, otherwise, determining a second loss value of the first reasoning result, and when the first loss value or the second loss value does not reach a corresponding preset threshold value, implementing weight updating on the multi-path classification model, and continuously calling other phrase samples to implement iterative training until the multi-path classification model converges.
On the other hand, the commodity recommendation device adapted to one of the purposes of the present application includes a set construction module, a grammar determination module, a similarity determination module, and a keyword screening module, where the set construction module is configured to obtain a commodity title of a commodity, and construct a corresponding candidate phrase set after performing word segmentation on the commodity title; the grammar determining module is used for judging whether grammar relations exist among the word elements in each candidate phrase in the candidate phrase set, and determining the grammar structure to which the corresponding grammar relations belong when the grammar relations exist; the similarity determining module is used for determining the similarity between each candidate phrase with the same category and the same grammatical structure and the commodity title; and the keyword screening module is used for screening the candidate phrases with the similarity meeting the preset conditions aiming at each grammar structure to serve as matching keywords of the commodity for commodity recommendation.
In a further embodiment, the set construction module includes: the sliding word-taking submodule is used for obtaining the commodity titles of the commodities, and the commodity titles are respectively subjected to sliding word-taking by adopting a preset word-taking window and multiple moving step lengths to obtain corresponding multiple word-segmentation phrases; the phrase summarizing submodule is used for summarizing each word segmentation phrase to form a candidate phrase set;
in a further embodiment, the sliding word-taking submodule includes: a title acquisition unit for acquiring a title of a commodity; the first word-taking unit is used for moving a word-taking window with a first step length to take words of the commodity title to obtain a first word-dividing sequence, and the first step length is smaller than the window size of the word-taking window; the second word-taking unit is used for moving the word-taking window with a second step length to take words from the commodity title to obtain a second word segmentation sequence, wherein the second step length is equal to the window size of the word-taking window; and the third word-taking unit is used for moving a word-taking window to take words for the commodity title by a third step length to obtain a third word-dividing sequence, and the third step length is larger than the window size of the word-taking window.
In a further embodiment, the syntax determination module includes: the inference characteristic representation submodule is used for extracting deep semantic information of the candidate phrases by taking a single candidate phrase in the candidate phrase set as input by adopting a preset multi-path classification model and obtaining a corresponding semantic feature vector; the binary classification submodule is used for determining whether grammatical relations exist among the word elements in each candidate phrase in the candidate phrase set through a first classifier of the multi-path classification model based on the semantic feature vectors; the multi-classification sub-module is used for determining the belonged grammar structure among the word elements in the candidate phrase through a second classifier of the multi-path classification model based on the semantic feature vector of the corresponding candidate phrase when the prediction result of the first classifier is that the grammar relation exists;
in a further embodiment, the similarity determining module includes: the text coding submodule is used for extracting deep semantic information corresponding to each candidate phrase and the commodity title which belong to the same syntactic structure by adopting a preset text coding model, and obtaining a text coding vector corresponding to each candidate phrase and a text coding vector of the commodity title; and the vector similarity submodule is used for calculating the similarity between the text coding vector corresponding to each candidate phrase and the text coding vector of the commodity title.
In a further embodiment, after the keyword screening module, the method further includes: the keyword acquisition sub-module is used for acquiring matching keywords corresponding to the target commodity and other commodities; the similarity score calculation sub-module is used for summing the weights of the matching keywords of the target commodity and other commodities corresponding to the same category and the same grammatical structure and matching the corresponding grammatical structure to obtain a middle value, calculating the ratio of the middle value to the total number of the matching keywords of the target commodity and other commodities, and obtaining the similarity score between the target commodity and other commodities; and the commodity screening submodule is used for screening other commodities with similar scores meeting preset conditions as recommended commodities of the target commodity.
In a further embodiment, before the reasoning characteristic representation submodule, the method further includes: the sample acquisition submodule is used for acquiring a single phrase sample and a supervision tag thereof from a prepared training set, wherein the phrase sample comprises a plurality of word elements, the supervision tag represents whether grammatical relations exist among the word elements in the phrase sample, and when the supervision tag represents that the grammatical relations exist among the word elements in the phrase sample, the supervision tag also comprises a grammar structure representing the corresponding grammar structures among the word elements in the corresponding phrase sample; the training feature representation submodule is used for inputting the phrase samples into a multi-path classification model, extracting deep semantic information corresponding to the phrase samples and obtaining corresponding semantic feature vectors; the first classification submodule is used for predicting a corresponding first inference result through a first classifier based on the semantic feature vector; the second classification submodule is used for predicting a corresponding second inference result through a second classifier based on the semantic feature vector when grammatical relations exist among the word elements in the supervision label characterization phrase sample; and the iterative training submodule is used for adopting the supervision labels of the phrase samples, correspondingly determining first loss values corresponding to the first reasoning result and the second reasoning result when the syntactic relations exist between the word elements in the supervision labels representing the phrase samples, otherwise, determining second loss values of the first reasoning result, updating the weight of the multi-path classification model when the first loss values or the second loss values do not reach the corresponding preset threshold values, and continuously calling other phrase samples to carry out iterative training until the multi-path classification model converges.
In yet another aspect, a computer device adapted for one of the purposes of the present application includes a central processing unit and a memory, the central processing unit being configured to invoke execution of a computer program stored in the memory to perform the steps of the merchandise recommendation method described in the present application.
In still another aspect, a computer-readable storage medium is provided, which stores a computer program implemented according to the product recommendation method in the form of computer-readable instructions, and when the computer program is called by a computer, executes the steps included in the method.
The technical solution of the present application has various advantages, including but not limited to the following aspects:
the method includes the steps of constructing a corresponding candidate phrase set by segmenting a commodity title of a commodity, judging whether grammatical relations exist among word elements in each candidate phrase, determining a grammatical structure to which the corresponding grammatical relations belong when the grammatical relations exist, further determining the similarity between each candidate phrase and the commodity title which belong to the same grammatical structure, and accordingly screening out the candidate phrases of which the similarity of each grammatical structure meets a preset condition as matching keywords of the commodity for commodity recommendation. Candidate phrases with multiple grammatical structures are effectively and deeply mined from the commodity titles as matching keywords of the commodities, so that the characteristics of the corresponding commodities can be accurately represented by the multi-dimensional matching keywords, and the accuracy of commodity recommendation according to the matching keywords of the commodities is guaranteed.
Drawings
The foregoing and/or additional aspects and advantages of the present application will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:
fig. 1 is a schematic flowchart of an exemplary embodiment of a product recommendation method of the present application;
FIG. 2 is a schematic flow chart illustrating the construction of a candidate phrase set according to an embodiment of the present application;
FIG. 3 is a schematic flowchart illustrating a process of determining whether a grammatical relation exists between lemmas in candidate phrases and a corresponding grammatical structure in an embodiment of the present application;
FIG. 4 is a diagram of a multi-way classification model in an embodiment of the present application;
FIG. 5 is a schematic flow chart illustrating a process of determining similarity between candidate phrases in a candidate phrase set and a title of a product according to an embodiment of the present application;
fig. 6 is a schematic flowchart of a recommended commodity screening process for a target commodity in an embodiment of the present application;
FIG. 7 is a diagram illustrating a training process for a multi-way classification model in an embodiment of the present application;
fig. 8 is a schematic block diagram of a commodity recommending apparatus according to the present application;
fig. 9 is a schematic structural diagram of a computer device used in the present application.
Detailed Description
Reference will now be made in detail to embodiments of the present application, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the accompanying drawings are exemplary only for explaining the present application and are not construed as limiting the present application.
As used herein, the singular forms "a", "an", "the" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. It will be understood that when an element is referred to as being "connected" or "coupled" to another element, it can be directly connected or coupled to the other element or intervening elements may also be present. Further, "connected" or "coupled" as used herein may include wirelessly connected or wirelessly coupled. As used herein, the term "and/or" includes all or any element and all combinations of one or more of the associated listed items.
It will be understood by those within the art that, unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the prior art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.
As used herein, "client," "terminal," and "terminal device" include both wireless signal receiver devices, which are only capable of wireless signal receiver devices without transmit capability, and receiving and transmitting hardware devices, which have receiving and transmitting hardware capable of two-way communication over a two-way communication link, as will be understood by those skilled in the art. Such a device may include: cellular or other communication devices such as personal computers, tablets, etc. having single or multi-line displays or cellular or other communication devices without multi-line displays; PCS (personal communications System), which may combine voice, data processing, facsimile and/or data communications capabilities; a PDA (personal digital assistant), which may include a radio frequency receiver, a pager, internet/intranet access, web browser, notepad, calendar, and/or GPS (global positioning System) receiver; a conventional laptop and/or palmtop computer or other device having and/or including a radio frequency receiver. As used herein, a "client," "terminal device" may be portable, transportable, installed in a vehicle (aeronautical, maritime, and/or land-based), or situated and/or configured to operate locally and/or in a distributed fashion at any other location(s) on earth and/or in space. The "client", "terminal" and "terminal device" used herein may also be a communication terminal, a internet access terminal, and a music/video playing terminal, and may be, for example, a PDA, an MI D (mobile internet device) and/or a mobile phone with music/video playing function, and may also be a smart television, a set-top box, and other devices.
The hardware referred to by the names "server", "client", "service node", etc. is essentially an electronic device with the performance of a personal computer, and is a hardware device having necessary components disclosed by the von neumann principle such as a central processing unit (including an arithmetic unit and a controller), a memory, an input device, an output device, etc., a computer program is stored in the memory, and the central processing unit calls a program stored in an external memory into the internal memory to run, executes instructions in the program, and interacts with the input and output devices, thereby completing a specific function.
It should be noted that the concept of "server" in the present application can be extended to the case of server cluster. According to the network deployment principle understood by those skilled in the art, the servers should be logically divided, and in physical space, the servers may be independent from each other but can be called through an interface, or may be integrated into one physical computer or a set of computer clusters. Those skilled in the art will appreciate this variation and should not be so limited as to restrict the implementation of the network deployment of the present application.
One or more technical features of the present application, unless expressly specified otherwise, may be deployed to a server for implementation by a client remotely invoking an online service interface provided by a capture server for access, or may be deployed directly and run on the client for access.
Unless specified in clear text, the neural network model referred to or possibly referred to in the application can be deployed in a remote server and used for remote call at a client, and can also be deployed in a client with qualified equipment capability for direct call.
Various data referred to in the present application may be stored in a server remotely or in a local terminal device unless specified in the clear text, as long as the data is suitable for being called by the technical solution of the present application.
The person skilled in the art will know this: although the various methods of the present application are described based on the same concept so as to be common to each other, they may be independently performed unless otherwise specified. In the same way, for each embodiment disclosed in the present application, the same inventive concept is proposed, and therefore, concepts expressed in the same manner and concepts expressed in terms of the same are equally understood, and even though the concepts are expressed differently, they are merely convenient and appropriately changed.
The embodiments to be disclosed herein can be flexibly constructed by cross-linking related technical features of the embodiments unless the mutual exclusion relationship between the related technical features is stated in the clear text, as long as the combination does not depart from the inventive spirit of the present application and can meet the needs of the prior art or solve the deficiencies of the prior art. Those skilled in the art will appreciate variations therefrom.
The commodity recommendation method can be programmed into a computer program product and deployed in a client or a server to run, for example, in an exemplary application scenario of the present application, the commodity recommendation method can be deployed and implemented in a server of an e-commerce platform, so that the method can be executed by accessing an open interface after the computer program product runs and performing human-computer interaction with a process of the computer program product through a graphical user interface.
Referring to fig. 1, in an exemplary embodiment, a method for recommending a commodity includes the following steps:
step S1100, acquiring a commodity title of a commodity, segmenting the commodity title, and constructing a corresponding candidate phrase set;
in the application scene of the e-commerce platform, each commodity on the shelf can be treated as a relatively independent single information unit, and the merchant users of the online shops of the e-commerce platform are responsible for publishing, maintaining and updating, and can be provided for browsing, ordering and the like of consumer users. The online shop can be an independent site, the independent site independently maintains a commodity database of commodities on shelves of the online shop, when a user of the online shop needs to publish a certain commodity on shelves, commodity information corresponding to the commodity is input into a commodity publishing page corresponding to the e-commerce platform, and then the commodity information is submitted to the background server so that the unique identification code corresponding to the target commodity associated with the corresponding commodity information is stored in the commodity database. The commodity information comprises commodity pictures and commodity texts, the commodity pictures visually display the effect and appearance of the commodities through the pictures, and the commodity texts comprise text information describing the commodities, such as commodity titles, commodity detail texts and the like.
The commodity title is stored for the associated commodity on the shelf and starts from commodity description information provided in a text form. In terms of application, the commodity title can be expressed in a concise language to accurately describe any specific information such as the name, brand, material, function, application, selling point and the like of the commodity on the shelf.
In one embodiment, the product title of the product can be obtained from a product database of the online shop according to a unique identification code of the product, wherein the unique identification code is a unique identification set by software engineering personnel for distinguishing each product of the e-commerce platform, so that the product information of the product can be stored and called conveniently.
When the commodity title is segmented, algorithms such as jieba, stanford, han lp, KCWS word segmenter, THULAC, N-Gram, deep learning and the like can be adopted for the commodity title in Chinese, and the commodity title can be segmented simply by a space segmentation mode for natural languages with the words such as English, french, german and the like separated by using spaces as separators, or algorithms such as Keras, space, gens im, NNTK, N-Gram, deep learning and the like can be adopted for the commodity title.
For an illustrative example, a search engine mode of a jieba word segmentation algorithm is adopted to segment the commodity title to obtain a corresponding segmented text. The skilled person can understand that the accurate mode of the j i eba word segmentation is to implement the most accurate segmentation on the commodity title to segment out the corresponding word segmentation text, so that redundant data does not exist in the word segmentation text, but the search engine mode segments long words in the word segmentation text again on the basis of the accurate mode to make the granularity of the word segmentation finer.
In one embodiment, after the word segmentation is performed on the commodity title, a corresponding word segmentation sequence is obtained, wherein the word segmentation sequence comprises a plurality of word segmentation texts which are ordered according to the word sequence of the commodity title, and the word segmentation texts are relatively smallest inseparable units with complete semantics. Further, each participle text in the participle sequence is combined with one or more participle texts behind the participle text to form a candidate phrase, and the number of the participle texts combined by the candidate phrase can be flexibly set by a person skilled in the art and is recommended to be 2. And summarizing all the combined candidate phrases to form a candidate phrase word set. It can be understood that a plurality of participle texts obtained by participling the title of the commodity are combined, so that the number of candidate phrases in the candidate phrase word set is ensured to be sufficient.
Step S1200, judging whether a grammatical relation exists between the word elements in each candidate phrase in the candidate phrase set, and determining a grammatical structure to which the corresponding grammatical relation belongs when the grammatical relation exists;
the method comprises the steps of extracting deep semantic information corresponding to each word element in candidate phrases in a candidate phrase set through a deep learning model suitable for extracting text semantic features in the NLP (Natural Language processing i ng) field, and obtaining semantic feature vectors corresponding to each candidate phrase after the deep learning model is followed by a first classifier suitable for a two-classification task and a second classifier suitable for a multi-classification task, and determining whether grammatical relations exist among the word elements in each candidate phrase in the candidate phrase set or not based on the first classifier followed by the deep learning model and the semantic feature vectors corresponding to each candidate phrase, wherein the grammatical relations refer to the mutual relations among all the constituent units in the grammatical structures, such as major-minor relations, joint relations, partial positive relations, dynamic guest relations, dynamic supplement relations and the like.
And when determining that grammatical relations exist among the word elements in the corresponding candidate phrases in the candidate phrase set, acquiring semantic feature vectors corresponding to the candidate phrases, and further determining a grammatical structure to which the grammatical relations belong based on the semantic feature vectors corresponding to the candidate phrases and the second classifier connected with the deep learning model, wherein the grammatical structure refers to a main meaning, a union, a deviation, a moving object, a moving complement and the like.
The deep learning models, such as LSTM, bi LSTM, or the open-source framework Sentence transformations, provide a number of transformation models that are pre-trained to converge, such as: bert, roBERTa, XLM-RoBERTa, MPNet, the specific selection of which can be flexibly and alternatively implemented by those skilled in the art.
It can be understood that, the deep learning model is followed by the first classifier and the second classifier, and a phrase is prepared as a sample in reference to step S1100, a supervision tag is used to mark whether there is a syntactic relation between the morphemes in the phrase sample, and when there is a syntactic relation, the supervision tag further includes a syntactic structure characterizing the corresponding morphemes in the phrase sample, so that the deep learning model under the model structure is supervised-trained by using the phrase sample and the supervision tag, after training is completed to converge, corresponding feature representation is performed based on the input phrase, two classification is performed to determine whether there is a syntactic relation between the morphemes in the phrase sample, and multiple classification is performed to determine the ability of the syntactic structure belonging to the morphemes in the phrase sample having the syntactic relation.
Step S1300, determining the similarity between each candidate phrase with the same generic and grammatical structure and the title of the commodity;
the model suitable for extracting text characteristics in the NLP field is used as a text coding model, each candidate phrase and the commodity title which belong to the same syntactic structure are used as input of the text coding model, corresponding deep semantic information is extracted, and vectorized characteristic representation corresponding to each candidate phrase and vectorized characteristic representation corresponding to the commodity title are obtained.
Because each candidate phrase and the commodity title have the corresponding vectorized feature representation, that is, equivalent quantization conversion is realized, the similarity between each candidate phrase and the corresponding vectorized feature representation of the commodity title can be calculated, and the similarity between each candidate phrase and the commodity title can be represented by the similarity, and the similarity calculation can be implemented by any one of large-scale vector retrieval engines such as Fa i ss, E l ast i cSearch, M i l vus and the like, and can also be calculated by any one of ready-made algorithms such as cosine similarity, inner product, manhattan distance, euclidean distance and the like.
The text coding model, such as the Bert model, is an excellent neural network model which can process text time sequence information so far, and can be suitable for text extraction work in the application. The text coding model used in the present application is pre-trained to be started after convergence, and since training processes of models such as transform and Bert are known to those skilled in the art, the training processes thereof cannot be detailed.
And S1400, screening the candidate phrases with the similarity meeting the preset conditions for each grammar structure, and using the candidate phrases as matching keywords of the commodity for commodity recommendation.
In one embodiment, the candidate phrases with similarity greater than a preset threshold are screened for each grammar structure as the matching keywords of the commodity, and the preset threshold can be flexibly set by a person skilled in the art. It can be understood that the screening based on the preset threshold obtains a certain number of matching keywords corresponding to each grammatical structure of the commodity, which is beneficial to matching based on the matching keywords corresponding to every two commodities in the follow-up process, and determining a proper amount of commodities to recommend.
In another embodiment, the candidate phrase with the maximum similarity is screened for each grammar structure as the matching keyword of the commodity. It can be understood that the matching keywords screened out based on the maximum similarity have high accuracy, so that matching of matching keywords corresponding to every two commodities is facilitated, and the corresponding commodities are accurately determined to be recommended.
As can be appreciated from the exemplary embodiments of the present application, the technical solution of the present application has various advantages, including but not limited to the following aspects:
the method includes the steps of constructing a corresponding candidate phrase set by segmenting a commodity title of a commodity, judging whether grammatical relations exist among word elements in each candidate phrase, determining a grammatical structure to which the corresponding grammatical relations belong when the grammatical relations exist, further determining the similarity between each candidate phrase and the commodity title which belong to the same grammatical structure, and accordingly screening out the candidate phrases of which the similarity of each grammatical structure meets a preset condition as matching keywords of the commodity for commodity recommendation. Candidate phrases with multiple grammatical structures are effectively and deeply mined from the commodity titles as matching keywords of the commodities, so that the characteristics of the corresponding commodities can be accurately represented by the multi-dimensional matching keywords, and the accuracy of commodity recommendation according to the matching keywords of the commodities is guaranteed.
Referring to fig. 2, in a further embodiment, the step S1100 of obtaining a title of a commodity, performing word segmentation on the title of the commodity, and constructing a corresponding candidate phrase set includes the following steps:
step S1110, acquiring a commodity title of a commodity, and respectively performing sliding word extraction on the commodity title by adopting a preset word extraction window in multiple moving step lengths to obtain corresponding multiple word segmentation phrases;
the method comprises the steps of obtaining a commodity title of a commodity from a commodity database of an online shop according to a unique identification code of the commodity, and then performing word segmentation on the commodity title by adopting an N-Gram algorithm. Specifically, a word-taking window with a preset size is adopted to slide on the commodity title gradually according to a preset moving step length, and word-segmentation phrases in the word-taking window are taken out each time. It can be understood that the moving step controls the distance of each step of movement of the word-taking window, and the size of the word-taking window controls the number of word elements in the word-segmentation phrase taken out from the word-taking window.
In one embodiment, the method for obtaining the corresponding multiple word segmentation phrases by sliding the commodity title in multiple moving steps through a preset word extraction window comprises the following steps:
s1111, moving a word-taking window by a first step length to take words from the commodity title to obtain a first word segmentation sequence, wherein the first step length is smaller than the window size of the word-taking window;
the size of the word-fetching window can be flexibly set by a person skilled in the art, so that the word-fetching window contains two or more word elements, and the recommended word-fetching window contains two word elements, for example, the size of the word-fetching window can contain four or five Chinese single words, two or three English words, and the like.
It can be understood that, since the first step size is smaller than the window size of the word extraction window, word-segmentation phrases in the first word-segmentation sequence obtained by corresponding word extraction overlap with each other.
Step S1112, moving the word-taking window with a second step length to take words from the title of the commodity to obtain a second word-taking sequence, where the second step length is equal to the window size of the word-taking window;
it can be understood that, since the second step size is equal to the window size of the word fetching window, word-segmented phrases in the second word-segmented sequence obtained by corresponding word fetching do not overlap and are relatively continuous.
And S1113, moving a word-taking window to take words from the commodity title by a third step length to obtain a third word-dividing sequence, wherein the third step length is larger than the window size of the word-taking window.
It can be understood that, since the third step size is larger than the window size of the word extraction window, word segmentation phrases in the third word segmentation sequence obtained by corresponding word extraction are relatively discontinuous.
Step S1120, summarizing each participle phrase to form a candidate phrase set;
and respectively carrying out sliding word extraction on the commodity title by adopting a preset word extraction window according to the first step length, the second step length and the third step length to obtain corresponding overlapped, non-overlapped, continuous and discontinuous word segmentation phrases, and summarizing to form a candidate phrase set.
In the embodiment, the word extracting window is adopted to respectively perform sliding word extraction on the commodity title in multiple moving step lengths, so that multiple corresponding word segmentation phrases are obtained and collected to form a candidate phrase set, and rich word segmentation phrases can be efficiently and conveniently extracted from the commodity title.
Referring to fig. 3, in a further embodiment, in step S1200, determining whether a grammatical relationship exists between the lemmas in each candidate phrase in the candidate phrase set, and when the grammatical relationship exists, determining a grammatical structure to which the corresponding grammatical relationship belongs includes the following steps:
step S1210, a preset multi-channel classification model is adopted, a single candidate phrase in the candidate phrase set is taken as input, deep semantic information of the candidate phrase is extracted, and a corresponding semantic feature vector is obtained;
referring to fig. 4, the multi-channel classification model prepared in the present application includes a feature representation layer, a first classifier and a second classifier, the feature representation layer may be a deep semantic learning model based on deep semantic learning in the NLP (natural Language processing i ng) field, such as Bi LSTM, bert, etc., the first classifier and the second classifier are respectively connected to the feature representation layer, so as to share the feature representation output by the feature representation layer, the first classifier is suitable for the classification task of two classes and may use one or more fully connected layers or MLPs (multi-layer perceptron), and the second classifier is suitable for the multi-classification task of one or more fully connected layers or MLPs (multi-layer perceptron).
The method comprises the steps of adopting a preset multi-path classification model and taking a single candidate phrase in a candidate phrase set as input, wherein a feature representation layer of the multi-path classification model is a BiLSTM in one embodiment, respectively extracting deep semantic information of the candidate phrase through forward and backward directions by the feature representation layer Bi LSTM, and obtaining corresponding forward feature representation and backward feature representation for each word element in the candidate phrase and splicing the forward feature representation and the backward feature representation into a semantic feature vector.
Step S1220, determining whether a grammatical relation exists between the word elements in each candidate phrase in the candidate phrase set through a first classifier of the multi-path classification model based on the semantic feature vector;
the first classifier of the multi-path classification model takes the feature representation, namely semantic feature vectors, output by the feature representation layer as input, performs two-classification mapping on the semantic feature vectors corresponding to each candidate phrase in the candidate phrase set respectively, so as to obtain a first classification probability corresponding to grammatical relation existing between the word elements in each candidate phrase and a second classification probability corresponding to no grammatical relation, and correspondingly determines whether grammatical relation exists by determining the maximum one of the first classification probability and the second classification probability.
Step S1230, when the prediction result of the first classifier is that a grammatical relation exists, determining a grammatical structure of the candidate phrases between the lemmas based on the semantic feature vector of the corresponding candidate phrase through a second classifier of the multi-path classification model;
and a second classifier of the multi-path classification model takes the feature representation, namely semantic feature vectors, of candidate phrases with grammatical relations among corresponding word elements output through a feature representation layer as input, performs multi-classification mapping on the semantic feature vectors of the candidate phrases so as to obtain classification probability corresponding to each grammar structure, and determines the grammar structure corresponding to the word element with the maximum classification probability as the grammar structure belonging to the word elements in the candidate phrases.
In this embodiment, a semantic feature vector corresponding to each candidate phrase in the candidate phrase set is extracted through a multi-channel classification model, two-class mapping is performed on the semantic feature vector corresponding to each candidate phrase, whether a grammatical relation exists between the morphemes in each candidate phrase is determined, when the grammatical relation exists, multi-class mapping is performed on the semantic feature vector of the corresponding candidate phrase, and a grammatical structure that the morphemes in the candidate phrase belong to is determined. The candidate phrases with grammatical relations among corresponding word elements and the grammatical structures among the word elements can be rapidly and accurately determined by adopting a multi-path classification model.
Referring to fig. 5, in a further embodiment, the step S1300 of determining the similarity between each candidate phrase of the same genus and the same grammar structure and the title of the product includes the following steps:
step 1310, extracting deep semantic information corresponding to each candidate phrase and the commodity title of the same category and the same grammatical structure by adopting a preset text coding model, and obtaining a text coding vector corresponding to each candidate phrase and a text coding vector of the commodity title;
by adopting a model which is suitable for extracting text characteristics in the NLP field as a text coding model and taking each candidate phrase and the commodity title which belong to the same syntactic structure as the input of the text coding model, corresponding deep semantic information is extracted, and vectorized characteristic representation corresponding to each candidate phrase and vectorized characteristic representation corresponding to the commodity title are obtained.
The text coding model, such as the Bert model, is a superior neural network model which can process text time sequence information so far, and can be suitable for the text extraction work in the application. The text coding model adopted in the present application is pre-trained to be started after convergence, and the training process of the models such as the Transformer and the Bert is known to those skilled in the art, so the details of the training process are not given.
Step S1320, calculating the similarity between the text code vector corresponding to each candidate phrase and the text code vector of the commodity title.
Since each candidate phrase and the product title have the corresponding vectorized feature representation, that is, equivalent quantization conversion is realized, the similarity between each candidate phrase and the product title can be calculated, and the similarity between each candidate phrase and the product title can be represented by the similarity, and the similarity calculation can be implemented by any one of large-scale vector search engines such as Fa iss, elast i cSearch, mi i vus and the like, and can also be calculated by any one of ready-made algorithms such as cosine similarity, inner product, manhattan distance, euclidean distance and the like.
In the embodiment, by adopting the text coding model, the text coding vector corresponding to each candidate phrase and the text coding vector corresponding to the commodity title can be accurately coded based on the depth semantic information corresponding to each candidate phrase and the commodity title, so that the similarity between the text coding vectors corresponding to each candidate phrase and the commodity title is calculated, the similarity between each candidate phrase and the commodity title is represented, the execution is efficient and convenient, and the accuracy of the similarity is ensured.
Referring to fig. 6, in a further embodiment, after the step S1400, screening out the candidate phrases with the similarity satisfying the preset condition for each grammar structure as the matching keywords of the commodity, the method further includes the following steps:
s1500, acquiring matching keywords corresponding to the target commodity and other commodities;
the target commodity can be a commodity searched when a user searches for a commodity, or a commodity added to a shopping cart of the user, or a commodity searched by the user, or a commodity purchased by the user, and the like, and the commodity can be used as the target commodity as long as the commodity is required to be recommended to the corresponding commodity. Of course, in order to make a recommendation of a commodity for each commodity on the shelf at any time, one commodity may be used as a target commodity.
The matching keywords of each commodity can be obtained by referring to steps S1100-1400, and the unique identification code of the commodity corresponding to the matching keywords is stored in the commodity database, so that the commodity recommendation needs to be performed to obtain the call.
Step S1510, summing the weights of the matching keyword numbers of the target commodity and the other commodities corresponding to the same category and the same grammar structure and matching the corresponding grammar structure to obtain a middle value, calculating the ratio of the middle value to the total number of the matching keywords of the target commodity and the other commodities to obtain a similarity score between the target commodity and the other commodities;
it can be understood that matching keywords corresponding to a plurality of grammatical structures of a commodity can accurately represent the commodity in a multi-dimensional manner, and accordingly, whether the two commodities are similar or not can be determined by comparing the similarity degree of the corresponding matching keywords between every two commodities.
An exemplary formula for calculating the similarity score between the target item and the other items is as follows:
Figure BDA0004032500810000151
wherein: score is the similarity Score between the target commodity and other commodities, w is the weight of a grammatical structure, A ^ B is the number of matching keywords of the target commodity and other commodities corresponding to the same genus and the same grammatical structure, and A ^ B is the number of matching keywords of the target commodity and other commodities in total.
According to the importance of each grammar structure, a weight is correspondingly preset for each grammar structure, for example, the centering relationship mainly comes from product words and modifiers thereof, which can be regarded as quite important, so that a higher weight is set, and the specific value of the weight can be set by a person skilled in the art as required.
And step S1520, screening out other commodities with similar scores meeting preset conditions as recommended commodities of the target commodity.
In one embodiment, when a plurality of commodities can be recommended as recommended commodities of a target commodity, other commodities with similarity scores exceeding a preset threshold value are screened out as recommended commodities of the target commodity. The preset threshold value can be flexibly set by those skilled in the art.
In another embodiment, when a commodity needs to be accurately recommended as a recommended commodity of a target commodity, other commodities with the largest similarity score are screened out as recommended commodities of the target commodity.
In this embodiment, the recommended product of the target product is determined by calculating the similarity score between the matching keywords corresponding to the target product and other products. And calculating based on the matching keywords of the multiple grammatical structures of the target commodity and the matching keywords of the multiple grammatical structures corresponding to other commodities, wherein the number of the matching keywords of the target commodity and the matching keywords of the other commodities corresponding to the same grammatical structures matches the weight of the corresponding grammatical structures in the calculation process, so that the finally obtained similarity score can accurately reflect the similarity between the target commodity and the other commodities.
Referring to fig. 7, in a further embodiment, before the step S1210 of using a preset multi-channel classification model to input a single candidate phrase in the candidate phrase set, the method further includes the following steps:
step S2200, obtaining a single phrase sample and a supervision label thereof from a prepared training set, wherein the phrase sample comprises a plurality of word elements, the supervision label characterizes whether grammatical relations exist among the word elements in the phrase sample, and when the supervision label characterizes the grammatical relations among the word elements in the phrase sample, the supervision label also comprises a grammatical structure which characterizes the word elements in the corresponding phrase sample;
referring to step S1100, a candidate phrase set of multiple goods may be obtained, each candidate phrase in the candidate phrase set is used as a phrase sample, and a supervision label of each phrase sample is manually labeled, so that each phrase sample and its supervision label may be summarized to construct a training set.
Step S2210, inputting the phrase samples into a multi-path classification model, extracting deep semantic information corresponding to the phrase samples, and obtaining corresponding semantic feature vectors;
a preset multi-path classification model is adopted to take a single candidate phrase in a candidate phrase set as input, a feature representation layer of the multi-path classification model in one embodiment is a Bi LSTM, deep semantic information of the candidate phrase is respectively extracted through forward and backward directions by the feature representation layer Bi LSTM, and corresponding forward feature representation and backward feature representation are obtained for each word element in the candidate phrase and are spliced into the semantic feature vector.
Step S2220, predicting a corresponding first inference result through a first classifier based on the semantic feature vector;
the first classifier of the multi-path classification model takes the feature representation output by the feature representation layer, namely semantic feature vectors, as input, and carries out classification mapping on the semantic feature vectors respectively, so as to obtain a first classification probability corresponding to grammatical relations existing between the word elements in the phrase samples and a second classification probability corresponding to no grammatical relations existing between the word elements in the phrase samples, and correspondingly determines whether grammatical relations exist or not by determining the maximum one of the first classification probability and the second classification probability as a first reasoning result.
Step S2230, predicting a corresponding second inference result through a second classifier based on the semantic feature vector when a grammatical relation exists between the word elements in the supervision label representation phrase sample;
and the second classifier of the multi-path classification model takes the feature representation, namely semantic feature vectors, of the phrase samples with the grammatical relation between corresponding word elements, which are output by the feature representation layer as input, performs multi-classification mapping on the semantic feature vectors of the phrase samples so as to obtain the classification probability corresponding to each grammatical structure, and determines the grammatical structure corresponding to the word with the largest classification probability as a second reasoning result.
Step S2240, adopting the supervision labels of the phrase samples, correspondingly determining first loss values corresponding to the first reasoning results and the second reasoning results when grammatical relations exist among the word elements in the supervision label representation phrase samples, otherwise, determining second loss values of the first reasoning results, and when the first loss values or the second loss values do not reach the corresponding preset threshold values, performing weight updating on the multi-path classification models, and continuing to call other phrase samples to perform iterative training until the multi-path classification models converge.
Calling a preset cross entropy loss function, wherein the preset cross entropy loss function can be flexibly set by a person skilled in the art according to priori knowledge or experimental experience, calculating corresponding cross entropy loss values according to supervision labels of the phrase samples, correspondingly determining the sum of the cross entropy loss values corresponding to the first inference result and the second inference result as a first loss value when the supervision labels represent that grammatical relations exist between the word elements in the phrase samples, and otherwise, determining the cross entropy loss value of the first inference result as a second loss value, and when the first loss value and the second loss value reach corresponding preset threshold values, indicating that the multi-path classification model is trained to be in a convergence state, so that model training can be terminated; when the first loss value or the second loss value reaches a corresponding preset threshold value, the model is not converged, then the model is subjected to gradient updating according to the corresponding first loss value or the second loss value, the weight parameters of all links of the model are corrected through back propagation to further approximate the model to be converged, and then the next phrase sample in the training set is continuously called to carry out iterative training on the multi-path classification model until the multi-path classification model is trained to be in a convergence state.
In this embodiment, a training process of a multi-channel classification model is disclosed, and after training is completed to convergence, the multi-channel classification model performs corresponding feature representation based on an input phrase, performs two-classification to determine whether a grammatical relationship exists between the lemmas in the phrase sample, and performs multi-classification to determine a grammatical structure to which the lemmas in the phrase sample having the grammatical relationship belong.
Please refer to fig. 8, which is a functional embodiment of the product recommendation method according to one of the purposes of the present application, and the device includes a set construction module 1100, a grammar determination module 1200, a similarity determination module 1300, and a keyword screening module 1400, where the set construction module 1100 is configured to obtain a product title of a product, and construct a corresponding candidate phrase set after performing word segmentation on the product title; a grammar determining module 1200, configured to determine whether a grammar relationship exists between the word elements in each candidate phrase in the candidate phrase set, and when a grammar relationship exists, determine a grammar structure to which the corresponding grammar relationship belongs; a similarity determining module 1300, configured to determine a similarity between each candidate phrase of the same generic and grammatical structure and the title of the product; and the keyword screening module 1400 is configured to screen out, for each grammar structure, candidate phrases whose similarity satisfies a preset condition as matching keywords of the commodity for commodity recommendation.
In a further embodiment, the set construction module 1100 includes: the sliding word-taking submodule is used for obtaining a commodity title of a commodity, and respectively performing sliding word-taking on the commodity title by adopting a preset word-taking window and multiple moving step lengths to obtain corresponding multiple word-segmentation phrases; the phrase summarizing submodule is used for summarizing each word segmentation phrase to form a candidate phrase set;
in a further embodiment, the sliding word-fetching submodule includes: a title acquisition unit for acquiring a title of a commodity; the first word-taking unit is used for moving a word-taking window with a first step length to take words of the commodity title to obtain a first word-dividing sequence, and the first step length is smaller than the window size of the word-taking window; the second word-taking unit is used for moving a word-taking window with a second step length to take words from the commodity title to obtain a second word segmentation sequence, wherein the second step length is equal to the window size of the word-taking window; and the third word-taking unit is used for moving a word-taking window to take words for the commodity title by a third step length to obtain a third word-dividing sequence, and the third step length is larger than the window size of the word-taking window.
In a further embodiment, the syntax determination module 1200 includes: the reasoning characteristic representation submodule is used for extracting deep semantic information of the candidate phrase by taking a single candidate phrase in the candidate phrase set as input by adopting a preset multi-path classification model and obtaining a corresponding semantic feature vector; the binary classification submodule is used for determining whether grammatical relations exist among the word elements in each candidate phrase in the candidate phrase set through a first classifier of the multi-path classification model based on the semantic feature vectors; the multi-classification submodule is used for determining the grammar structure of the candidate phrases between the lemmas based on the semantic feature vector of the corresponding candidate phrase and the second classifier of the multi-path classification model when the prediction result of the first classifier is that the grammar relationship exists;
in a further embodiment, the similarity determining module 1300 includes: the text coding submodule is used for extracting deep semantic information corresponding to each candidate phrase and the commodity title which belong to the same syntactic structure by adopting a preset text coding model, and obtaining a text coding vector corresponding to each candidate phrase and a text coding vector of the commodity title; and the vector similarity submodule is used for calculating the similarity between the text coding vector corresponding to each candidate phrase and the text coding vector of the commodity title.
In a further embodiment, after the keyword screening module 1400, the method further includes: the keyword acquisition sub-module is used for acquiring matching keywords corresponding to the target commodity and other commodities; the similarity score calculation sub-module is used for summing the weights of the matching keywords of the target commodity and other commodities corresponding to the same category and the same grammatical structure and matching the corresponding grammatical structure to obtain a middle value, calculating the ratio of the middle value to the total number of the matching keywords of the target commodity and other commodities, and obtaining the similarity score between the target commodity and other commodities; and the commodity screening submodule is used for screening other commodities with similar scores meeting preset conditions as recommended commodities of the target commodity.
In a further embodiment, before the inference characteristic representation submodule, the method further includes: the sample acquisition submodule is used for acquiring a single phrase sample and a supervision tag thereof from a prepared training set, wherein the phrase sample comprises a plurality of word elements, the supervision tag represents whether grammatical relations exist among the word elements in the phrase sample, and when the supervision tag represents that the grammatical relations exist among the word elements in the phrase sample, the supervision tag also comprises a grammar structure representing the corresponding grammar structures among the word elements in the corresponding phrase sample; the training feature representation submodule is used for inputting the phrase samples into a multi-path classification model, extracting deep semantic information corresponding to the phrase samples and obtaining corresponding semantic feature vectors; the first classification submodule is used for predicting a corresponding first inference result through a first classifier based on the semantic feature vector; the second classification submodule is used for predicting a corresponding second inference result through a second classifier based on the semantic feature vector when grammatical relations exist among the word elements in the supervision label characterization phrase sample; and the iterative training submodule is used for adopting the supervision labels of the phrase samples, correspondingly determining first loss values corresponding to the first reasoning result and the second reasoning result when the syntactic relations exist between the word elements in the supervision labels representing the phrase samples, otherwise, determining second loss values of the first reasoning result, updating the weight of the multi-path classification model when the first loss values or the second loss values do not reach the corresponding preset threshold values, and continuously calling other phrase samples to carry out iterative training until the multi-path classification model converges.
In order to solve the technical problem, an embodiment of the present application further provides a computer device. As shown in fig. 9, the internal structure of the computer device is schematically illustrated. The computer device includes a processor, a computer-readable storage medium, a memory, and a network interface connected by a system bus. The computer-readable storage medium of the computer device stores an operating system, a database and computer-readable instructions, the database can store control information sequences, and the computer-readable instructions, when executed by the processor, can cause the processor to implement a commodity recommendation method. The processor of the computer device is used for providing calculation and control capability and supporting the operation of the whole computer device. The memory of the computer device may have stored therein computer readable instructions that, when executed by the processor, may cause the processor to perform the article recommendation method of the present application. The network interface of the computer device is used for connecting and communicating with the terminal. Those skilled in the art will appreciate that the architecture shown in fig. 9 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
In this embodiment, the processor is configured to execute specific functions of each module and its sub-module in fig. 8, and the memory stores program codes and various data required for executing the modules or sub-modules. The network interface is used for data transmission to and from a user terminal or a server. The memory in this embodiment stores program codes and data necessary for executing all modules/sub-modules in the product recommendation device of the present application, and the server can call the program codes and data of the server to execute the functions of all sub-modules.
The present application also provides a storage medium storing computer-readable instructions, which when executed by one or more processors, cause the one or more processors to perform the steps of the method of recommending items of any of the embodiments of the present application.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments of the present application can be implemented by a computer program, which can be stored in a computer-readable storage medium, and when the computer program is executed, the processes of the embodiments of the methods can be included. The storage medium may be a magnetic disk, an optical disk, a Read-only Memory (ROM), or a Random Access Memory (RAM).
In summary, the multi-dimensional keywords of the commodities are deeply mined by combining the grammatical information of the phrases for commodity recommendation, and the accuracy of recommending the commodities can be improved.
Those of skill in the art will appreciate that the various operations, methods, steps in the processes, acts, or solutions discussed in this application can be interchanged, modified, combined, or eliminated. Further, other steps, measures, or schemes in various operations, methods, or flows that have been discussed in this application can be alternated, altered, rearranged, broken down, combined, or deleted. Further, steps, measures, schemes in the prior art having various operations, methods, procedures disclosed in the present application may also be alternated, modified, rearranged, decomposed, combined, or deleted.
The foregoing is only a partial embodiment of the present application, and it should be noted that, for those skilled in the art, several modifications and decorations can be made without departing from the principle of the present application, and these modifications and decorations should also be regarded as the protection scope of the present application.

Claims (10)

1. A commodity recommendation method is characterized by comprising the following steps:
acquiring a commodity title of a commodity, segmenting the commodity title, and constructing a corresponding candidate phrase set;
judging whether a grammatical relation exists between the word elements in each candidate phrase in the candidate phrase set, and determining a grammatical structure to which the corresponding grammatical relation belongs when the grammatical relation exists;
determining the similarity between each candidate phrase of the same genus and the same grammatical structure and the title of the commodity;
and screening out the candidate phrases with the similarity meeting the preset conditions for each grammatical structure, and taking the candidate phrases as matching keywords of the commodity for commodity recommendation.
2. The commodity recommendation method according to claim 1, wherein the step of obtaining a commodity title of a commodity, performing word segmentation on the commodity title, and constructing a corresponding candidate phrase set comprises the following steps:
acquiring a commodity title of a commodity, and respectively performing sliding word extraction on the commodity title by adopting a preset word extraction window in multiple moving step lengths to obtain corresponding multiple word segmentation phrases;
and summarizing each word segmentation phrase to form a candidate phrase set.
3. The commodity recommendation method according to claim 2, wherein the commodity title of the commodity is obtained, a preset word-taking window is adopted to respectively perform sliding word-taking on the commodity title in multiple moving step lengths, and corresponding multiple word-segmentation phrases are obtained, and the method comprises the following steps:
acquiring a commodity title of a commodity;
moving a word-taking window with a first step length to take words from the commodity title to obtain a first word-dividing sequence, wherein the first step length is smaller than the window size of the word-taking window;
moving a word-taking window with a second step length to take words from the commodity title to obtain a second word segmentation sequence, wherein the second step length is equal to the window size of the word-taking window;
and moving a word-taking window by a third step length to take words from the commodity title to obtain a third word-dividing sequence, wherein the third step length is larger than the window size of the word-taking window.
4. The merchandise recommendation method according to claim 1, wherein determining whether there is a grammatical relationship between the lemmas in each candidate phrase in the candidate phrase set, and when there is a grammatical relationship, determining the grammatical structure to which the corresponding grammatical relationship belongs comprises the following steps:
extracting deep semantic information of the candidate phrases by using a preset multi-path classification model and taking a single candidate phrase in the candidate phrase set as input to obtain a corresponding semantic feature vector;
determining whether a grammatical relation exists between the word elements in each candidate phrase in the candidate phrase set through a first classifier of the multi-path classification model based on the semantic feature vector;
and when the prediction result of the first classifier is that a grammatical relation exists, determining the grammatical structure of the candidate phrases between the word elements based on the semantic feature vectors of the corresponding candidate phrases through a second classifier of the multi-path classification model.
5. The item recommendation method according to claim 1, wherein determining the similarity between each candidate phrase of the same generic and same syntactic structure and the item title comprises the steps of:
extracting deep semantic information corresponding to each candidate phrase and the commodity title belonging to the same grammatical structure by adopting a preset text coding model, and obtaining a text coding vector corresponding to each candidate phrase and a text coding vector of the commodity title;
and calculating the similarity between the text coding vector corresponding to each candidate phrase and the text coding vector of the commodity title.
6. The commodity recommendation method according to claim 1, wherein after the candidate phrases with the similarity satisfying the preset condition are screened out for each grammar structure as the matching keywords of the commodity, the method further comprises the following steps:
acquiring matching keywords corresponding to the target commodity and other commodities;
the weights of the number of the matched keywords of the target commodity and other commodities corresponding to the same category and the same grammatical structure and the corresponding grammatical structure are added to obtain a middle value, the ratio of the middle value to the number of the matched keywords of the target commodity and other commodities in total is calculated, and the similarity score between the target commodity and other commodities is obtained;
and screening other commodities with similar scores meeting preset conditions as recommended commodities of the target commodity.
7. The merchandise recommendation method according to claim 4, wherein before a preset multi-path classification model is adopted and a single candidate phrase in the candidate phrase set is taken as an input, the method further comprises the following steps:
obtaining a single phrase sample and a supervision tag thereof from a prepared training set, wherein the phrase sample comprises a plurality of word elements, the supervision tag characterizes whether grammatical relations exist among the word elements in the phrase sample, and when the supervision tag characterizes the grammatical relations among the word elements in the phrase sample, the supervision tag also comprises a grammar structure which characterizes the belonging of the word elements in the corresponding phrase sample;
inputting the phrase samples into a multi-path classification model, extracting deep semantic information corresponding to the phrase samples, and obtaining corresponding semantic feature vectors;
predicting a corresponding first inference result through a first classifier based on the semantic feature vector;
predicting a corresponding second inference result through a second classifier based on the semantic feature vector when grammatical relations exist among the word elements in the supervision label representation phrase sample;
and adopting the supervision label of the phrase sample, correspondingly determining a first loss value corresponding to the first reasoning result and the second reasoning result when the syntactic relation exists between the word elements in the supervision label characterization phrase sample, otherwise, determining a second loss value of the first reasoning result, and when the first loss value or the second loss value does not reach a corresponding preset threshold value, implementing weight updating on the multi-path classification model, and continuously calling other phrase samples to implement iterative training until the multi-path classification model converges.
8. An article recommendation device, comprising:
the set construction module is used for acquiring a commodity title of a commodity, segmenting the commodity title and constructing a corresponding candidate phrase set;
the grammar determining module is used for judging whether grammar relations exist among the word elements in each candidate phrase in the candidate phrase set, and determining the grammar structure to which the corresponding grammar relations belong when the grammar relations exist;
the similarity determining module is used for determining the similarity between each candidate phrase with the same category and the same grammatical structure and the commodity title;
and the keyword screening module is used for screening the candidate phrases with the similarity meeting the preset conditions aiming at each grammar structure to serve as matching keywords of the commodity for commodity recommendation.
9. A computer device comprising a central processing unit and a memory, characterized in that the central processing unit is adapted to invoke the execution of a computer program stored in the memory to perform the steps of the method according to any one of claims 1 to 7.
10. A computer-readable storage medium, characterized in that it stores, in the form of computer-readable instructions, a computer program implemented according to the method of any one of claims 1 to 7, which, when invoked by a computer, performs the steps comprised by the corresponding method.
CN202211737299.XA 2022-12-30 2022-12-30 Commodity recommendation method, commodity recommendation device, commodity recommendation equipment and commodity recommendation medium Pending CN115907928A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211737299.XA CN115907928A (en) 2022-12-30 2022-12-30 Commodity recommendation method, commodity recommendation device, commodity recommendation equipment and commodity recommendation medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211737299.XA CN115907928A (en) 2022-12-30 2022-12-30 Commodity recommendation method, commodity recommendation device, commodity recommendation equipment and commodity recommendation medium

Publications (1)

Publication Number Publication Date
CN115907928A true CN115907928A (en) 2023-04-04

Family

ID=86480888

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211737299.XA Pending CN115907928A (en) 2022-12-30 2022-12-30 Commodity recommendation method, commodity recommendation device, commodity recommendation equipment and commodity recommendation medium

Country Status (1)

Country Link
CN (1) CN115907928A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116521906A (en) * 2023-04-28 2023-08-01 广州商研网络科技有限公司 Meta description generation method, device, equipment and medium thereof

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116521906A (en) * 2023-04-28 2023-08-01 广州商研网络科技有限公司 Meta description generation method, device, equipment and medium thereof
CN116521906B (en) * 2023-04-28 2023-10-24 广州商研网络科技有限公司 Meta description generation method, device, equipment and medium thereof

Similar Documents

Publication Publication Date Title
US10803055B2 (en) Cognitive searches based on deep-learning neural networks
KR101778679B1 (en) Method and system for classifying data consisting of multiple attribues represented by sequences of text words or symbols using deep learning
US9910930B2 (en) Scalable user intent mining using a multimodal restricted boltzmann machine
EP3717984B1 (en) Method and apparatus for providing personalized self-help experience
US20210390609A1 (en) System and method for e-commerce recommendations
WO2019133506A1 (en) Intelligent routing services and systems
CN116521906B (en) Meta description generation method, device, equipment and medium thereof
CN115731425A (en) Commodity classification method, commodity classification device, commodity classification equipment and commodity classification medium
CN111737559A (en) Resource sorting method, method for training sorting model and corresponding device
CN114663197A (en) Commodity recommendation method and device, equipment, medium and product thereof
CN114186013A (en) Entity recognition model hot updating method and device, equipment, medium and product thereof
CN114065750A (en) Commodity information matching and publishing method and device, equipment, medium and product thereof
CN116976920A (en) Commodity shopping guide method and device, equipment and medium thereof
CN115018549A (en) Method for generating advertisement file, device, equipment, medium and product thereof
CN116797280A (en) Advertisement document generation method and device, equipment and medium thereof
CN115129913A (en) Sensitive word mining method and device, equipment and medium thereof
CN114818674A (en) Commodity title keyword extraction method and device, equipment, medium and product thereof
CN115907928A (en) Commodity recommendation method, commodity recommendation device, commodity recommendation equipment and commodity recommendation medium
CN114626926A (en) Commodity search category identification method and device, equipment, medium and product thereof
CN114218948A (en) Keyword recognition method and device, equipment, medium and product thereof
CN113971599A (en) Advertisement putting and selecting method and device, equipment, medium and product thereof
CN116823404A (en) Commodity combination recommendation method, device, equipment and medium thereof
CN116796027A (en) Commodity picture label generation method and device, equipment, medium and product thereof
CN115936805A (en) Commodity recommendation method, commodity recommendation device, commodity recommendation equipment and commodity recommendation medium
CN116029793A (en) Commodity recommendation method, device, equipment and medium thereof

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination