CN111723566B - Product information reconstruction method and device - Google Patents

Product information reconstruction method and device Download PDF

Info

Publication number
CN111723566B
CN111723566B CN201910219171.6A CN201910219171A CN111723566B CN 111723566 B CN111723566 B CN 111723566B CN 201910219171 A CN201910219171 A CN 201910219171A CN 111723566 B CN111723566 B CN 111723566B
Authority
CN
China
Prior art keywords
key information
product
information
words
original content
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910219171.6A
Other languages
Chinese (zh)
Other versions
CN111723566A (en
Inventor
张珮
吴胜兰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Group Holding Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN201910219171.6A priority Critical patent/CN111723566B/en
Publication of CN111723566A publication Critical patent/CN111723566A/en
Application granted granted Critical
Publication of CN111723566B publication Critical patent/CN111723566B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9538Presentation of query results

Abstract

The invention discloses a method and a device for reconstructing product information. Wherein the method comprises the following steps: acquiring original content of product information in a preset platform; identifying key information from the original content, wherein the key information includes at least one term for characterizing a feature of the product; and obtaining the reconstructed product information at least by adjusting the sequence of the key information in the original content. The invention solves the technical problem of low searching efficiency when a user searches products due to chaotic titles of the products in shopping websites in the prior art.

Description

Product information reconstruction method and device
Technical Field
The invention relates to the field of data processing, in particular to a method and a device for reconstructing product information.
Background
In the e-commerce website, in order to acquire the exposure rate and sales rate of the commodity, sellers repeatedly pile product words when editing the commodity title, add marketing words irrelevant to commodity information, repeatedly pile similar product description words, and the like, and form a special e-commerce body over time, for example, in the example shown in fig. 1, the part outlined by the dotted line is the commodity title of the e-commerce body.
The commodity title format of the electronic commerce body is irregular, and products, attribute information and postage information are stacked in a redundant mode; for the mobile phone APP end user with limited display space, the list page title displays 60 characters at most, and commodity key information cannot be exposed. These phenomena can result in a title presenting a small amount of useful information and a low product differentiation, thus resulting in a user searching for goods with low efficiency; especially for english titles, the average sentence length is longer, so that less information can be presented in a limited length. Such as the merchandise title shown in FIG. 1, in which "Freeshiping-! The following is carried out The information such as "is exposed at the front end of the title, so that the key information" building-in … "of the commodity cannot be completely displayed.
Aiming at the problem that in the shopping website in the prior art, the titles of products are disordered, so that the searching efficiency is low when a user searches for the products, no effective solution is proposed at present.
Disclosure of Invention
The embodiment of the invention provides a method and a device for reconstructing product information, which at least solve the technical problem that in shopping websites in the prior art, the title of a product is disordered, so that the searching efficiency is low when a user searches the product.
According to an aspect of an embodiment of the present invention, there is provided a method for reconstructing product information, including: acquiring original content of product information in a preset platform; identifying key information from the original content, wherein the key information comprises at least one word for characterizing a feature of a product; and obtaining the reconstructed product information at least by adjusting the sequence of the key information in the original content.
According to another aspect of the embodiment of the present invention, there is also provided a method for reconstructing product information, including: displaying original content of product information describing a product; displaying key information identified in the original content, wherein the key information comprises at least one word and sentence for representing the characteristics of the product; and displaying the reconstructed product information, wherein the reconstructed product information is that the display sequence of the key information in the original content is adjusted.
According to another aspect of the embodiment of the present invention, there is also provided a device for reconstructing product information, including: the acquisition module is used for acquiring the original content of the product information in the preset platform; an identification module for identifying key information from the original content, wherein the key information comprises at least one word for characterizing a feature describing a product; and the adjusting module is used for obtaining the reconstructed product information at least by adjusting the sequence of the key information in the original content.
In the embodiment of the invention, the key information of the original title is extracted, and the position of the key information in the original content is adjusted, so that the information value and the understandability of the title display are improved, the searching efficiency of a user is further improved, and higher benefits are brought to websites. And has great advantages over the seq2seq method in that newly generated titles are reconstructed based on phrases in the original title, and no distortion problem exists. Therefore, the embodiment of the application solves the problem that in the shopping website in the prior art, the titles of products are disordered, so that the searching efficiency is low when a user searches the products.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiments of the invention and together with the description serve to explain the invention and do not constitute a limitation on the invention. In the drawings:
FIG. 1 is a schematic diagram of a product title according to the prior art;
fig. 2 shows a hardware block diagram of a computer terminal (or mobile device) for implementing a reconstruction method of product information;
fig. 3 is a flowchart of a method of reconstructing product information according to embodiment 1 of the present invention;
FIG. 4 is a schematic diagram of a product theme reconfiguration according to example 1 of the present application;
FIG. 5 is a schematic diagram of an example of product header reconstruction according to embodiment 1 of the present application;
fig. 6 is a schematic diagram of a device for reconstructing product information according to embodiment 2 of the present application;
FIG. 7 is a flow chart of a method of reconstructing product information according to embodiment 3 of the present application;
fig. 8 is a schematic diagram of a device for reconstructing product information according to embodiment 4 of the present application; and
fig. 9 is a block diagram of a computer terminal according to embodiment 6 of the present invention.
Detailed Description
In order that those skilled in the art will better understand the present invention, a technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present invention without making any inventive effort, shall fall within the scope of the present invention.
It should be noted that the terms "first," "second," and the like in the description and the claims of the present invention and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the invention described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
First, partial terms or terminology appearing in describing embodiments of the present application are applicable to the following explanation:
product words: the name of the commodity sold by the seller.
Marketing words: words unrelated to specific information of the commodity, e.g. "HOT SALE-! "," NEW ORIGINAL "," hot sell "," hot explosion ", etc.
NER: named Entity Recognition the named entity recognition technique can be used to recognize a person name, a place name from a sentence, or a commodity name, a medicine name, etc. from a search term of an electronic commerce.
Chunking: an NLP (Neuro-Linguistic Programming, neuro-linguistics) base technique is used to semantically cut text.
The applicable object words: refers to the object to which the commodity is adapted, e.g., for iphone6 in phone case for Iphone6, for 2-4year girl in address for 2-4year baby girl are all applicable object words.
CRF: conditional Random Field, namely a conditional random field, is a sequence labeling model based on a characteristic template and is commonly used for Chinese word segmentation, part-of-speech labeling, entity recognition and other tasks.
Bi-LSTM-CRF: a sequence labeling model based on a neural network mainly comprises three layers: a lookup layer, a bi-directional lstm layer, and a crf layer.
Example 1
There is also provided, in accordance with an embodiment of the present invention, an embodiment of a method of reconstructing product information, it being noted that the steps illustrated in the flowchart of the drawings may be performed in a computer system, such as a set of computer executable instructions, and, although a logical sequence is illustrated in the flowchart, in some cases, the steps illustrated or described may be performed in a different order than what is illustrated herein.
The method embodiment provided in the first embodiment of the present application may be executed in a mobile terminal, a computer terminal or a similar computing device. Fig. 2 shows a hardware block diagram of a computer terminal (or mobile device) for implementing a reconstruction method of product information. As shown in fig. 2, the computer terminal 20 (or mobile device 20) may include one or more (shown as 202a, 202b, … …,202 n) processors 202 (the processors 202 may include, but are not limited to, a microprocessor MCU or a processing device such as a programmable logic device FPGA), a memory 204 for storing data, and a transmission module 206 for communication functions. In addition, the method may further include: a display, an input/output interface (I/O interface), a Universal Serial Bus (USB) port (which may be included as one of the ports of the I/O interface), a network interface, a power supply, and/or a camera. It will be appreciated by those of ordinary skill in the art that the configuration shown in fig. 2 is merely illustrative and is not intended to limit the configuration of the electronic device described above. For example, the computer terminal 20 may also include more or fewer components than shown in FIG. 2, or have a different configuration than shown in FIG. 2.
It should be noted that the one or more processors 202 and/or other data processing circuits described above may be referred to herein generally as "data processing circuits. The data processing circuit may be embodied in whole or in part in software, hardware, firmware, or any other combination. Furthermore, the data processing circuitry may be a single stand-alone processing module, or incorporated, in whole or in part, into any of the other elements in the computer terminal 20 (or mobile device). As referred to in the embodiments of the present application, the data processing circuit acts as a processor control (e.g., selection of the path of the variable resistor termination to interface).
The memory 204 may be used to store software programs and modules of application software, such as program instructions/data storage devices corresponding to the method for reconstructing product information in the embodiments of the present invention, and the processor 202 executes the software programs and modules stored in the memory 204 to perform various functional applications and data processing, i.e., implement the method for reconstructing product information described above. Memory 204 may include high-speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory 204 may further include memory located remotely from the processor 202, which may be connected to the computer terminal 20 via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The transmission means 206 is used for receiving or transmitting data via a network. The specific examples of the network described above may include a wireless network provided by a communication provider of the computer terminal 20. In one example, the transmission device 206 includes a network adapter (Network Interface Controller, NIC) that can connect to other network devices through a base station to communicate with the internet. In one example, the transmission device 206 may be a Radio Frequency (RF) module for communicating with the internet wirelessly.
The display may be, for example, a touch screen type Liquid Crystal Display (LCD) that may enable a user to interact with a user interface of the computer terminal 20 (or mobile device).
It should be noted here that in some alternative embodiments, the computer device (or mobile device) shown in fig. 2 described above may include hardware elements (including circuitry), software elements (including computer code stored on a computer-readable medium), or a combination of both hardware and software elements. It should be noted that fig. 2 is only one example of a specific example, and is intended to illustrate the types of components that may be present in the computer device (or mobile device) described above.
In the above-described operating environment, the present application provides a method for reconstructing product information as shown in fig. 3. Fig. 3 is a flowchart of a method of reconstructing product information according to embodiment 1 of the present invention.
Step S31, obtaining the original content of the product information in a preset platform.
Specifically, the preset platform may be a shopping platform, the product information may include attributes such as a name, a model, a use, and the like of the product, and the original content may be a title of the product displayed in the shopping platform. The original content here is used to represent a title before the title of the product is reconstructed, and may be a title set in advance for the product by the seller.
In an alternative embodiment, before the product is put on shelf, the merchant can set a corresponding title for the product, and when the user searches the product by using the shopping platform, the title of the product can be displayed in a list page corresponding to the image information of the product for the user to view. To increase the exposure of a product, a merchant may repeatedly pile words describing the product, for example, for a down jacket, the title may be: the novel ultrathin Bai Yarong women fashionable and thickened hair-collar lovely down jacket is 100-130 jin. The title is the original content corresponding to the product.
Step S33, identifying key information from the original content, wherein the key information includes at least one word for characterizing the feature of the product.
Specifically, the product features may include: the key information is words for expressing the characteristics of the product, such as the product name, the number, the specification, the model, the applicable object and the like.
In an alternative embodiment, the original content corresponding to the product, i.e., the product title before reconstruction, may be obtained first. And identifying words for describing the characteristics of the product from the original content by using a named entity identification technology, and obtaining the key information corresponding to the product.
And step S35, obtaining the reconstructed product information at least by adjusting the sequence of the key information in the original content.
Specifically, after determining the characteristics of the product represented by the key information, the order of the key information in the original content can be adjusted according to the characteristics of the product represented by the key information.
In the above scheme, the order of the key information in the original content may be adjusted according to a preset unified order, for example, the preset unified order is: quantity word, specification word, product modifier word, model word, applicable object word and other words.
For the original content: the novel ultrathin Bai Yarong women fashion thickened body-building fur collar lovely down jacket is 100-130 jin, and as a result of extracting key information, countless words are used; the specification word is 100-130 jin; the product word is a down jacket; the product modifier includes: new, ultrathin, fashionable, thickened, shaped and lovely; no model word exists; the object words are used as women; other words include: white duck down and hair collar. According to the preset unified sequence, the reconstructed product information can be obtained: 100-130 jin of new ultrathin fashionable thickened shapely collar capable of loving white duck down.
But the importance of different features may be different for different fields of products, for example: for electronic products, the model words are more important; the applicable object words are more important for household articles, so that in an alternative embodiment, different arrangement sequences can be set for commodities in different fields.
In the above embodiment, after the key information of the product is identified, it is also necessary to determine the domain to which the product belongs, and search the order corresponding to the domain according to the domain to which the product belongs. And then, according to the sequence corresponding to the field, adjusting the sequence of the key information in the original content, thereby obtaining the reconstructed product information.
According to the method and the device for displaying the title, the key information of the original title is extracted, and the position of the key information in the original content is adjusted, so that the information value and the understandability of title display are improved, the searching efficiency of a user is further improved, and higher benefits are brought to websites. And has great advantages over the seq2seq method in that newly generated titles are reconstructed based on phrases in the original title, and no distortion problem exists.
Therefore, the embodiment of the application solves the problem that in the shopping website in the prior art, the titles of products are disordered, so that the searching efficiency is low when a user searches the products.
As an alternative embodiment, the obtaining the reconstructed product information at least by adjusting the order of the key information in the original content includes: classifying the key information to obtain first key information and second key information except the first key information, wherein the first key information comprises words and marketing words for describing unique attributes of products; determining a degree of attention parameter of the target object to the second key information; sorting the second key information according to the attention degree parameters to obtain a local sorting result; arranging the local sequencing result and the first key information according to a preset reconstruction rule; and determining the product information after the arrangement result is reconstructed.
Specifically, the unique attribute of the product is used for representing the unique attribute of the product, for example, for a down jacket, information such as color, material, applicable object and the like is the unique attribute of the down jacket, and key information used for representing the unique attribute is first key information; while the product words and product modifiers of the down jackets may have various adjective words (e.g., lovely, shapey, commute, etc.), the key information used to represent these non-unique attributes is the second key information.
In the above scheme, the attention degree parameter of the second key information is obtained, the local ordering result corresponding to the second key information is determined according to the attention degree parameter of the second key information, and then the local ordering result is combined with the first key information according to the preset reconstruction rule, so that the final reconstructed product title is obtained.
In this process, the attention parameter may be used to represent the attention of the user to the second key information. The calculation of the attention degree parameter of the second key information can be realized according to the historical data of the shopping platform. In an alternative embodiment, the number or frequency of searching the second key information in the preset platform may be obtained, and the attention parameter of the second keyword may be determined according to the number or frequency of searching the second key information. In another alternative embodiment, a attention parameter model may be further constructed according to historical data of the user accessing the shopping platform, and attention parameters of each second key information may be predicted based on the attention parameter model.
In the above process, the second key information may be sequenced according to the order of the attention degree parameter from high to low, to obtain a local sequencing result, and then the local sequencing result is used as a whole to participate in the sequencing with the first key information, to obtain the final reconstructed product title.
As an alternative embodiment, the first key information includes at least one of: brand, model, quantity, specification, marketing, and applicable object words.
The first key information is key information which does not need to participate in the attention parameter operation. Including key information describing unique attributes of the product. This part of the information is unique to a product and therefore does not participate in the calculation of the attention parameter. The first key information further comprises a marketing word, the marketing word has no practical meaning and occupies a larger space, so that the marketing word can be deleted or placed at the tail part of the title, and the calculation of the attention degree parameter is not needed.
As an alternative embodiment, identifying key information from the original content includes: carrying out named entity identification on the original content according to the entity characteristics of the product to obtain entity characteristic information; dividing the original content according to the semantics to obtain semantic information; combining the entity characteristic information and the semantic information to obtain combined information; at least one preset process is carried out on the combined information to obtain key information, wherein the preset process comprises the following steps: and (5) fusion, checksum disambiguation processing.
Specifically, the named entity identification may be a NER technology, and the original content corresponding to the product is identified by the NER technology, so as to obtain the entity characteristic information. The entity identification of the original content is realized by a sequence labeling method, and a model which can be adopted comprises the following steps: a feature template-based CRF model or a neural network-based Bi-LSTM-CRF model. In an alternative embodiment, the marks used for the different components may be set and the training data may be produced by manual labeling.
When the entity characteristic information of the product is acquired, the product attribute recorded by the seller in the shopping platform can be acquired, and the product attribute recorded by the seller is also used as the entity characteristic information.
The original content is segmented according to the semantics, namely the original content is segmented through a chunking model in the dimension of the semantics, so that the semantic information is obtained. Semantic boundaries are identified through a chunking model, and the problem of bad titles based on a rule method is solved. In an alternative embodiment, the granularity of semantics used in the cutting may be set first, and in this embodiment, the semantic edges may be defined according to the granularity of noun phrases. For example, for original content Free Shipping CANCA 32inch multimedia HD LED LCD flat panel TV Display monitor Full HD HDMI/USB/AV/RF/VGAChunking; the result of semantic segmentation is: free Shipping is CANCA is 32inch is mult imedia HD LED LCD is flat panel is TV Display monitor is Full HD HDMI/USB/AV/RF/VGA2.2. The model for semantic segmentation may employ CRF or Bi-LSTM-CRF.
In the above scheme, because the accuracy of named entity recognition is difficult to meet the requirement of extracting key information, for example, for an applicable object word "applicable to iphone8Plus" in the original content of a mobile phone shell product, the applicable object word possibly identified by the NER technology is "applicable to iphone8", and the whole of iphone8Plus is not identified. Therefore, after the named entity identification is carried out on the original content, the original content is segmented according to the semantics, and key information corresponding to the product is determined based on the named entity identification and the semantic segmentation result.
It should be noted that, the object for identifying the named entity and for performing semantic segmentation is the original content corresponding to the product, so that the entity feature information and the semantic information may include the same or similar content. Therefore, after combining the entity characteristic information and the semantic information to obtain combined information, various processes are required to be performed on the combined information.
In an alternative embodiment, the corresponding processing rule can be constructed according to the processing required by the combined information, and the key information corresponding to the product can be obtained by passing the combined information through the set processing rule.
Specifically, the fusion may be to perform redundancy elimination processing on the same entity characteristic information and semantic information in the combined information, and only one entity characteristic information and semantic information is reserved, so that a rule corresponding to the fusion processing may be to delete one entity characteristic information and semantic information if the same entity characteristic information and semantic information are detected.
The verification process may be to verify the accuracy of the entity feature information and the semantic information, and the corresponding rule may be to map the entity feature information and the semantic information into the corresponding vocabulary, and if the entity feature information and the semantic information cannot be mapped in the vocabulary, correct the entity feature information and the semantic information.
The disambiguation process may be to unify entity feature information and semantic information with similarity higher than a preset value in the combined information, so as to remove words with recognition errors or segmentation errors, so that a rule corresponding to the disambiguation process may be to acquire confidence degrees of the entity feature information and the semantic information and delete one with lower confidence degrees if the entity feature information and the semantic information with similarity higher than the preset value are detected. For example, still taking the applicable object word "applicable to iphone8 Plus" as an example, the result obtained by NER is "applicable to iphone8", the result obtained by semantic segmentation is "applicable to iphone8 Plus", and the similarity of the two is higher than a preset value, so that the confidence of the two is obtained, wherein "applicable to iphone8 Plus" has higher confidence, so that "applicable to iphone8" is deleted, and "applicable to iphone8 Plus" is reserved.
According to the scheme, the keyword is obtained by combining the result of the named entity identification and the result of the semantic segmentation, namely, the result of the named entity identification is corrected in a semantic segmentation mode, so that the phenomena of misidentification or missed identification and the like possibly occurring in the named entity identification are avoided, the problem of error of key information corresponding to a product is avoided, and the accuracy of the product title reconstruction is improved.
It should be noted that, there are two main types of methods for rewriting the commodity title: rule-based and seq 2-based models. The traditional rule-based method mainly screens out important components in the title by combining a statistical method with a word list, and deletes non-important components. Since the semantic boundaries are not considered, the title reconstructed by the method has a certain problem in terms of fluency; in addition, for the attention of the user, the attention weight is calculated by using a vocabulary matching method, and the generalization capability is limited, namely, the calculation of the new search word weight which does not appear in the history is inaccurate. And the method of the seq2seq model, the input is the original title and the output is the newly constructed title. The method has two defects, namely, a large amount of manual production data is needed to be used as training corpus, and the cost is high; secondly, the seq2seq method can produce titles with better fluency, has better effect under the female clothing, but has poor accuracy in extracting key information under the category of containing more brands, models, series, specifications and attributes (such as the key attributes of the sizes, the memory sizes and the like of commodities under the category 3c of mobile phones, computers and the like).
The method provided by the embodiment of the application only needs a small amount of artificial production data to train the NER model and the Chunking model, has great advantages compared with the seq2seq method, and the newly generated title is reconstructed based on the phrase in the original title, so that the problem of distortion does not exist.
As an alternative embodiment, determining the attention parameter of the target object to the second key information includes: acquiring a focus degree model, wherein the focus degree model is obtained based on query history training of a target object; and determining a degree of interest parameter of the target object on the second key information based on the degree of interest model.
Specifically, the target object may be a user accessing the shopping platform, and training data may be obtained from the target object in the query history of the shopping platform to train the attention model.
In an alternative embodiment, the terms in the user's query log within the last month may be selected, and the scoring corresponding to the terms may be used as training data. When the words are marked, the assumption can be based on the assumption that if the user directly performs one or more times of purchasing operations in the query result of one word, the user is more focused on the word; it may also be based on the assumption that if the longer a user stays in a query result corresponding to a term, the higher the user's attention to that term is explained.
After training to obtain the attention model, the second key information can be input into the attention model, so that attention parameters of second key information predicted by the attention model are obtained.
According to the scheme, the buyer attention model is trained through the history log, the user habit is skillfully fused, the information focused by the buyer in the reconstructed title can be exposed out in advance, and the searching efficiency of the buyer is improved.
As an optional embodiment, the attention model is a language model, and determining an attention parameter of the target object to the second key information based on the attention model includes: inputting the second key information into the language model to obtain scoring of the second key information by the language model; and determining a attention parameter scored as the second key information.
Specifically, the language model is language abstract mathematical modeling based on language objective facts. In an alternative embodiment, the attention model is a language model. Modeling of buyer interest may be obtained by obtaining all buyer historical Query, training bi-gram language models. For a second key information to be tested, the calculation of the attention of the buyer can be obtained through scoring by a language model.
As an optional embodiment, sorting the second key information according to the attention parameter, to obtain a local sorting result includes: and ordering the second key information according to the order of the attention degree parameters from high to low to obtain a local ordering result.
In the scheme, the second key information is arranged according to the order of the attention degree parameters from high to low, so that the second key information with higher attention degree is arranged forward, the exposure rate of the product on the shopping platform can be guaranteed to the greatest extent, and the user is attracted to the greatest extent.
As an optional embodiment, the second key information includes a plurality of product words and a plurality of product modifier words, and before the second key information is ranked in order of from high to low according to the attention parameter, the method further includes: reserving the product word with the largest attention degree parameter in the product words, and deleting other product words; or reserving the product modifier with the biggest attention degree parameter in the product modifiers, and deleting other product modifiers.
In a practical shopping platform, in order to increase the exposure rate of a product, a seller typically piles up a lot of product words and product modifier words in the title of the product.
Taking the original content of '100-130 jin of ultrathin Bai Yarong women fashionable thickening and slimming hair collar lovely down jackets' as an example, the novel, fashionable, slimming, lovely and the like are all product modifier words. Taking the original content of 'dress long skirt gentlewoman Qianlike summer suspender skirt' as an example, the dress, long skirt and suspender skirt are all product words.
The stacking can make the useful information of the product not exposed, so that the accurate judgment of the product by a user is affected, therefore, under the condition that the product words and the product modifier are more, one item with the highest attention degree parameter is selected, and other product words or product modifier are deleted, so that useless stacking of the product words and the product modifier is reduced.
As an alternative embodiment, the preset reconfiguration rule includes: arranged in a first order, the first order being: brand words, model words, quantity words, specification words, product modifier words, applicable object words and partial ordering results; or arranged in a second order, the second order being: quantity words, specification words, product modifier words, model words, applicable object words and local ordering results.
Specifically, the heuristic title reconstruction rule may be a reconstruction rule obtained by combing a linguist and verified by on-line experimental comparison. Wherein the first order is used to represent reconstruction rules that include brand words and the second order is used to represent reconstruction rules that do not include brand words.
As an optional embodiment, arranging the local ordering result and the first key information according to a preset reconfiguration rule includes: detecting whether the key information of the product comprises brand information or not; and if the key information of the product does not comprise brand information, arranging the first key information and the partial ordering result according to the second order.
In the above scheme, different reconstruction rules are used for different types of products. Products of which the key information does not include brand information show that the brand information is low in importance to the products, and for the key information of the category, the key information is rearranged by adopting the second sequence arrangement as a preset reconstruction rule so as to reconstruct original content.
As an alternative embodiment, if the key information of the product includes a word for representing brand information, the method further includes: acquiring confidence level for brand information; if the confidence coefficient is larger than a preset confidence coefficient threshold value, the first key information and the local ordering result are arranged according to a first sequence; and if the confidence coefficient is smaller than or equal to the preset confidence coefficient threshold value, the first key information and the local ordering result are arranged according to the second sequence.
And for products for which the key information includes brands, whether the brands are authentic becomes the criteria for which the reconstruction rules are selected. For a real brand, its brand name usually has a positive effect in the sales process, while for some false brands, its brand may be hidden from the reconstructed title.
Thus, in the above-described scheme, for a product including brand information in key information, it is necessary to determine a selected reconstruction rule according to the confidence level of the brand information thereof. If the confidence level of the brand information is larger than a preset confidence threshold value, the product titles are reconstructed by adopting a first sequence containing the brand information, and if the confidence level of the brand information is smaller than or equal to the preset confidence level, the product titles are reconstructed by adopting a second sequence not containing the brand information.
As an alternative embodiment, before determining the arrangement result to reconstruct the product information, the method includes: and deleting key information used for representing the marketing word in the arrangement result.
In the above scheme, after the first key information and the local ordering result are arranged according to the preset reconfiguration rule, the marketing words in the arrangement result need to be deleted, the marketing words are used for representing marketing words irrelevant to the product itself, and the judgment of the marketing words on the purchase of the user can be reduced by deleting the marketing words, so that the user is more concerned about the attribute of the product itself.
In an alternative embodiment, a marketing word list may be preset, after the sorting result is obtained, each word in the sorting result is mapped to the marketing word list, and the successfully mapped word is determined to be the marketing word, so that the marketing word to be deleted may be found out from the sorting result. After the marketing words in the ordering result are found, deleting the marketing words, and obtaining the title of the reconstructed product.
Fig. 4 is a schematic diagram of a product theme reconfiguration according to embodiment 1 of the present application, and the above embodiment is described below with reference to fig. 4:
step S41, obtaining a component information set S (entity feature) of the product title by the product attribute entered by the seller and performing the NER result on the product title (original content), including: brand, model word, product word, quantity word, specification word, product modifier, applicable object word, and other words;
step S42, segmenting a product title according to a semantic boundary by using Chunking to obtain a segmented ordered phrase set C (semantic information);
step S43, fusing, checking and disambiguating the component information set S and the semantic segmentation phrase set C, further repairing the component information of the product title, obtaining a title component set S1 (key information), and performing title component analysis based on the title component set S1;
Step S44, modeling the user behavior of the buyer through the historical Query of the user, wherein the model M (attention model) is used for calculating the attention degree of the buyer;
step S45, using a model M, performing buyer attention degree calculation on the components in the title component set S1, wherein first key information (brand, model word, quantity word, specification word, marketing word and applicable object word) does not participate in calculation, sorting the product component set S2 (local sorting result) according to the order of the buyer attention degree from high to low, and screening out phrases with highest buyer attention degree in a plurality of product words and a plurality of product modifier words;
step S46, reconstructing the sorting result obtained in step S45 and the component information not participating in the attention calculation according to the following heuristic (reconstruction rule):
the branding words are: brand word, model word, quantity word, specification word, product modifier word, applicable object word and other words;
no brand word: quantity words, specification words, product modifier words, model words, applicable object words and other words;
step S47, deleting marketing words in the reconstructed title based on a preset marketing word List to form a new title for List page display.
Fig. 5 is a schematic diagram of an example of the reconstruction of a product title according to embodiment 1 of the present application, and the original content is Free Shipping CANCA 32inch multimedia HD LED LCD flat panel TV Display monitor Dull HD HDMI/USB/AV/RF/VGA before reconstruction, as shown in fig. 5. The result after NER is freeO shippingO cancaB _B 32S_B inchS_E multimediaO hdO ledO lcdO flatOpanelO tvO displayP_M monitorP_C fullO hdO hdmiB_B/O usbO/O avO/O rfO/O vgaO, wherein "O", "B_B", "S_E", "P_M", "P_C" are the results of NER marking. The result of the Chunking of the original content is Free clip cancel 32inch multimedia HD LED LCD flat panel fill TV Display monitor Dull HD HDMI/USB/AV/RF/VGA, where the symbol "||" is used to represent the segmentation result.
After the NER result and the Chunking result are fused and subjected to checksum disambiguation, first key information and second key information in the NER result are extracted, wherein the first key information comprises: brand word "CANCA", model word "32inch", product word "TV Display monitor"; the second key information includes: other words "multimedia HD LED LCD" and "flat panel", product modifier "HD HDMI/USB/AV/RF/VGA". Calculating attention degree parameters of other words and product modifier words according to bi-gram language models trained by buyer historical Query, reserving the product modifier words with highest attention degree (HD HDMI), sequencing the product modifier words with highest attention degree and other words according to the order of the attention degree parameters from high to low, and reconstructing the product modifier words with brand words, model words and product words according to preset heuristics to obtain a reconstructed title CANCA 32inch TV Display monitor flat panel multimedia HD LED LCD HD HDMI.
It should be noted that, for simplicity of description, the foregoing method embodiments are all described as a series of acts, but it should be understood by those skilled in the art that the present invention is not limited by the order of acts described, as some steps may be performed in other orders or concurrently in accordance with the present invention. Further, those skilled in the art will also appreciate that the embodiments described in the specification are all preferred embodiments, and that the acts and modules referred to are not necessarily required for the present invention.
From the description of the above embodiments, it will be clear to a person skilled in the art that the method according to the above embodiments may be implemented by means of software plus the necessary general hardware platform, but of course also by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (e.g. ROM/RAM, magnetic disk, optical disk) comprising several instructions for causing a terminal device (which may be a mobile phone, a computer, a server, or a network device, etc.) to perform the method of the various embodiments of the present invention.
Example 2
According to an embodiment of the present invention, there is further provided a product information reconstruction device for implementing the above product information reconstruction method, and fig. 6 is a schematic diagram of a product information reconstruction device according to embodiment 2 of the present application, as shown in fig. 6, where the device 600 includes:
the obtaining module 602 is configured to obtain an original content of the product information in a preset platform;
an identification module 604 for identifying key information from the original content, wherein the key information includes at least one word for characterizing a feature of the product;
The adjusting module 606 is configured to obtain the reconstructed product information at least by adjusting the order of the key information in the original content.
It should be noted that, the above-mentioned obtaining module 602, the identifying module 604 and the adjusting module 606 correspond to the steps S31 to S35 in the embodiment 1, and the two modules are the same as the examples and application scenarios implemented by the corresponding steps, but are not limited to the disclosure of the above-mentioned embodiment one. It should be noted that the above-described module may be operated as a part of the apparatus in the computer terminal 10 provided in the first embodiment.
As an alternative embodiment, the adjustment module comprises: the classification sub-module is used for classifying the key information to obtain first key information and second key information except the first key information, wherein the first key information comprises words and marketing words for describing unique attributes of products; the first determining submodule is used for determining a degree of attention parameter of the target object to the second key information; the first ordering sub-module is used for ordering the second key information according to the attention degree parameters by a user to obtain a local ordering result; the second sequencing sub-module is used for sequencing the local sequencing result and the first key information according to a preset reconstruction rule; and the second determining submodule is used for determining the product information after the arrangement result is reconstructed.
As an alternative embodiment, the first key information includes at least one of: brand, model, quantity, specification, marketing, and applicable object words.
As an alternative embodiment, the identification module comprises: the identification sub-module is used for carrying out named entity identification on the original content according to the entity characteristics of the product to obtain entity characteristic information; the segmentation sub-module is used for segmenting the original content according to the semantics to obtain semantic information; the combination sub-module is used for combining the entity characteristic information and the semantic information to obtain combined information; the processing sub-module is used for carrying out at least one preset process on the combined information to obtain key information, and the preset process comprises the following steps: and (5) fusion, checksum disambiguation processing.
As an alternative embodiment, the first determining submodule includes: the acquisition unit is used for acquiring a focus degree model, wherein the focus degree model is obtained based on query history training of the target object; and the determining unit is used for determining the attention degree parameter of the target object to the second key information based on the attention degree model.
As an alternative embodiment, the attention model is a language model, and the determining unit includes: the scoring subunit is used for inputting the second key information into the language model to obtain the scoring of the language model on the second key information; and the determining subunit is used for determining the attention degree parameter which is scored as the second key information.
As an alternative embodiment, the first sorting sub-module comprises: and the ordering unit is used for ordering the second key information according to the order of the attention degree parameters from high to low to obtain a local ordering result.
As an alternative embodiment, the second key information includes a plurality of product words and a plurality of product modifier words, and the first ranking sub-module further includes: the deleting unit is used for reserving the product word with the largest attention degree parameter in the plurality of product words and deleting other product words before ordering the second key information according to the order of the attention degree parameters from high to low to obtain a local ordering result; or reserving the product modifier with the biggest attention degree parameter in the product modifiers, and deleting other product modifiers.
As an alternative embodiment, the preset reconfiguration rule includes: arranged in a first order, the first order being: brand words, model words, quantity words, specification words, product modifier words, applicable object words and partial ordering results; or arranged in a second order, the second order being: quantity words, specification words, product modifier words, model words, applicable object words and local ordering results.
As an alternative embodiment, the second sorting sub-module includes: the detection unit is used for detecting whether the key information of the product comprises brand information or not; and the first sorting unit is used for sorting the first key information and the partial sorting result according to the second sequence if the key information of the product does not comprise brand information.
As an alternative embodiment, the second sorting sub-module further comprises: an obtaining unit, configured to obtain a confidence level for brand information if the key information of the product includes the brand information; the second ordering unit is used for ordering the first key information and the local ordering result according to the first sequence if the confidence coefficient is larger than a preset confidence coefficient threshold value; and the third ordering unit is used for ordering the first key information and the local ordering result according to the second sequence if the confidence coefficient is smaller than or equal to a preset confidence coefficient threshold value.
As an alternative embodiment, the adjustment module further comprises: and the deleting sub-module is used for deleting key information used for representing the marketing word in the arrangement result before determining the arrangement result to be the reconstructed product information.
Example 3
According to an embodiment of the present invention, there is further provided a method for reconstructing product information, and fig. 7 is a flowchart of a method for reconstructing product information according to embodiment 3 of the present application, as shown in fig. 7, where the method includes:
step S71, the original content of the product information describing the product is displayed.
Specifically, the preset platform may be a shopping platform, the product information may include attributes such as a name, a model, a use, and the like of the product, and the original content may be a title of the product displayed in the shopping platform. The original content here is used to represent a title before the title of the product is reconstructed, and may be a title set in advance for the product by the seller.
In an alternative embodiment, before the product is put on shelf, the merchant may set a corresponding title for the product, and when the user searches for the product using the shopping platform, the title of the product may be displayed in the list page for the user to view. To increase the exposure of a product, a merchant may repeatedly pile words describing the product, for example, for a down jacket, the title may be: the novel ultrathin Bai Yarong women fashionable and thickened hair-collar lovely down jacket is 100-130 jin. The title is the original content corresponding to the product.
Step S73, displaying the key information identified in the original content, wherein the key information includes at least one term for characterizing the product.
Specifically, the product features may include: the key information is words for expressing the characteristics of the product, such as the product name, the number, the specification, the model, the applicable object and the like.
In an alternative embodiment, the original content corresponding to the product, i.e., the product title before reconstruction, may be obtained first. And identifying words for describing the characteristics of the product from the original content by using a named entity identification technology, and obtaining the key information corresponding to the product.
And step S75, displaying the reconstructed product information, wherein the display sequence of the reconstructed product information, which is the key information, in the original content is adjusted.
Specifically, after determining the characteristics of the product represented by the key information, the order of the key information in the original content can be adjusted according to the characteristics of the product represented by the key information.
In the above scheme, the order of the key information in the original content may be adjusted according to a preset unified order, for example, the preset unified order is: quantity word, specification word, product modifier word, model word, applicable object word and other words.
For the original content: the novel ultrathin Bai Yarong women fashion thickened body-building fur collar lovely down jacket is 100-130 jin, and as a result of extracting key information, countless words are used; the specification word is 100-130 jin; the product word is a down jacket; the product modifier includes: new, ultrathin, fashionable, thickened, shaped and lovely; no model word exists; the object words are used as women; other words include: white duck down and hair collar. According to the preset unified sequence, the reconstructed product information can be obtained: 100-130 jin of new ultrathin fashionable thickened shapely collar capable of loving white duck down.
But the importance of different features may be different for different fields of products, for example: for electronic products, the model words are more important; the applicable object words are more important for household articles, so that in an alternative embodiment, different arrangement sequences can be set for commodities in different fields.
In the above embodiment, after the key information of the product is identified, it is also necessary to determine the domain to which the product belongs, and search the order corresponding to the domain according to the domain to which the product belongs. And then, according to the sequence corresponding to the field, adjusting the sequence of the key information in the original content, thereby obtaining the reconstructed product information.
According to the method and the device for displaying the title, the key information of the original title is extracted, and the position of the key information in the original content is adjusted, so that the information value and the understandability of title display are improved, the searching efficiency of a user is further improved, and higher benefits are brought to websites. And has great advantages over the seq2seq method in that newly generated titles are reconstructed based on phrases in the original title, and no distortion problem exists.
Therefore, the embodiment of the application solves the problem that in the shopping website in the prior art, the titles of products are disordered, so that the searching efficiency is low when a user searches the products.
As an alternative embodiment, before displaying the reconstructed product information, the method further comprises: obtaining reconstructed product information at least by adjusting the order of the key information in the original content, wherein the obtaining of the reconstructed product information at least by adjusting the order of the key information in the original content comprises: classifying the key information to obtain first key information and second key information except the first key information, wherein the first key information comprises words and marketing words for describing unique attributes of products; determining a degree of attention parameter of the target object to the second key information; sorting the second key information according to the attention degree parameters to obtain a local sorting result; arranging the local sequencing result and the first key information according to a preset reconstruction rule; and determining the product information after the arrangement result is reconstructed.
Specifically, the unique attribute of the product is used for representing the unique attribute of the product, for example, for a down jacket, information such as color, material, applicable object and the like is the unique attribute of the down jacket, and key information used for representing the unique attribute is first key information; while the product words and product modifiers of the down jackets may have various adjective words (e.g., lovely, shapey, commute, etc.), the key information used to represent these non-unique attributes is the second key information.
In the above scheme, the attention degree parameter of the second key information is obtained, the local ordering result corresponding to the second key information is determined according to the attention degree parameter of the second key information, and then the local ordering result is combined with the first key information according to the preset reconstruction rule, so that the final reconstructed product title is obtained.
In this process, the attention parameter may be used to represent the attention of the user to the second key information. The calculation of the attention degree parameter of the second key information can be realized according to the historical data of the shopping platform. In an alternative embodiment, the number or frequency of searching the second key information in the preset platform may be obtained, and the attention parameter of the second keyword may be determined according to the number or frequency of searching the second key information. In another alternative embodiment, a attention parameter model may be further constructed according to historical data of the user accessing the shopping platform, and attention parameters of each second key information may be predicted based on the attention parameter model.
In the above process, the second key information may be sequenced according to the order of the attention degree parameter from high to low, to obtain a local sequencing result, and then the local sequencing result is used as a whole to participate in the sequencing with the first key information, to obtain the final reconstructed product title.
Example 4
According to an embodiment of the present invention, there is also provided a product information reconstruction device for implementing the product information reconstruction method in embodiment 1, and fig. 8 is a schematic diagram of a product information reconstruction device according to embodiment 4 of the present application, and as shown in fig. 8, the device 800 includes:
a first display module 802 for displaying original content of product information describing a product.
A second display module 804 is configured to display key information identified in the original content, where the key information includes at least one term for characterizing a feature of the product.
And a third display module 806, configured to display the reconstructed product information, where the display order of the reconstructed product information, which is the key information, in the original content is adjusted.
Here, it should be noted that the first display module 802, the second display module 804, and the third display module 806 correspond to steps S71 to S75 in embodiment 1, and the two modules are the same as the corresponding steps in implementation examples and application scenarios, but are not limited to the disclosure of the first embodiment. It should be noted that the above-described module may be operated as a part of the apparatus in the computer terminal 10 provided in the first embodiment.
As an alternative embodiment, the above device further comprises: and a reconstruction module, configured to obtain reconstructed product information at least by adjusting the order of the key information in the original content before displaying the reconstructed product information, where the reconstruction module includes: the classification sub-module classifies the key information by a user to obtain first key information and second key information except the first key information, wherein the first key information comprises words and marketing words for describing unique attributes of products; the first determining submodule is used for determining a degree of attention parameter of the target object to the second key information; the sorting sub-module is used for sorting the second key information according to the attention degree parameters to obtain a local sorting result; the arrangement sub-module is used for arranging the local ordering result and the first key information according to a preset reconstruction rule; and the second determining submodule is used for determining the product information after the arrangement result is reconstructed.
Example 5
According to an embodiment of the present invention, there is also provided a product information reconstruction system, including:
a processor; and
a memory, coupled to the processor, for providing instructions to the processor to process the following processing steps:
Acquiring original content of product information in a preset platform;
identifying key information from the original content, wherein the key information comprises at least one word for characterizing a feature of a product;
and obtaining the reconstructed product information at least by adjusting the sequence of the key information in the original content.
It should be noted that, the above memory is further used to provide instructions for the processor to process other steps in embodiment 1, which is not described herein.
Example 6
Embodiments of the present invention may provide a computer terminal, which may be any one of a group of computer terminals. Alternatively, in the present embodiment, the above-described computer terminal may be replaced with a terminal device such as a mobile terminal.
Alternatively, in this embodiment, the above-mentioned computer terminal may be located in at least one network device among a plurality of network devices of the computer network.
In this embodiment, the computer terminal may execute the program code of the following steps in the method for reconstructing product information: acquiring original content of product information in a preset platform; identifying key information from the original content, wherein the key information comprises at least one word for characterizing a feature of a product; and obtaining the reconstructed product information at least by adjusting the sequence of the key information in the original content.
Alternatively, fig. 9 is a block diagram of a computer terminal according to embodiment 6 of the present invention. As shown in fig. 9, the computer terminal a may include: one or more (only one shown) processors 902, memory 904, and a peripheral interface 906.
The memory may be used to store software programs and modules, such as program instructions/modules corresponding to the method and apparatus for reconstructing product information in the embodiments of the present invention, and the processor executes the software programs and modules stored in the memory, thereby executing various functional applications and data processing, that is, implementing the method for reconstructing product information described above. The memory may include high-speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory may further include memory remotely located with respect to the processor, which may be connected to terminal a through a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The processor may call the information and the application program stored in the memory through the transmission device to perform the following steps: acquiring original content of product information in a preset platform; identifying key information from the original content, wherein the key information comprises at least one word for characterizing a feature of a product; and obtaining the reconstructed product information at least by adjusting the sequence of the key information in the original content.
Optionally, the above processor may further execute program code for: classifying the key information to obtain first key information and second key information except the first key information, wherein the first key information comprises words and marketing words for describing unique attributes of products; determining a degree of attention parameter of the target object to the second key information; sorting the second key information according to the attention degree parameters to obtain a local sorting result; arranging the local sequencing result and the first key information according to a preset reconstruction rule; and determining the product information after the arrangement result is reconstructed.
Optionally, the first key information includes at least one of: brand, model, quantity, specification, marketing, and applicable object words.
Optionally, the above processor may further execute program code for: carrying out named entity identification on the original content according to the entity characteristics of the product to obtain entity characteristic information; dividing the original content according to the semantics to obtain semantic information; combining the entity characteristic information and the semantic information to obtain combined information; at least one preset process is carried out on the combined information to obtain key information, wherein the preset process comprises the following steps: and (5) fusion, checksum disambiguation processing.
Optionally, the above processor may further execute program code for: acquiring a focus degree model, wherein the focus degree model is obtained based on query history training of a target object; and determining a degree of interest parameter of the target object on the second key information based on the degree of interest model.
Optionally, the above processor may further execute program code for: the attention model is a language model, and the second key information is input into the language model to obtain scoring of the language model on the second key information; and determining a attention parameter scored as the second key information.
Optionally, the above processor may further execute program code for: and ordering the second key information according to the order of the attention degree parameters from high to low to obtain a local ordering result.
Optionally, the second key information includes a plurality of product words and a plurality of product modifier words, and the processor may further execute program code for: before ordering the second key information according to the order of the attention degree parameters from high to low to obtain a local ordering result, reserving the product word with the largest attention degree parameter in the plurality of product words, and deleting other product words; or reserving the product modifier with the biggest attention degree parameter in the product modifiers, and deleting other product modifiers.
Optionally, the preset reconfiguration rule includes: arranged in a first order, the first order being: brand words, model words, quantity words, specification words, product modifier words, applicable object words and partial ordering results; or arranged in a second order, the second order being: quantity words, specification words, product modifier words, model words, applicable object words and local ordering results.
Optionally, the above processor may further execute program code for: detecting whether the key information of the product comprises brand information or not; and if the key information of the product does not comprise brand information, arranging the first key information and the partial ordering result according to the second order.
Optionally, the above processor may further execute program code for: acquiring confidence level for brand information; if the confidence coefficient is larger than a preset confidence coefficient threshold value, the first key information and the local ordering result are arranged according to a first sequence; and if the confidence coefficient is smaller than or equal to the preset confidence coefficient threshold value, the first key information and the local ordering result are arranged according to the second sequence.
Optionally, the above processor may further execute program code for: and deleting key information used for representing the marketing word in the arrangement result before determining the arrangement result to be the reconstructed product information.
By adopting the embodiment of the invention, a scheme of a product information reconstruction method is provided. By extracting the key information of the original title and adjusting the position of the key information in the original content, the information value and the understandability of the title display are improved, the searching efficiency of the user is further improved, and higher benefits are brought to the website. And has great advantages over the seq2seq method in that newly generated titles are reconstructed based on phrases in the original title, and no distortion problem exists. Therefore, the embodiment of the application solves the problem that in the shopping website in the prior art, the titles of products are disordered, so that the searching efficiency is low when a user searches the products.
It will be appreciated by those skilled in the art that the configuration shown in fig. 9 is only illustrative, and the computer terminal may be a smart phone (such as an Android phone, an iOS phone, etc.), a tablet computer, a palm computer, a mobile internet device (Mobile Internet Devices, MID), a PAD, etc. Fig. 9 is not limited to the structure of the electronic device. For example, the computer terminal a may also include more or fewer components (such as a network interface, a display device, etc.) than shown in fig. 9, or have a different configuration than shown in fig. 9.
Those of ordinary skill in the art will appreciate that all or part of the steps in the various methods of the above embodiments may be implemented by a program for instructing a terminal device to execute in association with hardware, the program may be stored in a computer readable storage medium, and the storage medium may include: flash disk, read-Only Memory (ROM), random-access Memory (Random Access Memory, RAM), magnetic or optical disk, and the like.
Example 7
The embodiment of the invention also provides a storage medium. Alternatively, in this embodiment, the storage medium may be used to store program codes executed by the method for reconstructing product information provided in the first embodiment.
Alternatively, in this embodiment, the storage medium may be located in any one of the computer terminals in the computer terminal group in the computer network, or in any one of the mobile terminals in the mobile terminal group.
Alternatively, in the present embodiment, the storage medium is configured to store program code for performing the steps of: acquiring original content of product information in a preset platform; identifying key information from the original content, wherein the key information includes at least one term for characterizing a feature of the product; and obtaining the reconstructed product information at least by adjusting the sequence of the key information in the original content.
The foregoing embodiment numbers of the present invention are merely for the purpose of description, and do not represent the advantages or disadvantages of the embodiments.
In the foregoing embodiments of the present invention, the descriptions of the embodiments are emphasized, and for a portion of this disclosure that is not described in detail in this embodiment, reference is made to the related descriptions of other embodiments.
In the several embodiments provided in the present application, it should be understood that the disclosed technology content may be implemented in other manners. The above-described embodiments of the apparatus are merely exemplary, and the division of the units, such as the division of the units, is merely a logical function division, and may be implemented in another manner, for example, multiple units or components may be combined or may be integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be through some interfaces, units or modules, or may be in electrical or other forms.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in the embodiments of the present invention may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.
The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied essentially or in part or all of the technical solution or in part in the form of a software product stored in a storage medium, including instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a removable hard disk, a magnetic disk, or an optical disk, or other various media capable of storing program codes.
The foregoing is merely a preferred embodiment of the present invention and it should be noted that modifications and adaptations to those skilled in the art may be made without departing from the principles of the present invention, which are intended to be comprehended within the scope of the present invention.

Claims (12)

1. A method of reconstructing product information, comprising:
acquiring original content of product information in a preset platform;
identifying key information from the original content, wherein the key information comprises at least one word for characterizing a feature of a product;
obtaining reconstructed product information at least by adjusting the sequence of the key information in the original content;
wherein the obtaining the reconstructed product information at least by adjusting the order of the key information in the original content includes: classifying the key information to obtain first key information and second key information except the first key information, wherein the first key information comprises words and marketing words for describing unique attributes of products, and the second key information comprises a plurality of product words and a plurality of product modifier words; determining a degree of interest parameter of the target object on the second key information; sequencing the second key information according to the sequence from high to low of the attention degree parameter to obtain a local sequencing result; arranging the local sequencing result and the first key information according to a preset reconstruction rule; and determining the product information after the arrangement result is reconstructed.
2. The method of claim 1, wherein the first key information comprises at least one of: brand, model, quantity, specification, marketing, and applicable object words.
3. The method of claim 1, wherein identifying key information from the original content comprises:
carrying out named entity identification on the original content according to the entity characteristics of the product to obtain entity characteristic information;
dividing the original content according to semantics to obtain semantic information;
combining the entity characteristic information and the semantic information to obtain combined information;
at least one preset process is carried out on the combined information to obtain the key information, and the preset process comprises the following steps: and (5) fusion, checksum disambiguation processing.
4. The method of claim 1, wherein determining a degree of interest parameter of a target object for the second key information comprises:
acquiring a degree of interest model, wherein the degree of interest model is obtained based on query history training of the target object;
and determining a degree of attention parameter of the target object to the second key information based on the degree of attention model.
5. The method of claim 4, wherein the attention model is a language model, and determining the attention parameter of the target object to the second key information based on the attention model comprises:
inputting the second key information into the language model to obtain scoring of the second key information by the language model;
and determining a degree of attention parameter of the second key information.
6. The method of claim 1, wherein prior to ranking the second key information in order of the attention parameter from high to low, the method further comprises:
reserving the product word with the largest attention degree parameter in the product words, and deleting other product words; or (b)
And reserving the product modifier with the largest attention degree parameter in the product modifiers, and deleting other product modifiers.
7. The method of claim 1, wherein the preset reconfiguration rule comprises:
arranged in a first order, the first order being: brand words, model words, quantity words, specification words, product modifier words, applicable object words and the local ordering result; or (b)
Arranged in a second order, the second order being: quantity words, specification words, product modifier words, model words, applicable object words and the local ordering result.
8. The method of claim 7, wherein arranging the partial ordering result and the first key information according to the preset reconstruction rule comprises:
detecting whether the key information of the product comprises brand information or not;
and if the key information of the product does not comprise brand information, arranging the first key information and the local ordering result according to the second order.
9. The method of claim 8, wherein if the key information of the product includes brand information, the method further comprises:
acquiring a confidence level for the brand information;
if the confidence coefficient is larger than a preset confidence coefficient threshold value, the first key information and the local ordering result are arranged according to the first sequence;
and if the confidence coefficient is smaller than or equal to a preset confidence coefficient threshold value, the first key information and the local ordering result are arranged according to the second sequence.
10. The method of claim 1, comprising, prior to determining the product information after the ranking results are reconstructed: and deleting key information used for representing marketing words in the arrangement result.
11. A method of reconstructing product information, comprising:
displaying original content of product information describing a product;
displaying key information identified in the original content, wherein the key information comprises at least one word and sentence for representing the characteristics of the product;
displaying the reconstructed product information, wherein the reconstructed product information is that the display sequence of the key information in the original content is adjusted;
wherein the method further comprises: before the reconstructed product information is displayed, classifying the key information to obtain first key information and second key information except the first key information, wherein the first key information comprises words and marketing words for describing unique attributes of products, and the second key information comprises a plurality of product words and a plurality of product modifier words; determining a degree of interest parameter of the target object on the second key information; sequencing the second key information according to the sequence from high to low of the attention degree parameter to obtain a local sequencing result; arranging the local sequencing result and the first key information according to a preset reconstruction rule; and determining the product information after the arrangement result is reconstructed.
12. A device for reconstructing product information, comprising:
the acquisition module is used for acquiring the original content of the product information in the preset platform;
an identification module for identifying key information from the original content, wherein the key information comprises at least one word for characterizing a feature describing a product;
the adjusting module is used for obtaining the reconstructed product information at least by adjusting the sequence of the key information in the original content;
the adjusting module is configured to obtain the reconstructed product information at least by adjusting the sequence of the key information in the original content through the following steps:
classifying the key information to obtain first key information and second key information except the first key information, wherein the first key information comprises words and marketing words for describing unique attributes of products, and the second key information comprises a plurality of product words and a plurality of product modifier words; determining a degree of interest parameter of the target object on the second key information; sequencing the second key information according to the sequence from high to low of the attention degree parameter to obtain a local sequencing result; arranging the local sequencing result and the first key information according to a preset reconstruction rule; and determining the product information after the arrangement result is reconstructed.
CN201910219171.6A 2019-03-21 2019-03-21 Product information reconstruction method and device Active CN111723566B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910219171.6A CN111723566B (en) 2019-03-21 2019-03-21 Product information reconstruction method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910219171.6A CN111723566B (en) 2019-03-21 2019-03-21 Product information reconstruction method and device

Publications (2)

Publication Number Publication Date
CN111723566A CN111723566A (en) 2020-09-29
CN111723566B true CN111723566B (en) 2024-01-23

Family

ID=72562816

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910219171.6A Active CN111723566B (en) 2019-03-21 2019-03-21 Product information reconstruction method and device

Country Status (1)

Country Link
CN (1) CN111723566B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113033190A (en) * 2021-04-19 2021-06-25 北京有竹居网络技术有限公司 Subtitle generating method, device, medium and electronic equipment

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102193936A (en) * 2010-03-09 2011-09-21 阿里巴巴集团控股有限公司 Data classification method and device
CN103310343A (en) * 2012-03-15 2013-09-18 阿里巴巴集团控股有限公司 Commodity information issuing method and device
CN106708813A (en) * 2015-07-14 2017-05-24 阿里巴巴集团控股有限公司 Title processing method and equipment
WO2018029852A1 (en) * 2016-08-12 2018-02-15 楽天株式会社 Information processing device, information processing method, program, and storage medium
CN109190123A (en) * 2018-09-14 2019-01-11 北京字节跳动网络技术有限公司 Method and apparatus for output information

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7716056B2 (en) * 2004-09-27 2010-05-11 Robert Bosch Corporation Method and system for interactive conversational dialogue for cognitively overloaded device users
CN110147483B (en) * 2017-09-12 2023-09-29 阿里巴巴集团控股有限公司 Title reconstruction method and device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102193936A (en) * 2010-03-09 2011-09-21 阿里巴巴集团控股有限公司 Data classification method and device
CN103310343A (en) * 2012-03-15 2013-09-18 阿里巴巴集团控股有限公司 Commodity information issuing method and device
CN106708813A (en) * 2015-07-14 2017-05-24 阿里巴巴集团控股有限公司 Title processing method and equipment
WO2018029852A1 (en) * 2016-08-12 2018-02-15 楽天株式会社 Information processing device, information processing method, program, and storage medium
CN109190123A (en) * 2018-09-14 2019-01-11 北京字节跳动网络技术有限公司 Method and apparatus for output information

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
面向商务信息抽取的产品命名实体识别研究;刘非凡;赵军;吕碧波;徐波;于浩;夏迎炬;;中文信息学报(01);全文 *

Also Published As

Publication number Publication date
CN111723566A (en) 2020-09-29

Similar Documents

Publication Publication Date Title
CN111784455B (en) Article recommendation method and recommendation equipment
US9607010B1 (en) Techniques for shape-based search of content
CN111444334B (en) Data processing method, text recognition device and computer equipment
CN107341679A (en) Obtain the method and device of user's portrait
CN105607756A (en) Information recommendation method and device
CN110325986A (en) Article processing method, device, server and storage medium
JP2016503914A (en) Product evaluation analysis
US20230214895A1 (en) Methods and systems for product discovery in user generated content
CN108829847B (en) Multi-modal modeling method based on translation and application thereof in commodity retrieval
CN108319888B (en) Video type identification method and device and computer terminal
CN109145193A (en) A kind of information-pushing method and system
US11586694B2 (en) System and method for improved searching across multiple databases
CN110909536A (en) System and method for automatically generating articles for a product
WO2022156525A1 (en) Object matching method and apparatus, and device
CN107203507A (en) Feature vocabulary extracting method and device
CN108256537A (en) A kind of user gender prediction method and system
CN109801119A (en) Showing interface, information offer, user behavior content information processing method and equipment
TWI645348B (en) System and method for automatically summarizing images and comments within commodity-related web articles
CN110858353A (en) Method and system for obtaining case referee result
CN114328798B (en) Processing method, device, equipment, storage medium and program product for searching text
US20220083617A1 (en) Systems and methods for enhanced online research
KR20200115044A (en) Identifying physical objects using visual search query
KR20220019737A (en) Method, apparatus and computer program for fashion item recommendation
CN111723566B (en) Product information reconstruction method and device
CN111523914A (en) User satisfaction evaluation method, device and system and data display platform

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant