CN114663164A - E-commerce site popularization and configuration method and device, equipment, medium and product thereof - Google Patents

E-commerce site popularization and configuration method and device, equipment, medium and product thereof Download PDF

Info

Publication number
CN114663164A
CN114663164A CN202210383174.5A CN202210383174A CN114663164A CN 114663164 A CN114663164 A CN 114663164A CN 202210383174 A CN202210383174 A CN 202210383174A CN 114663164 A CN114663164 A CN 114663164A
Authority
CN
China
Prior art keywords
tail
search
words
long
word
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210383174.5A
Other languages
Chinese (zh)
Inventor
方兵
叶朝鹏
王�锋
郭东波
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Huanju Shidai Information Technology Co Ltd
Original Assignee
Guangzhou Huanju Shidai Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Huanju Shidai Information Technology Co Ltd filed Critical Guangzhou Huanju Shidai Information Technology Co Ltd
Priority to CN202210383174.5A priority Critical patent/CN114663164A/en
Publication of CN114663164A publication Critical patent/CN114663164A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0623Item investigation
    • G06Q30/0625Directed, with specific intent or strategy

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Finance (AREA)
  • Accounting & Taxation (AREA)
  • Physics & Mathematics (AREA)
  • Development Economics (AREA)
  • General Physics & Mathematics (AREA)
  • Strategic Management (AREA)
  • Computational Linguistics (AREA)
  • Economics (AREA)
  • Data Mining & Analysis (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Artificial Intelligence (AREA)
  • General Business, Economics & Management (AREA)
  • Marketing (AREA)
  • General Engineering & Computer Science (AREA)
  • Game Theory and Decision Science (AREA)
  • Databases & Information Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application discloses a method for popularizing and configuring E-commerce sites, and a device, equipment, medium and product thereof, wherein the method comprises the following steps: constructing a search text according to the product words and the attribute words of the commodity titles in the commodity display page; acquiring statistical data matched with a search text, wherein the statistical data comprises a plurality of candidate long-tail words and historical search statistical indexes thereof; determining a unique candidate long tail word semantically matched with the commodity title as a target long tail word according to the statistical index; and configuring the target long tail word into a page title of the commodity display page. According to the method and the device, the long-tail keywords can be determined based on the commodity titles corresponding to the commodity display pages and are automatically configured to be the page titles of the commodity display pages, search engine keyword optimization is achieved, the long-tail effect is utilized, the effect that the search ranking is advanced is achieved through the large number of long-tail keywords of the single commodity display page, and the effect of improving the drainage capacity of the search engine of the whole independent site of the e-commerce platform is achieved.

Description

E-commerce site popularization and configuration method and device, equipment, medium and product thereof
Technical Field
The present application relates to the field of e-commerce information technologies, and in particular, to an e-commerce site popularization and configuration method, and a corresponding apparatus, computer device, computer-readable storage medium, and computer program product.
Background
With the rapid development of the e-commerce website form of the cross-border e-commerce and independent station mode, it becomes more and more difficult for goods to be searched by a search engine and to be in front of the search engine, and thus Search Engine Optimization (SEO) for the e-commerce independent station becomes more and more important.
The title of a web page is a highly generalized language for the content provided by a web page, and is one of the main bases for search engine retrieval. Designing a good webpage title can greatly improve the ranking of the webpage in the retrieval result. Corresponding to the E-commerce website, the method designs a multi-dimensional and scientific SEO webpage title based on search quantity, competitiveness, commodity content and the like, can greatly improve the probability of commodity retrieval, and further improves the exposure rate and the conversion rate.
The SEO webpage title generation of the current commodity page mainly depends on the working personnel of the seller, and the SEO webpage title generation is set by using various query tools and experiences, so that the efficiency is low, and the effect is difficult to guarantee.
Disclosure of Invention
A primary object of the present application is to solve at least one of the above problems and provide an e-commerce site promotion configuration method, and a corresponding apparatus, computer device, computer readable storage medium, and computer program product.
In order to meet various purposes of the application, the following technical scheme is adopted in the application:
the method for popularizing and configuring the E-commerce station, which is provided by adapting to one of the purposes of the application, comprises the following steps:
constructing a search text according to the product words and the attribute words of the commodity titles in the commodity display page;
acquiring statistical data matched with a search text, wherein the statistical data comprises a plurality of candidate long-tailed words and historical search statistical indexes thereof;
determining only one candidate long-tail word semantically matched with the commodity title as a target long-tail word according to the statistical index;
and configuring the target long tail word into a page title of the commodity display page.
In a deepened embodiment, the method for constructing the search text according to the product words and the attribute words of the commodity titles in the commodity display page comprises the following steps:
acquiring a commodity title input in a commodity display page;
performing word segmentation and part-of-speech recognition on the commodity title to obtain a word segmentation set consisting of a plurality of word segmentations, wherein the word segmentation set comprises the word segmentations belonging to the product words and the word segmentations belonging to the attribute words;
performing keyword extraction on the word segmentation set in association with the commodity title, and determining a search weight corresponding to each word segmentation, wherein the search weight represents the potential search value of the corresponding word segmentation;
and determining the unique product word with the highest search weight and a plurality of attribute words with a preset number, and splicing to form a search text.
In a deepened embodiment, acquiring statistical data matched with a search text, wherein the statistical data comprises a plurality of candidate long-tailed words and historical search statistical indexes thereof, and the method comprises the following steps:
and calling a search interface to obtain statistical data of candidate long-tail words matched with the search text, wherein the statistical data comprises keywords and statistical indexes thereof which are obtained by statistics according to historical search behavior data of massive users, the statistical indexes comprise average search quantity of the corresponding keywords and competitiveness of the keywords adopted by different websites, and the keywords are the long-tail words.
In a deepened part of embodiments, determining, according to the statistical indicator, a single candidate long-tail word semantically matched with the commodity title as a target long-tail word, includes the following steps:
performing data cleaning on the statistical data according to the statistical indexes to obtain effective candidate long-tail words;
determining the semantic similarity between each effective candidate long tail word and the commodity title in a quantitative mode;
taking the average search amount as the matching weight of the semantic similarity, and calculating the comprehensive score of each effective candidate long-tail word;
and determining the effective candidate long-tail word with the highest comprehensive score as the target long-tail word.
In some embodiments of the embodiments, performing data cleaning on the statistical data according to the statistical indicator to obtain an effective candidate long-tail word, including any one or more of the following steps:
deleting the candidate long-tail words with the word number less than a preset value in the statistical data;
deleting the candidate long-tail words in the statistical data within a preset time range;
deleting the candidate long-tail words with the competitive degree higher than the preset level in the statistical data;
and deleting the candidate long-tail words with the average search quantity higher than a preset threshold value in the statistical data.
In some embodiments of the present invention, the determining the semantic similarity between each of the valid candidate long-tail words and the title of the product in a quantitative manner includes the following steps:
coding to obtain effective candidate long-tail words and embedded vectors of the commodity titles;
extracting high-level semantic information of the effective candidate long-tail words and the embedded vectors of the commodity titles by adopting a pre-trained text feature extraction model to obtain respective semantic feature vectors;
and calculating the data distance between the semantic feature vector of the commodity title and the mood feature vector of each effective candidate long tail word by adopting a preset data distance algorithm to serve as corresponding semantic similarity.
In a deepened part of embodiments, the step of configuring the target long tail word into the page title of the commodity display page includes the following steps:
displaying a search optimization page corresponding to the commodity display page to display a page title input box;
configuring the target long tail word into content data of the page title input box;
and responding to the instruction submitted by the user, and issuing the commodity display page and the search optimization page.
The E-commerce site popularization and configuration device comprises a search construction module, an index acquisition module, a target determination module and a search optimization module, wherein the search construction module is used for constructing a search text according to product words and attribute words of a commodity title in a commodity display page; the index acquisition module is used for acquiring statistical data matched with the search text, and the statistical data comprises a plurality of candidate long-tail words and historical search statistical indexes thereof; the target determining module is used for determining a unique candidate long-tail word semantically matched with the commodity title as a target long-tail word according to the statistical index; and the search optimization module is used for configuring the target long tail words into the page titles of the commodity display pages.
In some embodiments of the deepening, the search constructing module includes: the title acquisition unit is used for acquiring the commodity title input in the commodity display page; the word segmentation recognition unit is used for carrying out word segmentation and part-of-speech recognition on the commodity title to obtain a word segmentation set consisting of a plurality of word segmentations, wherein the word segmentation set comprises the word segmentations belonging to the product words and the word segmentations belonging to the attribute words; the weight quantification unit is used for extracting keywords from the participle set in association with the commodity title and determining the search weight corresponding to each participle, wherein the search weight represents the potential search value of the corresponding participle; and the search expression unit is used for determining the unique product word with the highest search weight and a plurality of attribute words with a preset number, and splicing and constructing the unique product word and the attribute words into a search text.
In some embodiments of the deepening, the index obtaining module includes: and calling a search interface to obtain statistical data of candidate long-tail words matched with the search text, wherein the statistical data comprises keywords and statistical indexes thereof which are obtained by statistics according to historical search behavior data of massive users, the statistical indexes comprise average search quantity of the corresponding keywords and competitiveness of the keywords adopted by different websites, and the keywords are the long-tail words.
In some embodiments of the deepening, the target determination module includes: the data cleaning unit is used for cleaning the statistical data according to the statistical indexes to obtain effective candidate long-tail words; the similarity quantization unit is used for determining semantic similarity between each effective candidate long-tail word and the commodity title in a quantization mode; the score quantification unit is used for taking the average search amount as the matching weight of the semantic similarity and calculating the comprehensive score of each effective candidate long-tail word; and the target selection unit is used for determining the effective candidate long-tail word with the highest comprehensive score as the target long-tail word.
In some embodiments, the data cleansing unit includes any one or more of the following sub-modules: the word number cleaning submodule is used for deleting the candidate long-tail words of which the word number is less than a preset value in the statistical data; the time cleaning sub-module is used for deleting the candidate long-tail words in the preset time range in the statistical data; the competition degree cleaning submodule is used for deleting the candidate long-tail words with the competition degree higher than the preset level in the statistical data; and the search quantity cleaning submodule is used for deleting the candidate long-tail words of which the average search quantity in the statistical data is higher than a preset threshold value.
In some embodiments, the similarity quantization unit includes: the vector coding subunit is used for coding to obtain effective candidate long-tail words and embedded vectors of the commodity titles; the semantic extraction subunit is used for extracting the high-level semantic information of the embedded vectors of the effective candidate long-tail words and the commodity titles by adopting a pre-trained text feature extraction model to obtain respective semantic feature vectors; and the similarity calculation subunit is used for calculating the data distance between the semantic feature vector of the commodity title and the tone feature vector of each effective candidate long tail word by adopting a preset data distance algorithm as the corresponding semantic similarity.
In some embodiments of the deepening, the search optimization module includes: the page display unit is used for displaying a search optimization page corresponding to the commodity display page so as to display a page title input box; the automatic editing unit is used for configuring the target long tail words into content data of the page title input box; and the optimization issuing unit is used for responding to the instruction submitted by the user and issuing the commodity display page and the search optimization page.
A computer device adapted for one of the purposes of the present application includes a central processing unit and a memory, the central processing unit being configured to invoke execution of a computer program stored in the memory to perform the steps of the e-commerce site promotion configuration method described herein.
A computer-readable storage medium, which is provided to adapt to another object of the present application, stores a computer program implemented according to the method for e-commerce site promotion configuration, in the form of computer-readable instructions, and when the computer program is called by a computer, executes the steps included in the method.
A computer program product, provided to adapt to another object of the present application, comprises computer programs/instructions which, when executed by a processor, implement the steps of the method described in any of the embodiments of the present application.
Compared with the prior art, the technical scheme of the application at least comprises the following technical advantages:
the method includes the steps that firstly, a search text is constructed according to product words and attribute words in a commodity title used by a commodity display page, candidate long-tail words and statistical indexes of the candidate long-tail words are obtained through search of the search text, then the optimal candidate long-tail words are determined according to the statistical indexes and serve as target long-tail words, the target long-tail words are used as the page title of the commodity display page, search engine optimization of the commodity display page is achieved in a commodity-by-commodity personalized mode, the commodity display page serving for an independent E-commerce station is manufactured, manual participation is not needed, and efficiency of configuring information needed by a large number of commodity display pages is improved.
Secondly, the long-tail words in the application are keywords with a long-tail effect as the name suggests, and are also called long-tail keywords, although the search amount is relatively small, the long-tail keywords have the advantage of strong pertinence, and for the condition that a large number of commodities exist in the e-commerce platform and each commodity corresponds to one commodity display page, the preferable long-tail keywords matched with the commodity titles are configured into the page titles of the commodity display pages, so that the probability that each commodity display page is ranked ahead in the search results of a search engine is favorably improved, and the overall flow of the whole independent site of the e-commerce platform is improved.
In addition, when the target long-tail word is determined for the page title of the commodity display page, the semantic matching relation between the candidate long-tail word and the commodity title is considered, the historical search statistical index of the candidate long-tail word is also considered, the statistical index is the heat information of the representation long-tail word counted in the historical process of the long-tail word being searched, the candidate long-tail word is favorably optimized, and the determined long-tail word is not only consistent with the commodity title in semantic height but also more effective after the semantic of the commodity title is combined as a reference.
Drawings
The foregoing and/or additional aspects and advantages of the present application will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:
fig. 1 is a schematic flow chart of an exemplary embodiment of an e-commerce site promotion configuration method according to the present application;
fig. 2 is a flowchart illustrating a process of constructing a search text according to a title of an article in an embodiment of the present application;
FIG. 3 is a graphical user interface illustrative of the present application, with a merchandise display page and a search engine optimization page displayed on both sides of the interface;
FIG. 4 is a flowchart illustrating a specific process of determining a target long-tail word in an embodiment of the present application;
FIG. 5 is a flow chart illustrating a process of data cleansing of statistical data according to an embodiment of the present application;
FIG. 6 is a flowchart illustrating a process of calculating semantic similarity between a title of a commodity and an effective candidate long-tail word according to an embodiment of the present application;
fig. 7 is a schematic flowchart illustrating a process of configuring a page title to complete publishing of a merchandise display page in an embodiment of the present application;
FIG. 8 is a schematic block diagram of an e-commerce site promotional configuration apparatus of the present application;
fig. 9 is a schematic structural diagram of a computer device used in the present application.
Detailed Description
Reference will now be made in detail to embodiments of the present application, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the drawings are exemplary only for the purpose of explaining the present application and are not to be construed as limiting the present application.
As used herein, the singular forms "a", "an", "the" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. It will be understood that when an element is referred to as being "connected" or "coupled" to another element, it can be directly connected or coupled to the other element or intervening elements may also be present. Further, "connected" or "coupled" as used herein may include wirelessly connected or wirelessly coupled. As used herein, the term "and/or" includes all or any element and all combinations of one or more of the associated listed items.
It will be understood by those within the art that, unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the prior art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.
As will be appreciated by those skilled in the art, "client," "terminal," and "terminal device" as used herein include both devices that are wireless signal receivers, which are devices having only wireless signal receivers without transmit capability, and devices that are receive and transmit hardware, which have receive and transmit hardware capable of two-way communication over a two-way communication link. Such a device may include: cellular or other communication devices such as personal computers, tablets, etc. having single or multi-line displays or cellular or other communication devices without multi-line displays; PCS (Personal Communications Service), which may combine voice, data processing, facsimile and/or data communication capabilities; a PDA (Personal Digital Assistant), which may include a radio frequency receiver, a pager, internet/intranet access, a web browser, a notepad, a calendar and/or a GPS (Global Positioning System) receiver; a conventional laptop and/or palmtop computer or other device having and/or including a radio frequency receiver. As used herein, a "client," "terminal device" can be portable, transportable, installed in a vehicle (aeronautical, maritime, and/or land-based), or situated and/or configured to operate locally and/or in a distributed fashion at any other location(s) on earth and/or in space. The "client", "terminal Device" used herein may also be a communication terminal, a web terminal, a music/video playing terminal, such as a PDA, an MID (Mobile Internet Device) and/or a Mobile phone with music/video playing function, and may also be a smart tv, a set-top box, and the like.
The hardware referred to by the names "server", "client", "service node", etc. is essentially an electronic device with the performance of a personal computer, and is a hardware device having necessary components disclosed by the von neumann principle such as a central processing unit (including an arithmetic unit and a controller), a memory, an input device, an output device, etc., a computer program is stored in the memory, and the central processing unit calls a program stored in an external memory into the internal memory to run, executes instructions in the program, and interacts with the input and output devices, thereby completing a specific function.
It should be noted that the concept of "server" as referred to in this application can be extended to the case of a server cluster. According to the network deployment principle understood by those skilled in the art, the servers should be logically divided, and in physical space, the servers may be independent from each other but can be called through an interface, or may be integrated into one physical computer or a set of computer clusters. Those skilled in the art will appreciate this variation and should not be so limited as to restrict the implementation of the network deployment of the present application.
One or more technical features of the present application, unless expressly specified otherwise, may be deployed to a server for implementation by a client remotely invoking an online service interface provided by a capture server for access, or may be deployed directly and run on the client for access.
Unless specified in clear text, the neural network model referred to or possibly referred to in the application can be deployed in a remote server and used for remote call at a client, and can also be deployed in a client with qualified equipment capability for direct call.
Various data referred to in the present application may be stored in a server remotely or in a local terminal device unless specified in the clear text, as long as the data is suitable for being called by the technical solution of the present application.
The person skilled in the art will know this: although the various methods of the present application are described based on the same concept so as to be common to each other, they may be independently performed unless otherwise specified. In the same way, for each embodiment disclosed in the present application, it is proposed based on the same inventive concept, and therefore, concepts of the same expression and concepts of which expressions are different but are appropriately changed only for convenience should be equally understood.
The embodiments to be disclosed herein can be flexibly constructed by cross-linking related technical features of the embodiments unless the mutual exclusion relationship between the related technical features is stated in the clear text, as long as the combination does not depart from the inventive spirit of the present application and can meet the needs of the prior art or solve the deficiencies of the prior art. Those skilled in the art will appreciate variations therefrom.
The e-commerce site popularization and configuration method can be programmed into a computer program product, is deployed in a client or a server to run, and is generally deployed in the server to be implemented in e-commerce platform application scenes including live e-commerce, so that the method can be executed by accessing an open interface after the computer program product runs and performing man-machine interaction with a process of the computer program product through a graphical user interface.
Referring to fig. 1, in an exemplary embodiment of the method for popularizing and configuring an e-commerce site of the present application, the method includes the following steps:
step S1100, constructing a search text according to the product words and the attribute words of the commodity titles in the commodity display page:
when an independent site of the e-commerce platform needs to release a certain commodity, commodity information is required to be input, the commodity information generally comprises but is not limited to information such as a commodity title, a commodity abstract and a commodity description, and then a commodity display page of the commodity is correspondingly generated according to the commodity information for a terminal user to call and browse. In order to improve the probability of the commodities being searched, search engine optimization parameters are allowed to be configured together, so that the corresponding commodity display page is easier to be collected and displayed by the search engine.
When the content data of the corresponding commodity title is input in the commodity title input box of the commodity display page, the commodity title can be obtained. Therefore, the traditional various word segmentation methods can be adopted to segment the commodity title and determine the corresponding part of speech, and a word segmentation set is correspondingly obtained. The parts of words in the part of words set are usually part of nouns in part, and part of adjectives and/or adverbs, the nouns usually contain product words indicating the contents of the goods, and the adjectives and/or adverbs are usually used to describe the attributes of a certain aspect of the goods, and are also called attribute words. Thus, the product words and attribute words in the title of the product are determined.
In order to obtain candidate long-tail words according to the commodity title, each product word and all attribute words can be spliced randomly to form one or more search texts, and the search texts can be submitted to a search interface to perform searching so as to obtain the candidate long-tail words. When the search text is constructed, the product words and the attribute words can be filtered and preferred so as to reduce the number of the search text and improve the operation speed. Similarly, a unique splicing rule can be set for the splicing mode of the product words and the attribute words, so that the product words and the attribute words are spliced uniquely according to the splicing rule.
Step S1200, obtaining statistical data matched with the search text, wherein the statistical data comprises a plurality of candidate long-tailed words and historical search statistical indexes thereof:
according to the search text, the corresponding candidate long-tail word and the statistical index formed by historical search of the candidate long-tail word can be obtained by any one of various modes.
In one mode, the candidate long-tail words and the corresponding statistical indexes thereof can be directly obtained through a search interface provided by a search engine, and the calling interface is realized by searching out part of semantically matched long-tail words from a preset long-tail word arrangement table according to the search text to serve as the candidate long-tail words and providing the statistical indexes corresponding to the candidate long-tail words. Generally, the search engine is a target search engine which is expected to search the goods display page.
In another mode, the long-suffix ranking table can be obtained by self-statistics of the e-commerce platform or other traditional data retrieval means, and then the candidate long-suffix and the corresponding statistical indexes thereof can be obtained from the self-defined long-suffix ranking table by calling a search interface provided by the e-commerce platform.
The long-end word, i.e. the long-end keyword, generally refers to a search string that has a small average search amount per unit time and contains a plurality of words. The long-tail words have the long-tail effect, and have the advantage of accurate hit although the average search amount per unit is small. When a user of a search engine inputs a long tail word, the page with the long tail word as the page title can obtain the top priority ranking effect. Because each independent site of the e-commerce platform often has a large number of commodity display pages, long tail words are configured in a large number of commodity display pages to serve as page titles required by search engine optimization, a scale effect can be caused by the advantage that a single page is accurately searched and ranked ahead, so that the user flow of the whole independent site is improved, and the effect of popularization of the independent site is achieved.
The statistical indicators generally include average search volume and competitiveness of each long-tail. Generally speaking, each search engine analyzes the search expression of the user by itself, and counts the user usage frequency of the keywords in the search expression and the usage frequency of the corresponding keywords by the website page, so that from the user side, the average search amount of the keywords used by the user can be counted by using the user usage frequency periodically, for example, monthly; from the website side, statistics can be periodically performed, for example, in a month unit, using the page use frequency, and the degree of competition of the keyword for page use can be counted. The average search volume may be numerical data, and the competitiveness may be hierarchical data, for example, representing options {1,2,3} of high, medium, and low, depending on the quantization habits of the respective search engines on the related data. The average search volume and the competitive degree of each keyword form a corresponding statistical index of each keyword. In other embodiments, the statistical indicator may further include other information such as usage time corresponding to the long-tailed word. Organizing each keyword, the average search amount and the competitiveness thereof as mapping relation data to form a keyword ranking table, wherein keywords containing more words than a preset number, for example, more than two words are used as long-tail words, and the long-tail word ranking table can be constructed by screening the keyword ranking table.
Therefore, the long-tail word ranking table actually stores statistical data obtained by counting long-tail words corresponding to historical search behavior data of massive users, the statistical data comprises each long-tail word and various corresponding statistical indexes of the long-tail word, and the statistical indexes comprise average search quantity of the corresponding long-tail words and competitiveness of the long-tail words adopted by different websites.
According to the principle of constructing the long-tail word ranking table by the search engine, the e-commerce platform can also obtain corresponding data by itself to construct the long-tail word ranking table by itself, or directly call the long-tail word ranking issued by the search engine to be stored locally for calling.
And when the long-tail words matched with the search text are obtained from the long-tail word arrangement table and serve as candidate long-tail words, the statistical indexes corresponding to the candidate long-tail words are obtained together. The search text and the long-tail word can be matched based on rules, including fuzzy matching or precise matching, or matching based on semantics, etc., and those skilled in the art can flexibly apply the principle disclosed herein. Therefore, the matched candidate long-tail words and the corresponding statistical indexes thereof form a candidate long-tail word subset.
Step S1300, determining the only candidate long-tail word semantically matched with the commodity title as a target long-tail word according to the statistical indexes:
because each commodity display page only needs to adopt one page title, one candidate long-tail word needs to be determined from the candidate long-tail word subset determined in the previous step as the target long-tail word. Therefore, each candidate long-tail word can be optimized by referring to the corresponding statistical indexes of each candidate long-tail word, including the average search amount and/or the competitive degree, and the semantic matching degree of the candidate long-tail word and the commodity title.
For example, the average search volume and the semantic similarity between the candidate long-tail words and the commodity title can be combined to be used as a main sorting field and a secondary sorting field respectively, multi-field reverse sorting is performed, and then the candidate long-tail words with the reverse sorting at the top are determined to be used as target long-tail words used as page titles.
For another example, the competition degree, the semantic similarity of the candidate long-tail word and the commodity title may be respectively used as a main sorting field and a secondary sorting field, multi-field inverted sorting is performed, and then the candidate long-tail word with lower competition degree and the highest similarity is determined to be used as the target long-tail word of the page title.
For another example, the average search volume, the competitive degree, the semantic similarity between the candidate long-tail words and the commodity title may be combined to perform multi-field comprehensive ranking, and the average search volume, the competitive degree, and the similarity are determined to be higher, so that the first candidate long-tail word is comprehensively ranked as the target long-tail word used as the page title.
In addition, other modes, such as a mode of performing semantic judgment based on a deep learning model trained to a convergence state in advance, can be flexibly utilized to determine that the only candidate long-tail word which is highly similar to the commodity title and has a high average search amount and/or a low competitive degree is used as the target long-tail word.
Therefore, by means of the statistical indexes, only one target long-tail word can be determined for the commodity title in various modes, and the target long-tail word has the characteristics of being matched with the commodity title in a semantic mode and being associated with one or more of the statistical indexes, so that the optimization of the candidate long-tail words is achieved.
Step S1400, configuring the target long tail word into a page title of the commodity display page:
after the target long tail word is determined, the target long tail word can be configured as a page title of the commodity display page, after the commodity display page is published, each search engine searches and records the commodity display page according to the implementation logic of the search engine, and then the target long tail word is used for searching in the search engine, so that the commodity display page can be hit more accurately, and the purpose of page popularization is achieved.
Through the exemplary embodiment of the present application and a plurality of corresponding modified embodiments thereof, it can be seen that, compared with the prior art, the technical solution of the present application at least includes the following technical advantages:
the method comprises the steps of firstly, constructing a search text aiming at product words and attribute words in a commodity title used by a commodity display page, retrieving the search text to obtain candidate long-tail words and statistical indexes thereof, then determining an optimal candidate long-tail word as a target long-tail word according to the statistical indexes, using the target long-tail word as the page title of the commodity display page, optimizing a search engine of the commodity display page in a commodity-by-commodity personalized mode, serving commodity display page manufacturing of an independent E-commerce station, avoiding manual participation and improving the efficiency of configuring information needed by a large quantity of commodity display pages.
Secondly, the long-tail words in the application are keywords with a long-tail effect as the name suggests, and are also called long-tail keywords, although the search amount is relatively small, the long-tail keywords have the advantage of strong pertinence, and for the condition that a large number of commodities exist in the e-commerce platform and each commodity corresponds to one commodity display page, the preferable long-tail keywords matched with the commodity titles are configured into the page titles of the commodity display pages, so that the probability that each commodity display page is ranked ahead in the search results of a search engine is favorably improved, and the overall flow of the whole independent site of the e-commerce platform is improved.
In addition, when the target long-tail word is determined for the page title of the commodity display page, the semantic matching relation between the candidate long-tail word and the commodity title is considered, the historical search statistical index of the candidate long-tail word is also considered, the statistical index is the heat information of the representation long-tail word counted in the historical process of the long-tail word being searched, the candidate long-tail word is favorably optimized, and the determined long-tail word is not only consistent with the commodity title in semantic height but also more effective after the semantic of the commodity title is combined as a reference.
Referring to fig. 2, in a further embodiment, the step S1100 of constructing a search text according to the product words and the attribute words of the product titles in the product display page includes the following steps:
step S1110, acquiring a product title input in the product display page:
as shown in fig. 3, when the user edits the product display page, a search engine optimization area is displayed, the user inputs the set product title in the product title input box on the left side of the figure, and the background obtains the text data of the product title.
Step S1120, performing word segmentation and part-of-speech recognition on the product title to obtain a word segmentation set composed of a plurality of word segmentations, wherein the word segmentation set includes both the word segmentations belonging to the product word and the word segmentations belonging to the attribute word:
in order to implement the word segmentation of the product title, any one of the conventional word segmentation algorithms based on statistics, such as N-Gram, may be adopted to perform word segmentation on the product title first to obtain a corresponding word segmentation set, where the word segmentation set generally includes a plurality of product words and a plurality of attribute words, and the product words are generally used to describe or refer to product names and multiple proper nouns; the attribute words are usually used to describe the attributes of the goods, and are mostly adjectives or adverbs.
In order to implement the part-of-speech analysis of each participle in the participle set, a neural network model trained to a convergence state in advance can be adopted for implementation, a recommended model can adopt a basic network architecture such as LSTM + CRF and BERT + CRF, the type can be flexibly selected by a person skilled in the art, and the neural network model can be subjected to fine tuning training to the convergence state by the person skilled in the art on the basis of the pre-training, so that the person skilled in the art can learn to perform the part-of-speech and the part-of-speech analysis according to an embedded vector of the inputted part-of-speech set of the commodity title so as to divide and determine the product words and the attribute words in the commodity title.
Step S1130, extracting keywords from the participle set in association with the commodity title, and determining the search weight corresponding to each participle, wherein the search weight represents the potential search value of the corresponding participle:
as mentioned above, there may be a plurality of product words and attribute words in the word segmentation set, but the syntax required for constructing the search text generally complies with the natural language usage habit, and therefore needs to be as compact as possible.
In one embodiment, a TF-IDF algorithm may be used to determine a TF-IDF value of each participle in the participle set as a search weight, and in another embodiment, a TextRank algorithm may be used to construct a participle map for each participle, and a weight corresponding to each participle is determined as the search weight according to the participle map. In addition, those skilled in the art can flexibly adopt other alternative implementations according to the principles disclosed herein, as long as the potential search value of each participle can be quantified.
Step S1140, determining the unique product word with the highest search weight and a plurality of attribute words with a preset number, and splicing and constructing the unique product word into a search text:
and reversely ordering the participles in the participle set according to the search weight, wherein the product words and the attribute words are orderly arranged according to the corresponding search weight, so that the only product word with the highest search weight can be determined to be used as the core product word of the commodity title, meanwhile, according to the preset number, such as 3, a plurality of attribute words with the corresponding number in the front of the order are selected, the core product word is spliced with the plurality of attribute words which are preferably selected, for example, the ordered splicing is carried out according to the preset rule, so that the attribute word is in the front and the core product word is behind, and the search text is constructed. Of course, the stitching order can be adjusted to obtain a plurality of search texts, and the method can be flexibly implemented by the skilled person.
In the embodiment, the commodity titles are subjected to word segmentation and part-of-speech recognition, then the search weight representing the potential search value of each word segmentation is calculated, the core product word and the multiple preferred attribute words are selected according to the search weight, the search text is constructed according to the core product word and the multiple preferred attribute words, the search text has higher potential search value, the preliminary screening of the candidate long-tail words is carried out according to the search text, and the matching accuracy can be improved.
Referring to fig. 4, in a deepened partial embodiment, the step S1300 of determining, according to the statistical indicator, a single candidate long-tail word semantically matched with the product title as a target long-tail word includes the following steps:
step S1310, cleaning the statistical data according to the statistical indexes to obtain effective candidate long-tail words:
as described above, the statistical data includes the statistical index, and the statistical index includes the average search amount and/or the competition degree, and may even include other information, so that the statistical index may be subjected to data cleaning according to the constraint condition on the average search amount and/or the competition degree of the candidate long-tail words in the statistical data, so as to filter out partial invalid candidate long-tail words, and the remaining part may be determined as the valid long-tail words. The constraints described can be flexibly set by those skilled in the art in accordance with the principles disclosed herein.
Step S1320, determining semantic similarity between each effective candidate long-tail word and the commodity title in a quantification mode:
in order to examine the semantic closeness degree between each effective candidate long-tail word and the commodity title, the semantic similarity of the two effective candidate long-tail words and the commodity title can be calculated. When calculating the semantic similarity, the deep semantic information of the effective candidate long-tail words and the commodity titles can be determined respectively, on the basis, the data distance between each candidate long-tail word and the commodity title is determined by adopting a data distance algorithm, and the data distance is expressed as a numerical value corresponding to the semantic similarity through normalization.
Step S1330, calculating a composite score of each effective candidate long-tail word by using the average search amount as the matching weight of the semantic similarity:
in order to facilitate the preference of each effective candidate long-tail word, the comprehensive score of each effective candidate long-tail word can be determined according to the statistical index and the similarity between the statistical index and the commodity title, and an exemplary formula is as follows:
final_score=ln(2.001-(search_score/max_search_score))*sim_score
the search _ score is the average search quantity corresponding to the current effective candidate long-tail word, the max _ search _ score is the maximum average search quantity of all effective candidate long-tail words, the logarithm is obtained by subtracting the logarithm of the ratio of the constant 2.001 and the logarithm of the constant, the logarithm is multiplied by the corresponding semantic similarity, weight matching is achieved, the data are guaranteed to be more stable, different differences are distinguished, and the corresponding comprehensive score final _ score is determined in a quantized mode.
It should be understood that the above formula is only an example, and illustrates the principle of weighting the semantic similarity of the effective candidate long-tail words with the title of the product by using the average search amount of the effective candidate long-tail words to realize quantization, and those skilled in the art can flexibly construct the formula according to the principle disclosed herein as long as the same purpose is achieved, and in this regard, the scope of the inventive spirit of the present application should not be broken through.
After each candidate long-tail word determines the corresponding comprehensive score, the average search heat of the candidate long-tail words and the closeness degree of the candidate long-tail words and the commodity titles are comprehensively reflected through the comprehensive score, and the advantages and the disadvantages of the effective candidate long-tail words are unified to the same dimension, so that the advantages of the effective candidate long-tail words can be selected through the comprehensive score.
Step S1340, determining the effective candidate long tail word with the highest comprehensive score as the target long tail word:
in order to obtain the optimal effective candidate long-tail words as the target long-tail words, the effective candidate long-tail words can be reversely ordered according to the comprehensive scores, and then the effective candidate long-tail words arranged at the head, namely the effective candidate long-tail words with the highest comprehensive scores, are determined as the target long-tail words required by the page titles generated by the application.
After the statistical data matched according to the search text is subjected to data cleaning to obtain the effective candidate long-tail words, the degree of closeness between the effective candidate long-tail words and the commodity titles and the average search amount used for reflecting the average search heat in the statistical indexes of the effective candidate long-tail words are utilized, the corresponding comprehensive scores are determined by using the preset formula, the quantification of the degree of superiority and inferiority of each effective candidate long-tail word is realized, then the effective candidate long-tail words with the highest comprehensive scores are selected as the target long-tail words required by the page titles, and the preferred matching of the effective candidate long-tail words is realized.
The target long-tail word determined by the embodiment is semantically closer to the commodity title, the access heat of the target long-tail word is relatively excellent, and the effective candidate long-tail word is a result of data cleaning on statistical data and filters extreme cases, so that the determined target long-tail word is comprehensively optimal, and the advantages in the aspect of long-tail word searching can be obtained after the commodity display page is published, so that the improvement of total station traffic is assisted.
Referring to fig. 5, in some embodiments of the present invention, in order to achieve a high-quality data cleaning effect, an effective candidate long-tail word is screened out, and step S1310 is performed to perform data cleaning on the statistical data according to the statistical index to obtain an effective candidate long-tail word, where the method includes any one or more of the following steps:
step S1311, deleting the candidate long-tail words with the number of words less than a preset value in the statistical data:
the long-tail word, as the name implies, is generally a sentence with more than 2 and 3 words, for example, the long-tail word may be set to include not less than 3 words, and accordingly, statistical data matching the search text may be filtered, and candidate long-tail words with less than 3 words may be deleted therefrom, thereby restricting the definition range of the long-tail word.
Step S1312, deleting the candidate long-tail words in the statistical data within a preset time range:
sometimes, the statistical data obtained by the search interface includes candidate long-tailed words generated within a long time range, but some commodities have sales timeliness, so that when the statistical data is obtained, time information corresponding to each candidate long-tailed word is obtained, then the candidate long-tailed words are filtered according to a preset time range, the candidate long-tailed words falling into the preset time range are deleted from the statistical data, and therefore data cleaning of overdue candidate long-tailed words is achieved.
Step S1313, deleting the candidate long-tailed words in the statistical data whose competitive power is higher than a preset level:
the degree of competition is generally characterized in terms of the respective degree of "high, medium and low", but can of course also be characterized in terms of a numerical value, as the case may be. In any case, since the fact that the competitive degree of a candidate long-tail word is too high means that a large number of websites and pages use the candidate long-tail word, if the candidate long-tail word is still used, the ranking of the commodity display page is difficult to be advanced even if the commodity display page is searched, and therefore, the candidate long-tail word with the higher competitive degree can be deleted from the statistical data by utilizing the characteristic, so that competition is avoided moderately.
Step S1314, deleting the candidate long-tailed words whose average search volume in the statistical data is higher than a preset threshold:
if the average search amount of the candidate long-tail words is too high, the candidate long-tail words are popular, the candidate long-tail words can be used by a large number of websites and pages, the competition problem also exists, the search ranking is difficult to promote even if the candidate long-tail words are used in the commodity display page, according to the principle, a preset threshold corresponding to the average search amount can be preset, the candidate long-tail words with the average search amount higher than the preset threshold are deleted from the statistical data, and the competition is avoided moderately.
The embodiments disclosed herein can be arbitrarily combined by those skilled in the art, for example, all embodiments corresponding to steps S1311 to S1314 are adopted in full, and in any case, those skilled in the art can flexibly apply the embodiments disclosed herein based on different purposes, so as to implement the cleaning of the statistical data matched with the search text, and ensure the validity of the valid candidate long-tail words depended on when the target long-tail word is subsequently determined, thereby ensuring the quality of the finally determined target long-tail word.
Referring to fig. 6, in some embodiments, the step S1320 of quantitatively determining semantic similarity between each of the valid candidate long-tail words and the product title includes the following steps:
step S1321, encoding to obtain effective candidate long-tail words and the embedded vectors of the commodity titles:
in order to calculate the semantic similarity between the effective candidate long-tail words and the commodity titles, the commodity titles and each effective candidate long-tail word are subjected to conventional word segmentation, and then the respective word segmentation sets are converted into embedded vectors according to word lists, so that the encoding is completed.
Step S1322, extracting high-level semantic information of the effective candidate long-tail words and the embedded vectors of the commodity titles by adopting a pre-trained text feature extraction model to obtain respective semantic feature vectors:
and then, a text feature extraction model is adopted to perform representation learning on the commodity title and the embedded vector of the effective candidate long-tail word participating in calculation, deep semantic information of the commodity title and the embedded vector of the effective candidate long-tail word is extracted, and a corresponding semantic feature vector, namely a sentence vector, is obtained. The text feature extraction model may be a pre-trained model, or a model obtained by fine-tuning training of a person skilled in the art on the basis of the pre-trained model, as long as the model has corresponding semantic feature vectors extracted from embedded vectors of texts.
The text feature extraction model can be any one of the commonly used basic models based on LSTM, Bert, Sennce-Transformer and the like, and can be flexibly selected by a person skilled in the art.
Step S1323, calculating a data distance between the semantic feature vector of the commodity title and the tone feature vector of each valid candidate long-tail word by using a preset data distance algorithm as a corresponding semantic similarity:
after determining the semantic feature vectors of the commodity title and each effective candidate long-tail word, calculating the similarity between the commodity title and each semantic feature vector of each effective candidate long-tail word by adopting a preset data distance algorithm. The data distance algorithm may adopt any one of conventional algorithms for calculating the distance between a data point and a point, such as a cosine similarity calculation method, an euclidean distance algorithm, a minson distance algorithm, a pearson correlation coefficient algorithm, and a jackard coefficient algorithm.
Taking cosine similarity algorithm as an example, the corresponding formula is as follows:
Figure BDA0003592650660000171
wherein A is a commodity title, B is a single effective candidate long tail word, and n is the total number of elements of the semantic feature vector.
After the data distance between the commodity title and each effective candidate long-tail word is calculated by applying a preset data distance algorithm, the commodity title and each effective candidate long-tail word can be normalized to the same dimension according to actual needs, the higher the numerical value is, the higher the semantic similarity is represented, so that the numerical value corresponding to the semantic similarity corresponding to each effective candidate long-tail word is obtained, and then the target long-tail word can be optimized according to the numerical value.
In the embodiment, the semantic similarity between the effective candidate long-tail words and the commodity title is calculated by means of the text feature model of deep learning, the obtained semantic similarity can better represent the semantic closeness between the effective candidate long-tail words and the commodity title, and the effective candidate long-tail words are subjected to preferential screening subsequently, so that the result is more accurate.
Referring to fig. 7, in a further embodiment, the step S1400 of configuring the target long headword in the page title of the merchandise display page includes the following steps:
step S1410, displaying a search optimization page corresponding to the merchandise display page to display a page title input box:
referring to fig. 3, in the interface shown in fig. 3, a search optimization page is displayed in advance corresponding to the merchandise display page, wherein the search optimization page includes a page title input box, and the input box is used for inputting the target long-tail word determined by any one of the above embodiments of the present application.
Step S1420, configuring the target long-tail word as the content data of the page title input box:
after the target long-tail word is obtained through any of the above embodiments, the target long-tail word can be automatically filled in a page title input box of the search optimization page, and the target long-tail word is configured as content data of a page title. The operation user can further edit the long-tail word or directly defaults to the automatically generated target long-tail word.
Step S1430, responding to the instruction submitted by the user, issuing the commodity display page and the search optimization page:
after the operation user completely inputs various data in the commodity display page and the search optimization page, the operation user can trigger the user to submit an instruction by operating the submission control, submit the commodity display page and the search optimization page to the background of an independent site, realize the publishing of the commodity display page, and simultaneously submit various data input in the search optimization page, including the page title, so that the publishing process of the commodity display page is completed.
Each commodity display page can determine the corresponding page title in such a way, so that the long tail effect of the keywords is utilized to guide the independent website of the E-commerce.
The embodiment exemplarily shows a use scene of the target long-tail word automatically generated by the application, and it can be seen that with the help of the technical scheme of the application, when an independent site issues each commodity, the problem of search engine optimization does not need to be paid excessive attention to.
Please refer to fig. 8, which is a functional embodiment of the e-commerce site popularization and configuration apparatus adapted to one of the purposes of the present application, and the apparatus includes a search construction module 1100, an index acquisition module 1200, a target determination module 1300, and a search optimization module 1400, where the search construction module 1100 is configured to construct a search text according to product words and attribute words of a title of a commodity in a commodity display page; the index obtaining module 1200 is configured to obtain statistical data matched with a search text, where the statistical data includes a plurality of candidate long-tailed words and historical search statistical indexes thereof; the target determining module 1300 is configured to determine, according to the statistical indicator, a unique candidate long-tail word semantically matched with the commodity title as a target long-tail word; the search optimization module 1400 is configured to configure the target long tail word into a page title of the commodity display page.
In some embodiments of the present invention, the search constructing module 1100 comprises: the title acquisition unit is used for acquiring the commodity title input in the commodity display page; the word segmentation recognition unit is used for carrying out word segmentation and part-of-speech recognition on the commodity title to obtain a word segmentation set consisting of a plurality of word segmentations, wherein the word segmentation set comprises the word segmentations belonging to the product words and the word segmentations belonging to the attribute words; the weight quantification unit is used for extracting keywords from the participle set in association with the commodity title and determining the search weight corresponding to each participle, wherein the search weight represents the potential search value of the corresponding participle; and the search expression unit is used for determining the unique product word with the highest search weight and a plurality of attribute words with a preset number, and splicing the unique product word with the highest search weight and the attribute words to form a search text.
In some embodiments of the deepening, the index obtaining module 1200 includes: and calling a search interface to obtain statistical data of candidate long-tail words matched with the search text, wherein the statistical data comprises keywords and statistical indexes thereof which are obtained by statistics according to historical search behavior data of massive users, the statistical indexes comprise average search quantity of the corresponding keywords and competitiveness of the keywords adopted by different websites, and the keywords are the long-tail words.
In some embodiments of the present disclosure, the object determining module 1300 includes: the data cleaning unit is used for cleaning the statistical data according to the statistical indexes to obtain effective candidate long-tail words; the similarity quantization unit is used for determining semantic similarity between each effective candidate long-tail word and the commodity title in a quantization mode; the score quantification unit is used for taking the average search amount as the matching weight of the semantic similarity and calculating the comprehensive score of each effective candidate long-tail word; and the target selection unit is used for determining the effective candidate long-tail word with the highest comprehensive score as the target long-tail word.
In some embodiments, the data cleansing unit includes any one or more of the following sub-modules: the word number cleaning submodule is used for deleting the candidate long-tail words of which the word number is less than a preset value in the statistical data; the time cleaning sub-module is used for deleting the candidate long-tail words in the preset time range in the statistical data; the competition degree cleaning submodule is used for deleting the candidate long-tailed words with the competition degree higher than the preset level in the statistical data; and the search quantity cleaning submodule is used for deleting the candidate long-tail words of which the average search quantity in the statistical data is higher than a preset threshold value.
In some embodiments, the similarity quantization unit includes: the vector coding subunit is used for coding to obtain effective candidate long-tail words and embedded vectors of the commodity titles; the semantic extraction subunit is used for extracting the high-level semantic information of the embedded vectors of the effective candidate long-tail words and the commodity titles by adopting a pre-trained text feature extraction model to obtain respective semantic feature vectors; and the similarity calculation subunit is used for calculating the data distance between the semantic feature vector of the commodity title and the tone feature vector of each effective candidate long tail word by adopting a preset data distance algorithm as the corresponding semantic similarity.
In some embodiments of the present invention, the search optimization module 1400 includes: the page display unit is used for displaying a search optimization page corresponding to the commodity display page so as to display a page title input box; the automatic editing unit is used for configuring the target long tail words into content data of the page title input box; and the optimization issuing unit is used for responding to the instruction submitted by the user and issuing the commodity display page and the search optimization page.
In order to solve the technical problem, an embodiment of the present application further provides a computer device. As shown in fig. 9, the internal structure of the computer device is schematically illustrated. The computer device includes a processor, a computer-readable storage medium, a memory, and a network interface connected by a system bus. The computer readable storage medium of the computer device stores an operating system, a database and computer readable instructions, the database can store control information sequences, and the computer readable instructions, when executed by the processor, can enable the processor to implement an e-commerce site popularization configuration method. The processor of the computer device is used for providing calculation and control capability and supporting the operation of the whole computer device. The memory of the computer device may have stored therein computer readable instructions that, when executed by the processor, may cause the processor to perform the e-commerce site promotional configuration method of the present application. The network interface of the computer device is used for connecting and communicating with the terminal. Those skilled in the art will appreciate that the architecture shown in fig. 9 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
In this embodiment, the processor is configured to execute specific functions of each module and its sub-module in fig. 8, and the memory stores program codes and various data required for executing the modules or sub-modules. The network interface is used for data transmission to and from a user terminal or a server. The memory in this embodiment stores program codes and data required for executing all modules/sub-modules in the e-commerce site promotion configuration device, and the server can call the program codes and data of the server to execute the functions of all sub-modules.
The present application also provides a storage medium having stored thereon computer-readable instructions which, when executed by one or more processors, cause the one or more processors to perform the steps of the e-commerce site promotional configuration method of any of the embodiments of the present application.
The present application also provides a computer program product comprising computer programs/instructions which, when executed by one or more processors, implement the steps of the method as described in any of the embodiments of the present application.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments of the present application can be implemented by a computer program, which can be stored in a computer-readable storage medium, and when the computer program is executed, the processes of the embodiments of the methods can be included. The storage medium may be a computer-readable storage medium such as a magnetic disk, an optical disk, a Read-Only Memory (ROM), or a Random Access Memory (RAM).
In summary, the method and the device can determine the long-tail keyword based on the commodity title of the commodity corresponding to the commodity display page, automatically configure the long-tail keyword into the page title of the commodity display page, achieve search engine keyword optimization, utilize the long-tail effect, and exert the effect that the search rank is advanced through the long-tail keyword of a mass single commodity display page, so as to achieve the effect of improving the search engine drainage capacity of the independent site and the whole station of the e-commerce platform.
Those of skill in the art will understand that various operations, methods, steps in the flow, measures, schemes discussed in this application can be alternated, modified, combined, or deleted. Further, other steps, measures, or schemes in various operations, methods, or flows that have been discussed in this application can be alternated, altered, rearranged, broken down, combined, or deleted. Further, steps, measures, schemes in the prior art having various operations, methods, procedures disclosed in the present application may also be alternated, modified, rearranged, decomposed, combined, or deleted.
The foregoing is only a few embodiments of the present application and it should be noted that those skilled in the art can make various improvements and modifications without departing from the principle of the present application, and that these improvements and modifications should also be considered as the protection scope of the present application.

Claims (10)

1. A method for popularizing and configuring E-commerce stations is characterized by comprising the following steps:
constructing a search text according to the product words and the attribute words of the commodity titles in the commodity display page;
acquiring statistical data matched with a search text, wherein the statistical data comprises a plurality of candidate long-tailed words and historical search statistical indexes thereof;
determining only one candidate long-tail word semantically matched with the commodity title as a target long-tail word according to the statistical index;
and configuring the target long tail word into a page title of the commodity display page.
2. The e-commerce site popularization configuration method of claim 1, wherein a search text is constructed according to product words and attribute words of a commodity title in a commodity display page, comprising the steps of:
acquiring a commodity title input in a commodity display page;
performing word segmentation and part-of-speech recognition on the commodity title to obtain a word segmentation set consisting of a plurality of word segmentations, wherein the word segmentation set comprises the word segmentations belonging to the product words and the word segmentations belonging to the attribute words;
performing keyword extraction on the word segmentation set in association with the commodity title, and determining a search weight corresponding to each word segmentation, wherein the search weight represents the potential search value of the corresponding word segmentation;
and determining the unique product word with the highest search weight and a plurality of attribute words with a preset number, and splicing to form a search text.
3. The e-commerce site promotion configuration method according to claim 1, wherein statistical data matched with a search text is obtained, the statistical data comprises a plurality of candidate long-tailed words and historical search statistical indexes thereof, and the method comprises the following steps:
and calling a search interface to obtain statistical data of candidate long-tail words matched with the search text, wherein the statistical data comprises keywords and statistical indexes thereof which are obtained by statistics according to historical search behavior data of massive users, the statistical indexes comprise average search quantity of the corresponding keywords and competitiveness of the keywords adopted by different websites, and the keywords are the long-tail words.
4. The e-commerce site promotion configuration method according to claim 3, wherein the step of determining, as the target long-tail word, the only one candidate long-tail word semantically matched with the commodity title according to the statistical index comprises the following steps:
performing data cleaning on the statistical data according to the statistical indexes to obtain effective candidate long-tail words;
determining the semantic similarity between each effective candidate long tail word and the commodity title in a quantitative mode;
taking the average search amount as the matching weight of the semantic similarity, and calculating the comprehensive score of each effective candidate long-tail word;
and determining the effective candidate long-tail word with the highest comprehensive score as the target long-tail word.
5. The e-commerce site promotion configuration method according to claim 4, wherein the step of performing data cleaning on the statistical data according to the statistical indexes to obtain effective candidate long-tail words comprises any one or more of the following steps:
deleting the candidate long-tail words with the word number less than a preset value in the statistical data;
deleting the candidate long-tail words in the statistical data within a preset time range;
deleting the candidate long-tail words with the competitive degree higher than the preset level in the statistical data;
and deleting the candidate long-tail words with the average search quantity higher than a preset threshold value in the statistical data.
6. The e-commerce site promotion configuration method according to claim 4, wherein the semantic similarity between each effective candidate long-tail word and the commodity title is quantitatively determined, and the method comprises the following steps of:
coding to obtain effective candidate long-tail words and embedded vectors of the commodity titles;
extracting high-level semantic information of the effective candidate long-tail words and the embedded vectors of the commodity titles by adopting a pre-trained text feature extraction model to obtain respective semantic feature vectors;
and calculating the data distance between the semantic feature vector of the commodity title and the mood feature vector of each effective candidate long tail word by adopting a preset data distance algorithm to serve as corresponding semantic similarity.
7. The e-commerce site promotion configuration method according to any one of claims 1 to 6, wherein the configuration of the target long-tail word into the page title of the commodity display page comprises the following steps:
displaying a search optimization page corresponding to the commodity display page to display a page title input box;
configuring the target long tail word into content data of the page title input box;
and responding to the instruction submitted by the user, and issuing the commodity display page and the search optimization page.
8. The utility model provides an e-commerce site popularization configuration device which characterized in that includes:
the search construction module is used for constructing a search text according to the product words and the attribute words of the commodity titles in the commodity display page;
the index acquisition module is used for acquiring statistical data matched with the search text, wherein the statistical data comprises a plurality of candidate long-tailed words and historical search statistical indexes thereof;
the target determining module is used for determining the only candidate long tail word which is semantically matched with the commodity title as a target long tail word according to the statistical index;
and the search optimization module is used for configuring the target long tail words into the page title of the commodity display page.
9. A computer device comprising a central processor and a memory, characterized in that the central processor is adapted to invoke execution of a computer program stored in the memory to perform the steps of the method according to any one of claims 1 to 7.
10. A computer-readable storage medium, characterized in that it stores, in the form of computer-readable instructions, a computer program implemented according to the method of any one of claims 1 to 7, which, when invoked by a computer, performs the steps comprised by the corresponding method.
CN202210383174.5A 2022-04-12 2022-04-12 E-commerce site popularization and configuration method and device, equipment, medium and product thereof Pending CN114663164A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210383174.5A CN114663164A (en) 2022-04-12 2022-04-12 E-commerce site popularization and configuration method and device, equipment, medium and product thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210383174.5A CN114663164A (en) 2022-04-12 2022-04-12 E-commerce site popularization and configuration method and device, equipment, medium and product thereof

Publications (1)

Publication Number Publication Date
CN114663164A true CN114663164A (en) 2022-06-24

Family

ID=82035724

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210383174.5A Pending CN114663164A (en) 2022-04-12 2022-04-12 E-commerce site popularization and configuration method and device, equipment, medium and product thereof

Country Status (1)

Country Link
CN (1) CN114663164A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115470322A (en) * 2022-10-21 2022-12-13 深圳市快云科技有限公司 Keyword generation system and method based on artificial intelligence
CN115719066A (en) * 2022-11-18 2023-02-28 北京百度网讯科技有限公司 Search text understanding method, device, equipment and medium based on artificial intelligence
CN117151082A (en) * 2023-10-30 2023-12-01 量子数科科技有限公司 Commodity title SPU keyword extraction method based on large language model

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115470322A (en) * 2022-10-21 2022-12-13 深圳市快云科技有限公司 Keyword generation system and method based on artificial intelligence
CN115719066A (en) * 2022-11-18 2023-02-28 北京百度网讯科技有限公司 Search text understanding method, device, equipment and medium based on artificial intelligence
CN117151082A (en) * 2023-10-30 2023-12-01 量子数科科技有限公司 Commodity title SPU keyword extraction method based on large language model
CN117151082B (en) * 2023-10-30 2024-01-02 量子数科科技有限公司 Commodity title SPU keyword extraction method based on large language model

Similar Documents

Publication Publication Date Title
CN109829104B (en) Semantic similarity based pseudo-correlation feedback model information retrieval method and system
WO2020108608A1 (en) Search result processing method, device, terminal, electronic device, and storage medium
CN110442777B (en) BERT-based pseudo-correlation feedback model information retrieval method and system
US9563665B2 (en) Product search method and system
CN101876981B (en) A kind of method and device building knowledge base
CN111008265B (en) Enterprise information searching method and device
CN114663164A (en) E-commerce site popularization and configuration method and device, equipment, medium and product thereof
CN102253982B (en) Query suggestion method based on query semantics and click-through data
US9864803B2 (en) Method and system for multimodal clue based personalized app function recommendation
Sarawagi et al. Open-domain quantity queries on web tables: annotation, response, and consensus models
WO2018040069A1 (en) Information recommendation system and method
US20130138429A1 (en) Method and Apparatus for Information Searching
CN106940726B (en) Creative automatic generation method and terminal based on knowledge network
US20170371965A1 (en) Method and system for dynamically personalizing profiles in a social network
WO2004013775A2 (en) Data search system and method using mutual subsethood measures
CN111090771B (en) Song searching method, device and computer storage medium
CN102768679B (en) Searching method and searching system
WO2020248378A1 (en) Service query method and apparatus, and storage medium and computer device
CN111444304A (en) Search ranking method and device
CN102637179B (en) Method and device for determining lexical item weighting functions and searching based on functions
CN103309869A (en) Method and system for recommending display keyword of data object
CN114186013A (en) Entity recognition model hot updating method and device, equipment, medium and product thereof
CN115248839A (en) Knowledge system-based long text retrieval method and device
CN116010552A (en) Engineering cost data analysis system and method based on keyword word library
Wei et al. Online education recommendation model based on user behavior data analysis

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination