CN110689407A - Price comparison method for selected products and computer readable storage medium - Google Patents
Price comparison method for selected products and computer readable storage medium Download PDFInfo
- Publication number
- CN110689407A CN110689407A CN201910929807.6A CN201910929807A CN110689407A CN 110689407 A CN110689407 A CN 110689407A CN 201910929807 A CN201910929807 A CN 201910929807A CN 110689407 A CN110689407 A CN 110689407A
- Authority
- CN
- China
- Prior art keywords
- commodity
- information
- data
- price
- price comparison
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/06—Buying, selling or leasing transactions
- G06Q30/0601—Electronic shopping [e-shopping]
- G06Q30/0623—Item investigation
- G06Q30/0625—Directed, with specific intent or strategy
- G06Q30/0629—Directed, with specific intent or strategy for generating comparisons
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9535—Search customisation based on user profiles and personalisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/955—Retrieval from the web using information identifiers, e.g. uniform resource locators [URL]
- G06F16/9558—Details of hyperlinks; Management of linked annotations
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Business, Economics & Management (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Accounting & Taxation (AREA)
- Finance (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Economics (AREA)
- General Business, Economics & Management (AREA)
- Strategic Management (AREA)
- Marketing (AREA)
- Development Economics (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention provides a method for price comparison of selected products and a computer readable storage medium. The method comprises a commodity information data acquisition process, a commodity information data cleaning process and a comparative information display process, and sets a unique cleaning rule, so that similar commodities are screened, and price comparison results are updated in real time. The invention can visually, accurately and automatically display the information of commodities and similar commodities, and can also look at the historical price of the inquired commodities, know the historical price trend, the sales volume condition and the comment condition, so that a buyer or an e-commerce can know the commodities more deeply, and the buyer or the e-commerce can conveniently purchase the commodities which are better and cheaper and more accurately make the commodity price.
Description
The technical field is as follows:
the invention relates to the technical field of data processing, in particular to a method for comparing prices of selected products and a computer-readable storage medium.
Background art:
when a consumer purchases a selected product, the consumer generally opens websites of different E-commerce at the same time, and inputs the same commodity name into each website so as to compare prices conveniently.
In addition, with the development of e-commerce, a great number of shopping platforms with different sizes are leap out in the market, such as relatively well-known Taobao, Jingdong and the like; under the condition that most consumers pay attention to the price, what is needed to attract the consumers is enough to stand out among a plurality of platforms, strong competitiveness is created, the pricing of commodities is particularly important, and the e-commerce needs to know the pricing of the same commodities on other platforms as much as possible.
At present, the price comparison work is mostly carried out manually, the cost is high, the operation is complicated, and the omission is easy.
The invention content is as follows:
in order to improve price comparison effect and save time, the invention provides a price comparison method for a selected item and a computer-readable storage medium.
A price comparison method for selected goods is applied to computer equipment and comprises a goods information data acquisition process, a goods information data cleaning process and a comparative information display process;
the commodity information data acquisition process is to acquire the links of commodities to be subjected to price comparison, analyze the pages where the commodities are located, extract useful information and store the useful information in a database;
in the commodity information data cleaning process, an index is established for the acquired information by using an index engine, and data of similar commodities meeting the requirements of a similarity threshold and a similarity ranking threshold at the same time are acquired;
and in the comparative information display process, the source URL corresponding to the final data obtained by cleaning according to the commodity information data is subjected to periodic data acquisition, so that the comparative display of the relevant information of the corresponding commodity is realized.
Further, the commodity information data acquisition process specifically comprises:
the method comprises the following steps: importing a link URL of a commodity needing price comparison into a task list to be acquired;
step two: loading a link URL of a commodity to be collected and putting the link URL into a request pool;
step three: acquiring and sending the corresponding request in the step two;
step four: downloading the page corresponding to the link URL in the second step, and returning page information;
step five: processing the page in the step four, analyzing the page, extracting useful information, putting the useful information into a list and handing the useful information by a pipeline;
step six: storing the data in the list into a database;
step seven: and repeating the steps three to six until all the requests in the request pool are processed.
Further, in the execution process of the commodity information data acquisition process, if a new link is found, the URL of the new link is continuously put into the request pool.
Furthermore, in the commodity information data acquisition process step five, an xpath analysis page is adopted.
Further, the cleaning process of the commodity information data specifically comprises the following steps:
the method comprises the following steps: firstly, establishing a corresponding index in a search server;
step two: loading all data acquired in the commodity information data acquisition process, and writing the data into a search server;
step three: searching for price comparison commodities according to needs by a search server, calculating commodity similarity, taking out data with the first N-phase similarity larger than X of each commodity similarity score, and judging the data; n is a similarity ranking threshold; x is a similarity threshold;
step four: and establishing a corresponding binding relationship between the data filtered out in the step three and the commodity to be subjected to price comparison.
Further, the similarity ranking threshold is 5.
Further, when the similarity is calculated by using a matching algorithm in the elastic search server, the similarity threshold is 200.
Further, in the comparative information display process, the corresponding commodity relevant information includes price, sales volume and/or number of comments and/or price fluctuation trend.
Further, the method also comprises the step of calculating and storing the price comparison result.
One or more non-transitory computer-readable storage media containing computer-executable instructions that, when executed by one or more processors, cause the processors to perform the above-described option pricing method.
The invention has the beneficial effects that:
by adopting the invention, the information of commodities and similar commodities can be visually, accurately and automatically displayed, the historical price of the inquired commodities can be seen, the trend of the historical price, the sales volume condition and the comment condition can be known, purchasers or e-commerce can more deeply know the commodities, better commodities can be selected and commodity price can be more accurately established by convenient purchasing, the commodities of the platform can constantly keep strong advantages, the competitiveness of the platform is improved, and consumers can also buy better commodities.
The specific implementation mode is as follows:
the design concept of the invention is as follows: aiming at the defects of the prior art, the method comprises a commodity information data acquisition process, a commodity information data cleaning process and a comparative information display process, and unique cleaning rules are set, so that similar commodities are screened, and price comparison results are updated in real time. Each process is described in detail below.
Commodity data acquisition process
The implementation of the data collection process requires four major components, namely a Downloader (Downloader), a page processor (PageProcessor), a Scheduler (Scheduler), and a Pipeline (Pipeline). The foregoing downloaders, pageprocess, schedule, Pipeline have the same meaning as the prior art terms.
The downloader is configured to download pages from the internet.
The page handler is configured to interpret pages, extract useful information, and discover new links. The useful information mainly includes price, sales volume, number of comments, etc. of the goods. The technical means for finding the new link is to manually analyze the page, write an analysis program and extract a script of page information.
The scheduler is configured to manage the URLs to be collected and to remove some duplicate URLs.
The pipeline is configured to be responsible for the processing of the extracted results, including computation, persistence to files, databases, and the like.
The commodity information data acquisition process comprises the following steps:
the method comprises the following steps: importing a link URL of a commodity needing price comparison into a task list to be acquired; the list to be collected is specially used for storing the link URL.
Step two: loading a link URL of a commodity to be collected and putting the link URL into a request pool;
step three: acquiring a request in a request pool and sending the request to a downloader; particularly, the request refers to a corresponding request in the step two;
step four: downloading the page by the downloader and returning page information;
step five: the page handler processes the page, parses the page (xpath is used in this embodiment), extracts useful information, puts it into a list, and handles it by the pipeline.
Step six: storing the data in the list into a database in the pipeline, so as to facilitate subsequent cleaning;
step seven: and repeating the steps three to six until all the requests in the request pool are processed. And finishing data acquisition.
In the above process, if a new link URL is found to be imported into the task list to be collected (here, it is a queue, and the FIFO principle is satisfied), the new link URL is continuously put into the request pool.
Cleaning process of commodity information data
In this embodiment, a search server elastic search (hereinafter referred to as es) and an N-gram word segmentation algorithm are used for data cleaning. The process mainly comprises the following steps:
the method comprises the following steps: firstly, establishing a corresponding index in a search server, wherein at present, the name of a commodity is mainly used as an index field;
step two: loading all the collected data and writing the data into a search server;
step three: and searching through a search server according to the provided commodity with the required price ratio, taking out the top N digits with the highest similarity score of each commodity, and judging the data.
And N is a similarity ranking threshold value which can be selected according to needs. The present embodiment is set to 5, i.e., the top five items are most similar.
The similarity score is a score obtained by a matching algorithm in the search server. The similarity score is calculated by the following means: the phrases are subjected to word segmentation through an N-gram algorithm and then are compared, each phrase has a score after being matched, the score is obtained by multiplying a coefficient after being accumulated, and the final score is the similarity score, and the coefficient takes different values according to the number of the phrases after word segmentation and is a standard defined in the interior. In this embodiment, it is considered that the matching degree between the commodity with the similarity score greater than 200 and the desired result is higher, so the data with the similarity greater than 200 is further selected from the top N bits, and the data with the similarity lower than 200 is considered to be lower and will not be used.
Step four: the data filtered out in the third step is further subjected to manual examination, the final data is obtained after the examination, and a corresponding binding relationship is established between the final data and the commodity to be subjected to price comparison;
manual review is a preferred way to further determine the similarity of the goods. The step can be skipped, and the data obtained in the step three can be directly used as final data to determine the corresponding binding relation with the commodity to be subjected to price comparison.
Thirdly, the method comprises the following steps: and (3) comparing information display processes:
the process is that a source URL of final data obtained in the commodity information data cleaning process is added into a task list to be collected and is collected periodically (for example, every three days), and a user can check the price, sales volume, comment number, price fluctuation trend and other related information of a price-comparing commodity through a man-machine interaction page, so that accurate guidance is provided for how a company platform prices or purchases commodities of which household electrical appliance. The man-machine interface is designed by those skilled in the art according to actual needs by adopting an interface design system.
And the system is responsible for processing the extracted result, including the processes of calculation and storage. The calculation means calculating the price difference and the percentage fluctuation of the price difference and the price of the commodity to be compared. Save is persisted to files, databases, etc.
It will be understood by those skilled in the art that all or part of the processes of the methods of the above embodiments may be implemented by a computer program, which can be stored in a non-volatile computer readable storage medium, and when executed, can include the processes of the above embodiments of the methods. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), or the like.
The above description is only one of the preferred embodiments of the present invention, and should not be taken as limiting the invention in any way, and modifications made by those skilled in the art by using the above description are within the scope of the present invention.
Claims (10)
1. A price comparison method for selected goods is characterized by comprising a goods information data acquisition process, a goods information data cleaning process and a comparative information display process;
the commodity information data acquisition process is to acquire the links of commodities to be subjected to price comparison, analyze the pages where the commodities are located, extract useful information and store the useful information in a database;
in the commodity information data cleaning process, an index is established for the acquired information by using an index engine, and data of similar commodities meeting the requirements of a similarity threshold and a similarity ranking threshold at the same time are acquired;
and in the comparative information display process, the source URL corresponding to the final data obtained by cleaning according to the commodity information data is subjected to periodic data acquisition, so that the comparative display of the relevant information of the corresponding commodity is realized.
2. The method for price comparison of selected products according to claim 1, wherein the process of acquiring the commodity information data is specifically as follows:
the method comprises the following steps: importing a link URL of a commodity needing price comparison into a task list to be acquired;
step two: loading a link URL of a commodity to be collected and putting the link URL into a request pool;
step three: acquiring and sending the corresponding request in the step two;
step four: downloading the page corresponding to the link URL in the second step, and returning page information;
step five: processing the page in the step four, analyzing the page, extracting useful information, putting the useful information into a list and handing the useful information by a pipeline;
step six: storing the data in the list into a database;
step seven: and repeating the steps three to six until all the requests in the request pool are processed.
3. The method for selecting a price as claimed in claim 2, wherein if a new link is found during the process of collecting the data of the commodity information, the URL of the new link is continuously put into the request pool.
4. The method for selecting a price for a commodity according to claim 2, wherein an xpath resolution page is adopted in the commodity information data collection process step five.
5. The method for selecting a price for a commodity according to claim 2, wherein the process of cleaning the commodity information data is specifically as follows:
the method comprises the following steps: firstly, establishing a corresponding index in a search server;
step two: loading all data acquired in the commodity information data acquisition process, and writing the data into a search server;
step three: searching for price comparison commodities according to needs by a search server, calculating commodity similarity, taking out data with the first N-phase similarity larger than X of each commodity similarity score, and judging the data; n is a similarity ranking threshold; x is a similarity threshold;
step four: and establishing a corresponding binding relationship between the data filtered out in the step three and the commodity to be subjected to price comparison.
6. The choice price comparison method of claim 5, wherein the similarity ranking threshold is 5.
7. The election price comparison method according to claim 5 or 6, characterized in that the similarity threshold is 200 when the similarity is calculated using the matching algorithm in the elasticsearch server.
8. The method for price comparison of selected items according to claim 5 or 6, wherein in the process of displaying the comparative information, the corresponding commodity-related information comprises price, sales volume and/or number of comments and/or price fluctuation trend.
9. The method of claim 1, further comprising the step of calculating and storing a price comparison result.
10. One or more non-transitory computer-readable storage media containing computer-executable instructions that, when executed by one or more processors, cause the processors to perform the method of comparing prices for a selection according to any one of claims 1 to 9.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910929807.6A CN110689407A (en) | 2019-09-29 | 2019-09-29 | Price comparison method for selected products and computer readable storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910929807.6A CN110689407A (en) | 2019-09-29 | 2019-09-29 | Price comparison method for selected products and computer readable storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110689407A true CN110689407A (en) | 2020-01-14 |
Family
ID=69108936
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910929807.6A Pending CN110689407A (en) | 2019-09-29 | 2019-09-29 | Price comparison method for selected products and computer readable storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110689407A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111353847A (en) * | 2020-02-11 | 2020-06-30 | 北京加立技术有限公司 | Multi-platform multi-dimensional price comparison method and device |
CN113781148A (en) * | 2020-11-24 | 2021-12-10 | 北京沃东天骏信息技术有限公司 | Method and device for determining displayed articles |
CN117522534A (en) * | 2024-01-08 | 2024-02-06 | 深圳市卖点科技股份有限公司 | Intelligent commodity display method and system based on Internet of things |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103258280A (en) * | 2012-02-17 | 2013-08-21 | 盛趣信息技术(上海)有限公司 | Price comparative method and system |
US20150112840A1 (en) * | 2013-10-23 | 2015-04-23 | Toshiba Tec Kabushiki Kaisha | Shopping support device and shopping support method |
CN107248098A (en) * | 2017-05-25 | 2017-10-13 | 深圳市思域网络技术有限公司 | A kind of commodity Auto-matching and the method and system of displaying |
CN107464162A (en) * | 2017-07-28 | 2017-12-12 | 腾讯科技(深圳)有限公司 | Commodity association method, apparatus and computer-readable recording medium |
CN107808325A (en) * | 2017-10-26 | 2018-03-16 | 广州供电局有限公司 | The concurrent real-time price comparing method of more electric business merchandise news real-time acquisition systems and more electric business |
-
2019
- 2019-09-29 CN CN201910929807.6A patent/CN110689407A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103258280A (en) * | 2012-02-17 | 2013-08-21 | 盛趣信息技术(上海)有限公司 | Price comparative method and system |
US20150112840A1 (en) * | 2013-10-23 | 2015-04-23 | Toshiba Tec Kabushiki Kaisha | Shopping support device and shopping support method |
CN107248098A (en) * | 2017-05-25 | 2017-10-13 | 深圳市思域网络技术有限公司 | A kind of commodity Auto-matching and the method and system of displaying |
CN107464162A (en) * | 2017-07-28 | 2017-12-12 | 腾讯科技(深圳)有限公司 | Commodity association method, apparatus and computer-readable recording medium |
CN107808325A (en) * | 2017-10-26 | 2018-03-16 | 广州供电局有限公司 | The concurrent real-time price comparing method of more electric business merchandise news real-time acquisition systems and more electric business |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111353847A (en) * | 2020-02-11 | 2020-06-30 | 北京加立技术有限公司 | Multi-platform multi-dimensional price comparison method and device |
CN113781148A (en) * | 2020-11-24 | 2021-12-10 | 北京沃东天骏信息技术有限公司 | Method and device for determining displayed articles |
CN117522534A (en) * | 2024-01-08 | 2024-02-06 | 深圳市卖点科技股份有限公司 | Intelligent commodity display method and system based on Internet of things |
CN117522534B (en) * | 2024-01-08 | 2024-03-29 | 深圳市卖点科技股份有限公司 | Intelligent commodity display method and system based on Internet of things |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107391493B (en) | Public opinion information extraction method and device, terminal equipment and storage medium | |
WO2017000513A1 (en) | Information pushing method and apparatus based on user search behavior, storage medium, and device | |
CN104699725B (en) | data search processing method and system | |
CN109635198B (en) | Method, device, medium and electronic equipment for presenting user search results on commodity display platform | |
US20130166544A1 (en) | Generating ranked search results using linear and nonlinear ranking models | |
CN110689407A (en) | Price comparison method for selected products and computer readable storage medium | |
US20140067786A1 (en) | Enhancing product search engine results using user click history | |
CN112907333B (en) | Intelligent matching method, device and equipment based on block chain and storage medium | |
JP2020504879A (en) | System and method for collecting data related to malicious content in a networked environment | |
CN112380457A (en) | Accurate personalized recommendation method based on purchase information | |
KR20210032691A (en) | Method and apparatus of recommending goods based on network | |
CN111858922A (en) | Service side information query method and device, electronic equipment and storage medium | |
KR20220101326A (en) | System for increasing open market product sales and efficient operation | |
JPH06119309A (en) | Purchase prospect degree predicting method and customer management system | |
CN118193806A (en) | Target retrieval method, target retrieval device, electronic equipment and storage medium | |
CN108984777B (en) | Customer service method, apparatus and computer-readable storage medium | |
US20040210335A1 (en) | Generating a sampling plan for testing generated content | |
JP6489340B1 (en) | Comparison target company selection system | |
JP5670490B2 (en) | Category determination device, search device, category determination method, category determination program, and computer-readable recording medium storing the program | |
JP2005100221A (en) | Investment judgement support information providing device and method | |
CN111639274B (en) | Online commodity intelligent sorting method, device, computer equipment and storage medium | |
JP6932680B2 (en) | Trading Examination Equipment, Trading Examination Method and Trading Examination Program | |
CN113051392A (en) | Knowledge pushing method and device | |
CN108182608B (en) | Electronic device, product recommendation method, and computer-readable storage medium | |
CN110019700B (en) | Data processing method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20200114 |