CN108776911A - A kind of Commodity Competition relationship analysis method based on machine learning - Google Patents

A kind of Commodity Competition relationship analysis method based on machine learning Download PDF

Info

Publication number
CN108776911A
CN108776911A CN201810706947.2A CN201810706947A CN108776911A CN 108776911 A CN108776911 A CN 108776911A CN 201810706947 A CN201810706947 A CN 201810706947A CN 108776911 A CN108776911 A CN 108776911A
Authority
CN
China
Prior art keywords
commodity
data
machine learning
similarity
method based
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810706947.2A
Other languages
Chinese (zh)
Inventor
张帅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inspur Software Co Ltd
Original Assignee
Inspur Software Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inspur Software Co Ltd filed Critical Inspur Software Co Ltd
Priority to CN201810706947.2A priority Critical patent/CN108776911A/en
Publication of CN108776911A publication Critical patent/CN108776911A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • G06Q30/0202Market predictions or forecasting for commercial activities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]

Landscapes

  • Business, Economics & Management (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Strategic Management (AREA)
  • Development Economics (AREA)
  • Engineering & Computer Science (AREA)
  • General Business, Economics & Management (AREA)
  • Physics & Mathematics (AREA)
  • Marketing (AREA)
  • General Physics & Mathematics (AREA)
  • Economics (AREA)
  • Theoretical Computer Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Data Mining & Analysis (AREA)
  • Game Theory and Decision Science (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a kind of Commodity Competition relationship analysis method based on machine learning, the method quantifies the similarity of commodity by item property, original commodity data is filtered, while being filtered out in the presence of abnormal business datum, to reach the business demand of network analysis.By the present invention in that carrying out Linear Comparison to the Sales Volume of Commodity in these experiments with visualization tool, it was found that two commodity show certain competitive relation within a certain period of time, it is provided a convenient for the work development of profession personnel, decision support is provided for Commodity Competition investigation and evidence collection, it ensure that the reliability of data, the data survey workload for greatly reducing product marketing personnel and decision-making management personnel again simultaneously, strong foundation is provided for the future plan of commodity.

Description

A kind of Commodity Competition relationship analysis method based on machine learning
Technical field
The present invention relates to data analysis technique fields, and in particular to a kind of Commodity Competition relationship analysis based on machine learning Method is related to the application of machine learning KNN algorithms, Pearson relevance matrix analysis, data visualization analysis, mathematical statistics etc..
Background technology
With the fast development of e-commerce, the highly desirable acquisition of goods marketing tradesman finds that there is certain competition closes The ability of the commodity of system.Existing similarity competition method of discrimination can help client to find that there are competitive relations to a certain extent Commodity, but the result of data analysis is often not accurate enough, in addition the erroneous judgement of many prioris and the defect of analysis method, Last result is often not particularly suited for Commodity Competition relationship and excavates practical scene.
Since common method of analyzing competitiveness SWOT self-formings, strategic research and competition analysis have been widely used in it, have become war The slightly important tools of analysis of management and competitive intelligence.Analysis is intuitive, using being simply its important advantage.Even if without accurate Data are supported and more specialized analysis tool, can also obtain convictive conclusion.But exactly this intuitive and letter It is single so that SWOT inevitably carries the inadequate defect of precision.Such as SWOT analysis uses qualitative method, by enumerating The various performances of Strengths, Weaknesses, Opportunities, Threats, a kind of fuzzy competition among enterprises of formation Position description.The judgement made on this basis carries a degree of get sth into one's head unavoidably.So when using SWOT methods It is noted that the limitation of method, true as possible, objective, accurate when enumerating as the fact that basis for estimation, and provide certain Quantitative data make up the deficiencies of SWOT qualitative analyses, construct the basis of high-rise qualitative analysis.Due to information excavating channel and pass Note point is often with subjectivity, and data analyst can go to collect data towards expected subconsciousness judgement, so analysis result is past It is past not accurate enough.
Keen competition brings immense pressure to enterprise between commodity economic times, similar commodity, accurate and quick A pair of of the commodity found with competitive relation can be necessarily the market expansion of industry product and reduce cost and bring important ginseng Examine foundation.Under big data scene, researcher often faces the data processing needs of magnanimity, and analysis result is often not objective enough Accurately, while it is even more impossible to ensure the validity of traditional analysis.
Current each major company, there are the mode that generally use when the commodity of competitive relation is investigated on the spot, passes through correlation in investigation Sales department and sales department go to market to collect evidence, and use some basic graphic statistics methods.But it is a lack of the reason of complete set By support, and precision is not high enough, and analysis result is often with prodigious subjective consciousness.
Invention content
The technical problem to be solved by the present invention is to:The present invention is in view of the above problems, provide a kind of quotient based on machine learning Product competitive relation analysis method is calculated by using data visualization analysis, Gauss modeling, NearestNeighbors machine learning Some application processes such as method, Pearson matrixes can preferably identify the commodity with competitive relation.
The technical solution adopted in the present invention is:
A kind of Commodity Competition relationship analysis method based on machine learning, the method quantify the similar of commodity by item property Degree, original commodity data is filtered, while being filtered out in the presence of abnormal business datum, to reach the industry of network analysis Business demand.
The commodity data is the business datum in commodity at the appointed time section, this partial data is can completely to do relatively Further linear analysis.
The selection process of the commodity data includes that content is as follows:
From at least one candidate commodity data object, obtain and commodity data object to be analyzed most like at least one period Target data objects, including:Determine the period threshold value of the target fragment and Sales Volume of Commodity data of a commodity data object It chooses.
The quantization method of the commodity similarity is:The pairs of attribute value of commodity is shown comparison on X-Y scheme.
The quantization method of the commodity similarity is:Set different commodity to a point, and using scatter plot come into Row visual analyzing, it is more intuitive effective.
The quantization of the commodity similarity carrys out Accurate Curve-fitting using kernel function fitting modeling using the method for gaussian kernel function The higher similar commodity of similarity are gathered in the same Gaussian Profile, the commodity of different similarities by the distribution of data attribute value It is respectively distributed in different gaussian kernel functions, commodity classification circle of Different groups, density, the color of deme commodity can be obtained The depth visualizes isometry strategy.
The item property is a variety of attributes(More than multiple attributes reach high-dimensional), the quantization of data commodity similarity adopts With difference of the An Delie curve comparison difference commodity on different attribute.An Delie curves can convert high dimensional data to limited Fourier series are finally indicated with trigonometric function output, judge the similarity between different commodity by the close coefficient of curve.
After obtaining the higher commodity collection of similarity, done further using NearestNeighbors machine learning algorithms Similarity calculation.
The similarity calculation process is as follows:
One training sample set is set, and training sample concentrates each data, and there are labels, i.e., it is understood that training sample is concentrated The correspondence of each data and affiliated classification;
After inputting the not new data of label, by each feature of new data spy corresponding with the training sample each data of concentration Sign is compared, and the tag along sort of the data most like with new samples is then calculated.Usually only selection sample data is concentrated Preceding k most like data, using wherein most tag along sorts belonged to as the tag along sort of new data.
The method is similar by calculating the minimum between two commodity by using NearestNeighbors nearest neighbour methods Distance, accurately calculate with another highest commodity of each commodity similarity, the commodity being calculated in this way are further Analysis provide important foundation.
Similarity is higher between similarity distance two commodity of smaller explanation between two commodity.By practical test, two The corresponding attribute of row commodity is very close.
After launch, market generally requires for a period of time to receive a pair of of commodity with competitive relation, with The propagation of public praise and the diffusion of demonstration effect, sales volume often gradually rise, a pair with competitive relation after market saturation Commodity often have special performance on sales volume.Such as the rapid growth of a Sales Volume of Commodity may result in another commodity The decline of sales volume, while a commodity show weak, sales volume stagnation, increase weak in a short time, another commodity is opened The gradually expansion market sales volume that begins also gradually rises.A pair of of commodity with this performance are often with competitive relation.Finally in order to Improve the reliability of data, it is also necessary to be filtered in the commodity of competitive relation to these.Two kinds of commodity all occupy on the market When certain share, their sales volume is possible to whithin a period of time relatively due to competing, and it is therefore necessary to right The screening that these data are further walked.Finally we have obtained multiple commodity pair with competitive relation.
Beneficial effects of the present invention are:
By the present invention in that carrying out Linear Comparison to the Sales Volume of Commodity in these experiments with visualization tool, it is found that two commodity exist Certain competitive relation is shown in certain period of time, is provided a convenient for the work development of profession personnel, is that commodity are competing It strives investigation and evidence collection and provides decision support, ensure that the reliability of data, while greatly reducing product marketing personnel again and determining The data survey workload of plan administrative staff provides strong foundation for the future plan of commodity.
Description of the drawings
Fig. 1 is distribution map of the different commodity on two attributes.
Specific implementation mode
Below in conjunction with the accompanying drawings, according to specific implementation mode, the present invention is further described:
Embodiment 1:
A kind of Commodity Competition relationship analysis method based on machine learning, the method quantify the similar of commodity by item property Degree, original commodity data is filtered, while being filtered out in the presence of abnormal business datum, to reach the industry of network analysis Business demand.
The commodity data is the business datum in commodity at the appointed time section, this partial data is can completely to do relatively Further linear analysis.
The selection process of the commodity data includes that content is as follows:
From at least one candidate commodity data object, obtain and commodity data object to be analyzed most like at least one period Target data objects, including:Determine the period threshold value of the target fragment and Sales Volume of Commodity data of a commodity data object It chooses.
Embodiment 2
The quantization method of the commodity similarity is:The pairs of attribute value of commodity is shown comparison on X-Y scheme.
Embodiment 3
As shown in Figure 1, the quantization method of the commodity similarity is:It sets different commodity to a point, and uses scatterplot Figure carries out visual analyzing, more it is intuitive effectively.
Each point represents a commodity in figure, and reference axis respectively represents two attributes of commodity, the quotient concentrated in together It is close that product point illustrates that these commodity show on attribute 3 and attribute 4, belongs to more similar commodity.
Embodiment 4
The quantization of the data commodity similarity carrys out Accurate Curve-fitting using kernel function fitting modeling using the method for gaussian kernel function The higher similar commodity of similarity are gathered in the same Gaussian Profile, the commodity of different similarities by the distribution of data attribute value It is respectively distributed in different gaussian kernel functions, commodity classification circle of Different groups, density, the color of deme commodity can be obtained The depth visualizes isometry strategy.
Embodiment 5
The item property is a variety of attributes(More than multiple attributes reach high-dimensional), the quantization of data commodity similarity is using peace Difference of the strong curve comparison difference commodity of moral on different attribute.An Delie curves can convert high dimensional data in limited Fu Leaf arrangement is finally indicated with trigonometric function output, judges the similarity between different commodity by the close coefficient of curve.
Embodiment 6
After obtaining the higher commodity collection of similarity, further phase is done using NearestNeighbors machine learning algorithms It is calculated like degree.
The similarity calculation process is as follows:
One training sample set is set, and training sample concentrates each data, and there are labels, i.e., it is understood that training sample is concentrated The correspondence of each data and affiliated classification;
After inputting the not new data of label, by each feature of new data spy corresponding with the training sample each data of concentration Sign is compared, and the tag along sort of the data most like with new samples is then calculated.Usually only selection sample data is concentrated Preceding k most like data, using wherein most tag along sorts belonged to as the tag along sort of new data.
The method is similar by calculating the minimum between two commodity by using NearestNeighbors nearest neighbour methods Distance, accurately calculate with another highest commodity of each commodity similarity, the commodity being calculated in this way are further Analysis provide important foundation.
Similarity is higher between similarity distance two commodity of smaller explanation between two commodity.By practical test, two The corresponding attribute of row commodity is very close.
Embodiment is merely to illustrate the present invention, and not limitation of the present invention, the ordinary skill in relation to technical field Personnel can also make a variety of changes and modification without departing from the spirit and scope of the present invention, therefore all equivalent Technical solution also belong to scope of the invention, scope of patent protection of the invention should be defined by the claims.

Claims (10)

1. a kind of Commodity Competition relationship analysis method based on machine learning, it is characterised in that:The method passes through item property The similarity for quantifying commodity, original commodity data is filtered, while being filtered out in the presence of abnormal business datum, to reach To the business demand of network analysis.
2. a kind of Commodity Competition relationship analysis method based on machine learning according to claim 1, which is characterized in that institute Commodity data is stated as the business datum in commodity at the appointed time section.
3. a kind of Commodity Competition relationship analysis method based on machine learning according to claim 2, which is characterized in that institute The selection process for stating commodity data includes that content is as follows:
From at least one candidate commodity data object, obtain and commodity data object to be analyzed most like at least one period Target data objects, include the period threshold value of the target fragment of commodity data object and Sales Volume of Commodity data.
4. a kind of Commodity Competition relationship analysis method based on machine learning according to claim 3, which is characterized in that institute The quantization method for stating commodity similarity is:The pairs of attribute value of commodity is shown comparison on X-Y scheme.
5. a kind of Commodity Competition relationship analysis method based on machine learning according to claim 3, which is characterized in that institute The quantization method for stating commodity similarity is:It sets different commodity to a point, and visualization point is carried out using scatter plot Analysis.
6. a kind of Commodity Competition relationship analysis method based on machine learning according to claim 3, which is characterized in that institute The quantization of commodity similarity is stated using kernel function fitting modeling, uses point of the method fitting data attribute value of gaussian kernel function Cloth.
7. a kind of Commodity Competition relationship analysis method based on machine learning according to claim 3, which is characterized in that institute It is a variety of attributes to state item property, and the quantization of data commodity similarity uses An Delie curve comparison difference commodity in different attribute On difference.
8. according to a kind of any Commodity Competition relationship analysis methods based on machine learning of claim 4-7, feature It is, after obtaining the higher commodity collection of similarity, is done further using NearestNeighbors machine learning algorithms Similarity calculation.
9. a kind of Commodity Competition relationship analysis method based on machine learning according to claim 8, which is characterized in that institute It is as follows to state similarity calculation process:
One training sample set is set, and training sample concentrates each data, and there are labels;
After inputting the not new data of label, by each feature of new data spy corresponding with the training sample each data of concentration Sign is compared, and the tag along sort of the data most like with new samples is then calculated.
10. a kind of Commodity Competition relationship analysis method based on machine learning according to claim 9, which is characterized in that The method is by using NearestNeighbors nearest neighbour methods, by calculating the minimum similarity distance between two commodity, meter It calculates with another highest commodity of each commodity similarity, the commodity being calculated in this way provide weight for further analysis Want basis.
CN201810706947.2A 2018-07-02 2018-07-02 A kind of Commodity Competition relationship analysis method based on machine learning Pending CN108776911A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810706947.2A CN108776911A (en) 2018-07-02 2018-07-02 A kind of Commodity Competition relationship analysis method based on machine learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810706947.2A CN108776911A (en) 2018-07-02 2018-07-02 A kind of Commodity Competition relationship analysis method based on machine learning

Publications (1)

Publication Number Publication Date
CN108776911A true CN108776911A (en) 2018-11-09

Family

ID=64030809

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810706947.2A Pending CN108776911A (en) 2018-07-02 2018-07-02 A kind of Commodity Competition relationship analysis method based on machine learning

Country Status (1)

Country Link
CN (1) CN108776911A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111275511A (en) * 2018-12-05 2020-06-12 北京京东尚科信息技术有限公司 Method, device, electronic equipment and medium for identifying competitive commodities
CN113129105A (en) * 2021-04-23 2021-07-16 北京沃东天骏信息技术有限公司 Object data processing method, device, equipment, storage medium and program

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106815347A (en) * 2017-01-13 2017-06-09 沈阳工学院 Improvement slope one Collaborative Filtering Recommendation Algorithms based on commodity similarity
CN106919619A (en) * 2015-12-28 2017-07-04 阿里巴巴集团控股有限公司 A kind of commercial articles clustering method, device and electronic equipment
CN107392644A (en) * 2017-06-19 2017-11-24 华南理工大学 A kind of commodity purchasing predicts modeling method

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106919619A (en) * 2015-12-28 2017-07-04 阿里巴巴集团控股有限公司 A kind of commercial articles clustering method, device and electronic equipment
CN106815347A (en) * 2017-01-13 2017-06-09 沈阳工学院 Improvement slope one Collaborative Filtering Recommendation Algorithms based on commodity similarity
CN107392644A (en) * 2017-06-19 2017-11-24 华南理工大学 A kind of commodity purchasing predicts modeling method

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111275511A (en) * 2018-12-05 2020-06-12 北京京东尚科信息技术有限公司 Method, device, electronic equipment and medium for identifying competitive commodities
CN113129105A (en) * 2021-04-23 2021-07-16 北京沃东天骏信息技术有限公司 Object data processing method, device, equipment, storage medium and program

Similar Documents

Publication Publication Date Title
Kim et al. Introducing EzAAI: a pipeline for high throughput calculations of prokaryotic average amino acid identity
Benabdellah et al. A survey of clustering algorithms for an industrial context
Zhang et al. Two deep learning networks for rail surface defect inspection of limited samples with line-level label
Ayeldeen et al. Prediction of liver fibrosis stages by machine learning model: A decision tree approach
Behbahani et al. A case-based reasoning system development for statistical process control: Case representation and retrieval
Sumathi et al. Data mining: analysis of student database using classification techniques
CN108776911A (en) A kind of Commodity Competition relationship analysis method based on machine learning
Mallikharjuna Rao et al. Data preprocessing techniques: emergence and selection towards machine learning models-a practical review using HPA dataset
Rajak et al. Applying and comparing machine learning classification algorithms for predicting the results of students
US10452746B2 (en) Quantitative comparison of sample populations using earth mover's distance
Pandya et al. Bias protected attributes data balancing using map reduce
Kostrzewska et al. The classical and Bayesian logistic regression in the research on the financial standing of enterprises after bankruptcy in Poland
CN113240209A (en) Urban industry cluster development path prediction method based on graph neural network
Shih et al. Mining changes in patent trends for competitive intelligence
Lai Segmentation study on enterprise customers based on data mining technology
Fan et al. Spatially enabled customer segmentation using a data classification method with uncertain predicates
CN112506930B (en) Data insight system based on machine learning technology
Costa et al. Optimizing object detection models via active learning
Yan et al. Research on application of data mining technology in risk assessment process of audit
Özari et al. Financial Performance Evaluating and Ranking Approach for Banks in Bist Sustainability Index Using Topsis and K-Means Clustering Method
Abbas et al. Unsupervised machine learning technique for classifying production zones in unconventional reservoirs
Ali et al. A brief analysis of data mining techniques
Chen Application of web data mining technique to enterprise management of electronic commerce
CN113470739B (en) Protein interaction prediction method and system based on mixed membership degree random block model
Kalaivani et al. Statistical Modelling Using Data Mining Tools in Mergers and Acquisition with Regards to Manufacture & Service Sector

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20181109

RJ01 Rejection of invention patent application after publication