CN116664173B - Big data model-based bid analysis method, terminal and storage medium - Google Patents

Big data model-based bid analysis method, terminal and storage medium Download PDF

Info

Publication number
CN116664173B
CN116664173B CN202310960482.4A CN202310960482A CN116664173B CN 116664173 B CN116664173 B CN 116664173B CN 202310960482 A CN202310960482 A CN 202310960482A CN 116664173 B CN116664173 B CN 116664173B
Authority
CN
China
Prior art keywords
bid
evaluation
determining
evaluation index
cluster
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310960482.4A
Other languages
Chinese (zh)
Other versions
CN116664173A (en
Inventor
高渐朋
邱洪涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chengdu Ict Information Technology Co ltd
Original Assignee
Chengdu Ict Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chengdu Ict Information Technology Co ltd filed Critical Chengdu Ict Information Technology Co ltd
Priority to CN202310960482.4A priority Critical patent/CN116664173B/en
Publication of CN116664173A publication Critical patent/CN116664173A/en
Application granted granted Critical
Publication of CN116664173B publication Critical patent/CN116664173B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0282Rating or review of business operators or products
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Abstract

The application discloses a bid analysis method, a terminal and a storage medium based on a big data model, wherein the method comprises the following steps: crawling commodity description, and outputting commodity types through a large language model; screening a plurality of possible bid products; crawling the evaluation indexes of all possible bid products for a plurality of continuous days; determining the periodic evaluation index of each possible bid item, grading the periodic evaluation index, and determining the weight of the periodic evaluation index; determining a composite score of each possible bid according to the weights; calculating and determining the number of clusters and a cluster center; the clustering center is used as a final bid product to be output; the application can automatically collect and analyze large-scale data by using a crawler technology and a large language model, and can accurately classify and position the bid by using a K-means clustering algorithm after continuously crawling the evaluation index of the possible bid and calculating the comprehensive score of each bid and screening the final bid from the possible bid.

Description

Big data model-based bid analysis method, terminal and storage medium
Technical Field
The application relates to the field of big data analysis, in particular to a bid analysis method, a terminal and a storage medium based on a big data model.
Background
In a cross-border trade business environment, bid analysis is a common business strategy, which involves intensive research on competitors' products, services, sales strategies, etc. to better understand the market environment and formulate effective market strategies. Traditional bid analysis methods mainly rely on manual collection and analysis of data, and the collection process of the data can be influenced by human factors, so that the quality and the integrity of the data are influenced; and can be time consuming in processing large-scale data, resulting in efficiency and accuracy that can also be affected by human factors.
Conventional bid analysis methods generally can only perform static analysis, and dynamic changes of markets are difficult to capture. For example, competitor's sales strategies may change over time, and conventional bid analysis methods may not be able to capture these changes in time.
Disclosure of Invention
The application aims to solve the technical problem that the traditional bid analysis method mainly relies on manual collection and analysis of data, and aims to provide a bid analysis method, a terminal and a storage medium based on a big data model, so that automatic analysis of bids of a sales platform is realized through the big data model.
The application is realized by the following technical scheme:
a bid analysis method based on a big data model comprises the following steps:
selecting a sales platform needing to perform bid analysis, crawling commodity description set by a merchant, inputting the commodity description into a large language model, and outputting commodity types through the large language model;
determining self commodities to be analyzed, and screening a plurality of possible bidding products from commodity types output by the large language model;
and crawling the evaluation indexes of all possible bid products for a plurality of continuous days, wherein the evaluation indexes comprise: sales quantity, customer evaluation content, store score, and price;
data cleaning is carried out on the collected evaluation indexes;
determining the periodic evaluation index of each possible bid item, grading the periodic evaluation index, and determining the weight of the periodic evaluation index;
determining a composite score of each possible bid according to the weights;
introducing the comprehensive score into a K-means algorithm, and calculating and determining the number of clusters and a cluster center;
and outputting the clustering center as a final bid product.
Specifically, the period T is set, the periodic evaluation index is the change amount of the evaluation index in the period T, and the first is setPeriodic evaluation index of individual bid items +.>Comprising the following steps: new sales numberQuantity->New customer evaluation quantity->New customer rating contentStore score->Price->,/>
The method for grading the periodic evaluation index comprises the following steps:
the newly added sales quantity is divided into 1 level, 2 level, 3 level, 4 level and 5 level according to the number from small to large;
the number of the newly added customer evaluations is divided into 1 grade, 2 grade, 3 grade, 4 grade and 5 grade according to the number from small to large;
the prices are classified into 1 level, 2 level, 3 level, 4 level and 5 level according to the high to low price;
store scores were rated from low to high as grade 1, grade 2, grade 3, grade 4 and grade 5;
and identifying the newly added customer evaluation content through a large language model, and classifying the customer evaluation content into 1 grade, 2 grade, 3 grade, 4 grade and 5 grade according to the difference evaluation, the deviation evaluation, the neutrality, the preference evaluation and the good evaluation.
Specifically, the method for determining the weight of the periodic evaluation index comprises the following steps:
for periodic evaluation indexNormalization is carried out: />Wherein->For the number of possible bids, +.>Is normalized data;
calculating the information entropy of each periodic evaluation index:wherein->Is->Information entropy of item evaluation index, if ∈>Definition->
Calculating a weight value of each periodic evaluation index:wherein->Is->The weight of the item evaluation index.
Specifically, the method for determining the composite score of each possible bid comprises the following steps:
constructing a data matrixAnd constructing positive and negative ideal solutions: />
Determining a weighted distance between each possible bid and positive and negative ideal solutions:
calculating a composite score for each possible bid:
as one embodiment, the method for determining the number of clusters and the cluster center includes:
determining a sample set of composite scores for all possible bidsAnd calculate->Sample->Average Euclidean distance from the rest of the samples in the sample set: />Wherein->Divide +.>Any sample other than the one used for the sample,is->To->Distance of->Is the number of possible bids;
calculated toIs the center, at a distance->Number of samples in euclidean distance in range:wherein->If the variable is not less than 0, the function value is 1; if the variable is smaller than 0, the function value is 0;
calculating the number of samples within Euclidean distance of all samplesAnd are arranged in descending order;
setting the number of clustersAnd choose +.>The samples are used as a clustering center, and the average distance from each sample in the clustering cluster to the clustering center is calculated: />,/>Wherein->Is->Total number of samples in each cluster, +.>Is->Cluster center of each cluster, +.>Other samples in the cluster;
calculating an evaluation value,/>Wherein->、/>For optional cluster center, +.>、/>For the average distance of the samples within the cluster to the cluster center, +.>Is the distance between two cluster centers;
setting an evaluation thresholdIf all the cluster centers are traversed, the evaluation values of any two cluster centers existOutput +.>A cluster center; if all the cluster centers are traversed, the evaluation value of some two cluster centers is +.>Correction->And (5) evaluating and repeating evaluation.
As another embodiment, a method of determining the number of clusters and a cluster center includes:
constructing a K-means mathematical model, determining the number of clusters by an elbow method, and randomly and optionally obtaining an initial cluster center;
dividing the rest B possible bid products into clusters according to nearby distribution selection and forming the clusters by the initial cluster centers, wherein B is the difference between the total number of the possible bid products and the number of the clusters;
calculating a new cluster center based on the formed cluster, and repeating the previous step;
and judging whether the clustering ending condition is met, if not, repeating the previous step, and if so, outputting the clustering number and the clustering center.
Optionally, the method for cleaning the collected evaluation indexes comprises the following steps: detecting and removing noise data and irrelevant data in the data; and removing white noise in the blank data field and the knowledge background.
Optionally, the period T is a set arbitrary time interval.
A big data model based bid analysis terminal comprising a memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing the steps of a big data model based bid analysis method as described above when executing the computer program.
A computer readable storage medium storing a computer program which, when executed by a processor, implements the steps of the method as claimed in any one of the preceding claims.
Compared with the prior art, the application has the following advantages and beneficial effects:
according to the application, by using a crawler technology and a large language model, commodity types can be screened out from names set by merchants, so that large-scale data can be automatically collected and analyzed, the data analysis efficiency is greatly improved, and meanwhile, the influence of human factors on the data quality is reduced;
through continuous multi-day crawling of the evaluation index of the possible bid and calculation of the comprehensive score of each bid, the bid can be accurately classified and positioned by using a K-means clustering algorithm, and the final bid can be screened from the possible bid
The bid analysis method based on the big data model can effectively solve the problems. By using a large language model and a K-means clustering algorithm, large-scale data can be automatically collected and analyzed, and the efficiency and accuracy of data analysis are improved. Meanwhile, dynamic changes of the market can be captured by continuously crawling evaluation indexes of the bid product for a plurality of days. In addition, by determining the periodic evaluation index and weight of each possible bid item, the strategy of the competitor can be deeply understood, thereby developing a more effective market strategy.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this specification, illustrate exemplary embodiments of the application and together with the description serve to explain the principles of the application.
Fig. 1 is a schematic flow chart of a bid analysis method based on a big data model according to the present application.
Fig. 2 is a flow chart of a method for determining the weight of a periodic evaluation index according to the present application.
FIG. 3 is a flow chart of a method of determining the number of clusters and cluster centers according to the present application.
Detailed Description
The present application will be described in further detail with reference to the drawings and embodiments, for the purpose of making the objects, technical solutions and advantages of the present application more apparent. It is to be understood that the specific embodiments described herein are merely illustrative of the substances, and not restrictive of the application.
It should be further noted that, for convenience of description, only the portions related to the present application are shown in the drawings.
Embodiments of the present application and features of the embodiments may be combined with each other without conflict. The present application will be described in detail below with reference to the accompanying drawings in conjunction with embodiments.
Example 1
As shown in fig. 1, a bid analysis method based on a big data model includes:
the first step, selecting a sales platform needing to perform bid analysis, crawling commodity description set by a merchant, inputting the commodity description into a large language model, and outputting commodity types through the large language model.
A web crawler is a program for automatically browsing the internet, and can automatically collect information on web pages.
A large language model is an artificial intelligence model that can understand and process natural language.
The main objective of this step is to determine the scope and target of the analysis and to collect relevant data. First, one or more sales platforms are selected for analysis. These platforms may include Taobao, beijing east, magnosis, amazon, eBay, temu, etc. The description of the merchant-set merchandise is then crawled using web crawler technology.
Since the merchant may set the commodity description to be various informal descriptions to increase the probability of being retrieved in order to improve the exposure rate of the commodity, the crawled commodity description is input to the large language model in the embodiment, and the large language model understands the commodity description and summarizes the actual commodity type information corresponding to the commodity description.
The main technical problem of this step is how to efficiently crawl and process large amounts of commodity description data, and how to accurately identify commodity categories using large language models.
Secondly, determining self commodities to be analyzed, and screening a plurality of possible bidding products from commodity types output by the large language model;
in this step, the self commodity to be analyzed is first determined. And then, selecting commodities similar to or related to the commodity from the commodity types output by the large language model as possible bid products.
Thirdly, crawling the evaluation indexes of all possible competitive products for a plurality of continuous days, wherein the evaluation indexes comprise: sales quantity, customer evaluation content, store score, and price;
by continuously crawling information for a plurality of days, the sales condition of the product in the period can be known in the subsequent processing, so that the possibility of explosive products can be observed.
Fourth, data cleaning is carried out on the collected evaluation indexes; comprising the following steps: detecting and removing noise data and irrelevant data in the data; and removing white noise in the blank data field and the knowledge background.
The data cleaning mainly comprises the steps of removing repeated data, filling missing data, correcting error data and the like. To ensure the quality and accuracy of the data.
And fourthly, determining the periodic evaluation index of each possible bid, grading the periodic evaluation index, and determining the weight of the periodic evaluation index.
That is, the period T is set, and the period T may be a day, a week, a month, a quarter, or a year, and the evaluation weights of the specific indexes in the evaluation indexes are different from each other for the whole commodity.
And fifthly, determining the comprehensive score of each possible bid according to the weight.
Step six, introducing the comprehensive score into a K-means algorithm, and calculating and determining the number of clusters and a cluster center; by introducing the composite score into the K-means algorithm, the bids can be classified, so that the distribution and characteristics of the bids can be better understood.
And seventh, outputting the clustering center as a final bid. The clustering center is used as a final bid product to be output, so that enterprises can be helped to determine the bid products more accurately, and the market strategy can be formulated better.
Through the embodiment, the efficiency and accuracy of data analysis can be improved: the large-scale data can be processed through automatic data collection and processing and deep data analysis, the efficiency and accuracy of the data analysis are improved, and the labor cost and the operation cost can be greatly reduced through the automatic data collection and processing and the deep data analysis.
Example two
In this embodiment, the fourth step and the fifth step are described, and in this embodiment, a period is taken as an example of one week, that is, the sales condition of the possible bidding products in one week is determined.
The periodic evaluation index is the change amount of the evaluation index in the period T, and the first is setPeriodic evaluation index of individual bid productsComprising the following steps: newly added sales quantity->New customer evaluation quantity->New customer evaluation content->Store score->Price of,/>
The method for grading the periodic evaluation index comprises the following steps:
the newly added sales quantity is divided into 1 level, 2 level, 3 level, 4 level and 5 level according to the number from small to large; the sales condition and market acceptance of the bid product can be reflected. In practice, the 1-stage can be set to 10 or less sales, the 2-stage can be set to 100 or less, the 3-stage can be set to 1000 or less, the 4-stage can be set to 10000 or less, and the 5-stage can be set to 10000 or more.
The number of the newly added customer evaluations is divided into 1 grade, 2 grade, 3 grade, 4 grade and 5 grade according to the number from small to large; customer feedback of the bid may be reflected, but in practice care needs to be taken to distinguish between the brush actions that may occur in the newly added customer evaluation.
The ranking of the number of evaluations may refer to the ranking of the newly increased sales number, but may be reduced as appropriate because some customers choose not to evaluate.
The prices are classified into 1 level, 2 level, 3 level, 4 level and 5 level according to the high to low price; the price positioning of the bid product can be reflected, meanwhile, the price positioning needs to refer to the price positioning of the commodity, the price is higher than the commodity, the price is designed to be 1 level, and the price is lower than the commodity, and the price is set to be 5 level. And performs registration division according to specific situations.
Store scores were rated from low to high as grade 1, grade 2, grade 3, grade 4 and grade 5; the shop service quality of the bid is reflected, and the common platform is provided with a corresponding shop scoring mechanism, which can be directly referred to or corrected according to specific conditions.
And identifying the newly added customer evaluation content through a large language model, and classifying the customer evaluation content into 1 grade, 2 grade, 3 grade, 4 grade and 5 grade according to the difference evaluation, the deviation evaluation, the neutrality, the preference evaluation and the good evaluation. The customer evaluation content is input into a large language model, natural language is identified through the large language model, and satisfaction degree of evaluation is analyzed and summarized.
After the division of the periodic evaluation index is completed, as shown in fig. 2, the method for providing the weight for determining the periodic evaluation index includes:
a1, evaluating the index of periodicityNormalization is carried out: />Wherein->For the number of possible bids, +.>Is normalized data; />For the level of periodic evaluation, i.e. +.>The normalization is to convert the data to the same scale, so that the dimensional and numerical differences between the data can be eliminated, and the data can be compared on the same scale
A2, calculating the information entropy of each periodic evaluation index:wherein->Is->Information entropy of item evaluation index, if ∈>Definition->The method comprises the steps of carrying out a first treatment on the surface of the Information entropy is a method for measuring the complexity of data, and the larger the value is, the higher the complexity of the data is, and the lower the complexity of the data is conversely. The complexity of each evaluation index can be reflected by calculating the information entropy of each periodic evaluation index.
A3, in the entropy weight theory, the variation coefficient can be used for measuring the weight value. The larger the index variation coefficient is, the larger the index weight is. The weight can be obtained by transforming the entropy values, and the weight value of each periodic evaluation index is calculated:wherein->Is->The weight of the item evaluation index.
And A4, after obtaining the weights of all indexes, calculating the evaluation score of the possible bid product by adopting a TOPAIA evaluation method. The method scores and ranks the possible bids by comparing the distance between the possible bids and the positive and negative ideal schemes. Because 5 indexes have the same polarity (are all larger and better), a data matrix is constructedAnd constructing positive and negative ideal solutions: />
A5, determining a weighted distance between each possible bid and positive and negative ideal solutions:the method comprises the steps of carrying out a first treatment on the surface of the The weighted distance is a method for measuring the similarity of data, and the smaller the value is, the higher the similarity between the data and an ideal solution is, and the lower the similarity is. Here, the weighted distance between each possible bid and the positive and negative ideal solutions is calculated, so that the quality degree of each possible bid can be reflected.
A6, calculating the comprehensive score of each possible bid:
example III
As shown in fig. 3, the method for determining the number of clusters and the cluster center includes:
b1, determining a sample set consisting of the combined scores of all possible bidsAnd calculate->Sample->Average Euclidean distance from the rest of the samples in the sample set: />Wherein->Divide +.>Any sample outside, ++>Is->To->Distance of->Is the number of possible bids. The similarity between samples is reflected by the computational evaluation.
B2, calculate toIs the center, at a distance->Number of samples in euclidean distance in range:wherein->If the variable is not less than 0, the function value is 1; if the variable is less than 0, the function value is 0.
If in the sampleThe greater the number of samples present in the average Euclidean distance of (2), then specify +.>For the centre of a region in the sample set, i.e. the density of the sampleThe larger. To->It will be easier to converge the constraint function as a cluster center.
B3, calculating the number of samples in Euclidean distance of all samplesAnd are arranged in descending order; i.e. arranged at the density of the samples.
B4, setting the clustering numberAnd choose +.>The samples are used as a clustering center, and the average distance from each sample in the clustering cluster to the clustering center is calculated: />,/>Wherein->Is->Total number of samples in each cluster, +.>Is->Cluster center of each cluster, +.>Are other samples in the cluster.
This stepFor a specific value, it can be assumed that +.>The clustering effect is optimal.
B5, calculating the evaluation value,/>Wherein->、/>For optional cluster center, +.>、/>For the average distance of the samples within the cluster to the cluster center, +.>Is the distance between the centers of two clusters.
By calculating an evaluation valueThe clustering effect, evaluation value +.>Is a ratio whose numerator is the average distance of samples within a cluster to the cluster center and denominator is the distance between two cluster centers. Evaluation value->Smaller indicates better clustering effect, i.e. when +.>When the minimum value is taken, the distance from each data in each cluster to the cluster center is nearest, and the distance between each cluster centers is farthest.
B6, setting an evaluation thresholdIf all the cluster centers are traversed, the evaluation values of any two cluster centers existOutput +.>A cluster center; if all the cluster centers are traversed, the evaluation value of some two cluster centers is +.>Correction->And (5) evaluating and repeating evaluation.
The main objective of this step is to determine the optimal number of clusters. Here, by setting an evaluation thresholdAnd compares the evaluation value +.>And evaluation threshold->The optimal number of clusters can be determined.
Example five
The embodiment provides another method for determining the number of clusters and the cluster center, which comprises the following steps:
the method for determining the number of clusters and the cluster center comprises the following steps:
and C1, constructing a K-means mathematical model, determining the number of clusters by an elbow method, and randomly and optionally obtaining an initial cluster center.
First, a K-means mathematical model is constructed. K-means is a common clustering algorithm that iteratively classifies data into K categories. Then, the number of clusters was determined by the elbow method. The elbow method is a common method for determining the number of clusters, and the number of clusters at the 'elbow' position in the graph is selected by observing a relation graph of the number of clusters and the clustering effect. Finally, an initial cluster center is randomly and optionally obtained.
C2, dividing the remaining (total number of possible bid items-clustering number) bid items into the initial clustering centers to form clusters according to nearby distribution selection; i.e. an initial clustering is performed. And dividing the rest of the bid products into initial clustering centers according to nearby distribution selection to form clusters.
C3, calculating a new cluster center based on the formed cluster, and repeating the step C2; namely, the possible bid products are divided into new cluster centers according to nearby distribution selection to form new cluster clusters
And C4, judging whether the clustering ending condition is met, if not, repeating the step C3, and if so, outputting the number of clusters and the cluster center. The end condition of a cluster is typically that the cluster center changes less than a certain threshold or that a maximum number of iterations is reached. If the clustering ending condition is met, outputting the clustering number and the clustering center; if the clustering ending condition is not satisfied, continuing iterative clustering. The specific threshold value can be determined according to actual conditions.
Example six
A bid analysis terminal based on a big data model comprises a memory, a processor and a computer program stored in the memory and capable of running on the processor, wherein the processor realizes the steps of the bid analysis method based on the big data model when executing the computer program.
The memory may be used to store software programs and modules, and the processor executes various functional applications of the terminal and data processing by running the software programs and modules stored in the memory. The memory may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an execution program required for at least one function, and the like.
The storage data area may store data created according to the use of the terminal, etc. In addition, the memory may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid-state storage device.
A computer readable storage medium storing a computer program which when executed by a processor performs the steps of a method as any one of the above. The method comprises the following steps of
Computer readable media may include computer storage media and communication media without loss of generality. Computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instruction data structures, program modules or other data. Computer storage media includes RAM, ROM, EPROM, EEPROM, flash memory or other solid state memory technology, CD-ROM, DVD or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices. Of course, those skilled in the art will recognize that computer storage media are not limited to the ones described above. The above-described system memory and mass storage devices may be collectively referred to as memory.
In the description of the present specification, reference to the terms "one embodiment/manner," "some embodiments/manner," "example," "a particular example," "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment/manner or example is included in at least one embodiment/manner or example of the application. In this specification, the schematic representations of the above terms are not necessarily for the same embodiment/manner or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments/modes or examples. Furthermore, the various embodiments/modes or examples described in this specification and the features of the various embodiments/modes or examples can be combined and combined by persons skilled in the art without contradiction.
Furthermore, the terms "first," "second," and the like, are used for descriptive purposes only and are not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include at least one such feature. In the description of the present application, the meaning of "plurality" means at least two, for example, two, three, etc., unless specifically defined otherwise.
It will be appreciated by persons skilled in the art that the above embodiments are provided for clarity of illustration only and are not intended to limit the scope of the application. Other variations or modifications of the above-described application will be apparent to those of skill in the art, and are still within the scope of the application.

Claims (7)

1. A bid analysis method based on a big data model is characterized by comprising the following steps:
selecting a sales platform needing to perform bid analysis, crawling commodity description set by a merchant, inputting the commodity description into a large language model, and outputting commodity types through the large language model;
determining self commodities to be analyzed, and screening a plurality of possible bidding products from commodity types output by the large language model;
and crawling the evaluation indexes of all possible bid products for a plurality of continuous days, wherein the evaluation indexes comprise: sales quantity, customer evaluation content, store score, and price;
data cleaning is carried out on the collected evaluation indexes;
determining the periodic evaluation index of each possible bid item, grading the periodic evaluation index, and determining the weight of the periodic evaluation index;
determining a composite score of each possible bid according to the weights;
introducing the comprehensive score into a K-means algorithm, and calculating and determining the number of clusters and a cluster center;
the clustering center is used as a final bid product to be output;
wherein, the period T is set, the periodic evaluation index is the change of the evaluation index in the period T, and the periodic evaluation index x of the ith bid item is set ij Comprising the following steps: newly added sales quantity x i1 Number of newly added customer evaluations x i2 New customer evaluation content x i3 Store score x i4 Price x i5 ,j=1,2,3,4,5,x ij =1,2,3,4,5;
The method for grading the periodic evaluation index comprises the following steps:
the newly added sales quantity is divided into 1 level, 2 level, 3 level, 4 level and 5 level according to the number from small to large;
the number of the newly added customer evaluations is divided into 1 grade, 2 grade, 3 grade, 4 grade and 5 grade according to the number from small to large;
the prices are classified into 1 level, 2 level, 3 level, 4 level and 5 level according to the high to low price;
store scores were rated from low to high as grade 1, grade 2, grade 3, grade 4 and grade 5;
identifying the newly added customer evaluation content through a large language model, and classifying the customer evaluation content into 1 level, 2 level, 3 level, 4 level and 5 level according to the difference evaluation, the deviation evaluation, the neutrality, the preference evaluation and the good evaluation;
the method for determining the weight of the periodic evaluation index comprises the following steps:
for periodic evaluation index x ij Normalization is carried out:wherein m is the number of possible bids, < +.>Is normalized data;
calculating the information entropy of each periodic evaluation index:wherein H is j Information entropy as the j-th evaluation index, if->Definition->
Calculating a weight value of each periodic evaluation index:wherein W is j The weight of the j-th evaluation index;
the method for determining the comprehensive score of each possible bid comprises the following steps:
constructing a data matrixAnd constructing positive and negative ideal solutions: />
Determining a weighted distance between each possible bid and positive and negative ideal solutions:
calculating a composite score for each possible bid:
2. the method for analyzing a bid based on a big data model according to claim 1, wherein the method for determining the number of clusters and the cluster center comprises:
determining a sample set M consisting of the combined scores of all possible bids, and calculating an ith sample y i Average Euclidean distance from the rest of the samples in the sample set:wherein y is r Dividing y for sample set i Any sample except d # yi,yr ) Is y i To y r M is the number of possible bids;
calculated as y i Is centered at a distance S (y i ) Number of samples in euclidean distance in range:wherein u (·) is a decision function, if the variable is not less than 0, the function value is 1; if the variable is smaller than 0, the function value is 0;
calculating the number of samples N (y) within Euclidean distance of all samples i ,S(y i ) And arranged in descending order;
setting a clustering number k, selecting the first k samples as clustering centers, and calculating the average distance from each sample in the clustering cluster to the clustering centers:p=1, 2, …, k, where G p G is the total number of samples in the p-th cluster p G is the cluster center of the p-th cluster q Other samples in the cluster;
an evaluation value a is calculated and,wherein a and b are optional cluster centers, L a 、L b D (c) is the average distance from the sample in the cluster to the cluster center a ,c b ) Is the distance between two cluster centers;
setting an evaluation threshold A ', and outputting k clustering centers if the evaluation values of any two clustering centers are less than or equal to A' after traversing all the clustering centers; if the evaluation value of a certain two clustering centers is less than A' after traversing all the clustering centers, correcting the k value and repeating evaluation value evaluation.
3. The method for analyzing a bid based on a big data model according to claim 1, wherein the method for determining the number of clusters and the cluster center comprises:
constructing a K-means mathematical model, determining the number of clusters by an elbow method, and randomly and optionally obtaining an initial cluster center;
dividing the rest B possible bid products into clusters according to nearby distribution selection and forming the clusters by the initial cluster centers, wherein B is the difference between the total number of the possible bid products and the number of the clusters;
calculating a new cluster center based on the formed cluster, and repeating the previous step;
and judging whether the clustering ending condition is met, if not, repeating the previous step, and if so, outputting the clustering number and the clustering center.
4. The method for analyzing a bid amount based on a big data model according to claim 1, wherein the method for data cleaning of the collected evaluation index comprises: detecting and removing noise data and irrelevant data in the data; and removing white noise in the blank data field and the knowledge background.
5. The method for analyzing a bid based on a big data model according to claim 1, wherein the period T is a set arbitrary time interval.
6. A big data model based bid analysis terminal comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the steps of a big data model based bid analysis method according to any of the claims 1-5 when executing the computer program.
7. A computer readable storage medium storing a computer program, wherein the computer program when executed by a processor implements the steps of a big data model based bid analysis method according to any of claims 1-5.
CN202310960482.4A 2023-08-02 2023-08-02 Big data model-based bid analysis method, terminal and storage medium Active CN116664173B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310960482.4A CN116664173B (en) 2023-08-02 2023-08-02 Big data model-based bid analysis method, terminal and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310960482.4A CN116664173B (en) 2023-08-02 2023-08-02 Big data model-based bid analysis method, terminal and storage medium

Publications (2)

Publication Number Publication Date
CN116664173A CN116664173A (en) 2023-08-29
CN116664173B true CN116664173B (en) 2023-11-14

Family

ID=87722857

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310960482.4A Active CN116664173B (en) 2023-08-02 2023-08-02 Big data model-based bid analysis method, terminal and storage medium

Country Status (1)

Country Link
CN (1) CN116664173B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117252632A (en) * 2023-11-17 2023-12-19 北京北清博育信息技术研究有限公司 Commodity price analysis system and method based on computer

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110443290A (en) * 2019-07-23 2019-11-12 广东数鼎科技有限公司 A kind of product competition relationship quantization generation method and device based on big data
CN111340516A (en) * 2020-03-13 2020-06-26 安图实验仪器(郑州)有限公司 Satisfaction evaluation system and method based on information entropy and variation coefficient fusion algorithm
GB202015382D0 (en) * 2020-09-29 2020-11-11 Smith Graeme Signal processing systems
CN112561730A (en) * 2020-12-02 2021-03-26 国网浙江省电力有限公司营销服务中心 Power supply service analysis method based on double-layer clustering and fuzzy comprehensive evaluation
CN115222276A (en) * 2022-07-29 2022-10-21 智己汽车科技有限公司 Bidding analysis and evaluation method and device
CN115775110A (en) * 2022-12-05 2023-03-10 中邮信息科技(北京)有限公司 Service quality assessment method and device

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110443290A (en) * 2019-07-23 2019-11-12 广东数鼎科技有限公司 A kind of product competition relationship quantization generation method and device based on big data
CN111340516A (en) * 2020-03-13 2020-06-26 安图实验仪器(郑州)有限公司 Satisfaction evaluation system and method based on information entropy and variation coefficient fusion algorithm
GB202015382D0 (en) * 2020-09-29 2020-11-11 Smith Graeme Signal processing systems
CN112561730A (en) * 2020-12-02 2021-03-26 国网浙江省电力有限公司营销服务中心 Power supply service analysis method based on double-layer clustering and fuzzy comprehensive evaluation
CN115222276A (en) * 2022-07-29 2022-10-21 智己汽车科技有限公司 Bidding analysis and evaluation method and device
CN115775110A (en) * 2022-12-05 2023-03-10 中邮信息科技(北京)有限公司 Service quality assessment method and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于变权-VIKOR理论的产品外观方案评价方法研究;李奋强等;兰州理工大学学报;第48卷(第04期);56-63 *

Also Published As

Publication number Publication date
CN116664173A (en) 2023-08-29

Similar Documents

Publication Publication Date Title
CN110869943B (en) GPU-enhanced graphics model construction and scoring engine
CN111091282B (en) Customer loyalty segmentation method based on user behavior data
Das et al. Hands-On Automated Machine Learning: A beginner's guide to building automated machine learning systems using AutoML and Python
US20080104000A1 (en) Determining Utility Functions from Ordinal Rankings
EP2625628A2 (en) Probabilistic data mining model comparison engine
CN116664173B (en) Big data model-based bid analysis method, terminal and storage medium
CN111724238A (en) Method, device and equipment for evaluating product recommendation accuracy and storage medium
CN111209469A (en) Personalized recommendation method and device, computer equipment and storage medium
Wanke et al. Revisiting camels rating system and the performance of Asean banks: a comprehensive mcdm/z-numbers approach
Agustyaningrum et al. Online shopper intention analysis using conventional machine learning and deep neural network classification algorithm
CN114519519A (en) Method, device and medium for assessing enterprise default risk based on GBDT algorithm and logistic regression model
CN113159881B (en) Data clustering and B2B platform customer preference obtaining method and system
Özöğür Akyüz et al. A novel hybrid house price prediction model
Kumar et al. Achieving market segmentation from B2B insurance client data using RFM & K-Means Algorithm
Silva et al. A categorical clustering of publishers for mobile performance marketing
Kanwal et al. An attribute weight estimation using particle swarm optimization and machine learning approaches for customer churn prediction
Yee et al. Using machine learning to forecast residential property prices in overcoming the property overhang issue
CN114820074A (en) Target user group prediction model construction method based on machine learning
Sharma et al. Prediction of Real-Time Estate Pricing using Train-Test Splitting Techniques
WO1992017853A2 (en) Direct data base analysis, forecasting and diagnosis method
Kangane et al. Analysis of different regression models for real estate price prediction
Ake Combining machine learning models to predict house prices
CN113538020B (en) Method and device for acquiring association degree of group of people features, storage medium and electronic device
Tekin et al. Click and sales prediction for OTAs’ digital advertisements: Fuzzy clustering based approach
Dewi et al. Modeling Salesperson Performance Based On Sales Data Clustering

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant