CN112418941A - Resource popularity calculation method, system and storage medium based on real-time flow - Google Patents
Resource popularity calculation method, system and storage medium based on real-time flow Download PDFInfo
- Publication number
- CN112418941A CN112418941A CN202011349407.7A CN202011349407A CN112418941A CN 112418941 A CN112418941 A CN 112418941A CN 202011349407 A CN202011349407 A CN 202011349407A CN 112418941 A CN112418941 A CN 112418941A
- Authority
- CN
- China
- Prior art keywords
- data
- real
- behavior
- resource
- time
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000004364 calculation method Methods 0.000 title claims abstract description 59
- 238000000034 method Methods 0.000 claims abstract description 31
- 230000002776 aggregation Effects 0.000 claims abstract description 9
- 238000004220 aggregation Methods 0.000 claims abstract description 9
- 230000003993 interaction Effects 0.000 claims abstract description 7
- 230000006399 behavior Effects 0.000 claims description 119
- 238000010276 construction Methods 0.000 claims description 5
- 238000004590 computer program Methods 0.000 claims description 4
- 230000003044 adaptive effect Effects 0.000 claims description 2
- 230000004931 aggregating effect Effects 0.000 claims description 2
- 238000005111 flow chemistry technique Methods 0.000 claims 1
- 230000009471 action Effects 0.000 description 10
- 238000013500 data storage Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000001914 filtration Methods 0.000 description 2
- 230000007246 mechanism Effects 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/02—Marketing; Price estimation or determination; Fundraising
- G06Q30/0201—Market modelling; Market analysis; Collecting market data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2458—Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
- G06F16/2471—Distributed queries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/248—Presentation of query results
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/27—Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/02—Marketing; Price estimation or determination; Fundraising
- G06Q30/0201—Market modelling; Market analysis; Collecting market data
- G06Q30/0203—Market surveys; Market polls
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
- H04L67/1097—Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/50—Network services
- H04L67/535—Tracking the activity of the user
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Business, Economics & Management (AREA)
- Strategic Management (AREA)
- Finance (AREA)
- Development Economics (AREA)
- Accounting & Taxation (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- Entrepreneurship & Innovation (AREA)
- General Engineering & Computer Science (AREA)
- Marketing (AREA)
- General Business, Economics & Management (AREA)
- Signal Processing (AREA)
- Computational Linguistics (AREA)
- Game Theory and Decision Science (AREA)
- Economics (AREA)
- Computer Networks & Wireless Communication (AREA)
- Computing Systems (AREA)
- Computer Hardware Design (AREA)
- Fuzzy Systems (AREA)
- Mathematical Physics (AREA)
- Probability & Statistics with Applications (AREA)
- Software Systems (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention relates to a resource popularity calculation method, a resource popularity calculation system and a storage medium based on real-time streaming, wherein the method comprises the following steps of behavior data acquisition: collecting user behavior data with interaction events before each client and each server to a message middleware in a server embedded point mode; a transaction data acquisition step: synchronizing transaction information to a message middleware in real time in a log synchronization mode; data layering processing: and constructing a source data layer, a public data layer and an application data layer in the message middleware by adopting a data layering method, and performing aggregation calculation through a real-time calculation engine to obtain resource popularity index values. Compared with the prior art, the method has the advantages of avoiding the information loss condition, ensuring the real-time performance of data, along with high reliability and the like.
Description
Technical Field
The invention relates to the technical field of big data processing, in particular to a resource popularity calculation method and system based on real-time flow and a storage medium.
Background
Currently, with the progress of network communication technology and the increase of broadband network, network retail platforms are increasingly developed and applied. For the network retail platform seller, it is very important to know the accessed condition of the shop resource in time. However, with the increasing number of users of network buyers, the original method for analyzing the popularity of store resources by obtaining information such as search and transaction off line cannot meet the requirement of timely adjustment of store resources by sellers; meanwhile, the mode of embedding the front-end codes is adopted for collecting the user behavior data, the mode can only obtain the user behavior of the PC end, channels such as complex mobile end APP and small programs cannot be covered, and information loss exists.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provide a resource popularity calculation method, a resource popularity calculation system and a storage medium which are suitable for a network retail platform, can avoid information loss, ensure data real-time performance and improve data processing reliability and are based on real-time flow.
The purpose of the invention can be realized by the following technical scheme:
a resource popularity calculation method based on real-time flow calculates user transaction data and user behavior data of a network retail platform according to the display dimension of a front end, and specifically comprises the following steps:
behavior data acquisition:
and acquiring user behavior data with interaction events between each client and the server of the network retail platform to the message middleware in a server-side embedded point mode.
A transaction data acquisition step:
and synchronizing the transaction information to the message middleware in real time in a log synchronization mode. The transaction information includes order data and resource number information.
Data layering processing:
and constructing a source data layer, a public data layer and an application data layer in the message middleware by adopting a data layering method, and performing aggregation calculation through a real-time calculation engine to obtain resource popularity index values. The source data layer comprises a clicking action unit, a browsing detail action unit, a quality guarantee book viewing action unit, a bidding action unit and a vehicle adding action unit.
In the data layering step, user behavior data acquired by pushing of a server-side buried point are respectively stored in a clicking behavior unit, a detail browsing behavior unit, a quality and guarantee book viewing behavior unit, a bid behavior unit and a car adding behavior unit of a source data layer of the message middleware.
Further, the message middleware adopts a distributed publish-subscribe message system kafka.
In the data layering step, stream processing is adopted to analyze log information of a source data layer, data with empty resource numbers in user behavior data are filtered, and a transaction wide table and a user behavior wide table which are adaptive to the display dimension of the front end are generated and serve as public data layer data. Specifically, the real-time calculation engine analyzes the user behavior data and the order data, judges whether a resource number in the user behavior data is empty, if the resource number is empty, does not count the statistical calculation, and otherwise, establishes a transaction width table and a user behavior width table of the message middleware in a transaction domain and a behavior domain respectively according to the transaction data field and the behavior data field.
The specific contents for constructing the transaction broad table are as follows:
the method comprises the steps of firstly correlating records of an order main table and an order sub-table according to order IDs through a real-time calculation engine, carrying out data statistics according to the display dimension of a front end, marking sources, marking scores of corresponding resource popularity, and generating a public data layer data table.
The specific content for constructing the user behavior broad table is as follows:
and analyzing each behavior detail data of clicking behavior, browsing detail behavior, checking warranty book behavior, bidding behavior and vehicle-adding behavior by adopting a real-time computing engine, grouping according to the display dimension of the front end, and aggregating and computing resource popularity. The specific contents of the calculation resource human qi are aggregated as follows:
and setting a scoring rule according to the depth of the behaviors, scoring resource popularity of different behaviors of the user until the score of the car adding behavior of the user is the highest, combining resource popularity scores of all the resource popularity scores of the listed user behaviors according to the same latitude, and calculating the resource popularity to generate a behavior wide table.
Further, the real-time computation engine employs an open source stream processing framework Flink.
And the step of data layering processing also comprises a step of computing result output, wherein the step of real-time aggregation computing is carried out according to different dimensions, the result is output to the distributed storage system HBase, and the query HBase provides data service to the front end. And the intermediate result of the aggregation calculation is stored in a distributed storage system HBase or stored by adopting a message middleware, and the final calculation result is provided to the front end for display by generating a data service API.
Further, the client includes but is not limited to a PC end, a mobile end and an applet.
In another aspect, the present invention provides a resource popularity calculation system based on real-time streaming, including:
the data acquisition module is used for acquiring user behavior data with an interaction event between the client and the server by a server-side point burying method;
the data layering construction module is used for classifying and layering the collected data of different attribute sources by adopting a data layering method, and the layering processing comprises a source data layer, a public data layer and an application data layer;
the data processing module analyzes the source layer logs through stream processing, filters data with empty resource numbers, and generates a transaction width table and a user behavior width table with corresponding dimensions;
the resource popularity calculation module analyzes each behavior detail data of the source data layer through the real-time calculation engine and calculates a resource popularity value for the analyzed data;
the distributed storage system HBase stores the data processing result and provides data query service;
and the front end sets the resource popularity display dimension and displays the data service provided by the HBase of the distributed storage system.
Further, the distributed storage system HBase provides a data service API exposed by a front end.
Another aspect of the present invention provides a computer-readable storage medium having stored therein a computer program executable by at least one processor to implement the steps of the real-time streaming based resource popularity calculation method as described above.
Compared with the prior art, the resource popularity calculation method, the resource popularity calculation system and the resource popularity calculation storage medium based on the real-time stream at least have the following beneficial effects:
1) according to the invention, the collection of the user behavior data of the network retail platform is changed from the original front-end embedded point to the server-end embedded point, so that the data accuracy is improved, and meanwhile, the behavior data of a complex mobile terminal can be collected, thereby avoiding the situation of information loss.
2) The real-time data storage is constructed by adopting the message middleware to store data, and the message middleware adopts the stable message queue kafka, so that the method has the characteristics of high throughput, low delay, high concurrency, high fault tolerance and high expandability.
3) The real-time computing engine is used for processing, the real-time computing engine is a Flink, the Flink is a stream batch unified engine which has high throughput, low delay, high flexible stream windows and a lightweight fault-tolerant mechanism, and the real-time performance of data can be guaranteed.
4) The data layering construction method is adopted to classify and layer data from different attribute sources, the data reuse degree is improved, good expansibility is achieved, the data of the original public data layer can be reused if new behaviors and transaction types are analyzed, the data calculation task of the public data layer only needs to be reconstructed if new behaviors and dimensions exist, the new task is not needed, and the calculation resource overhead is reduced.
5) The result data is stored by adopting the column-oriented storage distributed storage system HBase, and the query service is provided, so that the scenes of massive data and high concurrency can be supported, the delay of the data service is lower, and the reliability is higher.
Drawings
Fig. 1 is a schematic flow chart of a resource popularity calculation method based on real-time streaming in an embodiment.
Detailed Description
The invention is described in detail below with reference to the figures and specific embodiments. It is to be understood that the embodiments described are only a few embodiments of the present invention, and not all embodiments. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, shall fall within the scope of protection of the present invention.
Examples
The invention relates to a resource popularity calculation method based on real-time flow, which utilizes a plurality of user behaviors and transaction data collected from a network retail platform in real time, calculates a resource popularity value by carrying out real-time statistical processing on a message middleware, and finally provides the resource popularity value to the platform in a mode of providing real-time data service. As shown in fig. 1, the method specifically includes the following steps:
step one, data acquisition:
behavior data acquisition:
and acquiring user behavior data including interaction events of a PC (personal computer) end, a mobile end, a client of an applet and a server associated with the network retail platform to a message middleware in a point burying mode of the server. The user behavior data includes click behavior data, browse details behavior data, view warranty behavior data, bid behavior data, and car-added behavior (shopping-cart-added behavior) data,
collecting transaction data:
and synchronizing the transaction information to the message middleware in real time in a log synchronization mode. The transaction information includes order data, resource number and the like.
The real-time data storage is constructed by adopting the message middleware to store data, and the message middleware adopts the stable message queue kafka, so that the method has the characteristics of high throughput, low delay, high concurrency, high fault tolerance and high expandability.
Step two, data layering processing:
and after data acquisition, a data layering method is adopted, a source data layer, a public data layer and an application data layer are constructed in the message middleware, and then aggregation calculation is carried out through a real-time calculation engine to obtain resource popularity index values. Namely, the real-time data warehouse is divided into three layers as a whole, namely a source data layer, a public layer (DWD and DWS layer) and an application layer.
The computing engine adopts Flink which is taken as a main factor of the streaming computing engine and comprises the following steps: high throughput, low latency, high performance; highly flexible streaming windows; exact-once semantics of state computation; a lightweight fault tolerance mechanism; EventTime and out-of-order events are supported; and (4) a stream batch unification engine. The online traffic data goes directly to Kafka or other message storage system and is calculated using Flink real-time consumption data.
Specifically, the method comprises the following steps:
2.1) source data layer:
and respectively storing the buried point push user behavior data into TPXHB01 (click action), TPXHB02 (browse detail action), TPXHB03 (view warranty action), TPXHB04 (BID action) and TPXHB05 (vehicle adding action) of a source data layer of the message middleware, and simultaneously synchronizing the ORDER main table data TPXH _ ORDER _ M, the ORDER sub table TPXH _ ORDER _ D and the bidding process table XH TPBID _ RECORD in the transaction to the source data layer of the message middleware in a log synchronization mode.
2.2) dividing the data domain by the common data layer:
and analyzing the source layer logs through stream processing, filtering dirty data with empty resource numbers, and generating a transaction wide table and a user behavior wide table of corresponding dimensions as public data layer data. Specifically, the real-time calculation engine analyzes the behavior data and the order data, and the transaction information and the behavior data comprise the resource number, but the resource number of the behavior data may be empty, so that whether the resource number in the behavior data is empty is judged firstly, if the resource number is empty, the statistical calculation is not included, otherwise, a message middleware public layer transaction width table and a user behavior width table are respectively constructed in a transaction domain and a behavior domain according to the transaction domain and the behavior domain. Specifically, the method comprises the following steps:
2.3) constructing a transaction wide table:
the method comprises the steps of firstly, associating records of an ORDER main table TPXH _ ORDER _ M and an ORDER sub table TPXH _ ORDER _ D according to ORDER IDs through a real-time computing engine, in the embodiment, associating by adopting a Flink stream data join technology in the prior art, directly obtaining details after stream data join, carrying out data statistics according to a certain dimension, marking sources, and giving corresponding scores to marks. For example, in the obtained details, the volume of the deal, the amount of the deal and the popularity of the resource are counted by the rule of 'deal time + seller + buyer + resource number + bundle number + variety + brand + specification + quality grade + region + source' (remark: source is ORDER deal), (remark: score of popularity value of each deal resource for 6 points), and a public level deal data table DW _ JY _ ORDER is generated.
2.4) constructing a behavior broad table:
analyzing the behavior detail data of the TPXHB01, the TPXHB02, the TPXHB03, the TPXHB04, and the TPXHB05 by using a real-time computing engine, in this embodiment, analyzing is performed by a method of directly analyzing json-format files by using a Flink engine in the prior art to obtain key information required in json, and grouping is performed according to a certain dimension of an actual transaction rule, for example, grouping is performed by using a dimension of "behavior time + seller + buyer + resource number + package number + variety + brand + specification + quality level + region + behavior name", and resource popularity is aggregated.
As each behavior represents the depth that the user wants to know the product, the invention defines the scoring rule according to the depth of the behavior and scores different behaviors until the scoring of the behavior of adding the car by the user is the highest. For example, according to the depth of the BEHAVIOR, the resource popularity of TPXHB01 (click BEHAVIOR) is recorded for 1 point, the resource popularity of TPXHB02 (browse detail BEHAVIOR) is recorded for 2 points, the resource popularity of TPXHB03 (view warranty BEHAVIOR) is recorded for 3 points, the resource popularity of TPXHB04 (bid BEHAVIOR) is recorded for 4 points, the resource popularity of TPXHB05 (car-added BEHAVIOR) is recorded for 5 points, the resources of the listed user BEHAVIORs are combined and calculated according to the rule according to the same latitude, and the BEHAVIOR width table DW _ XW _ BEHAVIOR is generated statistically.
2.5) merging and generating an application data layer resource human atmosphere table:
by using stream processing, merging the dimensionalities of resource popularity detail, namely time, seller, buyer, resource number, bundle number, variety, brand number, specification, quality grade, region and source (BEHAVIOR name) into a transaction width table DW _ JY _ ORDER and a BEHAVIOR width table DW _ XW _ BEHAVIOR to generate an application layer resource popularity table: DM _ ZY _ POP.
Step three, outputting a calculation result:
the method calculates according to the dimension displayed by the front end, for example, if the front end is the dimension of 'variety + brand', the front end aggregates according to the dimension of the variety brand, and the embodiment aggregates and counts the resource popularity according to the detailed dimension, namely the dimension of 'date + seller + buyer + resource number + package number + variety + brand + specification + quality grade + region'; and after real-time aggregation calculation is carried out according to different dimensions, the results of the application layer resource human gas table are output to a column storage-oriented telescopic distributed storage system HBase, and finally, data services are provided to the front end by a query HBase for display.
In this embodiment, as a preferred scheme, the result of the calculation is stored in the HBase, and finally, a data service API is generated and provided to the front-end presentation. Namely, a unified interface service layer (such as OneService) is used for providing Dubbo interface acquisition index data for a service user, and the previous section is displayed.
In addition, the invention provides a resource popularity calculation system based on real-time flow, which comprises a data acquisition module, a data layering construction module and a data processing module, wherein:
and the data acquisition module is used for acquiring user behavior data of interaction events between the client and the server by a server point burying method, and synchronizing transaction information to the message middleware in real time by a log synchronization mode.
And the data layering construction module is used for classifying and layering the acquired data of different attribute sources by adopting a data layering method, and specifically comprises a source data layer, a public data layer and an application data layer.
And the data processing module is used for analyzing the source number layer logs through stream processing, filtering dirty data with empty resource numbers, and generating a transaction wide table and a user behavior wide table with corresponding dimensions. The trading width table firstly correlates records of an order main table and an order sub table according to order IDs through a real-time calculation engine, performs data statistics according to the display dimension of a front end, marks sources, and marks corresponding resource popularity scores; the user behavior broad table is used for setting a scoring rule according to the depth of the behavior, scoring resource popularity of different behaviors of the user, and combining resource popularity scores of all listed user behaviors according to the same latitude to calculate the resource popularity.
And the resource popularity calculation module is used for analyzing each behavior detail data of the source data layer through the real-time calculation engine and calculating a resource popularity value for the analyzed data.
And the distributed storage system HBase is used for storing the data processing result and providing data query service.
And the front end is used for setting the resource popularity display dimension and displaying the data service provided by the distributed storage system HBase.
The present invention further provides a computer-readable storage medium, which is a non-volatile readable storage medium and stores a computer program, where the computer program is executable by at least one processor to implement the operation of the resource popularity calculation method or system based on real-time streaming.
While the invention has been described with reference to specific embodiments, the invention is not limited thereto, and those skilled in the art can easily conceive of various equivalent modifications or substitutions within the technical scope of the invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.
Claims (17)
1. A resource popularity calculation method based on real-time flow is characterized in that the method calculates user transaction data and user behavior data of a network retail platform according to the display dimension of a front end, and comprises the following steps:
behavior data acquisition: collecting user behavior data with interaction events between each client and a server of the network retail platform to a message middleware in a server embedded point mode;
a transaction data acquisition step: synchronizing transaction information to a message middleware in real time in a log synchronization mode;
data layering processing: and constructing a source data layer, a public data layer and an application data layer in the message middleware by adopting a data layering method, and performing aggregation calculation through a real-time calculation engine to obtain resource popularity index values.
2. The real-time streaming based resource popularity computation method of claim 1, wherein the transaction information includes order data and resource number information.
3. The real-time stream-based resource popularity computation method according to claim 1, wherein the source data layer includes a click behavior unit, a browse details behavior unit, a view warranty behavior unit, a bid behavior unit, and a car-filling behavior unit.
4. The resource popularity calculation method based on the real-time stream according to claim 3, wherein in the data layering step, user behavior data collected by pushing of a server-side buried point are respectively stored in a click behavior unit, a browse details behavior unit, a quality insurance book viewing behavior unit, a bid behavior unit and a car adding behavior unit of a source data layer of a message middleware.
5. The resource popularity computation method based on the real-time streaming according to claim 4, wherein the message middleware adopts a distributed publish-subscribe message system kafka.
6. The resource popularity calculation method based on the real-time stream according to claim 1, wherein in the data layering step, stream processing is adopted to analyze log information of a source data layer, data with empty resource numbers in user behavior data are filtered, and a transaction width table and a user behavior width table which are adaptive to a presentation dimension of a front end are generated to serve as public data layer data.
7. The resource popularity calculation method based on the real-time flow according to claim 6, wherein the real-time calculation engine analyzes the user behavior data and the order data, judges whether the resource number in the user behavior data is empty, if the resource number is empty, the statistical calculation is not included, otherwise, according to the two data fields of transaction and behavior, a transaction width table and a user behavior width table of the message middleware are respectively constructed in the transaction field and the behavior field.
8. The resource popularity calculation method based on real-time streaming according to claim 7, wherein the specific contents for constructing the transaction wide table are as follows:
the method comprises the steps of firstly correlating records of an order main table and an order sub-table according to order IDs through a real-time calculation engine, carrying out data statistics according to the display dimension of a front end, marking sources, marking scores of corresponding resource popularity, and generating a public data layer data table.
9. The resource popularity calculation method based on real-time streams as claimed in claim 6, wherein the specific contents for constructing the user behavior broad table are as follows:
and analyzing each behavior detail data of clicking behavior, browsing detail behavior, checking warranty book behavior, bidding behavior and vehicle-adding behavior by adopting a real-time computing engine, grouping according to the display dimension of the front end, and aggregating and computing resource popularity.
10. The resource popularity computation method based on real-time streaming according to claim 9, wherein the specific contents of the aggregated computation resource popularity are as follows:
and setting a scoring rule according to the depth of the behaviors, scoring resource popularity of different behaviors of the user until the score of the car adding behavior of the user is the highest, combining resource popularity scores of all the resource popularity scores of the listed user behaviors according to the same latitude, and calculating the resource popularity to generate a behavior wide table.
11. The real-time flow-based resource popularity computation method according to claim 8 or 10, wherein the real-time computation engine employs an open source flow processing framework Flink.
12. The resource popularity calculation method based on the real-time stream according to claim 1, wherein the data layering processing step further includes a calculation result output step, the step performs real-time aggregation calculation according to different dimensions, the result is output to a distributed storage system HBase, and a query HBase provides data service to a front end.
13. The resource popularity computation method based on real-time streams according to claim 12, wherein intermediate results of the aggregation computation are stored in a distributed storage system HBase or stored by using a message middleware, and final computation results are provided to a front end for presentation by generating a data service API.
14. The real-time streaming based resource popularity calculation method according to claim 1, wherein the client comprises a PC end, a mobile end and an applet.
15. A real-time streaming based resource popularity computation system, the system comprising:
the data acquisition module is used for acquiring user behavior data with an interaction event between the client and the server by a server-side point burying method;
the data layering construction module is used for classifying and layering the collected data of different attribute sources by adopting a data layering method, and the layering processing comprises a source data layer, a public data layer and an application data layer;
the data processing module analyzes the source layer logs through stream processing, filters data with empty resource numbers, and generates a transaction width table and a user behavior width table with corresponding dimensions;
the resource popularity calculation module analyzes each behavior detail data of the source data layer through the real-time calculation engine and calculates a resource popularity value for the analyzed data;
the distributed storage system HBase stores the data processing result and provides data query service;
and the front end sets the resource popularity display dimension and displays the data service provided by the HBase of the distributed storage system.
16. The real-time streaming based resource popularity computation system of claim 15, wherein the distributed storage system HBase provides a data services API exposed with a front-end.
17. A computer-readable storage medium, having stored thereon a computer program executable by at least one processor to perform the steps of the real-time streaming based resource popularity calculation method according to any one of claims 1-14.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011349407.7A CN112418941A (en) | 2020-11-26 | 2020-11-26 | Resource popularity calculation method, system and storage medium based on real-time flow |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011349407.7A CN112418941A (en) | 2020-11-26 | 2020-11-26 | Resource popularity calculation method, system and storage medium based on real-time flow |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112418941A true CN112418941A (en) | 2021-02-26 |
Family
ID=74842948
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011349407.7A Pending CN112418941A (en) | 2020-11-26 | 2020-11-26 | Resource popularity calculation method, system and storage medium based on real-time flow |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112418941A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113852664A (en) * | 2021-08-19 | 2021-12-28 | 天津市普迅电力信息技术有限公司 | Energy commodity and energy demand accurate pushing method based on distributed real-time calculation |
CN115361418A (en) * | 2022-08-18 | 2022-11-18 | 中国第一汽车股份有限公司 | Vehicle-mounted distributed dynamic data embedded point acquisition method, vehicle and cloud server |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108427711A (en) * | 2018-01-31 | 2018-08-21 | 北京三快在线科技有限公司 | Real-time data warehouse, real-time data processing method, electronic equipment and storage medium |
CN110909063A (en) * | 2019-11-28 | 2020-03-24 | 蜂助手股份有限公司 | User behavior analysis method and device, application server and storage medium |
CN111026801A (en) * | 2019-12-25 | 2020-04-17 | 焦点科技股份有限公司 | Method and system for assisting operation quick decision-making work of insurance type e-commerce |
CN111651510A (en) * | 2020-05-14 | 2020-09-11 | 拉扎斯网络科技(上海)有限公司 | Data processing method and device, electronic equipment and computer readable storage medium |
-
2020
- 2020-11-26 CN CN202011349407.7A patent/CN112418941A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108427711A (en) * | 2018-01-31 | 2018-08-21 | 北京三快在线科技有限公司 | Real-time data warehouse, real-time data processing method, electronic equipment and storage medium |
CN110909063A (en) * | 2019-11-28 | 2020-03-24 | 蜂助手股份有限公司 | User behavior analysis method and device, application server and storage medium |
CN111026801A (en) * | 2019-12-25 | 2020-04-17 | 焦点科技股份有限公司 | Method and system for assisting operation quick decision-making work of insurance type e-commerce |
CN111651510A (en) * | 2020-05-14 | 2020-09-11 | 拉扎斯网络科技(上海)有限公司 | Data processing method and device, electronic equipment and computer readable storage medium |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113852664A (en) * | 2021-08-19 | 2021-12-28 | 天津市普迅电力信息技术有限公司 | Energy commodity and energy demand accurate pushing method based on distributed real-time calculation |
CN115361418A (en) * | 2022-08-18 | 2022-11-18 | 中国第一汽车股份有限公司 | Vehicle-mounted distributed dynamic data embedded point acquisition method, vehicle and cloud server |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108416620B (en) | Portrait data intelligent social advertisement putting platform based on big data | |
US20210374324A1 (en) | Resource size-based content item selection | |
US10410258B2 (en) | Graphical user interface for high volume data analytics | |
US10178430B2 (en) | Intelligent processing method and system for video data | |
CN109189951B (en) | Multimedia resource recommendation method, equipment and storage medium | |
US8856113B1 (en) | Method and device for ranking video embeds | |
US7318056B2 (en) | System and method for performing click stream analysis | |
CN103620601B (en) | Joining tables in a mapreduce procedure | |
US20160162582A1 (en) | Method and system for conducting an opinion search engine and a display thereof | |
CN105765573B (en) | Improvements in website traffic optimization | |
US20110282860A1 (en) | Data collection, tracking, and analysis for multiple media including impact analysis and influence tracking | |
US20090319365A1 (en) | System and method for assessing marketing data | |
EP3564828A1 (en) | Method of data query based on evaluation and device | |
CN109889891B (en) | Method, device and storage medium for acquiring target media file | |
CN110647512B (en) | Data storage and analysis method, device, equipment and readable medium | |
WO2020037917A1 (en) | User behavior data recommendation method, server and computer readable medium | |
US9634909B2 (en) | Methods and systems of detection of most relevant insights for large volume query-based social data stream | |
TW201224972A (en) | Sorting method and apparatus of query results | |
US11748365B2 (en) | Multi-dimensional search | |
KR20110009198A (en) | Search results with most clicked next objects | |
Marcus et al. | Tweets as data: demonstration of tweeql and twitinfo | |
WO2014056369A1 (en) | Method and system for sorting online videos of search | |
US20140280133A1 (en) | Structured Data to Aggregate Analytics | |
CN108921734A (en) | One real estate information visualization system based on multi-source heterogeneous data | |
US20160034553A1 (en) | Hybrid aggregation of data sets |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20210226 |