CN112418941A - Resource popularity calculation method, system and storage medium based on real-time flow - Google Patents

Resource popularity calculation method, system and storage medium based on real-time flow Download PDF

Info

Publication number
CN112418941A
CN112418941A CN202011349407.7A CN202011349407A CN112418941A CN 112418941 A CN112418941 A CN 112418941A CN 202011349407 A CN202011349407 A CN 202011349407A CN 112418941 A CN112418941 A CN 112418941A
Authority
CN
China
Prior art keywords
data
real
behavior
resource
time
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011349407.7A
Other languages
Chinese (zh)
Inventor
万仕龙
顾永兴
仲跻炜
朱彭生
冯若寅
梁东梅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ouye Yunshang Co ltd
Original Assignee
Ouye Yunshang Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ouye Yunshang Co ltd filed Critical Ouye Yunshang Co ltd
Priority to CN202011349407.7A priority Critical patent/CN112418941A/en
Publication of CN112418941A publication Critical patent/CN112418941A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2471Distributed queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/248Presentation of query results
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • G06Q30/0203Market surveys; Market polls
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1097Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/535Tracking the activity of the user

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Strategic Management (AREA)
  • Finance (AREA)
  • Development Economics (AREA)
  • Accounting & Taxation (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Entrepreneurship & Innovation (AREA)
  • General Engineering & Computer Science (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Signal Processing (AREA)
  • Computational Linguistics (AREA)
  • Game Theory and Decision Science (AREA)
  • Economics (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Computing Systems (AREA)
  • Computer Hardware Design (AREA)
  • Fuzzy Systems (AREA)
  • Mathematical Physics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Software Systems (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention relates to a resource popularity calculation method, a resource popularity calculation system and a storage medium based on real-time streaming, wherein the method comprises the following steps of behavior data acquisition: collecting user behavior data with interaction events before each client and each server to a message middleware in a server embedded point mode; a transaction data acquisition step: synchronizing transaction information to a message middleware in real time in a log synchronization mode; data layering processing: and constructing a source data layer, a public data layer and an application data layer in the message middleware by adopting a data layering method, and performing aggregation calculation through a real-time calculation engine to obtain resource popularity index values. Compared with the prior art, the method has the advantages of avoiding the information loss condition, ensuring the real-time performance of data, along with high reliability and the like.

Description

Resource popularity calculation method, system and storage medium based on real-time flow
Technical Field
The invention relates to the technical field of big data processing, in particular to a resource popularity calculation method and system based on real-time flow and a storage medium.
Background
Currently, with the progress of network communication technology and the increase of broadband network, network retail platforms are increasingly developed and applied. For the network retail platform seller, it is very important to know the accessed condition of the shop resource in time. However, with the increasing number of users of network buyers, the original method for analyzing the popularity of store resources by obtaining information such as search and transaction off line cannot meet the requirement of timely adjustment of store resources by sellers; meanwhile, the mode of embedding the front-end codes is adopted for collecting the user behavior data, the mode can only obtain the user behavior of the PC end, channels such as complex mobile end APP and small programs cannot be covered, and information loss exists.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provide a resource popularity calculation method, a resource popularity calculation system and a storage medium which are suitable for a network retail platform, can avoid information loss, ensure data real-time performance and improve data processing reliability and are based on real-time flow.
The purpose of the invention can be realized by the following technical scheme:
a resource popularity calculation method based on real-time flow calculates user transaction data and user behavior data of a network retail platform according to the display dimension of a front end, and specifically comprises the following steps:
behavior data acquisition:
and acquiring user behavior data with interaction events between each client and the server of the network retail platform to the message middleware in a server-side embedded point mode.
A transaction data acquisition step:
and synchronizing the transaction information to the message middleware in real time in a log synchronization mode. The transaction information includes order data and resource number information.
Data layering processing:
and constructing a source data layer, a public data layer and an application data layer in the message middleware by adopting a data layering method, and performing aggregation calculation through a real-time calculation engine to obtain resource popularity index values. The source data layer comprises a clicking action unit, a browsing detail action unit, a quality guarantee book viewing action unit, a bidding action unit and a vehicle adding action unit.
In the data layering step, user behavior data acquired by pushing of a server-side buried point are respectively stored in a clicking behavior unit, a detail browsing behavior unit, a quality and guarantee book viewing behavior unit, a bid behavior unit and a car adding behavior unit of a source data layer of the message middleware.
Further, the message middleware adopts a distributed publish-subscribe message system kafka.
In the data layering step, stream processing is adopted to analyze log information of a source data layer, data with empty resource numbers in user behavior data are filtered, and a transaction wide table and a user behavior wide table which are adaptive to the display dimension of the front end are generated and serve as public data layer data. Specifically, the real-time calculation engine analyzes the user behavior data and the order data, judges whether a resource number in the user behavior data is empty, if the resource number is empty, does not count the statistical calculation, and otherwise, establishes a transaction width table and a user behavior width table of the message middleware in a transaction domain and a behavior domain respectively according to the transaction data field and the behavior data field.
The specific contents for constructing the transaction broad table are as follows:
the method comprises the steps of firstly correlating records of an order main table and an order sub-table according to order IDs through a real-time calculation engine, carrying out data statistics according to the display dimension of a front end, marking sources, marking scores of corresponding resource popularity, and generating a public data layer data table.
The specific content for constructing the user behavior broad table is as follows:
and analyzing each behavior detail data of clicking behavior, browsing detail behavior, checking warranty book behavior, bidding behavior and vehicle-adding behavior by adopting a real-time computing engine, grouping according to the display dimension of the front end, and aggregating and computing resource popularity. The specific contents of the calculation resource human qi are aggregated as follows:
and setting a scoring rule according to the depth of the behaviors, scoring resource popularity of different behaviors of the user until the score of the car adding behavior of the user is the highest, combining resource popularity scores of all the resource popularity scores of the listed user behaviors according to the same latitude, and calculating the resource popularity to generate a behavior wide table.
Further, the real-time computation engine employs an open source stream processing framework Flink.
And the step of data layering processing also comprises a step of computing result output, wherein the step of real-time aggregation computing is carried out according to different dimensions, the result is output to the distributed storage system HBase, and the query HBase provides data service to the front end. And the intermediate result of the aggregation calculation is stored in a distributed storage system HBase or stored by adopting a message middleware, and the final calculation result is provided to the front end for display by generating a data service API.
Further, the client includes but is not limited to a PC end, a mobile end and an applet.
In another aspect, the present invention provides a resource popularity calculation system based on real-time streaming, including:
the data acquisition module is used for acquiring user behavior data with an interaction event between the client and the server by a server-side point burying method;
the data layering construction module is used for classifying and layering the collected data of different attribute sources by adopting a data layering method, and the layering processing comprises a source data layer, a public data layer and an application data layer;
the data processing module analyzes the source layer logs through stream processing, filters data with empty resource numbers, and generates a transaction width table and a user behavior width table with corresponding dimensions;
the resource popularity calculation module analyzes each behavior detail data of the source data layer through the real-time calculation engine and calculates a resource popularity value for the analyzed data;
the distributed storage system HBase stores the data processing result and provides data query service;
and the front end sets the resource popularity display dimension and displays the data service provided by the HBase of the distributed storage system.
Further, the distributed storage system HBase provides a data service API exposed by a front end.
Another aspect of the present invention provides a computer-readable storage medium having stored therein a computer program executable by at least one processor to implement the steps of the real-time streaming based resource popularity calculation method as described above.
Compared with the prior art, the resource popularity calculation method, the resource popularity calculation system and the resource popularity calculation storage medium based on the real-time stream at least have the following beneficial effects:
1) according to the invention, the collection of the user behavior data of the network retail platform is changed from the original front-end embedded point to the server-end embedded point, so that the data accuracy is improved, and meanwhile, the behavior data of a complex mobile terminal can be collected, thereby avoiding the situation of information loss.
2) The real-time data storage is constructed by adopting the message middleware to store data, and the message middleware adopts the stable message queue kafka, so that the method has the characteristics of high throughput, low delay, high concurrency, high fault tolerance and high expandability.
3) The real-time computing engine is used for processing, the real-time computing engine is a Flink, the Flink is a stream batch unified engine which has high throughput, low delay, high flexible stream windows and a lightweight fault-tolerant mechanism, and the real-time performance of data can be guaranteed.
4) The data layering construction method is adopted to classify and layer data from different attribute sources, the data reuse degree is improved, good expansibility is achieved, the data of the original public data layer can be reused if new behaviors and transaction types are analyzed, the data calculation task of the public data layer only needs to be reconstructed if new behaviors and dimensions exist, the new task is not needed, and the calculation resource overhead is reduced.
5) The result data is stored by adopting the column-oriented storage distributed storage system HBase, and the query service is provided, so that the scenes of massive data and high concurrency can be supported, the delay of the data service is lower, and the reliability is higher.
Drawings
Fig. 1 is a schematic flow chart of a resource popularity calculation method based on real-time streaming in an embodiment.
Detailed Description
The invention is described in detail below with reference to the figures and specific embodiments. It is to be understood that the embodiments described are only a few embodiments of the present invention, and not all embodiments. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, shall fall within the scope of protection of the present invention.
Examples
The invention relates to a resource popularity calculation method based on real-time flow, which utilizes a plurality of user behaviors and transaction data collected from a network retail platform in real time, calculates a resource popularity value by carrying out real-time statistical processing on a message middleware, and finally provides the resource popularity value to the platform in a mode of providing real-time data service. As shown in fig. 1, the method specifically includes the following steps:
step one, data acquisition:
behavior data acquisition:
and acquiring user behavior data including interaction events of a PC (personal computer) end, a mobile end, a client of an applet and a server associated with the network retail platform to a message middleware in a point burying mode of the server. The user behavior data includes click behavior data, browse details behavior data, view warranty behavior data, bid behavior data, and car-added behavior (shopping-cart-added behavior) data,
collecting transaction data:
and synchronizing the transaction information to the message middleware in real time in a log synchronization mode. The transaction information includes order data, resource number and the like.
The real-time data storage is constructed by adopting the message middleware to store data, and the message middleware adopts the stable message queue kafka, so that the method has the characteristics of high throughput, low delay, high concurrency, high fault tolerance and high expandability.
Step two, data layering processing:
and after data acquisition, a data layering method is adopted, a source data layer, a public data layer and an application data layer are constructed in the message middleware, and then aggregation calculation is carried out through a real-time calculation engine to obtain resource popularity index values. Namely, the real-time data warehouse is divided into three layers as a whole, namely a source data layer, a public layer (DWD and DWS layer) and an application layer.
The computing engine adopts Flink which is taken as a main factor of the streaming computing engine and comprises the following steps: high throughput, low latency, high performance; highly flexible streaming windows; exact-once semantics of state computation; a lightweight fault tolerance mechanism; EventTime and out-of-order events are supported; and (4) a stream batch unification engine. The online traffic data goes directly to Kafka or other message storage system and is calculated using Flink real-time consumption data.
Specifically, the method comprises the following steps:
2.1) source data layer:
and respectively storing the buried point push user behavior data into TPXHB01 (click action), TPXHB02 (browse detail action), TPXHB03 (view warranty action), TPXHB04 (BID action) and TPXHB05 (vehicle adding action) of a source data layer of the message middleware, and simultaneously synchronizing the ORDER main table data TPXH _ ORDER _ M, the ORDER sub table TPXH _ ORDER _ D and the bidding process table XH TPBID _ RECORD in the transaction to the source data layer of the message middleware in a log synchronization mode.
2.2) dividing the data domain by the common data layer:
and analyzing the source layer logs through stream processing, filtering dirty data with empty resource numbers, and generating a transaction wide table and a user behavior wide table of corresponding dimensions as public data layer data. Specifically, the real-time calculation engine analyzes the behavior data and the order data, and the transaction information and the behavior data comprise the resource number, but the resource number of the behavior data may be empty, so that whether the resource number in the behavior data is empty is judged firstly, if the resource number is empty, the statistical calculation is not included, otherwise, a message middleware public layer transaction width table and a user behavior width table are respectively constructed in a transaction domain and a behavior domain according to the transaction domain and the behavior domain. Specifically, the method comprises the following steps:
2.3) constructing a transaction wide table:
the method comprises the steps of firstly, associating records of an ORDER main table TPXH _ ORDER _ M and an ORDER sub table TPXH _ ORDER _ D according to ORDER IDs through a real-time computing engine, in the embodiment, associating by adopting a Flink stream data join technology in the prior art, directly obtaining details after stream data join, carrying out data statistics according to a certain dimension, marking sources, and giving corresponding scores to marks. For example, in the obtained details, the volume of the deal, the amount of the deal and the popularity of the resource are counted by the rule of 'deal time + seller + buyer + resource number + bundle number + variety + brand + specification + quality grade + region + source' (remark: source is ORDER deal), (remark: score of popularity value of each deal resource for 6 points), and a public level deal data table DW _ JY _ ORDER is generated.
2.4) constructing a behavior broad table:
analyzing the behavior detail data of the TPXHB01, the TPXHB02, the TPXHB03, the TPXHB04, and the TPXHB05 by using a real-time computing engine, in this embodiment, analyzing is performed by a method of directly analyzing json-format files by using a Flink engine in the prior art to obtain key information required in json, and grouping is performed according to a certain dimension of an actual transaction rule, for example, grouping is performed by using a dimension of "behavior time + seller + buyer + resource number + package number + variety + brand + specification + quality level + region + behavior name", and resource popularity is aggregated.
As each behavior represents the depth that the user wants to know the product, the invention defines the scoring rule according to the depth of the behavior and scores different behaviors until the scoring of the behavior of adding the car by the user is the highest. For example, according to the depth of the BEHAVIOR, the resource popularity of TPXHB01 (click BEHAVIOR) is recorded for 1 point, the resource popularity of TPXHB02 (browse detail BEHAVIOR) is recorded for 2 points, the resource popularity of TPXHB03 (view warranty BEHAVIOR) is recorded for 3 points, the resource popularity of TPXHB04 (bid BEHAVIOR) is recorded for 4 points, the resource popularity of TPXHB05 (car-added BEHAVIOR) is recorded for 5 points, the resources of the listed user BEHAVIORs are combined and calculated according to the rule according to the same latitude, and the BEHAVIOR width table DW _ XW _ BEHAVIOR is generated statistically.
2.5) merging and generating an application data layer resource human atmosphere table:
by using stream processing, merging the dimensionalities of resource popularity detail, namely time, seller, buyer, resource number, bundle number, variety, brand number, specification, quality grade, region and source (BEHAVIOR name) into a transaction width table DW _ JY _ ORDER and a BEHAVIOR width table DW _ XW _ BEHAVIOR to generate an application layer resource popularity table: DM _ ZY _ POP.
Step three, outputting a calculation result:
the method calculates according to the dimension displayed by the front end, for example, if the front end is the dimension of 'variety + brand', the front end aggregates according to the dimension of the variety brand, and the embodiment aggregates and counts the resource popularity according to the detailed dimension, namely the dimension of 'date + seller + buyer + resource number + package number + variety + brand + specification + quality grade + region'; and after real-time aggregation calculation is carried out according to different dimensions, the results of the application layer resource human gas table are output to a column storage-oriented telescopic distributed storage system HBase, and finally, data services are provided to the front end by a query HBase for display.
In this embodiment, as a preferred scheme, the result of the calculation is stored in the HBase, and finally, a data service API is generated and provided to the front-end presentation. Namely, a unified interface service layer (such as OneService) is used for providing Dubbo interface acquisition index data for a service user, and the previous section is displayed.
In addition, the invention provides a resource popularity calculation system based on real-time flow, which comprises a data acquisition module, a data layering construction module and a data processing module, wherein:
and the data acquisition module is used for acquiring user behavior data of interaction events between the client and the server by a server point burying method, and synchronizing transaction information to the message middleware in real time by a log synchronization mode.
And the data layering construction module is used for classifying and layering the acquired data of different attribute sources by adopting a data layering method, and specifically comprises a source data layer, a public data layer and an application data layer.
And the data processing module is used for analyzing the source number layer logs through stream processing, filtering dirty data with empty resource numbers, and generating a transaction wide table and a user behavior wide table with corresponding dimensions. The trading width table firstly correlates records of an order main table and an order sub table according to order IDs through a real-time calculation engine, performs data statistics according to the display dimension of a front end, marks sources, and marks corresponding resource popularity scores; the user behavior broad table is used for setting a scoring rule according to the depth of the behavior, scoring resource popularity of different behaviors of the user, and combining resource popularity scores of all listed user behaviors according to the same latitude to calculate the resource popularity.
And the resource popularity calculation module is used for analyzing each behavior detail data of the source data layer through the real-time calculation engine and calculating a resource popularity value for the analyzed data.
And the distributed storage system HBase is used for storing the data processing result and providing data query service.
And the front end is used for setting the resource popularity display dimension and displaying the data service provided by the distributed storage system HBase.
The present invention further provides a computer-readable storage medium, which is a non-volatile readable storage medium and stores a computer program, where the computer program is executable by at least one processor to implement the operation of the resource popularity calculation method or system based on real-time streaming.
While the invention has been described with reference to specific embodiments, the invention is not limited thereto, and those skilled in the art can easily conceive of various equivalent modifications or substitutions within the technical scope of the invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (17)

1. A resource popularity calculation method based on real-time flow is characterized in that the method calculates user transaction data and user behavior data of a network retail platform according to the display dimension of a front end, and comprises the following steps:
behavior data acquisition: collecting user behavior data with interaction events between each client and a server of the network retail platform to a message middleware in a server embedded point mode;
a transaction data acquisition step: synchronizing transaction information to a message middleware in real time in a log synchronization mode;
data layering processing: and constructing a source data layer, a public data layer and an application data layer in the message middleware by adopting a data layering method, and performing aggregation calculation through a real-time calculation engine to obtain resource popularity index values.
2. The real-time streaming based resource popularity computation method of claim 1, wherein the transaction information includes order data and resource number information.
3. The real-time stream-based resource popularity computation method according to claim 1, wherein the source data layer includes a click behavior unit, a browse details behavior unit, a view warranty behavior unit, a bid behavior unit, and a car-filling behavior unit.
4. The resource popularity calculation method based on the real-time stream according to claim 3, wherein in the data layering step, user behavior data collected by pushing of a server-side buried point are respectively stored in a click behavior unit, a browse details behavior unit, a quality insurance book viewing behavior unit, a bid behavior unit and a car adding behavior unit of a source data layer of a message middleware.
5. The resource popularity computation method based on the real-time streaming according to claim 4, wherein the message middleware adopts a distributed publish-subscribe message system kafka.
6. The resource popularity calculation method based on the real-time stream according to claim 1, wherein in the data layering step, stream processing is adopted to analyze log information of a source data layer, data with empty resource numbers in user behavior data are filtered, and a transaction width table and a user behavior width table which are adaptive to a presentation dimension of a front end are generated to serve as public data layer data.
7. The resource popularity calculation method based on the real-time flow according to claim 6, wherein the real-time calculation engine analyzes the user behavior data and the order data, judges whether the resource number in the user behavior data is empty, if the resource number is empty, the statistical calculation is not included, otherwise, according to the two data fields of transaction and behavior, a transaction width table and a user behavior width table of the message middleware are respectively constructed in the transaction field and the behavior field.
8. The resource popularity calculation method based on real-time streaming according to claim 7, wherein the specific contents for constructing the transaction wide table are as follows:
the method comprises the steps of firstly correlating records of an order main table and an order sub-table according to order IDs through a real-time calculation engine, carrying out data statistics according to the display dimension of a front end, marking sources, marking scores of corresponding resource popularity, and generating a public data layer data table.
9. The resource popularity calculation method based on real-time streams as claimed in claim 6, wherein the specific contents for constructing the user behavior broad table are as follows:
and analyzing each behavior detail data of clicking behavior, browsing detail behavior, checking warranty book behavior, bidding behavior and vehicle-adding behavior by adopting a real-time computing engine, grouping according to the display dimension of the front end, and aggregating and computing resource popularity.
10. The resource popularity computation method based on real-time streaming according to claim 9, wherein the specific contents of the aggregated computation resource popularity are as follows:
and setting a scoring rule according to the depth of the behaviors, scoring resource popularity of different behaviors of the user until the score of the car adding behavior of the user is the highest, combining resource popularity scores of all the resource popularity scores of the listed user behaviors according to the same latitude, and calculating the resource popularity to generate a behavior wide table.
11. The real-time flow-based resource popularity computation method according to claim 8 or 10, wherein the real-time computation engine employs an open source flow processing framework Flink.
12. The resource popularity calculation method based on the real-time stream according to claim 1, wherein the data layering processing step further includes a calculation result output step, the step performs real-time aggregation calculation according to different dimensions, the result is output to a distributed storage system HBase, and a query HBase provides data service to a front end.
13. The resource popularity computation method based on real-time streams according to claim 12, wherein intermediate results of the aggregation computation are stored in a distributed storage system HBase or stored by using a message middleware, and final computation results are provided to a front end for presentation by generating a data service API.
14. The real-time streaming based resource popularity calculation method according to claim 1, wherein the client comprises a PC end, a mobile end and an applet.
15. A real-time streaming based resource popularity computation system, the system comprising:
the data acquisition module is used for acquiring user behavior data with an interaction event between the client and the server by a server-side point burying method;
the data layering construction module is used for classifying and layering the collected data of different attribute sources by adopting a data layering method, and the layering processing comprises a source data layer, a public data layer and an application data layer;
the data processing module analyzes the source layer logs through stream processing, filters data with empty resource numbers, and generates a transaction width table and a user behavior width table with corresponding dimensions;
the resource popularity calculation module analyzes each behavior detail data of the source data layer through the real-time calculation engine and calculates a resource popularity value for the analyzed data;
the distributed storage system HBase stores the data processing result and provides data query service;
and the front end sets the resource popularity display dimension and displays the data service provided by the HBase of the distributed storage system.
16. The real-time streaming based resource popularity computation system of claim 15, wherein the distributed storage system HBase provides a data services API exposed with a front-end.
17. A computer-readable storage medium, having stored thereon a computer program executable by at least one processor to perform the steps of the real-time streaming based resource popularity calculation method according to any one of claims 1-14.
CN202011349407.7A 2020-11-26 2020-11-26 Resource popularity calculation method, system and storage medium based on real-time flow Pending CN112418941A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011349407.7A CN112418941A (en) 2020-11-26 2020-11-26 Resource popularity calculation method, system and storage medium based on real-time flow

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011349407.7A CN112418941A (en) 2020-11-26 2020-11-26 Resource popularity calculation method, system and storage medium based on real-time flow

Publications (1)

Publication Number Publication Date
CN112418941A true CN112418941A (en) 2021-02-26

Family

ID=74842948

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011349407.7A Pending CN112418941A (en) 2020-11-26 2020-11-26 Resource popularity calculation method, system and storage medium based on real-time flow

Country Status (1)

Country Link
CN (1) CN112418941A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113852664A (en) * 2021-08-19 2021-12-28 天津市普迅电力信息技术有限公司 Energy commodity and energy demand accurate pushing method based on distributed real-time calculation
CN115361418A (en) * 2022-08-18 2022-11-18 中国第一汽车股份有限公司 Vehicle-mounted distributed dynamic data embedded point acquisition method, vehicle and cloud server

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108427711A (en) * 2018-01-31 2018-08-21 北京三快在线科技有限公司 Real-time data warehouse, real-time data processing method, electronic equipment and storage medium
CN110909063A (en) * 2019-11-28 2020-03-24 蜂助手股份有限公司 User behavior analysis method and device, application server and storage medium
CN111026801A (en) * 2019-12-25 2020-04-17 焦点科技股份有限公司 Method and system for assisting operation quick decision-making work of insurance type e-commerce
CN111651510A (en) * 2020-05-14 2020-09-11 拉扎斯网络科技(上海)有限公司 Data processing method and device, electronic equipment and computer readable storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108427711A (en) * 2018-01-31 2018-08-21 北京三快在线科技有限公司 Real-time data warehouse, real-time data processing method, electronic equipment and storage medium
CN110909063A (en) * 2019-11-28 2020-03-24 蜂助手股份有限公司 User behavior analysis method and device, application server and storage medium
CN111026801A (en) * 2019-12-25 2020-04-17 焦点科技股份有限公司 Method and system for assisting operation quick decision-making work of insurance type e-commerce
CN111651510A (en) * 2020-05-14 2020-09-11 拉扎斯网络科技(上海)有限公司 Data processing method and device, electronic equipment and computer readable storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113852664A (en) * 2021-08-19 2021-12-28 天津市普迅电力信息技术有限公司 Energy commodity and energy demand accurate pushing method based on distributed real-time calculation
CN115361418A (en) * 2022-08-18 2022-11-18 中国第一汽车股份有限公司 Vehicle-mounted distributed dynamic data embedded point acquisition method, vehicle and cloud server

Similar Documents

Publication Publication Date Title
CN108416620B (en) Portrait data intelligent social advertisement putting platform based on big data
US20210374324A1 (en) Resource size-based content item selection
US10410258B2 (en) Graphical user interface for high volume data analytics
US10178430B2 (en) Intelligent processing method and system for video data
CN109189951B (en) Multimedia resource recommendation method, equipment and storage medium
US8856113B1 (en) Method and device for ranking video embeds
US7318056B2 (en) System and method for performing click stream analysis
CN103620601B (en) Joining tables in a mapreduce procedure
US20160162582A1 (en) Method and system for conducting an opinion search engine and a display thereof
CN105765573B (en) Improvements in website traffic optimization
US20110282860A1 (en) Data collection, tracking, and analysis for multiple media including impact analysis and influence tracking
US20090319365A1 (en) System and method for assessing marketing data
EP3564828A1 (en) Method of data query based on evaluation and device
CN109889891B (en) Method, device and storage medium for acquiring target media file
CN110647512B (en) Data storage and analysis method, device, equipment and readable medium
WO2020037917A1 (en) User behavior data recommendation method, server and computer readable medium
US9634909B2 (en) Methods and systems of detection of most relevant insights for large volume query-based social data stream
TW201224972A (en) Sorting method and apparatus of query results
US11748365B2 (en) Multi-dimensional search
KR20110009198A (en) Search results with most clicked next objects
Marcus et al. Tweets as data: demonstration of tweeql and twitinfo
WO2014056369A1 (en) Method and system for sorting online videos of search
US20140280133A1 (en) Structured Data to Aggregate Analytics
CN108921734A (en) One real estate information visualization system based on multi-source heterogeneous data
US20160034553A1 (en) Hybrid aggregation of data sets

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20210226