CN113590343A - Method for solving uneven information ratio - Google Patents

Method for solving uneven information ratio Download PDF

Info

Publication number
CN113590343A
CN113590343A CN202010367166.2A CN202010367166A CN113590343A CN 113590343 A CN113590343 A CN 113590343A CN 202010367166 A CN202010367166 A CN 202010367166A CN 113590343 A CN113590343 A CN 113590343A
Authority
CN
China
Prior art keywords
data
hotspot
message queue
hot
processing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010367166.2A
Other languages
Chinese (zh)
Inventor
邹海文
刘大力
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hainan Palm Energy Media Co ltd
Original Assignee
Hainan Palm Energy Media Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hainan Palm Energy Media Co ltd filed Critical Hainan Palm Energy Media Co ltd
Priority to CN202010367166.2A priority Critical patent/CN113590343A/en
Publication of CN113590343A publication Critical patent/CN113590343A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/546Message passing systems or structures, e.g. queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/54Indexing scheme relating to G06F9/54
    • G06F2209/548Queue

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

According to the method for solving the problem of low program efficiency caused by uneven information ratio, the adopted distributed message queue has a series of functions of low coupling, reliable delivery, broadcasting, flow control, final consistency and the like, the service system can be helped to deconstruct and improve development efficiency and system stability, special processing of hotspot data can be guaranteed, and processing work can be carried out on non-hotspot data without being influenced by hotspot data quantity. Therefore, even if the hot spot data processing program is abnormal or crashed due to sudden data explosion, the normal circulation of non-hot spot data cannot be influenced, and the non-hot spot data still normally process business processes such as data warehousing, data statistics, data analysis and the like.

Description

Method for solving uneven information ratio
Technical Field
The invention belongs to the field of computer information processing, and particularly relates to a method for solving the problem of low program efficiency caused by uneven information ratio.
Background
Before the present invention, the solution in the industry was to use the same project to process all data information, but there were problems of data hot spots and data heat value. A large number of clients directly access one or a small number of nodes of a cluster, where the access may be read, write or other operations, and the large number of accesses may cause a single machine in which a hot spot area is located to exceed its own bearing capacity, causing performance degradation or even unavailability of the hot spot area, which may affect other areas on the same area server, and cause resource waste because the host cannot serve requests of other areas. For example, 20% of the data types in the processed information may account for 80% of the data traffic, and when the program hardware is not enough to support the whole system operation, the flow of all data is affected.
Disclosure of Invention
In order to solve the above problems, an object of the present invention is to provide a method for solving the problem of low program efficiency caused by uneven information ratio, and to solve the problem of uneven information ratio. In order to achieve the purpose, the invention adopts the technical scheme that:
a method of resolving program inefficiency due to uneven information ratios, the method comprising:
s1, connecting the data source with a distributed Message Queue (MQ for short), and identifying hot data and non-hot data in the data source by a judging module of the distributed Message Queue;
s2, the judging module of the distributed message queue sends the hot data to the theme (Topic) of the hot data message queue, and sends the non-hot data to the theme of the non-hot data message queue;
s3, processing the hot data by a hot data processing program to realize service operation; and the non-hotspot data is processed by a non-hotspot data processing program to realize service operation.
Further, the distributed message queue comprises: active MQ (Apache Active MQ is an open source code message middleware developed by Apache software foundation), rabbitmq (open source message agent software written in Erlang language, also called message-oriented middleware), socket MQ (Apache rockmq message middleware), or Zero MQ (a simple and easy-to-use transport layer, ZMQ for short).
Further, the hotspot data program needs to be deployed on a server with a higher hardware configuration; the non-hotspot data programs can be deployed on servers with lower hardware configurations.
Further, the business operation comprises data storage, data statistics, data analysis and data cleaning.
Wherein, still include: according to the number of clicks recorded in the processing, the processing result comprises hotspot data and non-hotspot data;
judging a processing result in the message queue, taking a message with the processing result of hot data as a hot data message, and taking a message with the processing result of non-hot data as a non-hot data message;
acquiring the click times of the hotspot data messages, sorting in a sorting list according to the click times, and updating the sorting list; and acquiring the click times of the non-hotspot data messages, sorting in a sorting table according to the click times, and updating the sorting table.
The method for identifying the processing result in the message queue comprises sequential storage and sequential access and random storage and query access. The random storage and query access are realized based on various distributed systems, and are organized and managed by a file system without considering the time sequence of information generation. The "sequential storage + sequential access" mode is generally implemented in a queue form such as kafak (an open source streaming processing platform developed by the Apache software foundation), ActiveMQ, and the like, and information to be exchanged is written into the subject of a queue message queue according to the generated time sequence.
And the hot spot data message click times are sorted in the sorted list according to the standard from high to low.
When the number of clicks in the processing record exceeds a set threshold, the processing record is classified as hot data; and when the number of clicks in the processing record is lower than a set threshold value, classifying the processing record as non-hotspot data.
A system for resolving information-to-noise disparities resulting in reduced program efficiency, the system comprising: the system comprises a recording module, a judging module, a sequencing module, a theme of a message queue and a data processing program module; the recording module is used for recording the number of clicks according to a processing result, and the processing result comprises a hotspot data message and a non-hotspot data message; the judging module is used for judging the processing result of the message queue according to the click times, classifying the messages with the click times exceeding a set threshold value as hot data messages, and classifying the messages with the click times lower than the set threshold value as non-hot data messages; the sorting module is used for acquiring the click times of the hotspot data messages and the non-hotspot data messages and sorting the hotspot data messages and the non-hotspot data messages in a sorting list from high to low; the theme of the message queue comprises a theme of a hot spot data message queue and a theme of a non-hot spot data message queue, wherein the theme of the hot spot data message queue is used for receiving hot spot data messages, and the theme of the non-hot spot data message queue is used for receiving non-hot spot data messages; the data processing program module comprises a hotspot data processing program and a non-hotspot data processing program, wherein the hotspot data processing program is used for performing operations of data warehousing, data statistics, data analysis and data cleaning on hotspot data, and the non-hotspot data processing program is used for performing operations of data warehousing, data statistics, data analysis and data cleaning on non-hotspot data.
The hot spot refers to one or a few nodes (access may be read, write or other operations) which are generated in a large number of clients and directly access the cluster. The large number of accesses can cause a single machine where a hotspot region is located to exceed self-bearing capacity, cause performance reduction and even make the hotspot region unavailable, which can affect other regions on the same region server, and cause resource waste because a host cannot service the requests of other regions.
Compared with the prior art, the invention has the following technical effects:
according to the method for solving the problem of low program efficiency caused by uneven information ratio, the adopted distributed message queue has a series of functions of low coupling, reliable delivery, broadcasting, flow control, final consistency and the like, the service system can be helped to deconstruct and improve development efficiency and system stability, special processing of hotspot data can be guaranteed, and processing work can be carried out on non-hotspot data without being influenced by hotspot data quantity. Therefore, even if the hot spot data processing program is abnormal or crashed due to sudden data explosion, the normal circulation of non-hot spot data cannot be influenced, and the non-hot spot data still normally process business processes such as data warehousing, data statistics, data analysis and the like.
Drawings
FIG. 1 is a flow chart showing the method of solving the problem of low efficiency of the program caused by uneven information ratio according to the present invention;
FIG. 2 is a schematic diagram of a system for solving the problem of program efficiency reduction caused by uneven information ratio according to the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in further detail with reference to the accompanying drawings in conjunction with the following detailed description. It is to be understood that this description is made only by way of example and not as a limitation on the scope of the invention. Moreover, in the following description, descriptions of well-known structures and techniques are omitted so as to not unnecessarily obscure the concepts of the present invention.
FIG. 1 is a flow structure diagram of a method for solving the problem of low program efficiency caused by uneven information ratio, which comprises the following steps:
step 1: connecting a data source to a distributed message queue, the distributed message queue comprising: an Active MQ, a Rabbit MQ, a socket MQ or a Zero MQ; and identifying hot data and non-hot data in the data source according to the received judging module of the distributed message queue.
Step 2: the judging module of the distributed message queue sends the hot data to the theme of the hot data message queue and sends the non-hot data to the theme of the non-hot data message queue; dividing the hotspot data and the non-hotspot data according to the number of clicks, and classifying the hotspot data and the non-hotspot data as the hotspot data when the number of clicks in the processing record exceeds a set threshold; and when the number of clicks in the processing record is lower than a set threshold value, classifying the processing record as non-hotspot data.
And step 3: processing the hotspot data by a hotspot data processing program, and then performing business operation work, wherein the business operation work comprises data warehousing, data statistics, data analysis or data cleaning; and the non-hotspot data is processed by a non-hotspot data processing program to realize service operation. The hotspot data program needs to be deployed on a server with higher hardware configuration; the non-hotspot data programs can be deployed on servers with lower hardware configurations.
And 4, step 4: judging a processing result in the message queue, wherein the processing result comprises hotspot data and non-hotspot data according to the click times of the processing record; and taking the message with the processing result of the hot data as a hot data message, and taking the message with the processing result of the non-hot data as a non-hot data message.
And 5: acquiring the click times of the hotspot data messages, sorting in a sorting list according to the click times, and updating the sorting list; and acquiring the click times of the non-hotspot data messages, sorting in a sorting list according to the click times, and updating the sorting list, preferably, sorting the click times of the hotspot data messages in the sorting list according to a standard from high to low. In addition to the ordering, the method of identifying the processing result in the message queue includes "sequential storage + sequential access" and "random storage + query access".
FIG. 2 is a schematic diagram of a system for solving the problem of program efficiency reduction caused by uneven information ratio, the system comprising: the system comprises a recording module, a judging module, a sequencing module, a theme of a message queue and a data processing program module; the recording module is used for recording the number of clicks according to a processing result, and the processing result comprises a hotspot data message and a non-hotspot data message; the judging module is used for judging the processing result of the message queue according to the click times, classifying the messages with the click times exceeding a set threshold value as hot data messages, and classifying the messages with the click times lower than the set threshold value as non-hot data messages; the sorting module is used for acquiring the click times of the hotspot data messages and the non-hotspot data messages and sorting the hotspot data messages and the non-hotspot data messages in a sorting list from high to low; the theme of the message queue comprises a theme of a hot spot data message queue and a theme of a non-hot spot data message queue, wherein the theme of the hot spot data message queue is used for receiving hot spot data messages, and the theme of the non-hot spot data message queue is used for receiving non-hot spot data messages; the data processing program module comprises a hotspot data processing program and a non-hotspot data processing program, wherein the hotspot data processing program is used for performing operations of data warehousing, data statistics, data analysis and data cleaning on hotspot data, and the non-hotspot data processing program is used for performing operations of data warehousing, data statistics, data analysis and data cleaning on non-hotspot data.
It is to be understood that the above-described embodiments of the present invention are merely illustrative of or explaining the principles of the invention and are not to be construed as limiting the invention. Therefore, any modification, equivalent replacement, improvement and the like made without departing from the spirit and scope of the present invention should be included in the protection scope of the present invention. Further, it is intended that the appended claims cover all such variations and modifications as fall within the scope and boundaries of the appended claims or the equivalents of such scope and boundaries.

Claims (9)

1. A method for resolving program inefficiency caused by uneven information ratios, the method comprising:
s1, connecting the data source to a distributed message queue, and identifying hot data and non-hot data in the data source by a judgment module of the distributed message queue;
s2, the judging module of the distributed message queue sends the hot data to the theme of the hot data message queue, and sends the non-hot data to the theme of the non-hot data message queue;
s3, processing the hot data by a hot data processing program to realize service operation; and the non-hotspot data is processed by a non-hotspot data processing program to realize service operation.
2. The method of claim 1, wherein the distributed message queue comprises: an Active distributed message queue, a Rabbit distributed message queue, a socket distributed message queue, or a Zero distributed message queue.
3. The method of claim 1, wherein the hot spot data program needs to be deployed on a server with a higher hardware configuration; the non-hotspot data programs can be deployed on servers with lower hardware configurations.
4. The method of claim 1, wherein the business operations comprise data warehousing, data statistics, data analysis, and data cleaning.
5. The method of claim 1, further comprising:
according to the number of clicks recorded in the processing, the processing result comprises hotspot data and non-hotspot data;
judging a processing result in the message queue, taking a message with the processing result of hot data as a hot data message, and taking a message with the processing result of non-hot data as a non-hot data message;
acquiring the click times of the hotspot data messages, sorting in a sorting list according to the click times, and updating the sorting list; and acquiring the click times of the non-hotspot data messages, sorting in a sorting table according to the click times, and updating the sorting table.
6. The method of resolving program inefficiency resulting from uneven information occupancy as recited in claim 5, wherein said method of identifying processing results in a message queue comprises "sequential storage + sequential access" and "random storage + query access".
7. The method of resolving program inefficiency due to uneven information ratios as recited in claim 5, wherein the hot data message clicks are sorted in the sorted list by a high-to-low criterion.
8. The method according to claim 5, wherein the processing records are classified as hot data when the number of clicks in the processing records exceeds a set threshold; and when the number of clicks in the processing record is lower than a set threshold value, classifying the processing record as non-hotspot data.
9. A system for using the method of any of claims 1-8 to address program inefficiency due to uneven information ratios, the system comprising: the system comprises a recording module, a judging module, a sequencing module, a theme of a message queue and a data processing program module; the recording module is used for recording the number of clicks according to a processing result, and the processing result comprises a hotspot data message and a non-hotspot data message; the judging module is used for judging the processing result of the message queue according to the click times, classifying the messages with the click times exceeding a set threshold value as hot data messages, and classifying the messages with the click times lower than the set threshold value as non-hot data messages; the sorting module is used for acquiring the click times of the hotspot data messages and the non-hotspot data messages and sorting the hotspot data messages and the non-hotspot data messages in a sorting list from high to low; the theme of the message queue comprises a theme of a hot spot data message queue and a theme of a non-hot spot data message queue, wherein the theme of the hot spot data message queue is used for receiving hot spot data messages, and the theme of the non-hot spot data message queue is used for receiving non-hot spot data messages; the data processing program module comprises a hotspot data processing program and a non-hotspot data processing program, wherein the hotspot data processing program is used for performing operations of data warehousing, data statistics, data analysis and data cleaning on hotspot data, and the non-hotspot data processing program is used for performing operations of data warehousing, data statistics, data analysis and data cleaning on non-hotspot data.
CN202010367166.2A 2020-04-30 2020-04-30 Method for solving uneven information ratio Pending CN113590343A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010367166.2A CN113590343A (en) 2020-04-30 2020-04-30 Method for solving uneven information ratio

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010367166.2A CN113590343A (en) 2020-04-30 2020-04-30 Method for solving uneven information ratio

Publications (1)

Publication Number Publication Date
CN113590343A true CN113590343A (en) 2021-11-02

Family

ID=78237849

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010367166.2A Pending CN113590343A (en) 2020-04-30 2020-04-30 Method for solving uneven information ratio

Country Status (1)

Country Link
CN (1) CN113590343A (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106462360A (en) * 2014-12-23 2017-02-22 华为技术有限公司 Resource scheduling method and related apparatus
CN110247855A (en) * 2019-07-26 2019-09-17 中国工商银行股份有限公司 Method for interchanging data, client and server
CN110569406A (en) * 2019-07-25 2019-12-13 北京明朝万达科技股份有限公司 Configurable hot spot data automatic analysis method, device, system and medium

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106462360A (en) * 2014-12-23 2017-02-22 华为技术有限公司 Resource scheduling method and related apparatus
CN110569406A (en) * 2019-07-25 2019-12-13 北京明朝万达科技股份有限公司 Configurable hot spot data automatic analysis method, device, system and medium
CN110247855A (en) * 2019-07-26 2019-09-17 中国工商银行股份有限公司 Method for interchanging data, client and server

Similar Documents

Publication Publication Date Title
CN111585867B (en) Message processing method and device, electronic equipment and readable storage medium
CN109005056B (en) CDN application-based storage capacity evaluation method and device
CN109947668B (en) Method and device for storing data
US20100293154A1 (en) Managing long-lived resource locks in a multi-system mail infrastructure
CN101800762A (en) Service cloud system for fusing multiple services and service implementation method
CN113032099B (en) Cloud computing node, file management method and device
CN103607428A (en) Method of accessing shared memory and apparatus thereof
CN106034137A (en) Intelligent scheduling method for distributed system, and distributed service system
CN107241444B (en) Distributed cache data management system, method and device
CN112121413A (en) Response method, system, device, terminal and medium of function service
CN111416823A (en) Data transmission method and device
CN101330431A (en) Method and system for storing instant information
CN112988679A (en) Log collection control method and device, storage medium and server
US20240348684A1 (en) Cloud desktop data migration method, service node, management node, server, electronic device, and computer-readable storage medium
CN111984196B (en) File migration method, device, equipment and readable storage medium
CN105207993A (en) Data access and scheduling method in CDN, and system
CN111797352B (en) Account blocking method, account blocking device and account blocking system
CN113590343A (en) Method for solving uneven information ratio
CN109445966B (en) Event processing method, device, medium and computing equipment
CN116521639A (en) Log data processing method, electronic equipment and computer readable medium
CN107656936B (en) Terminal database construction method in field of instant messaging
CN112181737B (en) Message processing method, device, electronic equipment and medium
CN111966694A (en) System and method for optimizing back-end data storage space
CN112350921A (en) Message processing method, terminal and storage medium
CN110688350B (en) Method and device for storing logs

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination