CN113590343A - Method for solving uneven information ratio - Google Patents
Method for solving uneven information ratio Download PDFInfo
- Publication number
- CN113590343A CN113590343A CN202010367166.2A CN202010367166A CN113590343A CN 113590343 A CN113590343 A CN 113590343A CN 202010367166 A CN202010367166 A CN 202010367166A CN 113590343 A CN113590343 A CN 113590343A
- Authority
- CN
- China
- Prior art keywords
- data
- hotspot
- message queue
- hot
- processing
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 29
- 238000007405 data analysis Methods 0.000 claims abstract description 11
- 238000004140 cleaning Methods 0.000 claims description 9
- 238000012163 sequencing technique Methods 0.000 claims description 3
- 241000283973 Oryctolagus cuniculus Species 0.000 claims description 2
- 230000008569 process Effects 0.000 abstract description 5
- 230000002159 abnormal effect Effects 0.000 abstract description 2
- 230000008878 coupling Effects 0.000 abstract description 2
- 238000010168 coupling process Methods 0.000 abstract description 2
- 238000005859 coupling reaction Methods 0.000 abstract description 2
- 238000004880 explosion Methods 0.000 abstract description 2
- 230000006870 function Effects 0.000 abstract description 2
- 238000010586 diagram Methods 0.000 description 3
- 230000009467 reduction Effects 0.000 description 3
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 239000002699 waste material Substances 0.000 description 2
- 230000015556 catabolic process Effects 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 238000006731 degradation reaction Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000010365 information processing Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/54—Interprogram communication
- G06F9/546—Message passing systems or structures, e.g. queues
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2209/00—Indexing scheme relating to G06F9/00
- G06F2209/54—Indexing scheme relating to G06F9/54
- G06F2209/548—Queue
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
According to the method for solving the problem of low program efficiency caused by uneven information ratio, the adopted distributed message queue has a series of functions of low coupling, reliable delivery, broadcasting, flow control, final consistency and the like, the service system can be helped to deconstruct and improve development efficiency and system stability, special processing of hotspot data can be guaranteed, and processing work can be carried out on non-hotspot data without being influenced by hotspot data quantity. Therefore, even if the hot spot data processing program is abnormal or crashed due to sudden data explosion, the normal circulation of non-hot spot data cannot be influenced, and the non-hot spot data still normally process business processes such as data warehousing, data statistics, data analysis and the like.
Description
Technical Field
The invention belongs to the field of computer information processing, and particularly relates to a method for solving the problem of low program efficiency caused by uneven information ratio.
Background
Before the present invention, the solution in the industry was to use the same project to process all data information, but there were problems of data hot spots and data heat value. A large number of clients directly access one or a small number of nodes of a cluster, where the access may be read, write or other operations, and the large number of accesses may cause a single machine in which a hot spot area is located to exceed its own bearing capacity, causing performance degradation or even unavailability of the hot spot area, which may affect other areas on the same area server, and cause resource waste because the host cannot serve requests of other areas. For example, 20% of the data types in the processed information may account for 80% of the data traffic, and when the program hardware is not enough to support the whole system operation, the flow of all data is affected.
Disclosure of Invention
In order to solve the above problems, an object of the present invention is to provide a method for solving the problem of low program efficiency caused by uneven information ratio, and to solve the problem of uneven information ratio. In order to achieve the purpose, the invention adopts the technical scheme that:
a method of resolving program inefficiency due to uneven information ratios, the method comprising:
s1, connecting the data source with a distributed Message Queue (MQ for short), and identifying hot data and non-hot data in the data source by a judging module of the distributed Message Queue;
s2, the judging module of the distributed message queue sends the hot data to the theme (Topic) of the hot data message queue, and sends the non-hot data to the theme of the non-hot data message queue;
s3, processing the hot data by a hot data processing program to realize service operation; and the non-hotspot data is processed by a non-hotspot data processing program to realize service operation.
Further, the distributed message queue comprises: active MQ (Apache Active MQ is an open source code message middleware developed by Apache software foundation), rabbitmq (open source message agent software written in Erlang language, also called message-oriented middleware), socket MQ (Apache rockmq message middleware), or Zero MQ (a simple and easy-to-use transport layer, ZMQ for short).
Further, the hotspot data program needs to be deployed on a server with a higher hardware configuration; the non-hotspot data programs can be deployed on servers with lower hardware configurations.
Further, the business operation comprises data storage, data statistics, data analysis and data cleaning.
Wherein, still include: according to the number of clicks recorded in the processing, the processing result comprises hotspot data and non-hotspot data;
judging a processing result in the message queue, taking a message with the processing result of hot data as a hot data message, and taking a message with the processing result of non-hot data as a non-hot data message;
acquiring the click times of the hotspot data messages, sorting in a sorting list according to the click times, and updating the sorting list; and acquiring the click times of the non-hotspot data messages, sorting in a sorting table according to the click times, and updating the sorting table.
The method for identifying the processing result in the message queue comprises sequential storage and sequential access and random storage and query access. The random storage and query access are realized based on various distributed systems, and are organized and managed by a file system without considering the time sequence of information generation. The "sequential storage + sequential access" mode is generally implemented in a queue form such as kafak (an open source streaming processing platform developed by the Apache software foundation), ActiveMQ, and the like, and information to be exchanged is written into the subject of a queue message queue according to the generated time sequence.
And the hot spot data message click times are sorted in the sorted list according to the standard from high to low.
When the number of clicks in the processing record exceeds a set threshold, the processing record is classified as hot data; and when the number of clicks in the processing record is lower than a set threshold value, classifying the processing record as non-hotspot data.
A system for resolving information-to-noise disparities resulting in reduced program efficiency, the system comprising: the system comprises a recording module, a judging module, a sequencing module, a theme of a message queue and a data processing program module; the recording module is used for recording the number of clicks according to a processing result, and the processing result comprises a hotspot data message and a non-hotspot data message; the judging module is used for judging the processing result of the message queue according to the click times, classifying the messages with the click times exceeding a set threshold value as hot data messages, and classifying the messages with the click times lower than the set threshold value as non-hot data messages; the sorting module is used for acquiring the click times of the hotspot data messages and the non-hotspot data messages and sorting the hotspot data messages and the non-hotspot data messages in a sorting list from high to low; the theme of the message queue comprises a theme of a hot spot data message queue and a theme of a non-hot spot data message queue, wherein the theme of the hot spot data message queue is used for receiving hot spot data messages, and the theme of the non-hot spot data message queue is used for receiving non-hot spot data messages; the data processing program module comprises a hotspot data processing program and a non-hotspot data processing program, wherein the hotspot data processing program is used for performing operations of data warehousing, data statistics, data analysis and data cleaning on hotspot data, and the non-hotspot data processing program is used for performing operations of data warehousing, data statistics, data analysis and data cleaning on non-hotspot data.
The hot spot refers to one or a few nodes (access may be read, write or other operations) which are generated in a large number of clients and directly access the cluster. The large number of accesses can cause a single machine where a hotspot region is located to exceed self-bearing capacity, cause performance reduction and even make the hotspot region unavailable, which can affect other regions on the same region server, and cause resource waste because a host cannot service the requests of other regions.
Compared with the prior art, the invention has the following technical effects:
according to the method for solving the problem of low program efficiency caused by uneven information ratio, the adopted distributed message queue has a series of functions of low coupling, reliable delivery, broadcasting, flow control, final consistency and the like, the service system can be helped to deconstruct and improve development efficiency and system stability, special processing of hotspot data can be guaranteed, and processing work can be carried out on non-hotspot data without being influenced by hotspot data quantity. Therefore, even if the hot spot data processing program is abnormal or crashed due to sudden data explosion, the normal circulation of non-hot spot data cannot be influenced, and the non-hot spot data still normally process business processes such as data warehousing, data statistics, data analysis and the like.
Drawings
FIG. 1 is a flow chart showing the method of solving the problem of low efficiency of the program caused by uneven information ratio according to the present invention;
FIG. 2 is a schematic diagram of a system for solving the problem of program efficiency reduction caused by uneven information ratio according to the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in further detail with reference to the accompanying drawings in conjunction with the following detailed description. It is to be understood that this description is made only by way of example and not as a limitation on the scope of the invention. Moreover, in the following description, descriptions of well-known structures and techniques are omitted so as to not unnecessarily obscure the concepts of the present invention.
FIG. 1 is a flow structure diagram of a method for solving the problem of low program efficiency caused by uneven information ratio, which comprises the following steps:
step 1: connecting a data source to a distributed message queue, the distributed message queue comprising: an Active MQ, a Rabbit MQ, a socket MQ or a Zero MQ; and identifying hot data and non-hot data in the data source according to the received judging module of the distributed message queue.
Step 2: the judging module of the distributed message queue sends the hot data to the theme of the hot data message queue and sends the non-hot data to the theme of the non-hot data message queue; dividing the hotspot data and the non-hotspot data according to the number of clicks, and classifying the hotspot data and the non-hotspot data as the hotspot data when the number of clicks in the processing record exceeds a set threshold; and when the number of clicks in the processing record is lower than a set threshold value, classifying the processing record as non-hotspot data.
And step 3: processing the hotspot data by a hotspot data processing program, and then performing business operation work, wherein the business operation work comprises data warehousing, data statistics, data analysis or data cleaning; and the non-hotspot data is processed by a non-hotspot data processing program to realize service operation. The hotspot data program needs to be deployed on a server with higher hardware configuration; the non-hotspot data programs can be deployed on servers with lower hardware configurations.
And 4, step 4: judging a processing result in the message queue, wherein the processing result comprises hotspot data and non-hotspot data according to the click times of the processing record; and taking the message with the processing result of the hot data as a hot data message, and taking the message with the processing result of the non-hot data as a non-hot data message.
And 5: acquiring the click times of the hotspot data messages, sorting in a sorting list according to the click times, and updating the sorting list; and acquiring the click times of the non-hotspot data messages, sorting in a sorting list according to the click times, and updating the sorting list, preferably, sorting the click times of the hotspot data messages in the sorting list according to a standard from high to low. In addition to the ordering, the method of identifying the processing result in the message queue includes "sequential storage + sequential access" and "random storage + query access".
FIG. 2 is a schematic diagram of a system for solving the problem of program efficiency reduction caused by uneven information ratio, the system comprising: the system comprises a recording module, a judging module, a sequencing module, a theme of a message queue and a data processing program module; the recording module is used for recording the number of clicks according to a processing result, and the processing result comprises a hotspot data message and a non-hotspot data message; the judging module is used for judging the processing result of the message queue according to the click times, classifying the messages with the click times exceeding a set threshold value as hot data messages, and classifying the messages with the click times lower than the set threshold value as non-hot data messages; the sorting module is used for acquiring the click times of the hotspot data messages and the non-hotspot data messages and sorting the hotspot data messages and the non-hotspot data messages in a sorting list from high to low; the theme of the message queue comprises a theme of a hot spot data message queue and a theme of a non-hot spot data message queue, wherein the theme of the hot spot data message queue is used for receiving hot spot data messages, and the theme of the non-hot spot data message queue is used for receiving non-hot spot data messages; the data processing program module comprises a hotspot data processing program and a non-hotspot data processing program, wherein the hotspot data processing program is used for performing operations of data warehousing, data statistics, data analysis and data cleaning on hotspot data, and the non-hotspot data processing program is used for performing operations of data warehousing, data statistics, data analysis and data cleaning on non-hotspot data.
It is to be understood that the above-described embodiments of the present invention are merely illustrative of or explaining the principles of the invention and are not to be construed as limiting the invention. Therefore, any modification, equivalent replacement, improvement and the like made without departing from the spirit and scope of the present invention should be included in the protection scope of the present invention. Further, it is intended that the appended claims cover all such variations and modifications as fall within the scope and boundaries of the appended claims or the equivalents of such scope and boundaries.
Claims (9)
1. A method for resolving program inefficiency caused by uneven information ratios, the method comprising:
s1, connecting the data source to a distributed message queue, and identifying hot data and non-hot data in the data source by a judgment module of the distributed message queue;
s2, the judging module of the distributed message queue sends the hot data to the theme of the hot data message queue, and sends the non-hot data to the theme of the non-hot data message queue;
s3, processing the hot data by a hot data processing program to realize service operation; and the non-hotspot data is processed by a non-hotspot data processing program to realize service operation.
2. The method of claim 1, wherein the distributed message queue comprises: an Active distributed message queue, a Rabbit distributed message queue, a socket distributed message queue, or a Zero distributed message queue.
3. The method of claim 1, wherein the hot spot data program needs to be deployed on a server with a higher hardware configuration; the non-hotspot data programs can be deployed on servers with lower hardware configurations.
4. The method of claim 1, wherein the business operations comprise data warehousing, data statistics, data analysis, and data cleaning.
5. The method of claim 1, further comprising:
according to the number of clicks recorded in the processing, the processing result comprises hotspot data and non-hotspot data;
judging a processing result in the message queue, taking a message with the processing result of hot data as a hot data message, and taking a message with the processing result of non-hot data as a non-hot data message;
acquiring the click times of the hotspot data messages, sorting in a sorting list according to the click times, and updating the sorting list; and acquiring the click times of the non-hotspot data messages, sorting in a sorting table according to the click times, and updating the sorting table.
6. The method of resolving program inefficiency resulting from uneven information occupancy as recited in claim 5, wherein said method of identifying processing results in a message queue comprises "sequential storage + sequential access" and "random storage + query access".
7. The method of resolving program inefficiency due to uneven information ratios as recited in claim 5, wherein the hot data message clicks are sorted in the sorted list by a high-to-low criterion.
8. The method according to claim 5, wherein the processing records are classified as hot data when the number of clicks in the processing records exceeds a set threshold; and when the number of clicks in the processing record is lower than a set threshold value, classifying the processing record as non-hotspot data.
9. A system for using the method of any of claims 1-8 to address program inefficiency due to uneven information ratios, the system comprising: the system comprises a recording module, a judging module, a sequencing module, a theme of a message queue and a data processing program module; the recording module is used for recording the number of clicks according to a processing result, and the processing result comprises a hotspot data message and a non-hotspot data message; the judging module is used for judging the processing result of the message queue according to the click times, classifying the messages with the click times exceeding a set threshold value as hot data messages, and classifying the messages with the click times lower than the set threshold value as non-hot data messages; the sorting module is used for acquiring the click times of the hotspot data messages and the non-hotspot data messages and sorting the hotspot data messages and the non-hotspot data messages in a sorting list from high to low; the theme of the message queue comprises a theme of a hot spot data message queue and a theme of a non-hot spot data message queue, wherein the theme of the hot spot data message queue is used for receiving hot spot data messages, and the theme of the non-hot spot data message queue is used for receiving non-hot spot data messages; the data processing program module comprises a hotspot data processing program and a non-hotspot data processing program, wherein the hotspot data processing program is used for performing operations of data warehousing, data statistics, data analysis and data cleaning on hotspot data, and the non-hotspot data processing program is used for performing operations of data warehousing, data statistics, data analysis and data cleaning on non-hotspot data.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010367166.2A CN113590343A (en) | 2020-04-30 | 2020-04-30 | Method for solving uneven information ratio |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010367166.2A CN113590343A (en) | 2020-04-30 | 2020-04-30 | Method for solving uneven information ratio |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113590343A true CN113590343A (en) | 2021-11-02 |
Family
ID=78237849
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010367166.2A Pending CN113590343A (en) | 2020-04-30 | 2020-04-30 | Method for solving uneven information ratio |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113590343A (en) |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106462360A (en) * | 2014-12-23 | 2017-02-22 | 华为技术有限公司 | Resource scheduling method and related apparatus |
CN110247855A (en) * | 2019-07-26 | 2019-09-17 | 中国工商银行股份有限公司 | Method for interchanging data, client and server |
CN110569406A (en) * | 2019-07-25 | 2019-12-13 | 北京明朝万达科技股份有限公司 | Configurable hot spot data automatic analysis method, device, system and medium |
-
2020
- 2020-04-30 CN CN202010367166.2A patent/CN113590343A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106462360A (en) * | 2014-12-23 | 2017-02-22 | 华为技术有限公司 | Resource scheduling method and related apparatus |
CN110569406A (en) * | 2019-07-25 | 2019-12-13 | 北京明朝万达科技股份有限公司 | Configurable hot spot data automatic analysis method, device, system and medium |
CN110247855A (en) * | 2019-07-26 | 2019-09-17 | 中国工商银行股份有限公司 | Method for interchanging data, client and server |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111585867B (en) | Message processing method and device, electronic equipment and readable storage medium | |
CN109005056B (en) | CDN application-based storage capacity evaluation method and device | |
CN109947668B (en) | Method and device for storing data | |
US20100293154A1 (en) | Managing long-lived resource locks in a multi-system mail infrastructure | |
CN101800762A (en) | Service cloud system for fusing multiple services and service implementation method | |
CN113032099B (en) | Cloud computing node, file management method and device | |
CN103607428A (en) | Method of accessing shared memory and apparatus thereof | |
CN106034137A (en) | Intelligent scheduling method for distributed system, and distributed service system | |
CN107241444B (en) | Distributed cache data management system, method and device | |
CN112121413A (en) | Response method, system, device, terminal and medium of function service | |
CN111416823A (en) | Data transmission method and device | |
CN101330431A (en) | Method and system for storing instant information | |
CN112988679A (en) | Log collection control method and device, storage medium and server | |
US20240348684A1 (en) | Cloud desktop data migration method, service node, management node, server, electronic device, and computer-readable storage medium | |
CN111984196B (en) | File migration method, device, equipment and readable storage medium | |
CN105207993A (en) | Data access and scheduling method in CDN, and system | |
CN111797352B (en) | Account blocking method, account blocking device and account blocking system | |
CN113590343A (en) | Method for solving uneven information ratio | |
CN109445966B (en) | Event processing method, device, medium and computing equipment | |
CN116521639A (en) | Log data processing method, electronic equipment and computer readable medium | |
CN107656936B (en) | Terminal database construction method in field of instant messaging | |
CN112181737B (en) | Message processing method, device, electronic equipment and medium | |
CN111966694A (en) | System and method for optimizing back-end data storage space | |
CN112350921A (en) | Message processing method, terminal and storage medium | |
CN110688350B (en) | Method and device for storing logs |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |