CN113282578B - Message processing method, device, message processing equipment and storage medium - Google Patents

Message processing method, device, message processing equipment and storage medium Download PDF

Info

Publication number
CN113282578B
CN113282578B CN202010106534.8A CN202010106534A CN113282578B CN 113282578 B CN113282578 B CN 113282578B CN 202010106534 A CN202010106534 A CN 202010106534A CN 113282578 B CN113282578 B CN 113282578B
Authority
CN
China
Prior art keywords
target
message
data
invalid
leaf node
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010106534.8A
Other languages
Chinese (zh)
Other versions
CN113282578A (en
Inventor
林兆祥
易卉芹
蔡毅超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN202010106534.8A priority Critical patent/CN113282578B/en
Publication of CN113282578A publication Critical patent/CN113282578A/en
Application granted granted Critical
Publication of CN113282578B publication Critical patent/CN113282578B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/221Column-oriented storage; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Information Transfer Between Computers (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention discloses a message processing method, a device, message processing equipment and a storage medium, wherein the method comprises the following steps: acquiring a target message to be stored and a message tree corresponding to the target message, wherein the target message comprises a plurality of data information, the message tree comprises a plurality of leaf nodes, and each leaf node is used for storing one data information; if the message tree comprises a target leaf node corresponding to a target data column, performing determinant storage coding on data information in the target leaf node to obtain a coding result, and storing the coding result in a target storage position in the target data column, wherein the target storage position corresponds to the target message; and if the message tree does not comprise the target leaf node corresponding to the target data column, determining an invalid mark position in the target data column according to the target storage position, and setting an invalid data mark at the invalid mark position. The embodiment of the invention can improve the coding efficiency.

Description

Message processing method, device, message processing equipment and storage medium
Technical Field
The present application relates to the field of internet technologies, and in particular, to a message processing method, a device, a message processing apparatus, and a storage medium.
Background
With the continuous development of big data, in order to facilitate the inquiry of data, the storage mode in a relational database is gradually changed from line storage to column storage, and the relational database refers to a database which adopts a two-dimensional table model to organize data and stores the data in the form of lines and columns. By column storage is meant that a plurality of data in a plurality of messages are stored in separate corresponding data columns, for example, name data in a plurality of messages are stored in the same data column, and identification data in a plurality of messages are stored in the same data column.
Currently, in the columnar storage process, a columnar storage format is adopted commonly as Parquet. Specifically, the determinant storage process in accordance with the columnar storage format of Parquet can be summarized as: and sequentially encoding and storing the data information included in each leaf node in the message tree corresponding to the message to be stored into the corresponding data column. The practical application shows that the column type storage format can encode and store more redundant information in the practical application process, so that the problems of low encoding efficiency and the like are caused. Therefore, how to improve the encoding efficiency has become a hot problem in research in the field of columnar storage.
Disclosure of Invention
The embodiment of the invention provides a message processing method, a device, message processing equipment and a storage medium, which can improve coding efficiency.
In one aspect, an embodiment of the present invention provides a message processing method, including:
Acquiring a target message to be stored and a message tree corresponding to the target message, wherein the target message comprises a plurality of data information, the message tree comprises a plurality of leaf nodes, and each leaf node is used for storing one data information;
If the message tree comprises a target leaf node corresponding to a target data column, performing determinant storage coding on data information in the target leaf node to obtain a coding result, and storing the coding result in a target storage position in the target data column, wherein the target storage position corresponds to the target message;
And if the message tree does not comprise the target leaf node corresponding to the target data column, determining an invalid mark position in the target data column according to the target storage position, and setting an invalid data mark at the invalid mark position.
In one aspect, an embodiment of the present invention provides another message processing method, including:
Receiving a decoding instruction for decoding a target message, and acquiring a plurality of data columns related to the target message, wherein the target message comprises a plurality of data information, and each data column stores one or more of the following: the method comprises the steps of storing a coding result related to data information and an invalid data mark corresponding to the data information in a corresponding data column by adopting the message processing method;
traversing related storage locations in a plurality of data columns related to the target message in sequence;
If the coding result exists in the relevant storage position of the current traversal, decoding the coding result to obtain data information corresponding to the relevant storage position of the current traversal in the target message;
and if invalid data marks exist in the related storage position of the current traversal, skipping the related storage position of the current traversal, and continuing to traverse the next related storage position.
In one aspect, an embodiment of the present invention provides a message processing apparatus, including:
an acquisition unit configured to acquire a target message to be stored, where the target message includes a plurality of data information;
the acquisition unit is further used for acquiring a message tree corresponding to the target message, wherein the message tree comprises a plurality of leaf nodes, and each leaf node is used for storing one data message;
The processing unit is used for performing determinant storage coding on the data information in the target leaf node to obtain a coding result if the message tree comprises a target leaf node corresponding to a target data column, and storing the coding result in a target storage position in the target data column, wherein the target storage position corresponds to the target message;
The processing unit is further configured to determine an invalid flag location in the target data column according to the target storage location if the message tree does not include a target leaf node corresponding to the target data column, and set an invalid data flag at the invalid flag location.
In one aspect, an embodiment of the present invention provides a processing apparatus, including:
A receiving unit, configured to receive a decoding instruction for decoding a target message, and acquire a plurality of data columns related to the target message, where the target message includes a plurality of data information, and each data column stores one or more of the following: the method comprises the steps of storing a coding result related to data information and an invalid data mark corresponding to the data information in a corresponding data column by adopting the message processing method;
the processing unit is used for traversing related storage positions related to the target message in a plurality of data columns in sequence;
The processing unit is further used for decoding the coding result if the coding result exists in the relevant storage position of the current traversal, so as to obtain data information corresponding to the relevant storage position of the current traversal in the target message;
and the processing unit is further used for skipping the relevant storage position of the current traversal if the invalid data mark exists in the relevant storage position of the current traversal, and continuing to traverse the next relevant storage position.
In one aspect, an embodiment of the present invention provides a message processing apparatus, including: a processor adapted to implement one or more instructions; and
A computer storage medium storing one or more instructions adapted to be loaded by the processor and to perform the steps of:
Acquiring a target message to be stored and a message tree corresponding to the target message, wherein the target message comprises a plurality of data information, the message tree comprises a plurality of leaf nodes, and each leaf node is used for storing one data information; if the message tree comprises a target leaf node corresponding to a target data column, performing determinant storage coding on data information in the target leaf node to obtain a coding result, and storing the coding result in a target storage position in the target data column, wherein the target storage position corresponds to the target message; and if the message tree does not comprise the target leaf node corresponding to the target data column, determining an invalid mark position in the target data column according to the target storage position, and setting an invalid data mark at the invalid mark position.
Or the computer storage medium stores one or more instructions adapted to be loaded by the processor and to perform the steps of:
Receiving a decoding instruction for decoding a target message, and acquiring a plurality of data columns related to the target message, wherein the target message comprises a plurality of data information, and each data column stores one or more of the following: the method comprises the steps of storing a coding result related to data information and an invalid data mark corresponding to the data information in a corresponding data column by adopting the message processing method;
traversing related storage locations in a plurality of data columns related to the target message in sequence;
If the coding result exists in the relevant storage position of the current traversal, decoding the coding result to obtain data information corresponding to the relevant storage position of the current traversal in the target message;
and if invalid data marks exist in the related storage position of the current traversal, skipping the related storage position of the current traversal, and continuing to traverse the next related storage position.
In one aspect, an embodiment of the present invention provides a computer storage medium, where computer program instructions are stored, where the computer program instructions are used to perform a message processing method as described above when executed by a processor.
In the embodiment of the invention, after a target message to be stored and a message tree corresponding to the target message are acquired, before a plurality of data information codes included in the target message are stored in each data column, whether a target leaf node corresponding to the target data column exists in the message tree included in the target message is judged; if so, performing determinant storage coding on the data information in the target leaf node to obtain a coding result, and storing the coding result in the target storage position in the target data column; if not, an invalid flag position is determined in the target data column, and an invalid data flag is added at the invalid flag position. Therefore, if the target message does not include the target leaf node corresponding to the target data column, the target message is not stored in a coding mode, and the invalid data mark is directly added, so that a part of coding resources can be saved, and the coding efficiency is improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1a is a schematic diagram of a column-wise stored data column according to an embodiment of the present invention;
FIG. 1b is a diagram of a message tree provided by an embodiment of the present invention;
FIG. 2 is a schematic diagram of a message processing system according to an embodiment of the present invention;
FIG. 3 is a flow chart of a message processing method according to an embodiment of the present invention;
FIG. 4a is a schematic diagram of another message tree according to an embodiment of the present invention;
FIG. 4b is a schematic diagram of a message tree according to an embodiment of the present invention;
FIG. 5a is a schematic diagram of a columnar storage encoding according to an embodiment of the present invention;
FIG. 5b is a schematic diagram of another columnar storage encoding provided by an embodiment of the present invention;
FIG. 5c is a schematic diagram of yet another columnar storage encoding provided by an embodiment of the present invention;
FIG. 6 is a flow chart of another message processing method according to an embodiment of the present invention;
FIG. 7 is a flow chart of another message processing method according to an embodiment of the present invention;
fig. 8 is an application scenario diagram of a message processing method provided in an embodiment of the present invention;
Fig. 9 is a schematic structural diagram of a message processing apparatus according to an embodiment of the present invention;
fig. 10 is a schematic structural diagram of another message processing apparatus according to an embodiment of the present invention;
fig. 11 is a schematic structural diagram of a message processing apparatus according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention.
In a relational database, column-based storage is a data storage form commonly used in the relational database because of the advantage of efficient query. The column-type storage means that data information of the same data type included in a plurality of messages is stored in the same data column, so that when data information of a certain data type needs to be queried from a database, only the data column corresponding to the data information of the data type needs to be traversed. For example, assuming that there are 3 messages to be stored, the data information in each message includes a name, a gender, and a place, after the data information in the 3 messages are stored in a column, the names in the 3 messages are stored in the same data column, the sexes in the 3 messages are stored in the same data column, and the place in the 3 messages is stored in the same data column. If it is desired to count how many messages with gender as female are in each message, the data sequence corresponding to the gender is required to be traversed.
Briefly, the core idea of columnar storage can be summarized as: mapping a nested message to a plurality of data columns; and when the message is required to be read or queried later, the message can be recovered according to the data stored in each data column. In the column-type storage process, a common column-type storage mode is Parquet, and the process of adopting Parquet to store a message to be stored in a column-type mode can be summarized as follows: determining the type of data information included in a message to be stored, and distributing a data column for each type of data information; then obtaining a message tree corresponding to the message to be stored, wherein the message tree comprises a plurality of leaf nodes, and each leaf node comprises a data message; for any one of a plurality of data columns, the target message is subjected to column-type storage encoding for the data column, and the obtained encoding result is stored in a storage position corresponding to the data column.
When the coding mode adopted in Parquet is Dremal and Dremal in the column coding, if a leaf node corresponding to a certain data column exists, coding data information in the corresponding leaf node and position information of the target message for any data column to obtain a triplet, and storing the triplet in a certain storage position in the data column; if there is no leaf node corresponding to a data column, the location information of the target message for that data column is encoded into a triplet, stored at a location in that data column.
Optionally, the message tree further includes a root node and a plurality of intermediate nodes, and each occurrence of a root node represents that a new message to be stored is acquired, and as can be seen from the foregoing, the leaf node is used for storing specific data information. An intermediate node refers to a node that includes leaf nodes, in which no data is stored in itself.
In the following, a detailed description will be given of a case where a message to be stored is stored in a determinant by Parquet, assuming that the message to be stored is expressed as follows:
After the message to be stored is obtained, determining the data information of the pageview _id, the info_id and the click_id included in the message, and obtaining a data column corresponding to each data information, as shown in fig. 1a, 101 represents a data column corresponding to pageview _id, 102 represents a data column corresponding to info_id, and 103 represents a data column corresponding to click_id.
Then, the message tree corresponding to the message to be stored is obtained as shown in fig. 1b, 11 in fig. 1b represents a root node, 12 represents a leaf node, and 13 represents an intermediate node. As can be seen from fig. 1b, each intermediate node 13 and leaf node 12 corresponds to a node type, which may include a necessary type, an optional type, and a repeatable type. As the name implies, the necessary nodes refer to the requisite nodes; the optional node means that the node may or may not exist; a node of the repeatable type, which is typically used to store an array, illustrates that the node is repeatable.
After the data columns corresponding to the various types of data information after the message tree corresponding to the message to be stored is obtained, performing determinant coding storage according to the method, and storing corresponding coding results in each data column.
In practical applications, some messages may not include leaf nodes corresponding to a certain data column in the message tree, or some messages may not include data information corresponding to a certain data column, when the above Parquet is used for determinant storage, the coding information of the information is also stored in the data column, and decoding processing is performed on the coding information, so that actual data information cannot be obtained, and only father nodes in some message trees may be recovered, where the father nodes have no practical significance. Such coding schemes waste redundant coding resources to result in lower coding efficiency.
In order to improve coding efficiency, the embodiment of the present invention proposes a message processing scheme, where the message processing scheme is also based on Parquet's columnar storage mode to implement columnar storage, but unlike the foregoing, in the present message processing scheme, when storing a target message in each data column in a columnar storage mode, it is first determined whether the target message includes a target leaf node corresponding to a target data column (the target data column is any one of a plurality of data columns); if so, executing operation of determinant storage coding of the data information in the target leaf node, and storing a coding result obtained by coding in a target data column; if not, the determinant encoding operation is not performed on the target message, but an invalid data flag is added to the target data column. Thus, the data information needing to be encoded is saved, and the encoding efficiency can be improved to a certain extent.
Based on the above-mentioned message processing scheme, the embodiment of the present invention provides a message processing system, and referring to fig. 2, a schematic structural diagram of the message processing system is provided in the embodiment of the present invention. The message processing system shown in fig. 2 may include a terminal device 201 and a message processing device 202, where the terminal device 201 may be a mobile phone, a tablet, a notebook, an intelligent wearable device, and the message processing device 202 may be a server, or may be a terminal device such as a mobile phone, a notebook, a tablet, or the like.
In one embodiment, the number of the terminal devices 201 is at least one, and the terminal devices 201 may send a message to be stored to the message processing device 202, for example, each terminal device 201 obtains an advertisement exposure message cached in the terminal, and sends the obtained advertisement exposure message to the message processing device 202. After the message processing device 202 obtains the messages to be stored sent by the respective terminal devices 201, the message processing scheme is adopted to process each message to be stored, so that each message to be stored is stored in the columnar storage database in a columnar storage manner.
Alternatively, the message processing device 202 may start processing the message to be stored after receiving the message to be stored sent by one terminal device 201; or the message processing device 202 may store the message to be stored in the buffer after each time the message to be stored sent by one terminal device 201 is acquired, and then process the message to be stored in the buffer for a period of time according to the receiving sequence.
Based on the above-mentioned message processing system, the embodiment of the present invention provides a message processing method, and referring to fig. 3, a flow chart of the message processing method provided by the embodiment of the present invention is shown. The message processing method shown in fig. 3 may be performed by a message processing device, in particular by a processor of the message processing device. The message processing method shown in fig. 3 may include the steps of:
Step S301, obtaining a message tree corresponding to a target message to be stored.
The target message may be any type of message, such as an exposure message in advertisement data, an employee basic information statistics message of a certain enterprise, and the like. In one embodiment, the implementation of obtaining the target message to be stored may include: the message processing equipment receives the messages to be stored sent by the plurality of terminal equipment in advance, and stores the received messages to be stored in a cache; and acquiring any message to be stored from the cache as a target message.
In other embodiments, the implementation manner of obtaining the target message to be stored may further include: and the message processing equipment receives the target message to be stored sent by the terminal equipment in real time. The target message may be any message sent by any terminal device.
In one embodiment, the target message may include a plurality of data information, where the data information may include a data name and a data name, for example, the target message is an exposure message of advertisement data, and the plurality of data information may include platform information for delivering the advertisement message, for example, the platform information includes a browser-XX and a social application-XXA, where the browser and the social application refer to the data names included in the platform information, and XX and XXA are corresponding data values; for another example, if the target message is a statistical message of employee basic information, the plurality of data information may include age information, and the age information may be expressed as: age 26 years, wherein age is the name of data in the age data, and age 26 years is the value of data in the age information.
In one embodiment, the plurality of data information included in the target message belongs to at least one data type, and as can be seen from the foregoing, each data type corresponds to one data column, each data information belongs to one data type, so each data information corresponds to one data column, and each data type corresponds to a plurality of data information of the same data type and each data column belongs to the same data type and is stored in the same data column. For example, the plurality of data information included in the target message includes two pieces of location information, and the data types to which the two pieces of location information belong are address data, so that the two pieces of location information need to be stored in the same data column.
In one embodiment, the obtaining the message tree corresponding to the target message includes: and generating a message tree corresponding to the target message according to the nested structure in the target message. For example, the target message may be represented as follows:
Where MESSAGE PAGEVIEW denotes receipt of one Pageview message, { } denotes a nested relationship between a plurality of data information included in the target message. The message tree generated for the target message may be described with reference to fig. 1b in terms of nested relationships between the individual data information.
As can be seen from the foregoing, in the message tree, one leaf node includes one data information in a message, and therefore, if a plurality of data information is included in one message, a plurality of leaf nodes should be included in the message tree corresponding to the message. Since each data information corresponds to one data column and each leaf node includes one data information, each leaf node corresponds to one data column. If the data information stored in two leaf nodes belongs to the same data type, the two leaf nodes correspond to the same data column. In the embodiment of the invention, the target message comprises a plurality of data information, and the message tree corresponding to the target message comprises a plurality of leaf nodes. Each leaf node is used for storing one data information of the target message, and each leaf node corresponds to one node type which comprises a necessary type, an optional type and a repeatable type; the data value of the data information in the essential type leaf node is a non-null value, the data value of the data information in the optional type leaf node is a null value or a non-null value, and the data value of the data information in the repeatable type leaf node is a null value or a non-null value.
In one embodiment, a message may correspond to one or more storage locations in a data column, and if two data information of the same data type are included in a message, then two storage locations corresponding to the message are included in the data column storing the data type. For example, for the target data column, a plurality of storage locations may be included in the target data column, each storage location corresponding to one data information of the target message corresponding to the target data column, and in the following description, the storage location corresponding to the data information included in the target message in the target data column is taken as an example of the target storage location.
It should be appreciated that if the target message includes at least two data information that are of the same data type, the data information of that data type being stored in the target data column, then the target message has at least two corresponding target storage locations in the target data column. In general, the sequence of at least the target storage locations in the target data column may be determined according to the sequence of traversed leaf nodes corresponding to at least two data information, where the traversed leaf nodes include data information that is traversed earlier than the corresponding target storage locations in the target data column, and conversely, the traversed leaf nodes include data information that is traversed later than the corresponding target storage locations in the target data column.
Step S303, if the message tree comprises a target leaf node corresponding to the target data column, performing determinant storage encoding on the data information in the target leaf node to obtain an encoding result, and storing the encoding result in a target storage position in the target data column.
As can be seen from the foregoing, when the message processing apparatus receives a target message to be stored, it needs to store the data information included in each leaf node in the message tree corresponding to the target message in the corresponding data column. In a specific implementation, in order to save the number of encoded data and improve the encoding efficiency, for any data column, the message processing device needs to determine whether the target message has data information to be stored in the data column, and if so, performs column-type encoding storage; if not, an invalid data flag without corresponding data information in the target message is added in the data column. In the following, taking a target data column as an example, how the message processing device stores data information included in each leaf node in the message tree in the target data column, where the target data column is any one data column.
Specifically, after the message tree corresponding to the target message is obtained, firstly judging whether the message tree includes a target leaf node corresponding to the target data column, and if so, executing step S302; if not, step S303 is performed.
As can be seen from the foregoing, the encoding method adopted in Parquet is Dremal, dremal, when performing column encoding, the encoding method encodes the data information in each leaf node and the location information of the target message about the corresponding data column to obtain a triplet, and based on this, in this embodiment, performing determinant storage encoding on the data information in the target leaf node in step S303 to obtain an encoding result may include: acquiring the position information of the target message relative to a target data column; the position information comprises a target definition depth and a target repetition depth; and performing determinant storage coding on the data information in the target leaf node and the position information of the target message to obtain a coding result.
In one embodiment, the target repetition depth is a number of repeatable nodes in a second path in the message tree, the second path being a path between a second repeated node in a first path in the message tree associated with the target leaf node and a root node, wherein the first path in the message tree associated with the target leaf node may include: if the message tree includes a target leaf node corresponding to the target data column, a first path in the message tree associated with the target leaf node may include a path from the root node to the target leaf node; if the target leaf node corresponding to the target data column is not included in the message tree, the first path in the message tree associated with the target leaf node may include a path from the root node to a parent node of the target leaf node.
In a specific implementation, the implementation of obtaining the target repetition depth may be: traversing from the first repeated node to obtain a second repeated node in a first path related to the target leaf node; the number of repeatable nodes in the second path of the second repeated node to the root node is calculated and the number is determined as the repetition depth of the target leaf node. In brief, when calculating the target repetition depth, it is first determined from which node the target leaf node is generated, and then the number of repeatable nodes between the node and the root node is calculated, and the number is taken as the target repetition depth of the target leaf node.
For example, assume that the target data column is a data column for storing click_id, and that the received target message is expressed as follows: pageview: { Positions: { Impressions: {
Click_id:100 }; the message tree corresponding to the target message may be as shown in fig. 4a, and assume that a leaf node 401 including click_id is a target leaf node, where the target message includes a target leaf node corresponding to a target data column, and the first path related to the target leaf node is a path between a root node and the target leaf node. In the first path related to the target leaf node, the first repeated node is a root node, and the second repeated node obtained by traversing from the first repeated node is also the root node, and no repeatable node exists between the root node and the root node, so that the target repetition depth is equal to 0. In other words, the target leaf node is generated by the root node in the message tree shown in fig. 4a, and there is no repeatable node from root node to root node, so the target repetition depth of the target leaf node is equal to 0.
As another example, assume that the received target message is represented as follows: pageview: { Positions: { }
Positions { Impressions { } }, the message tree corresponding to the target message may be as shown in fig. 4B, where the target message does not include the target leaf node corresponding to the target data column, but includes parent nodes of two target leaf nodes, positions and impressions, respectively, so that the first path related to the target leaf node in the message tree shown in fig. 4B includes path a and path B. Assuming that for path a, the first repeated node in path a is the root node, traversing from the root node results in a second repeated node being positions, and the second path from positions to the root node is: pageview- > positions, positions are repeatable nodes on the second path, and thus the target repetition depth of the target leaf node is equal to 1 at this time. In other words, in path a, the target leaf node is generated by the positions node, and the repeatable node in the second path from the root node to positions is calculated as one, positions.
In one embodiment, the definition of the target definition depth may include: if the message tree comprises a target leaf node, the target definition depth is the number of empty nodes in a third path in the message tree, wherein the third path is a path from a root node to the target leaf node in the message tree; if the message tree does not include a target leaf node, the target defined depth is the number of available nodes in a first path in the message tree associated with the target leaf node. In a specific implementation, the implementation of obtaining the target definition depth may be: if the target leaf node is included in the target message tree, the number of empty nodes in the path from the root node to the middle of the target leaf node is taken as a target definition depth; if the target leaf node is not included in the target message tree, the number of available empty nodes in a first path in the message tree related to the target leaf node is taken as a target definition depth.
For example, in the message tree shown in fig. 4a, the target leaf node is included in the message tree of the target message, the path from the root node to the target node is pageview- > positions- > impressions- > click_id, and the nodes that may be empty in this path include 3, positions, impressions and click_id, respectively (assuming that positions, impressions and click_id may be empty), so the target definition depth is 3 in fig. 4 a.
For another example, in the message tree shown in fig. 4B, the message tree of the target message does not include the target leaf node, the first path related to the target leaf node in the message tree includes a path a and a path B, and a node that may be empty in the path a is positions, so for the path a, the target definition depth is 1; nodes that may be empty in path B are positions and impressions, so the target definition depth is 2 for path B.
In one embodiment, the step S302 performs determinant storing encoding on the data information in the target leaf node to obtain an encoding result, which may include: acquiring position information of the target message about the target data column; the position information comprises a target definition depth and a target repetition depth; and performing determinant storage coding on the data information and the position information in the target leaf node to obtain a coding result. Optionally, the manner of performing determinant storage encoding on the data information and the position information in the target leaf node to obtain the encoding result may be: the target defined depth, the target repetition depth, and the data value are encoded into the form of triples, which are stored as encoding results for the target leaf node at the target storage location of the target data column.
Step S303, if the message tree does not include the target leaf node, acquiring an invalid mark position in the target data column according to the target storage position, and setting an invalid data mark at the invalid mark position.
Wherein the invalid data flag may be an invalid repetition depth; or the invalid data tag may be an invalid defined depth; still alternatively, the invalid data flag may be collectively represented by an invalid repetition depth, an invalid definition depth, and an invalid data value. In one embodiment, the implementation of determining the invalid flag location in the target data column according to the target storage location may be: the target storage location is taken as an invalid marker location. The invalid repetition depth may be determined according to a repetition depth range corresponding to the target data column, and the invalid repetition depth exceeds the repetition depth range. The invalidation definition depth may be any value set in advance.
Based on the above, the acquiring the invalid flag location from the target data column according to the target storage location in step S303, and setting the invalid data flag at the invalid flag location, includes: acquiring a repeated depth range corresponding to the target data column; acquiring an invalid definition depth for setting an invalid mark, and acquiring an invalid repetition depth for setting an invalid mark based on the repetition depth range, the invalid repetition depth exceeding the repetition depth range; generating an invalid data tag based on the invalid defined depth and the invalid repeat depth; the target storage location is taken as an invalid tag location, and the invalid data tag is added at the target storage location.
The repeating depth range corresponding to the target data column is composed of a maximum repeating depth and a minimum repeating depth, and the maximum repeating depth and the minimum repeating depth, wherein the maximum repeating depth refers to the number of nodes which can be repeated on a path from a root node to a target leaf node in the case that the message tree comprises the target leaf node, and the minimum repeating depth is 0. Optionally, the acquiring, based on the repetition depth range, an invalid repetition depth for setting an invalid flag includes: obtaining the maximum repetition depth or the minimum repetition depth in the repetition depth range; and acquiring a preset value, and processing the maximum repetition depth or the minimum repetition depth based on the preset value to obtain an invalid repetition depth. In brief, a preset value is adopted to process the maximum repetition depth and the minimum repetition depth corresponding to the target data column to obtain an invalid repetition depth. The obtained invalid definition depth may be any preset value, for example, 0, 100 or any other value. The invalid data value may be denoted null. Based on the above, one invalid data flag can be expressed as: invalid repetition depth invalid definition depth null; or may be expressed as [ invalid repetition depth invalid definition depth ]; yet alternatively, it may be expressed as: [ invalid repetition depth null ]; it can also be expressed as: [ invalid definition depth null ] and the like.
The following is a comparison of the message processing method provided by way of example in the embodiment of fig. 3 with the prior art method:
Assuming that 5 messages to be stored are acquired by the message processing device, assuming that the target data column is the data column storing the click_id, the target leaf node should be the click_id, and the 5 messages are sequentially expressed as :1、Pageview:{Positions:{Impressions:{click_id:100}}};2、Pageview:{Positions:{}};3、Pageview:{Positions:{}Positions:{
Impressions:{}}};4、Pageview:{Positions:{Impressions:{}
Impressions:{}}};5、Pageview:{Positions:{Impressions:{
click_id:200}}}。
Of the 5 messages, only the first and fifth messages include the target leaf node in the message tree, and the other three messages do not include the target leaf node in the message tree. When the prior art is adopted for processing the messages, determinant storage and coding processing is carried out on target leaf nodes of the 5 messages, and the coding result is stored in a target storage position of a target data column. Pageview is abbreviated pv, positions is abbreviated pos, impressions is abbreviated imp in the following description):
Specifically, a schematic diagram of performing column-type encoding storage on the above 5 messages may be referred to as fig. 5a, where the first row in fig. 5a represents a column-type storage encoding process for a first message, the second row represents a column-type storage encoding process for a second message, and the third and fourth rows represent column-type storage encoding processes for a third message; the fifth and sixth rows represent column store encoding procedures for the fourth message; the seventh row represents the columnar store encoding process for the fifth message:
In the first row, the target leaf node corresponds to a target repetition depth R-level=0, a target definition depth D-level=3, and a data Value value=100. The target leaf node is present in a new message, that is, the target leaf node is generated from the root node, so that the first repeated node in the first path from the root node to the target leaf node is the root node, the second repeated node is also the root node, and the number of repeatable nodes between the root node and the root node is 0, so that R-level=0 (hereinafter, where the target leaf node is generated from the root node, R of the target leaf node is equal to 0). The message tree of the first message includes the target leaf node, so the target definition depth is defined as the number of possible empty nodes in the path from the root node to the target leaf node, the path from the root node to the target leaf node is pageview- > positions- > impressions- > click_id, there are 3 nodes that may be empty in the above path, and thus the D-level=3 of the target leaf node.
In the second row, the target leaf node corresponds to a target repetition depth R-level=0, a target definition depth D-level=1, and a data Value value=null. The target leaf node is generated from the root node, and R-level=0 can be derived based on the above statement. In the second target message, the message tree does not include target leaf nodes, the target definition depth is defined as the number of empty nodes on the first path related to the target leaf nodes, the path on which the target leaf nodes are located is only defined to positions nodes, the first path related to the target leaf nodes in the second row is pageview- > positions, and the node on the path which can be empty is positions, so D-level=1. Since the click_id is not defined inside the second target message, value=null.
In the third line, the target repetition depth R-level=0, the target definition depth D-level=1, and the data Value value=null corresponding to the target leaf node. R-level=0 because in the path of the current root node to the target leaf node, the target leaf node is generated from the root point, the target repetition depth in this case is equal to 0 as known from the foregoing. The third row corresponds to the third message, the message tree corresponding to the third message does not include the target leaf node, the first path associated with the target leaf node in the third row is pageview- > positions, and there are 1 nodes that may be empty (positions is repeated, and when the repeat number is 0, the nodes are empty). Since click_id is not defined at this position, value is null.
In the fourth row, the target leaf node corresponds to a target repetition depth R-level=1, a target definition depth D-level=2, and a data Value value=null. The fourth row is for the third message, where the message tree for the third message does not include the target leaf node, and in the fourth row, the first path associated with the target leaf node is pageview- > positions- > impressions, with two nodes that may be empty (positions and impressions are repeated, and when the repeat number is 0, the node is empty), so D-level=2. The target leaf node click_id is not defined, so value is null.
The coding process of the fifth line-7 th line can be shown by referring to the above lines, and is not described in detail here.
In the above-described columnar-code storage process, whether or not a target leaf node corresponding to a target data column is included in a target message, a target definition depth and a target repetition depth need to be acquired, and a large number of target definition depths and target repetition depths need to be stored in the target data column. As shown in fig. 5a, although the target leaf node is defined in only two messages, 7 pairs of triples are still stored in the target data column. The second, third, fourth and fifth pairs of target definition depth and target repetition depth are only used for recovering the empty parent node, and in many practical application scenarios, the empty parent node recovered in the decoding process has no practical significance. Thus, this coding scheme actually reduces the coding efficiency while also increasing the size of the final output data.
By adopting the message processing method in the embodiment of the invention, determinant storage coding processing is carried out on target leaf nodes in the first message and the fifth message to obtain a coding result, and the coding result is stored in a corresponding storage position of a target data column, namely a triplet consisting of target definition depth, target repetition depth and data value is stored; for the second message-fourth message, an invalid flag is added at the corresponding target storage location.
Specifically, referring to fig. 5b, in a schematic diagram of the optimized column storage encoding of fig. 5a according to the embodiment of the present invention, in fig. 5b, when it is determined that a message tree corresponding to a message does not include a target leaf node, encoding resources are not wasted, but invalid data marks, such as invalid triples, are directly used to replace the encoding result to store the encoding result in the corresponding storage location of the target data column. Alternatively, in FIG. 5b, the invalid data tag may be a triplet of invalid defined depth, invalid duplicate depth, and invalid data value. The invalid repetition depth may be set to be the maximum repetition depth of the target data column plus 1, for example, in fig. 5a, the maximum repetition depth of the target data column corresponding to the target leaf node click-id is 3, and the invalid repetition depth may be set to be 3+1=4; the invalid definition depth may be a preset value, such as 0, and the invalid data value is indicated by nul. Based on this, the invalid data flag may be: r-level=4, d-level=0, value=null. In the column-type storage encoding process according to the embodiment of the present invention, it is found that the corresponding messages in the first row and the seventh row include the target leaf node, so that the encoding process for the data information in the target leaf node is the same as that in fig. 5 a. For the messages corresponding to the second row and the sixth row, the corresponding message tree does not comprise the target leaf node, column storage coding is not carried out on the target leaf node, and an invalid data mark R-level=4, D-level=0 and value=null are directly added at a storage position corresponding to each message, so that the data needing to be coded is reduced, and the coding efficiency is improved.
In the embodiment of the invention, a message processing device acquires a target message to be stored and a message tree corresponding to the target message, wherein the target message comprises a plurality of data information; further, if it is detected that the message tree corresponding to the target message includes a target leaf node corresponding to the target data column, determinant encoding is performed on data information in the target leaf node to obtain an encoding result, and the encoding result is stored in a target storage position of the target data column. If the message tree does not include the target leaf node corresponding to the target data column, determining an invalid mark position in the target data column according to the target storage position, and setting an invalid data mark at the invalid mark position. It should be understood that when the message tree corresponding to the target message does not include the target leaf node corresponding to the target data column, the target leaf node is not stored in a coding manner, and an invalid data tag is directly added, so that a part of coding resources can be saved, and the coded message is improved.
Based on the above-mentioned message processing method, the embodiment of the present invention further provides another message processing method, and referring to fig. 6, a flow chart of another message processing method provided by the embodiment of the present invention is shown. The message processing method shown in fig. 6 may be performed by a message processing apparatus, and in particular may be performed by a processor in the message processing apparatus. The message processing method shown in fig. 6 may include the steps of:
step S601, obtaining a target message to be stored and a message tree corresponding to the target message.
Step S602, if the message tree includes a target leaf node corresponding to the target data column, performing determinant storage encoding on the data information in the target leaf node to obtain an encoding result, and storing the encoding result in a target storage position in the target data column.
In an embodiment, some possible implementations included in step S601 to step S602 may refer to descriptions of related steps in the message processing method shown in fig. 3, which are not described herein.
Step S603, if the message tree does not include the target leaf node corresponding to the target data column, searching forward for the first reference storage location in the target data column based on the target storage location, and searching backward for the second reference storage location.
In one embodiment, if the message tree does not include the target leaf node corresponding to the target data column, in order to reduce the encoded data and improve the encoding efficiency, the data information in the leaf node may not be encoded, but an invalid flag position may be directly found in the target data column, and an invalid data flag is added in the invalid flag position to flag that the target message does not have data information in the target data column. The target data column can comprise a plurality of continuous storage positions, the data information in the plurality of messages to be stored is sequentially stored in the target data column, coding data is reduced, coding efficiency is improved, meanwhile, the embodiment of the invention can also jointly generate an invalid data mark for a plurality of continuously acquired messages to be stored which do not comprise target leaf nodes corresponding to the target data column, and the invalid data mark is stored in the target data column by finding an invalid mark position. In this way, the method can be called run-length encoding, and in the subsequent decoding process, according to the invalid data mark, it can accurately know which data information corresponding to the target data column in the message is empty, so that the data information can not be decoded, and decoding resources are saved to a certain extent. Specifically, this can be achieved through steps S603 to S605.
In step S603, it is first known that, in the several target messages to be stored that are acquired continuously, a first reference storage location in the target data column corresponding to the target message that does not include the target leaf node appears for the first time, where the first reference storage location satisfies the following condition: the encoding result is stored in a previous storage position located in the first reference storage position, and the encoding result is not stored in all storage positions between the first reference storage position and the target storage position; then, a second reference storage position of the target message which does not comprise the target leaf node and appears in the target data column for the last time in the plurality of continuously acquired targets to be stored is found, wherein the second reference storage position meets the following conditions: the encoding results are stored in a storage location that is located subsequent to the second reference storage location, and no encoding results are stored in all storage locations between the second reference storage location and the target storage location.
Assuming that the message processing device acquires 5 consecutive target messages to be stored as 5 messages corresponding to fig. 5b, in fig. 5b, no target leaf node is included in the message tree corresponding to the second message to the fourth message, the second row to the sixth row are column coding processes for the second message to the fourth message, and assuming that 500 represents the target storage location, the first reference storage location found according to the search method in step S604 is the storage location corresponding to the data information in the target leaf node of the second row, that is, the storage location corresponding to the data information in the target leaf node in the fifth row, that is, 501 in fig. 5b, that is, 502 in fig. 5 b.
Step S604, acquiring an invalid definition depth based on the number of messages to be stored corresponding to the storage positions between the first reference storage position and the second reference storage position, and acquiring an invalid repetition depth based on the repetition depth range corresponding to the target data column.
Further, in the embodiment of the present invention, the invalid definition depth may be obtained based on the number of storage locations included between the first reference storage location and the second reference storage location, where the message to be stored corresponding to the first reference storage location and the message to be stored corresponding to the second reference storage location include the message to be stored corresponding to the first reference storage location and the message to be stored corresponding to the second reference storage location, as shown in fig. 5b, the number of storage locations included between the first reference storage location and the second reference storage location is 5, where the storage locations of the third row and the storage location of the fourth row are all corresponding to the message to be stored of the third, the storage location of the fifth row and the storage location of the sixth row are all corresponding to the message to be stored of the fourth, and the storage location of the second row corresponds to the message to be stored of the second, so the number of the messages to be stored corresponding to the first reference storage location and the second reference storage location is 3, and the message to be stored of the second to be stored, the third message to be stored and the fourth to be stored respectively.
Optionally, acquiring the invalid definition depth based on the number of storage locations included between the first reference storage location and the second reference storage location includes: the number of storage locations included between the first reference storage location and the second reference storage location is set to an invalid defined depth. Or in other embodiments, it may also be provided that: the number is squared, the result is obtained as an invalid defined depth, etc. The examples of the present invention are merely illustrative of several possible embodiments, and the specific embodiments are not limited thereto.
In an embodiment, the implementation manner of acquiring the invalid repetition depth based on the repetition depth range corresponding to the target data column may refer to the description of the related steps in the embodiment of fig. 3, which is not repeated herein.
Step S605, taking the first reference storage location as an invalid flag location, generating an invalid data flag based on the invalid definition depth and the invalid repetition depth, and adding the invalid data flag to the first reference storage location.
After determining the invalid definition depth and the invalid repetition depth, an invalid data flag may be generated based on the invalid definition depth, the invalid repetition depth, and the invalid data value, and in particular, the invalid data flag may be expressed as [ the invalid definition depth, the invalid repetition depth, and the invalid data value ].
For example, referring to fig. 5c, which is a schematic diagram of still another column-type storage encoding according to an embodiment of the present invention, the schematic diagram shown in fig. 5c is based on fig. 5b, and the column-type storage encoding is performed on a message that does not include the target leaf node by using the methods shown in the above steps S603-S606. In fig. 5b, no target leaf node is included in the message corresponding to the second-sixth row, 500 represents a target storage location, a first reference storage location is searched forward from the target storage location for 501, and a second reference storage location is searched backward from the target storage location for 502. The number of messages to be stored corresponding between 502 and 501 is 3, then an invalid definition depth of 3 may be obtained. Assuming that the invalid repetition depth acquired based on the repetition depth range corresponding to the target data column is equal to the maximum repetition depth plus 1, the maximum repetition depth corresponding to the target data column is 3, the invalid repetition depth is 4. Adding the invalid definition depth, the invalid repetition depth and the invalid value as invalid data marks at the first reference storage location yields fig. 5c, and comparing fig. 5b and fig. 5c, it can be seen that the invalid data marks of the third row-sixth row are optimized in fig. 5c, and that the invalid definition depth 3 at the second row (i.e. the invalid mark location) indicates that there are consecutive 3 messages as data information in the target data column.
When decoding, the optimized message to be stored is recovered to the corresponding root node, and other nodes of the corresponding message tree are discarded, and the discarded nodes do not contain the defined target leaf nodes on the target data column, so that the nodes are discarded in most application scenes and the validity of the data is not affected.
Practice proves that for the advertisement data of 13G, if the column-type storage coding in the prior art is adopted, the occupied memory in the coding process is 13G, and the storage space required for storing the position information of leaf nodes of the advertisement data is 760 megabits; if the above-mentioned column-type storage coding method of this embodiment is adopted, the memory occupied in the coding process is reduced from 13G to 5G, and the storage space required for storing the position information of the leaf node of the advertisement data is reduced from 760 mega-n by 384 mega-n, and at the same time the coding speed is also improved by more than 20%.
In the embodiment of the invention, a message processing device acquires a target message to be stored and a message tree corresponding to the target message; further, if it is detected that the message tree corresponding to the target message includes a target leaf node corresponding to the target data column, determinant encoding is performed on data information in the target leaf node to obtain an encoding result, and the encoding result is stored in a target storage position of the target data column. If the fact that the data value of the data information in the target leaf node corresponding to the target data column is null is detected not to be included in the message tree corresponding to the target message, traversing forward based on the target storage position in the target data column to obtain a first reference storage position, and traversing backward to obtain a second reference storage position; further, acquiring an invalid definition depth based on the number of the messages to be stored corresponding to the first reference storage position and the second reference storage position, and acquiring an invalid repetition depth based on a repetition depth range corresponding to the target data column; and finally, taking the first reference storage position as an invalid mark position, generating an invalid data mark according to the invalid definition depth and the invalid repetition depth, and adding the invalid data mark to the invalid mark position. In this way, when the message tree corresponding to the target message does not include the target leaf node corresponding to the target data column, the target leaf node is not stored in a coding manner, and an invalid data mark is directly added, so that a part of coding resources can be saved, and the coded message is improved. And for a plurality of continuous messages to be stored, the data information of which is empty in the target data column, an invalid data mark can be added at a storage position corresponding to one message to be stored in a combined way, so that the storage space can be saved.
Based on the embodiments of the message processing methods shown in fig. 3 and fig. 6, the present invention provides an embodiment of a further message processing method, and referring to fig. 7, a flow chart of the further message processing method provided by the embodiment of the present invention is shown. The message processing method shown in fig. 7 may be executed by a message processing apparatus, and in particular, may be executed by a processor of the message processing apparatus, and the message processing method shown in fig. 7 may include the steps of:
Step S701, receiving a decoding instruction for decoding a target message, and acquiring a plurality of data columns related to the target message.
In one embodiment, the decoding instruction for decoding the target message may be generated after receiving a query operation on the target message; or may be generated upon receipt of a decoding operation on the target message.
As is apparent from the description of the foregoing embodiments, the target message includes a plurality of data information, the plurality of data information includes at least one data type, and the data information of one data type corresponds to one data column, and thus the plurality of data information included in the target message corresponds to a plurality of data columns in the columnar storage database. In addition, any one data information included in the target message is stored in a corresponding data column in a form of storing the encoded encoding result in a column; or any one of the data information included in the target message is marked in the corresponding data column in the form of an invalid data mark, so that, in order to decode to obtain a plurality of data information included in the target message, each data column corresponding to the target message is first acquired.
Step S702, traversing related storage locations related to the target message in the plurality of data columns sequentially.
In one embodiment, as can be seen from the foregoing embodiment, the target message corresponds to a target storage location in the target data column, and if the target message includes a target leaf node corresponding to the target data column, the data information in the target leaf node is subjected to column coding to obtain a coding result and then stored in the target storage location; if the target leaf node is not included in the target message, an invalid tag position is found in the corresponding data column according to the target storage position, an invalid data tag is generated, and the invalid data tag is added to the invalid tag position.
Alternatively, the invalid storage location may be equal to the target storage location, or may be a storage location located before the target storage location. Based on this, the relevant storage location in the plurality of data columns related to the target message may be a storage location including the encoding result or a storage location including an invalid flag.
Step S703, if there is a coding result in the relevant storage location of the current traversal, performing decoding processing on the coding result to obtain data information corresponding to the relevant storage location of the current traversal in the target message.
Step S704, if invalid data marks exist in the relevant storage position of the current traversal, skipping the relevant storage position of the current traversal, and continuing to traverse the next relevant storage position.
In the process of traversing each data column, if the data column is traversed to a certain relevant storage position, decoding the coding result to obtain data information corresponding to the relevant storage position traversed currently in the target message; if the traversal is performed until the invalid data mark exists in a certain relevant storage position, skipping the relevant storage position of the current traversal, and continuing to traverse the next relevant storage position without any processing on the invalid mark. And stopping traversing until all storage positions related to the target message in each data column are traversed.
In the embodiment of the invention, the message processing equipment corresponds each data information included in the target message to each data column, and each data column stores a coding result or an invalid data mark related to the data information; when a decoding instruction for decoding the target message is received, traversing storage positions related to the target message in a plurality of data columns in sequence; if the relevant storage position of the current traversal stores the coding result, decoding the coding result to obtain data information corresponding to the relevant storage position of the current traversal in the target message; if invalid data marks exist in the relevant storage position of the current traversal, skipping the relevant storage position of the current traversal, and continuing to traverse the next relevant storage position. Therefore, the data to be decoded in the decoding process can be reduced, a part of decoding power consumption is saved, and the decoding efficiency can be improved.
Based on the description of the embodiments of the message processing method, the message processing method according to the embodiments of the present invention may be applied to various application scenarios of data storage and query, such as storage and query of advertisement data. The following description will be given by taking advertisement data as an example:
Referring to fig. 8, an application scenario diagram of a message processing method according to an embodiment of the present invention is provided, where 801 represents an advertisement delivery terminal, 802 represents a message processing device, and 803 represents a columnar storage database. Each advertisement delivery party 801 transmits an exposure message of advertisement data to be delivered to the message processing apparatus 802, wherein the data information included in the exposure message is one or more of the following: exposure stage information, age information of an exposure object, and sex information of the exposure object. If the data information is exposure platform information, the data value included in the data information may include an exposure platform type, which may include a social platform, a browser, and other platforms, etc.; if the data information is age information of the exposure object, the data value included in the data information is age of the exposure object; if the data information is sex information of the exposure object, the data value included in the data information is the sex of the exposure object. The message processing apparatus 802 processes each exposure message by using the message processing method shown in fig. 3 and 6 described above, and stores each exposure message and the data information included in each exposure message into each data column of the column-type storage database.
Assuming that the exposure message comprises an exposure platform type, wherein the exposure platform type corresponds to a data column A in the column storage database; the exposure message comprises sex information of an exposure object, and the sex information of the exposure object corresponds to a data column B in a column storage database; the exposure message includes age information of the exposure object, and the age information of the exposure object corresponds to a data column C in the column-like storage data. As can be seen from fig. 8, each data column includes a plurality of storage locations, and each storage location sequentially stores the encoding result or invalid data flag of the corresponding data information in the respective exposure message.
In one embodiment, any one advertisement delivery party can determine which type of exposure platform with the most current exposure advertisement data is according to the exposure platform information stored in the columnar storage database, and then can deliver advertisements on the type of exposure platform. Specifically, the advertisement delivery party can send a query request to the message processing equipment to query the exposure platform information stored in the columnar storage database; the message processing equipment responds to the inquiry instruction of the exposure platform information and counts the number of each type of exposure platform of the target data column; outputting the number of each type of exposure platform so that advertisement data is subsequently put on the basis of the number of each type of exposure platform.
As can be seen from the foregoing, each storage location in each data column in the column storage database is associated with a target message, for example, in the column storage database shown in fig. 8, the storage locations in the data column a from top to bottom correspond to the exposure message 1, the exposure message 2, the exposure message 3, and the exposure message 1 in sequence; the storage positions from top to bottom in the data B correspond to the exposure message 1, the exposure message 2, the exposure message 3 and the exposure message 1 in sequence. Based on the method, each advertisement dispenser can count the number of exposure messages of which the exposure platform is a browser and the exposure object is a male in the columnar storage database. Specifically, in response to the statistical request, the message processing device may first search the exposure message with the exposure platform type being the browser in the data column a corresponding to the exposure platform type, and assume that the type of the exposure platform in the found exposure message 1 and the found exposure message 3 is the browser; further, searching whether the sex in the sex information of the exposure objects included in the exposure message 1 and the exposure message 3 is male in the data column B corresponding to the sex; if yes, the output exposure platform is a browser, and the number of exposure messages of which the exposure object is male is 2.
It should be understood that in practical applications, the advertisement dispenser terminal 801 and the message processing apparatus 802 may be the same apparatus, that is, the advertisement dispenser terminal 801 may perform the message processing methods shown in fig. 3 and 6 to store respective advertisement data in the columnar storage database.
Based on the above embodiment of the message processing method, the embodiment of the invention also provides a message processing device. Referring to fig. 9, a schematic structural diagram of a message processing apparatus according to an embodiment of the present invention is provided. In the message processing apparatus shown in fig. 9, the following units can be operated:
An obtaining unit 901, configured to obtain a target message to be stored, where the target message includes a plurality of data information;
the obtaining unit 901 is further configured to obtain a message tree corresponding to the target message, where the message tree includes a plurality of leaf nodes, and each leaf node is configured to store one data information;
A processing unit 902, configured to, if the message tree includes a target leaf node corresponding to a target data column, perform determinant storage encoding on data information in the target leaf node to obtain an encoding result, and store the encoding result in a target storage location in the target data column, where the target storage location corresponds to the target message;
The processing unit 902 is further configured to determine an invalid flag location in the target data column according to the target storage location if the message tree does not include a target leaf node corresponding to the target data column, and set an invalid data flag at the invalid flag location.
In one embodiment, the plurality of data information included in the target message belongs to at least one data type, each data type corresponds to one data column, and the plurality of data information belonging to the same data type is stored in the same data column; the message tree comprises a plurality of leaf nodes, each leaf node is used for storing one piece of data information of the target message; each leaf node corresponds to a node type, and the node type comprises a necessary type, an optional type and a repeatable type; the data value of the data information in the essential type leaf node is a non-null value, the data value of the data information in the optional type leaf node is a null value or a non-null value, and the data value of the data information in the repeatable type leaf node is a null value or a non-null value.
In one embodiment, the message tree further comprises a root node; the processing unit 902 performs the following operations when performing determinant storage encoding on the data information in the target leaf node to obtain an encoding result: acquiring position information of the target message relative to the target data column; the position information comprises a target definition depth and a target repetition depth; performing determinant storage coding on the data information in the target leaf node and the position information of the target message to obtain a coding result; the target repetition depth is the number of repeatable nodes in a second path in the message tree, the second path being the path between a second repeatable node in a first path in the message tree associated with the target leaf node and the root node; if the message tree includes a target leaf node, the target defined depth is the number of available empty nodes in a third path in the message tree, the third path being a path from a root node to the target leaf node in the message tree; if the message tree does not include a target leaf node, the target defined depth is the number of available nodes in a first path in the message tree associated with the target leaf node.
In one embodiment, the processing unit 902, when acquiring an invalid flag position from the target data column according to the target storage position and setting an invalid data flag at the invalid flag position, performs the following operations: acquiring a repeated depth range corresponding to the target data column; acquiring an invalid definition depth for setting an invalid mark, and acquiring an invalid repetition depth for setting an invalid mark based on the repetition depth range, the invalid repetition depth exceeding the repetition depth range; generating an invalid data tag based on the invalid defined depth and the invalid repeat depth; the target storage location is taken as an invalid tag location, and the invalid data tag is added at the target storage location.
In one embodiment, the processing unit 902, when acquiring an invalid repetition depth for setting an invalid flag based on the repetition depth range, performs the following operations: obtaining the maximum repetition depth or the minimum repetition depth in the repetition depth range; and acquiring a preset value, and processing the maximum repetition depth or the minimum repetition depth based on the preset value to obtain an invalid repetition depth.
In one embodiment, the target data column includes a plurality of consecutive storage locations, the data information of the plurality of messages to be stored is sequentially stored to the target data column, and the processing unit 902 performs the following operations when determining an invalid flag location in the target data column according to the target storage location and setting an invalid data flag at the invalid flag location: looking up a first reference storage location forward in the target data column based on the target storage location, the first reference storage location satisfying the following condition: the encoding result is stored in a previous storage position located in the first reference storage position, and the encoding result is not stored in all storage positions between the first reference storage position and the target storage position; looking back a second reference storage location that satisfies the following condition: the encoding result is stored in a later storage position of the second reference storage position, and the encoding result is not stored in all storage positions between the second reference storage position and the target storage position; acquiring an invalid definition depth based on the number of messages to be stored corresponding to a storage position between the first reference storage position and the second reference storage position, and acquiring an invalid repetition depth based on a repetition depth range corresponding to the target data column; and generating an invalid data mark based on the invalid definition depth and the invalid repetition depth and adding the invalid data mark to the first reference storage position.
According to one embodiment of the invention, the steps involved in the message processing methods shown in fig. 3 and 6 may be performed by the units in the message processing apparatus shown in fig. 9. For example, step S301 shown in fig. 3 may be performed by the acquisition unit 901 in the message processing apparatus shown in the figure, and steps S302 and S303 may be performed by the processing unit 902 in the message processing apparatus shown in fig. 9; as another example, step S601 shown in fig. 6 may be performed by the acquisition unit 901 in the message processing apparatus shown in fig. 9, and steps S603 to S605 may be performed by the processing unit 1102 in the message processing apparatus shown in fig. 9.
According to another embodiment of the present invention, each unit in the message processing apparatus shown in fig. 9 may be separately or completely combined into one or several other units, or some unit(s) thereof may be further split into a plurality of units with smaller functions, which may achieve the same operation without affecting the implementation of the technical effects of the embodiments of the present invention. The above units are divided based on logic functions, and in practical applications, the functions of one unit may be implemented by a plurality of units, or the functions of a plurality of units may be implemented by one unit. In other embodiments of the invention, the message-based processing apparatus may also include other units, and in actual practice, these functions may also be implemented with the assistance of other units, and may be implemented by the cooperation of multiple units.
According to another embodiment of the present invention, a message processing apparatus as shown in fig. 9 may be constructed by running a computer program (including program code) capable of executing the steps involved in the respective methods as shown in fig. 3 or 6 on a general-purpose computing device such as a computer including a processing element such as a Central Processing Unit (CPU), a random access storage medium (RAM), a read only storage medium (ROM), and the like, and a storage element, and implementing the message processing method of the embodiment of the present invention. The computer program may be recorded on, for example, a computer readable storage medium, and loaded into and executed by the computing device described above.
In the embodiment of the invention, a message processing device acquires a target message to be stored and a message tree corresponding to the target message, wherein the target message comprises a plurality of data information; further, if it is detected that the message tree corresponding to the target message includes a target leaf node corresponding to the target data column, determinant encoding is performed on data information in the target leaf node to obtain an encoding result, and the encoding result is stored in a target storage position of the target data column. If the message tree does not include the target leaf node corresponding to the target data column, determining an invalid mark position in the target data column according to the target storage position, and setting an invalid data mark at the invalid mark position. It should be understood that when the message tree corresponding to the target message does not include the target leaf node corresponding to the target data column, the target leaf node is not stored in a coding manner, and an invalid data tag is directly added, so that a part of coding resources can be saved, and the coded message is improved.
Based on the above embodiments of the message processing method and the message processing apparatus, the embodiment of the present invention further provides another message processing apparatus, and referring to fig. 10, a schematic structural diagram of another message processing apparatus provided in the embodiment of the present invention is provided, where the message processing apparatus shown in fig. 10 may operate the following units:
A receiving unit 1001, configured to receive a decoding instruction for decoding a target message, and obtain a plurality of data columns related to the target message, where the target message includes a plurality of data information, and each data column stores one or more of the following: a coding result related to the data information and an invalid data mark corresponding to the data information, wherein the coding result and the invalid data mark are stored in the corresponding data columns by adopting the message processing methods shown in fig. 3 and 6;
A processing unit 1002, configured to sequentially traverse related storage locations related to the target message in a plurality of data columns;
The processing unit 1002 is further configured to decode the encoding result if the encoding result exists in the relevant storage location currently traversed, to obtain data information corresponding to the relevant storage location currently traversed in the target message;
The processing unit 1002 is further configured to skip the currently traversed relevant storage location and continue to traverse the next relevant storage location if there is an invalid data tag in the currently traversed relevant storage location.
In one embodiment, the target message includes an exposure message of advertisement data, the exposure message includes exposure platform information for delivering the advertisement data, the exposure platform information includes a type of an exposure platform, the exposure platform information corresponds to a target data column in the column storage database, and the message processing apparatus may further include a statistics unit 1003 and an output unit 1004:
the statistics unit 1003 is configured to, in response to a query instruction for exposure platform information, count the number of each type of exposure platform in the target data column;
The output unit 1004 is configured to output the number of each type of exposure stage, so that advertisement data is subsequently delivered based on the number of each type of exposure stage.
According to one embodiment of the invention, the steps involved in the message processing method shown in fig. 7 may be performed by the units in the message processing apparatus shown in fig. 10. For example, step S701 described in fig. 7 may be performed by the acquisition unit 1001 in the message processing apparatus described in the figure, and steps S702 to S704 may be performed by the processing unit 1002 in the message processing apparatus described in fig. 10.
According to another embodiment of the present invention, each unit in the message processing apparatus shown in fig. 10 may be separately or completely combined into one or several other units, or some unit(s) thereof may be further split into a plurality of units with smaller functions, which may achieve the same operation without affecting the implementation of the technical effects of the embodiments of the present invention. The above units are divided based on logic functions, and in practical applications, the functions of one unit may be implemented by a plurality of units, or the functions of a plurality of units may be implemented by one unit. In other embodiments of the invention, the message-based processing apparatus may also include other units, and in actual practice, these functions may also be implemented with the assistance of other units, and may be implemented by the cooperation of multiple units.
According to another embodiment of the present invention, a message processing apparatus as shown in fig. 10 may be constructed by running a computer program (including program code) capable of executing the steps involved in the respective methods as shown in fig. 7 on a general-purpose computing device such as a computer including a processing element such as a Central Processing Unit (CPU), a random access storage medium (RAM), a read only storage medium (ROM), and the like, and a storage element, and implementing the message processing method of the embodiment of the present invention. The computer program may be recorded on, for example, a computer readable storage medium, and loaded into and executed by the computing device described above.
In the embodiment of the invention, the message processing equipment corresponds each data information included in the target message to each data column, and each data column stores a coding result or an invalid data mark related to the data information; when a decoding instruction for decoding the target message is received, traversing storage positions related to the target message in a plurality of data columns in sequence; if the relevant storage position of the current traversal stores the coding result, decoding the coding result to obtain data information corresponding to the relevant storage position of the current traversal in the target message; if invalid data marks exist in the relevant storage position of the current traversal, skipping the relevant storage position of the current traversal, and continuing to traverse the next relevant storage position. Therefore, the data to be decoded in the decoding process can be reduced, a part of decoding power consumption is saved, and the decoding efficiency can be improved.
Based on the above-mentioned method embodiment and the device embodiment, the embodiment of the present invention further provides a terminal, and referring to fig. 11, a schematic structural diagram of a terminal provided in the embodiment of the present invention is provided, where the terminal shown in fig. 11 may at least include a processor 1101, an input interface 1102, an output interface 1103, and a computer storage medium 1104. Wherein the processor 1101, the input interface 1102, the output interface 1103, and the computer storage medium 1104 may be connected by a bus or other means.
A computer storage medium 1104 may be stored in a memory of a node device, the computer storage medium 1304 for storing a computer program comprising program instructions, the processor 1101 for executing the program instructions stored by the computer storage medium 1104. The processor 1101 (or CPU (Central Processing Unit, central processing unit)) is a computing core and a control core of the message processing apparatus, which is adapted to implement one or more instructions, in particular to load and execute one or more instructions to implement a corresponding method flow or a corresponding function; in one embodiment, the processor 1101 of an embodiment of the present invention may be configured to perform: acquiring a target message to be stored and a message tree corresponding to the target message, wherein the target message comprises a plurality of data information, the message tree comprises a plurality of leaf nodes, and each leaf node is used for storing one data information; if the message tree comprises a target leaf node corresponding to a target data column, performing determinant storage coding on data information in the target leaf node to obtain a coding result, and storing the coding result in a target storage position in the target data column, wherein the target storage position corresponds to the target message; and if the message tree does not comprise the target leaf node corresponding to the target data column, determining an invalid mark position in the target data column according to the target storage position, and setting an invalid data mark at the invalid mark position.
In other embodiments, the processor 1101 of an embodiment of the present invention may be further configured to perform: receiving a decoding instruction for decoding a target message, and acquiring a plurality of data columns related to the target message, wherein the target message comprises a plurality of data information, and each data column stores one or more of the following: the method comprises the steps of storing a coding result related to data information and an invalid data mark corresponding to the data information in a corresponding data column by adopting the message processing method; traversing related storage locations in a plurality of data columns related to the target message in sequence; if the coding result exists in the relevant storage position of the current traversal, decoding the coding result to obtain data information corresponding to the relevant storage position of the current traversal in the target message; and if invalid data marks exist in the related storage position of the current traversal, skipping the related storage position of the current traversal, and continuing to traverse the next related storage position.
The embodiment of the invention also provides a computer storage medium (Memory), which is a Memory device in the node device and is used for storing programs and data. It will be appreciated that the computer storage media herein may include both built-in storage media in the message processing device and extended storage media supported by the message processing device. The computer storage media provides storage space that stores an operating system of the message processing device. Also stored in this memory space are one or more instructions, which may be one or more computer programs (including program code), adapted to be loaded and executed by the processor 1101. The computer storage medium herein may be a high-speed RAM memory or a non-volatile memory (non-volatile memory), such as at least one magnetic disk memory; optionally, at least one computer storage medium remote from the processor may be present.
In one embodiment, one or more instructions stored in a computer storage medium may be loaded and executed by processor 1301 to implement the corresponding steps of the method in the message processing method embodiment described above with respect to fig. 3 and 6, and in a specific implementation, the one or more instructions in the computer storage medium are loaded and executed by processor 1301 to perform the steps of: acquiring a target message to be stored and a message tree corresponding to the target message, wherein the target message comprises a plurality of data information, the message tree comprises a plurality of leaf nodes, and each leaf node is used for storing one data information; if the message tree comprises a target leaf node corresponding to a target data column, performing determinant storage coding on data information in the target leaf node to obtain a coding result, and storing the coding result in a target storage position in the target data column, wherein the target storage position corresponds to the target message; and if the message tree does not comprise the target leaf node corresponding to the target data column, determining an invalid mark position in the target data column according to the target storage position, and setting an invalid data mark at the invalid mark position.
In one embodiment, the plurality of data information included in the target message belongs to at least one data type, each data type corresponds to one data column, and the plurality of data information belonging to the same data type is stored in the same data column; the message tree comprises a plurality of leaf nodes, each leaf node is used for storing one piece of data information of the target message; each leaf node corresponds to a node type, and the node type comprises a necessary type, an optional type and a repeatable type; the data value of the data information in the essential type leaf node is a non-null value, the data value of the data information in the optional type leaf node is a null value or a non-null value, and the data value of the data information in the repeatable type leaf node is a null value or a non-null value.
In one embodiment, the message tree further comprises a root node; the processor 1101 performs the following operations when performing determinant storing encoding on the data information in the target leaf node to obtain an encoding result: acquiring position information of the target message relative to the target data column; the position information comprises a target definition depth and a target repetition depth; performing determinant storage coding on the data information in the target leaf node and the position information of the target message to obtain a coding result; the target repetition depth is the number of repeatable nodes in a second path in the message tree, the second path being the path between a second repeatable node in a first path in the message tree associated with the target leaf node and the root node; if the message tree includes a target leaf node, the target defined depth is the number of available empty nodes in a third path in the message tree, the third path being a path from a root node to the target leaf node in the message tree; if the message tree does not include a target leaf node, the target defined depth is the number of available nodes in a first path in the message tree associated with the target leaf node.
In one embodiment, the processor 1101 performs the following operations when acquiring an invalid flag location from the target data column according to the target storage location and setting an invalid data flag at the invalid flag location: acquiring a repeated depth range corresponding to the target data column; acquiring an invalid definition depth for setting an invalid mark, and acquiring an invalid repetition depth for setting an invalid mark based on the repetition depth range, the invalid repetition depth exceeding the repetition depth range; generating an invalid data tag based on the invalid defined depth and the invalid repeat depth; the target storage location is taken as an invalid tag location, and the invalid data tag is added at the target storage location.
In one embodiment, the processor 1101 performs the following operations when acquiring an invalid repetition depth for setting an invalid flag based on the repetition depth range: obtaining the maximum repetition depth or the minimum repetition depth in the repetition depth range; and acquiring a preset value, and processing the maximum repetition depth or the minimum repetition depth based on the preset value to obtain an invalid repetition depth.
In one embodiment, the target data column includes a plurality of consecutive storage locations, the data information of a plurality of messages to be stored is sequentially stored to the target data column, and the processor 1101 performs the following operations when determining an invalid flag location in the target data column according to the target storage location and setting an invalid data flag at the invalid flag location: looking up a first reference storage location forward in the target data column based on the target storage location, the first reference storage location satisfying the following condition: the encoding result is stored in a previous storage position located in the first reference storage position, and the encoding result is not stored in all storage positions between the first reference storage position and the target storage position; looking back a second reference storage location that satisfies the following condition: the encoding result is stored in a later storage position of the second reference storage position, and the encoding result is not stored in all storage positions between the second reference storage position and the target storage position; acquiring an invalid definition depth based on the number of messages to be stored corresponding to a storage position between the first reference storage position and the second reference storage position, and acquiring an invalid repetition depth based on a repetition depth range corresponding to the target data column; and generating an invalid data mark based on the invalid definition depth and the invalid repetition depth and adding the invalid data mark to the first reference storage position.
In the embodiment of the invention, a message processing device acquires a target message to be stored and a message tree corresponding to the target message, wherein the target message comprises a plurality of data information; further, if it is detected that the message tree corresponding to the target message includes a target leaf node corresponding to the target data column, determinant encoding is performed on data information in the target leaf node to obtain an encoding result, and the encoding result is stored in a target storage position of the target data column. If the message tree does not include the target leaf node corresponding to the target data column, determining an invalid mark position in the target data column according to the target storage position, and setting an invalid data mark at the invalid mark position. It should be understood that when the message tree corresponding to the target message does not include the target leaf node corresponding to the target data column, the target leaf node is not stored in a coding manner, and an invalid data tag is directly added, so that a part of coding resources can be saved, and the coded message is improved.
In one embodiment, the computer storage medium provided in the embodiments of the present invention may be loaded and executed by the processor 1101 to implement the corresponding steps of the method in the embodiment of the message processing method described above with respect to fig. 7, where the one or more instructions in the computer storage medium are loaded and executed by the processor 1101 to perform the following steps: receiving a decoding instruction for decoding a target message, and acquiring a plurality of data columns related to the target message, wherein the target message comprises a plurality of data information, and each data column stores one or more of the following: the method comprises the steps of storing a coding result related to data information and an invalid data mark corresponding to the data information in a corresponding data column by adopting the message processing method; traversing related storage locations in a plurality of data columns related to the target message in sequence; if the coding result exists in the relevant storage position of the current traversal, decoding the coding result to obtain data information corresponding to the relevant storage position of the current traversal in the target message; and if invalid data marks exist in the related storage position of the current traversal, skipping the related storage position of the current traversal, and continuing to traverse the next related storage position.
In the embodiment of the invention, the message processing equipment corresponds each data information included in the target message to each data column, and each data column stores a coding result or an invalid data mark related to the data information; when a decoding instruction for decoding the target message is received, traversing storage positions related to the target message in a plurality of data columns in sequence; if the relevant storage position of the current traversal stores the coding result, decoding the coding result to obtain data information corresponding to the relevant storage position of the current traversal in the target message; if invalid data marks exist in the relevant storage position of the current traversal, skipping the relevant storage position of the current traversal, and continuing to traverse the next relevant storage position. Therefore, the data to be decoded in the decoding process can be reduced, a part of decoding power consumption is saved, and the decoding efficiency can be improved.
The above disclosure is illustrative only of some embodiments of the invention and is not intended to limit the scope of the invention, which is defined by the claims and their equivalents.

Claims (12)

1. A method of message processing comprising:
Acquiring a target message to be stored and a message tree corresponding to the target message, wherein the target message comprises a plurality of data information, the message tree comprises a plurality of leaf nodes, and each leaf node is used for storing one data information;
If the message tree comprises a target leaf node corresponding to a target data column, performing determinant storage coding on data information in the target leaf node to obtain a coding result, and storing the coding result in a target storage position in the target data column, wherein the target storage position corresponds to the target message;
And if the message tree does not comprise the target leaf node corresponding to the target data column, determining an invalid mark position in the target data column according to the target storage position, and setting an invalid data mark at the invalid mark position.
2. The method of claim 1, wherein the plurality of data information included in the target message belongs to at least one data type, each data type corresponds to a data column, and the plurality of data information belonging to the same data type is stored to the same data column;
The message tree comprises a plurality of leaf nodes, each leaf node is used for storing one piece of data information of the target message; each leaf node corresponds to a node type, and the node type comprises a necessary type, an optional type and a repeatable type; the data value of the data information in the essential type leaf node is a non-null value, the data value of the data information in the optional type leaf node is a null value or a non-null value, and the data value of the data information in the repeatable type leaf node is a null value or a non-null value.
3. The method of claim 2, wherein the message tree further comprises a root node; the step of performing determinant storage coding on the data information in the target leaf node to obtain a coding result comprises the following steps:
acquiring position information of the target message relative to the target data column; the position information comprises a target definition depth and a target repetition depth;
performing determinant storage coding on the data information in the target leaf node and the position information of the target message to obtain a coding result;
The target repetition depth is the number of repeatable nodes in a second path in the message tree, the second path being the path between a second repeatable node in a first path in the message tree associated with the target leaf node and the root node;
If the message tree includes a target leaf node, the target defined depth is the number of available empty nodes in a third path in the message tree, the third path being a path from a root node to the target leaf node in the message tree; if the message tree does not include a target leaf node, the target defined depth is the number of available nodes in a first path in the message tree associated with the target leaf node.
4. The method of claim 1, wherein the obtaining an invalid tag location from the target data column based on the target storage location and setting an invalid data tag at the invalid tag location comprises:
Acquiring a repeated depth range corresponding to the target data column;
Acquiring an invalid definition depth for setting an invalid mark, and acquiring an invalid repetition depth for setting an invalid mark based on the repetition depth range, the invalid repetition depth exceeding the repetition depth range;
generating an invalid data tag based on the invalid defined depth and the invalid repeat depth;
the target storage location is taken as an invalid tag location, and the invalid data tag is added at the target storage location.
5. The method of claim 4, wherein the obtaining an invalid repetition depth for setting an invalid flag based on the repetition depth range comprises:
obtaining the maximum repetition depth or the minimum repetition depth in the repetition depth range;
and acquiring a preset value, and processing the maximum repetition depth or the minimum repetition depth based on the preset value to obtain an invalid repetition depth.
6. The method of claim 1, wherein the target data column includes a plurality of consecutive storage locations, the data information of a plurality of messages to be stored is sequentially stored to the target data column, the determining an invalid flag location in the target data column according to the target storage location, and setting an invalid data flag at the invalid flag location, comprises:
Looking up a first reference storage location forward in the target data column based on the target storage location, the first reference storage location satisfying the following condition: the encoding result is stored in a previous storage position located in the first reference storage position, and the encoding result is not stored in all storage positions between the first reference storage position and the target storage position;
looking back a second reference storage location that satisfies the following condition: the encoding result is stored in a later storage position of the second reference storage position, and the encoding result is not stored in all storage positions between the second reference storage position and the target storage position;
Acquiring an invalid definition depth based on the number of messages to be stored corresponding to a storage position between the first reference storage position and the second reference storage position, and acquiring an invalid repetition depth based on a repetition depth range corresponding to the target data column;
And generating an invalid data mark based on the invalid definition depth and the invalid repetition depth and adding the invalid data mark to the first reference storage position.
7. A method of message processing comprising:
Receiving a decoding instruction for decoding a target message, and acquiring a plurality of data columns related to the target message, wherein the target message comprises a plurality of data information, and each data column stores one or more of the following: a data information related encoding result and an invalid data tag corresponding to the data information, wherein the encoding result and the invalid data tag are stored in the corresponding data column by adopting the message processing method according to claims 1-6;
traversing related storage locations in a plurality of data columns related to the target message in sequence;
If the coding result exists in the relevant storage position of the current traversal, decoding the coding result to obtain data information corresponding to the relevant storage position of the current traversal in the target message;
and if invalid data marks exist in the related storage position of the current traversal, skipping the related storage position of the current traversal, and continuing to traverse the next related storage position.
8. The method of claim 7, wherein the target message comprises an exposure message of advertisement data, the exposure message comprising exposure platform information for posting the advertisement data, the exposure platform information comprising a type of exposure platform, the exposure platform information corresponding to a target data column in the columnar storage database, the method further comprising:
responding to a query instruction of exposure platform information, and counting the number of each type of exposure platform of the target data column;
outputting the number of each type of exposure platform so that advertisement data is subsequently put on the basis of the number of each type of exposure platform.
9. A message processing apparatus, comprising:
an acquisition unit configured to acquire a target message to be stored, where the target message includes a plurality of data information;
the acquisition unit is further used for acquiring a message tree corresponding to the target message, wherein the message tree comprises a plurality of leaf nodes, and each leaf node is used for storing one data message;
The processing unit is used for performing determinant storage coding on the data information in the target leaf node to obtain a coding result if the message tree comprises a target leaf node corresponding to a target data column, and storing the coding result in a target storage position in the target data column, wherein the target storage position corresponds to the target message;
The processing unit is further configured to determine an invalid flag location in the target data column according to the target storage location if the message tree does not include a target leaf node corresponding to the target data column, and set an invalid data flag at the invalid flag location.
10. A message processing apparatus, comprising:
A receiving unit, configured to receive a decoding instruction for decoding a target message, and acquire a plurality of data columns related to the target message, where the target message includes a plurality of data information, and each data column stores one or more of the following: a data information related encoding result and an invalid data tag corresponding to the data information, wherein the encoding result and the invalid data tag are stored in the corresponding data column by adopting the message processing method according to claims 1-6;
the processing unit is used for traversing related storage positions related to the target message in a plurality of data columns in sequence;
The processing unit is further used for decoding the coding result if the coding result exists in the relevant storage position of the current traversal, so as to obtain data information corresponding to the relevant storage position of the current traversal in the target message;
and the processing unit is further used for skipping the relevant storage position of the current traversal if the invalid data mark exists in the relevant storage position of the current traversal, and continuing to traverse the next relevant storage position.
11. A message processing apparatus, comprising:
A processor adapted to implement one or more instructions; and
Computer storage medium storing one or more instructions adapted to be loaded by the processor and to perform the message processing method according to any of claims 1-6 or the message processing method according to any of claims 8-9.
12. A computer storage medium having stored therein computer program instructions for performing the message processing method of any of claims 1-6 or the message processing method of any of claims 8-9 when executed by a processor.
CN202010106534.8A 2020-02-20 2020-02-20 Message processing method, device, message processing equipment and storage medium Active CN113282578B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010106534.8A CN113282578B (en) 2020-02-20 2020-02-20 Message processing method, device, message processing equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010106534.8A CN113282578B (en) 2020-02-20 2020-02-20 Message processing method, device, message processing equipment and storage medium

Publications (2)

Publication Number Publication Date
CN113282578A CN113282578A (en) 2021-08-20
CN113282578B true CN113282578B (en) 2024-07-09

Family

ID=77275321

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010106534.8A Active CN113282578B (en) 2020-02-20 2020-02-20 Message processing method, device, message processing equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113282578B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102396222A (en) * 2009-06-09 2012-03-28 索尼公司 Adaptive entropy coding for images and videos using set partitioning in generalized hierarchical trees
CN103678556A (en) * 2013-12-06 2014-03-26 华为技术有限公司 Method for processing column-oriented database and processing equipment

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8341134B2 (en) * 2010-12-10 2012-12-25 International Business Machines Corporation Asynchronous deletion of a range of messages processed by a parallel database replication apply process
US10324961B2 (en) * 2017-01-17 2019-06-18 International Business Machines Corporation Automatic feature extraction from a relational database
CN107016071B (en) * 2017-03-23 2019-06-18 中国科学院计算技术研究所 A kind of method and system using simple path characteristic optimization tree data
CN110362572B (en) * 2019-06-25 2022-07-01 浙江邦盛科技股份有限公司 Sequential database system based on column type storage

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102396222A (en) * 2009-06-09 2012-03-28 索尼公司 Adaptive entropy coding for images and videos using set partitioning in generalized hierarchical trees
CN103678556A (en) * 2013-12-06 2014-03-26 华为技术有限公司 Method for processing column-oriented database and processing equipment

Also Published As

Publication number Publication date
CN113282578A (en) 2021-08-20

Similar Documents

Publication Publication Date Title
EP3767483B1 (en) Method, device, system, and server for image retrieval, and storage medium
CN102841860B (en) A kind of big data quantity information storage and inquire method
US11567976B2 (en) Detecting relationships across data columns
US10839006B2 (en) Mobile visual search using deep variant coding
US11461316B2 (en) Detecting relationships across data columns
CN103942221B (en) Search method and equipment
CN103744913A (en) Database retrieval method based on search engine technology
CN109801693B (en) Medical records grouping method and device, terminal and computer readable storage medium
CN111078952B (en) Cross-modal variable-length hash retrieval method based on hierarchical structure
CN105447168A (en) Method for restoring and recombining fragmented files in MP4 format
CN110532284B (en) Mass data storage and retrieval method and device, computer equipment and storage medium
CN111008183B (en) Storage method and system for business wind control log data
US8306974B2 (en) Ranking database query results using an efficient method for N-ary summation
CN113282578B (en) Message processing method, device, message processing equipment and storage medium
CN110688515A (en) Text image semantic conversion method and device, computing equipment and storage medium
EP3282372A1 (en) Method and apparatus for storing data
CN111190896B (en) Data processing method, device, storage medium and computer equipment
CN104361029A (en) Double-index video circulating memory and quick retrieval method
CN108647243B (en) Industrial big data storage method based on time series
CN113609313A (en) Data processing method and device, electronic equipment and storage medium
CN103116654A (en) Compression method for extensive makeup language (XML) data node coding
US11061876B2 (en) Fast aggregation on compressed data
CN102063480A (en) Haar transform-based method for realizing multi-dimensional histogram
CN114925117A (en) User portrait label data processing method
US20170192674A1 (en) Facilitating reverse reading of sequentially stored, variable-length data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant