CN110688387A - Data processing method and device - Google Patents

Data processing method and device Download PDF

Info

Publication number
CN110688387A
CN110688387A CN201910958650.XA CN201910958650A CN110688387A CN 110688387 A CN110688387 A CN 110688387A CN 201910958650 A CN201910958650 A CN 201910958650A CN 110688387 A CN110688387 A CN 110688387A
Authority
CN
China
Prior art keywords
attribute
user behavior
read
memory
behavior
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910958650.XA
Other languages
Chinese (zh)
Inventor
李金凤
吴丁
蔡炳炎
陈光尧
谢睿
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Quwan Network Technology Co Ltd
Original Assignee
Guangzhou Quwan Network Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Quwan Network Technology Co Ltd filed Critical Guangzhou Quwan Network Technology Co Ltd
Priority to CN201910958650.XA priority Critical patent/CN110688387A/en
Publication of CN110688387A publication Critical patent/CN110688387A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2282Tablespace storage structures; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • G06F16/2308Concurrency control
    • G06F16/2336Pessimistic concurrency control approaches, e.g. locking or multiple versions without time stamps
    • G06F16/2343Locking methods, e.g. distributed locking or locking implementation details

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a data processing method and a data processing device, which are used for acquiring and analyzing service data to be processed in real time to obtain a user behavior attribute of the service data to be processed, starting a distributed lock if the user behavior attribute does not exist in a memory, adding the user behavior attribute and a data type of the user behavior attribute into the memory, expanding a behavior read-write table partition field, storing an attribute value corresponding to the user behavior attribute in a behavior read-write table partition, storing the user behavior attribute and/or the attribute value of the user behavior attribute into the memory, splicing the user behavior attribute and the attribute value corresponding to the user behavior attribute into a character string, writing the character string into the behavior read-write table partition, and generating a wide table. The method realizes real-time automatic expansion of behavior read-write table partition fields, generates a new wide table, and reduces the cost of manually maintaining the database.

Description

Data processing method and device
Technical Field
The invention belongs to the technical field of computers, and particularly relates to a data processing method and device.
Background
With the progress of scientific technology, the method for constructing the database is diversified, wherein the method comprises the step of constructing the database through a wide table. When the business data to be processed needs to be stored in the wide table in the database, the new wide table is artificially generated, and the business data to be processed is stored in the wide table.
However, the fields in the wide table need to be manually expanded each time through the artificially generated wide table, and the fields are frequently increased due to the artificial expansion of the wide table, which increases the maintenance cost of the database.
Disclosure of Invention
In view of the above, an object of the present invention is to provide a data processing method and apparatus, which are used to solve the problem that the maintenance cost of the database is increased due to the artificial generation of the wide table. The technical scheme is as follows:
the invention provides a data processing method, which comprises the following steps:
acquiring and analyzing service data to be processed in real time to obtain user behavior attributes of the service data to be processed;
if the user behavior attribute does not exist in the memory, starting a distributed lock, and adding the user behavior attribute and the data type of the attribute value of the user behavior attribute into the memory;
expanding a behavior read-write table partition field, wherein the behavior read-write table partition is used for storing an attribute value corresponding to the user behavior attribute;
storing the user behavior attribute and/or the attribute value of the user behavior attribute into the memory;
and splicing the user behavior attributes and attribute values corresponding to the user behavior attributes into character strings, and writing the character strings into the behavior read-write table partitions to generate the wide table.
Preferably, the method further comprises:
if the user behavior attribute exists in the memory, judging whether the data type of the attribute value of the user behavior attribute is consistent with the data type in the memory;
if so, storing the user behavior attribute and/or the attribute value of the user behavior attribute in the memory;
if not, defining the attribute value of the user behavior attribute as abnormal data, and writing the attribute value into an abnormal log file.
Preferably, after the reading and writing of the table partition field by the extended behavior, the method further includes:
and releasing the distributed lock.
Preferably, the splicing the user behavior attribute and the attribute value corresponding to the user behavior attribute into a character string, and writing the character string into the behavior read-write table partition includes:
splicing the user behavior attributes and attribute values corresponding to the user behavior attributes into character strings according to the sequence of adding the user behavior attributes to the memory;
and writing the character string into the behavior read-write table partition.
Preferably, after the splicing the user behavior attribute and the attribute value corresponding to the user behavior attribute into a character string, writing the character string into the behavior read-write table partition and generating the wide table, the method further includes:
and storing the character strings in the behavior read-write table partition into the behavior read-write table partition in a fast read file format.
Preferably, the storing the character strings in the behavior read-write table partition into the behavior read-write table partition in a fast read file format includes:
starting a data rotation device based on a preset time period;
obtaining an offset value of the behavior read-write table partition;
and reading the character strings in the behavior read-write table partition based on the deviation value, and storing the read character strings in the behavior read-write table partition in a fast read file format.
The present invention also provides a data processing apparatus, the apparatus comprising:
the analysis module is used for acquiring and analyzing the service data to be processed in real time to obtain the user behavior attribute of the service data to be processed;
the adding module is used for starting a distributed lock and adding the user behavior attribute and the data type of the attribute value of the user behavior attribute into the memory if the user behavior attribute does not exist in the memory;
the extension module is used for extending the behavior read-write table partition field, and the behavior read-write table partition is used for storing the attribute value corresponding to the user behavior attribute;
the first storage module is used for storing the user behavior attribute and/or the attribute value of the user behavior attribute into the memory;
and the generating module is used for splicing the user behavior attributes and the attribute values corresponding to the user behavior attributes into character strings, and writing the character strings into the behavior read-write table partition to generate the wide table.
Preferably, the apparatus further comprises:
the judging module is used for judging whether the data type of the attribute value of the user behavior attribute is consistent with the data type in the memory or not if the user behavior attribute exists in the memory;
the second storage module is used for storing the user behavior attribute and/or the attribute value of the user behavior attribute in the memory if the data type of the attribute value of the user behavior attribute is consistent with the data type in the memory;
and the writing module is used for defining the attribute value of the user behavior attribute as abnormal data and writing the attribute value into an abnormal log file if the data type of the attribute value of the user behavior attribute is inconsistent with the data type in the memory.
Preferably, the apparatus further comprises:
a release module to release the distributed lock.
Preferably, the generating module includes:
the splicing unit is used for splicing the user behavior attribute and the attribute value corresponding to the user behavior attribute into a character string according to the sequence of adding the user behavior attribute to the memory;
and the writing unit is used for writing the character string into the behavior read-write table partition.
Compared with the prior art, the technical scheme provided by the invention has the following advantages:
the method comprises the steps of obtaining and analyzing service data to be processed in real time to obtain a user behavior attribute of the service data to be processed, starting a distributed lock if the user behavior attribute does not exist in a memory, adding the user behavior attribute and a data type of the user behavior attribute into the memory, expanding a behavior read-write table partition field, wherein the behavior read-write table partition is used for storing an attribute value corresponding to the user behavior attribute, storing the user behavior attribute and/or the attribute value of the user behavior attribute into the memory, splicing the user behavior attribute and the attribute value corresponding to the user behavior attribute into a character string, writing the character string into the behavior read-write table partition, and generating a broad table. The method realizes real-time automatic expansion of behavior read-write table partition fields, generates a new wide table, and reduces the cost of manually maintaining the database.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
Fig. 1 is a flowchart of a data processing method according to an embodiment of the present invention;
fig. 2 is a flowchart for splicing a user behavior attribute and attribute values corresponding to the user behavior attribute into a character string and writing the character string into a behavior read-write table partition according to the embodiment of the present invention;
fig. 3 is a flowchart illustrating storing a character string in a behavior read-write table partition into the behavior read-write table partition in a fast read file format according to an embodiment of the present invention;
FIG. 4 is a flow chart of another data processing method provided by the embodiments of the present invention;
fig. 5 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present invention.
Detailed Description
The invention provides a data processing method and device, which are used for solving the problem that the maintenance cost of a database is increased due to manual generation of a wide table.
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
As shown in fig. 1, a data processing method provided by an embodiment of the present invention is shown, and the method includes the following steps:
and S101, acquiring and analyzing the service data to be processed in real time to obtain the user behavior attribute of the service data to be processed.
In a specific implementation, a log collection device or a client of an ETL (Extract-Transform-Load) reports to-be-processed service data in real time, and executes S101 to obtain to-be-processed service data reported in real time, so as to perform data processing on the to-be-processed service data. The method comprises the steps of receiving service data to be processed, processing the received service data to be processed, mainly analyzing the service data to be processed, and obtaining user behavior attributes of the service data to be processed. It should be noted that the source for acquiring the to-be-processed service data reported in real time includes, but is not limited to, a log collection device at the client or the ETL.
The obtained user behavior attribute of the service data to be processed refers to: a user event behavior attribute.
And S102, judging whether the user behavior attribute exists in the memory, if not, executing S103, and if so, executing S104.
In the process of executing S102, after obtaining the user behavior attribute corresponding to the service data to be processed, it is determined whether the user behavior attribute exists in the memory, and after the determination, it is determined that the user behavior attribute does not exist in the memory, S103 needs to be executed, and if the user behavior attribute exists in the memory, S104 is executed.
For example, in a login event, the user behavior attribute corresponding to the login event includes a login client, a version of the login client, and a login time. And judging whether user behavior attributes such as a login client, a version of the login client, login time and the like exist in the memory.
It should be noted that, whether the user behavior attribute exists in the memory may be determined by a comparison method.
S103, starting the distributed lock, and adding the user behavior attribute and the data type of the attribute value of the user behavior attribute into the memory.
In S103, distributed locking is one way to control synchronous access to shared resources between distributed systems. In distributed systems, it is often necessary to coordinate their actions. If one or a group of resources are shared among different distributed systems or among different hosts of the same distributed system, then access to these resources often requires mutual exclusion to prevent interference with each other to ensure consistency, in which case a distributed lock is used.
In the process of executing S103, if the user behavior attribute corresponding to the service data to be processed does not exist in the memory, at this time, the distributed lock is started, and the data type of the user behavior attribute and the attribute value of the user behavior attribute is added to the memory from the current process.
And S104, judging whether the data type of the attribute value of the user behavior attribute is consistent with the data type in the memory, if not, executing S105, and if so, executing S107.
In the process of executing S104, the user behavior attribute corresponding to the service data to be processed has a corresponding attribute value, and different attribute values correspond to different data types. Therefore, after it is determined that the user behavior attribute corresponding to the service data to be processed exists in the memory, it is further required to determine whether the data type of the attribute value of the user behavior attribute is consistent with the data type in the memory.
And S105, defining the attribute value of the user behavior attribute as abnormal data, and writing the attribute value into an abnormal log file.
In the process of executing S105, if the data type in the memory is inconsistent with the data type of the attribute value of the user behavior attribute, defining the attribute value of the data behavior attribute as abnormal data, and writing the attribute value into an abnormal log file.
And S106, expanding the behavior read-write table partition field.
In S106, the behavior read-write table partition is used to store the attribute value corresponding to the user behavior attribute, and in the process of executing S106, the storage space of the behavior read-write table partition is expanded by expanding the field of the behavior read-write table partition to store the new attribute value corresponding to the user behavior attribute.
For example: the fields in the behavior read-write table partition include: when an attribute value corresponding to a new user behavior attribute needs to be stored in a behavior read-write table partition, a new field needs to be defined for storing the attribute value corresponding to the user behavior attribute under the field, for example, a field of a login client is defined for storing data under the field of the login client.
It should be noted that what kind of field is expanded may be defined according to the user behavior attribute, and is not described herein again.
And S107, storing the user behavior attribute into a memory.
In the process of executing S107, if the user behavior attribute is compared with the user behavior attribute already loaded in the memory, and if the comparison result is the same, it is determined that the user behavior attribute obtained by the parsing already exists in the memory. And then judging whether the data type of the attribute value of the user behavior attribute obtained by analysis exists in the memory, and if so, storing the user behavior attribute obtained by analysis in the memory.
It should be noted that the user behavior attribute can be stored in the memory only if the data type satisfying the analysis of the user behavior attribute and the attribute value of the user behavior attribute exists in the memory.
In addition to storing the user behavior attribute in the memory, the attribute value of the user behavior attribute may also be stored in the memory.
In the embodiment of the present invention, the following cases may exist simultaneously:
and S107 is executed, after the user behavior attribute is stored in the memory, the attribute value of the user behavior attribute is stored in the memory.
In S107, the user behavior attribute may be stored in the memory, or the attribute value of the user behavior attribute may be stored in the memory.
And S108, splicing the user behavior attributes and the attribute values corresponding to the user behavior attributes into character strings, and writing the character strings into the behavior read-write table partitions to generate the wide table.
In the process of executing S108, before writing the attribute values corresponding to the user behavior attributes into the behavior read-write table partition, the attribute values are spliced into character strings, then the spliced character strings are written into the behavior read-write table partition, and finally a new wide table is generated.
For example: the values of the 3 user behavior attributes are A, B and C respectively, and the character string ABC is obtained after splicing.
It should be noted that, a specific implementation process of splicing the user behavior attribute and the attribute value corresponding to the user behavior attribute into a character string and writing the character string into the behavior read-write table partition, as shown in fig. 2, mainly includes:
s201, splicing the user behavior attribute and the attribute value corresponding to the user behavior attribute into a character string according to the sequence of adding the user behavior attribute to the memory.
In the process of executing S201, attribute values corresponding to the user behavior attributes are spliced into a character string according to the sequence in which the user behavior attributes obtained through analysis are stored in the memory.
For example: in the login event, the user behavior attributes obtained through analysis are respectively login time, login client-side versions and login client-side versions, the attribute values are respectively 1, 2 and 3, when the character string is spliced, the attribute value of the login time is arranged in the first mode, the attribute value of the login client-side is arranged in the second mode, the attribute value of the login client-side version is arranged in the third mode, and finally the character string '123' is spliced.
It should be noted that, in addition to writing the character strings obtained by splicing the attribute values of the user behavior attributes into the behavior read-write table partition according to the sequence, the character strings can also be spliced into the character strings according to a random manner, and then written into the behavior read-write table partition.
And S202, writing the character string into the behavior read-write table partition.
Optionally, after S108 is executed, the character strings in the behavior read-write table partition may also be stored in the behavior read-write table partition in a fast read file format. The specific implementation process is shown in fig. 3, and mainly includes:
and S301, starting the data rotation device based on a preset time period.
In the process of performing S301, an operating time period of the data rotation means is set such that the data rotation means operates within the operating time period.
It should be noted that the working time period of the data rotation device is set according to actual conditions, and is not described herein again.
S302, obtaining the offset value of the behavior read-write table partition.
And S303, reading the character strings in the behavior read-write table partition based on the deviation value, and storing the read character strings in the behavior read-write table partition in a fast reading file format.
In the process of executing S303, all data of the row read-write table partition from the offset value to the current data are read according to the offset value. And storing the read data into the behavior reading table partition in a quick reading file format.
Based on the data processing method disclosed in the embodiment of the present invention, it can be known that, to-be-processed service data is obtained and analyzed in real time, a user behavior attribute of the to-be-processed service data is obtained, if the user behavior attribute does not exist in a memory, a distributed lock is started, the user behavior attribute and a data type of the user behavior attribute are added to the memory, a behavior read-write table partition field is expanded, the behavior read-write table partition is used for storing an attribute value corresponding to the user behavior attribute, the user behavior attribute and/or the attribute value of the user behavior attribute are/is stored in the memory, the user behavior attribute and the attribute value corresponding to the user behavior attribute are spliced into a character string, and the character string is written into the behavior read-write table partition, so as to generate a broad table. The method realizes real-time automatic expansion of behavior read-write table partition fields, generates a new wide table, and reduces the cost of manually maintaining the database.
As shown in fig. 4, a flowchart of another data processing method provided in the embodiment of the present invention mainly includes:
s401, acquiring and analyzing the service data to be processed in real time to obtain the user behavior attribute of the service data to be processed.
S402, judging whether the user behavior attribute exists in the memory, if not, executing S403, and if so, executing S404.
And S403, starting the distributed lock, and adding the user behavior attribute and the data type of the attribute value of the user behavior attribute into the memory.
And S404, judging whether the data type of the attribute value of the user behavior attribute is consistent with the data type in the memory, if not, executing S405, and if so, executing S408.
S405, defining the attribute value of the user behavior attribute as abnormal data, and writing the attribute value into an abnormal log file.
S406, the extended behavior reads and writes the table partition field.
The execution principle of S401 to S406 is the same as that of S101 to S106, and is not described herein again.
S407, the distributed lock is released.
And S408, storing the user behavior attribute into a memory.
And S409, splicing the user behavior attributes and the attribute values corresponding to the user behavior attributes into character strings, writing the character strings into the behavior read-write table partitions, and generating the wide table.
The execution principle of S408 and S409 is the same as that of S107 and S108 described above, and will not be described herein again.
Based on the data processing method disclosed in the embodiment of the present invention, it can be known that, to-be-processed service data is obtained and analyzed in real time, a user behavior attribute of the to-be-processed service data is obtained, if the user behavior attribute does not exist in a memory, a distributed lock is started, the user behavior attribute and a data type of the user behavior attribute are added to the memory, a behavior read-write table partition field is expanded, the behavior read-write table partition is used for storing an attribute value corresponding to the user behavior attribute, the user behavior attribute and/or the attribute value of the user behavior attribute are/is stored in the memory, the user behavior attribute and the attribute value corresponding to the user behavior attribute are spliced into a character string, and the character string is written into the behavior read-write table partition, so as to generate a broad table. The method realizes real-time automatic expansion of behavior read-write table partition fields, generates a new wide table, and reduces the cost of manually maintaining the database.
Based on the data processing method disclosed in the foregoing embodiment of the present invention, an embodiment of the present invention further discloses a data processing apparatus correspondingly, as shown in fig. 5, which is a schematic structural diagram of a data processing apparatus provided in the embodiment of the present invention, and the data processing apparatus mainly includes: parsing module 50, adding module 51, expanding module 52, first storing module 53 and generating module 54.
And the analysis module 50 is configured to obtain and analyze the service data to be processed in real time to obtain a user behavior attribute of the service data to be processed.
And an adding module 51, configured to start the distributed lock if the user behavior attribute does not exist in the memory, and add the user behavior attribute and the data type of the attribute value of the user behavior attribute to the memory.
And the extension module 52 is configured to extend the behavior read-write table partition field, where the behavior read-write table partition is used to store an attribute value corresponding to the user behavior attribute.
The first storage module 53 is configured to store the user behavior attribute and/or the attribute value of the user behavior attribute in the memory.
And the generating module 54 is configured to splice the user behavior attribute and the attribute value corresponding to the user behavior attribute into a character string, write the character string into the behavior read-write table partition, and generate the wide table.
An optional structure of the generating module 53 in the embodiment of the apparatus of the present invention is: the generation module 54 includes a splice unit and a write unit.
And the splicing unit is used for splicing the user behavior attribute and the attribute value corresponding to the user behavior attribute into a character string according to the sequence of adding the user behavior attribute to the memory.
And the writing unit is used for writing the character string into the behavior read-write table partition.
Based on the data processing device disclosed in the embodiment of the present invention, the service data to be processed is obtained and analyzed in real time, the user behavior attribute of the service data to be processed is obtained, if the user behavior attribute does not exist in the memory, the distributed lock is started, the data type of the user behavior attribute and the user behavior attribute is added to the memory, the behavior read-write table partition field is expanded, the behavior read-write table partition is used for storing the attribute value corresponding to the user behavior attribute, the user behavior attribute and/or the attribute value of the user behavior attribute are/is stored in the memory, the user behavior attribute and the attribute value corresponding to the user behavior attribute are spliced into the character string, and the character string is written into the behavior read-write table partition, so as to generate the broad table. The method realizes real-time automatic expansion of behavior read-write table partition fields, generates a new wide table, and reduces the cost of manually maintaining the database.
Based on the data processing device disclosed in the above embodiment of the present invention, the data processing device further includes: the device comprises a judging module, a second storage module and a writing module.
And the judging module is used for judging whether the data type of the attribute value of the user behavior attribute is consistent with the data type in the memory or not if the user behavior attribute exists in the memory, executing the second storage module if the data type of the attribute value of the user behavior attribute is consistent with the data type in the memory, and executing the writing module if the data type of the attribute value of the user behavior attribute is not consistent with the data type in the memory.
And the second storage module is used for storing the user behavior attribute and/or the attribute value of the user behavior attribute in the memory.
And the writing module is used for defining the attribute value of the user behavior attribute as abnormal data and writing the attribute value into an abnormal log file.
Based on the data processing device disclosed in the embodiment of the present invention, the service data to be processed is obtained and analyzed in real time, the user behavior attribute of the service data to be processed is obtained, if the user behavior attribute does not exist in the memory, the distributed lock is started, the data type of the user behavior attribute and the user behavior attribute is added to the memory, the behavior read-write table partition field is expanded, the behavior read-write table partition is used for storing the attribute value corresponding to the user behavior attribute, the user behavior attribute and/or the attribute value of the user behavior attribute are/is stored in the memory, the user behavior attribute and the attribute value corresponding to the user behavior attribute are spliced into the character string, and the character string is written into the behavior read-write table partition, so as to generate the broad table. The method realizes real-time automatic expansion of behavior read-write table partition fields, generates a new wide table, and reduces the cost of manually maintaining the database.
Based on the data processing device disclosed in the above embodiment of the present invention, the data processing device further includes: and releasing the module.
And the releasing module is used for releasing the distributed lock.
Based on the data processing device disclosed in the above embodiment of the present invention, the data processing device further includes: and a third storage module.
And the third storage module is used for storing the character strings in the behavior read-write table partition into the behavior read-write table partition in a fast reading file format.
An alternative structure of the third memory module in the embodiment of the apparatus of the present invention is: the third storage module comprises a starting unit, an acquisition unit and a storage unit.
And the starting unit is used for starting the data rotation device based on a preset time period.
And the acquisition unit is used for acquiring the deviant of the behavior read-write table partition.
And the storage unit is used for reading the character strings in the behavior read-write table partition based on the deviation value and storing the read character strings into the behavior read-write table partition in a fast reading file format.
Based on the data processing device disclosed in the embodiment of the present invention, the service data to be processed is obtained and analyzed in real time, the user behavior attribute of the service data to be processed is obtained, if the user behavior attribute does not exist in the memory, the distributed lock is started, the data type of the user behavior attribute and the user behavior attribute is added to the memory, the behavior read-write table partition field is expanded, the behavior read-write table partition is used for storing the attribute value corresponding to the user behavior attribute, the user behavior attribute and/or the attribute value of the user behavior attribute are/is stored in the memory, the user behavior attribute and the attribute value corresponding to the user behavior attribute are spliced into the character string, and the character string is written into the behavior read-write table partition, so as to generate the broad table. The method realizes real-time automatic expansion of behavior read-write table partition fields, generates a new wide table, and reduces the cost of manually maintaining the database.
It should be noted that, in the present specification, the embodiments are all described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments may be referred to each other. For the device-like embodiment, since it is basically similar to the method embodiment, the description is simple, and for the relevant points, reference may be made to the partial description of the method embodiment.
Finally, it should also be noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.
The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.

Claims (10)

1. A method of data processing, the method comprising:
acquiring and analyzing service data to be processed in real time to obtain user behavior attributes of the service data to be processed;
if the user behavior attribute does not exist in the memory, starting a distributed lock, and adding the user behavior attribute and the data type of the attribute value of the user behavior attribute into the memory;
expanding a behavior read-write table partition field, wherein the behavior read-write table partition is used for storing an attribute value corresponding to the user behavior attribute;
storing the user behavior attribute and/or the attribute value of the user behavior attribute into the memory;
and splicing the user behavior attributes and attribute values corresponding to the user behavior attributes into character strings, and writing the character strings into the behavior read-write table partitions to generate the wide table.
2. The method of claim 1, further comprising:
if the user behavior attribute exists in the memory, judging whether the data type of the attribute value of the user behavior attribute is consistent with the data type in the memory;
if so, storing the user behavior attribute and/or the attribute value of the user behavior attribute in the memory;
if not, defining the attribute value of the user behavior attribute as abnormal data, and writing the attribute value into an abnormal log file.
3. The method of claim 1, wherein after the extending the behavior reads and writes the table partition field, further comprising:
and releasing the distributed lock.
4. The method according to claim 1, wherein splicing the user behavior attribute and the attribute value corresponding to the user behavior attribute into a character string, and writing the character string into the behavior read-write table partition comprises:
splicing the user behavior attributes and attribute values corresponding to the user behavior attributes into character strings according to the sequence of adding the user behavior attributes to the memory;
and writing the character string into the behavior read-write table partition.
5. The method according to any one of claims 1 to 4, wherein the splicing the user behavior attribute and the attribute value corresponding to the user behavior attribute into a character string, writing the character string into the behavior read-write table partition, and after generating the wide table, further comprises:
and storing the character strings in the behavior read-write table partition into the behavior read-write table partition in a fast read file format.
6. The method of claim 5, wherein storing the character strings in the behavior read-write table partition into the behavior read table partition in a fast read file format comprises:
starting a data rotation device based on a preset time period;
obtaining an offset value of the behavior read-write table partition;
and reading the character strings in the behavior read-write table partition based on the deviation value, and storing the read character strings in the behavior read-write table partition in a fast read file format.
7. A data processing apparatus, characterized in that the apparatus comprises:
the analysis module is used for acquiring and analyzing the service data to be processed in real time to obtain the user behavior attribute of the service data to be processed;
the adding module is used for starting a distributed lock and adding the user behavior attribute and the data type of the attribute value of the user behavior attribute into the memory if the user behavior attribute does not exist in the memory;
the extension module is used for extending the behavior read-write table partition field, and the behavior read-write table partition is used for storing the attribute value corresponding to the user behavior attribute;
the first storage module is used for storing the user behavior attribute and/or the attribute value of the user behavior attribute into the memory;
and the generating module is used for splicing the user behavior attributes and the attribute values corresponding to the user behavior attributes into character strings, and writing the character strings into the behavior read-write table partition to generate the wide table.
8. The apparatus of claim 7, further comprising:
the judging module is used for judging whether the data type of the attribute value of the user behavior attribute is consistent with the data type in the memory or not if the user behavior attribute exists in the memory;
the second storage module is used for storing the user behavior attribute and/or the attribute value of the user behavior attribute in the memory if the data type of the attribute value of the user behavior attribute is consistent with the data type in the memory;
and the writing module is used for defining the attribute value of the user behavior attribute as abnormal data and writing the attribute value into an abnormal log file if the data type of the attribute value of the user behavior attribute is inconsistent with the data type in the memory.
9. The apparatus of claim 7, further comprising:
a release module to release the distributed lock.
10. The apparatus of claim 7, wherein the generating module comprises:
the splicing unit is used for splicing the user behavior attribute and the attribute value corresponding to the user behavior attribute into a character string according to the sequence of adding the user behavior attribute to the memory;
and the writing unit is used for writing the character string into the behavior read-write table partition.
CN201910958650.XA 2019-10-10 2019-10-10 Data processing method and device Pending CN110688387A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910958650.XA CN110688387A (en) 2019-10-10 2019-10-10 Data processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910958650.XA CN110688387A (en) 2019-10-10 2019-10-10 Data processing method and device

Publications (1)

Publication Number Publication Date
CN110688387A true CN110688387A (en) 2020-01-14

Family

ID=69112015

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910958650.XA Pending CN110688387A (en) 2019-10-10 2019-10-10 Data processing method and device

Country Status (1)

Country Link
CN (1) CN110688387A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116501744A (en) * 2023-06-30 2023-07-28 中国人民解放军国防科技大学 Automatic form building and warehousing method and device for simulation data and computer equipment

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102523131A (en) * 2011-12-07 2012-06-27 上海海高通信发展有限公司 User internet behavior collecting method and system and user internet behavior analyzing method and system
CN103207880A (en) * 2012-01-17 2013-07-17 阿里巴巴集团控股有限公司 Behavior tag value capturing method and device
CN107818120A (en) * 2016-09-14 2018-03-20 博雅网络游戏开发(深圳)有限公司 Data processing method and device based on big data
CN108365968A (en) * 2018-01-05 2018-08-03 广州风声计算机有限公司 Data collection and analysis control method and computer storage media, terminal

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102523131A (en) * 2011-12-07 2012-06-27 上海海高通信发展有限公司 User internet behavior collecting method and system and user internet behavior analyzing method and system
CN103207880A (en) * 2012-01-17 2013-07-17 阿里巴巴集团控股有限公司 Behavior tag value capturing method and device
CN107818120A (en) * 2016-09-14 2018-03-20 博雅网络游戏开发(深圳)有限公司 Data processing method and device based on big data
CN108365968A (en) * 2018-01-05 2018-08-03 广州风声计算机有限公司 Data collection and analysis control method and computer storage media, terminal

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
艾利克斯洪木尔: "《云计算架构设计模式》", 30 December 2017, pages: 189 - 190 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116501744A (en) * 2023-06-30 2023-07-28 中国人民解放军国防科技大学 Automatic form building and warehousing method and device for simulation data and computer equipment
CN116501744B (en) * 2023-06-30 2023-09-19 中国人民解放军国防科技大学 Automatic form building and warehousing method and device for simulation data and computer equipment

Similar Documents

Publication Publication Date Title
US20210273972A1 (en) Dynamic Hierarchical Tagging System and Method
CN107391628B (en) Data synchronization method and device
KR101999409B1 (en) Formatting data by example
US8869111B2 (en) Method and system for generating test cases for a software application
CN106874281B (en) Method and device for realizing database read-write separation
CN106648994B (en) Method, equipment and system for backing up operation log
CN107423404B (en) Flow instance data synchronous processing method and device
US20160026699A1 (en) Method for Synchronization of UGC Master and Backup and System Thereof, and Computer Storage Medium
CN102141963A (en) Method and equipment for analyzing data
CN105159968A (en) Directory management method for file system and client
CN114564446B (en) File storage method, device, system and storage medium
CN111124872A (en) Branch detection method and device based on difference code analysis and storage medium
CN112433712A (en) Report display method and device, computer equipment and storage medium
CN112256318A (en) Construction method and equipment for dependent product
CN111737227A (en) Data modification method and system
CN115098600A (en) Directed acyclic graph construction method and device for data warehouse and computer equipment
CN112948504A (en) Data acquisition method and device, computer equipment and storage medium
CN114443294B (en) Big data service component deployment method, system, terminal and storage medium
KR20120022911A (en) Synchronizing self-referencing fields during two-way synchronization
CN110688387A (en) Data processing method and device
CN112000321B (en) Method and device for realizing undo and/or redo of three-dimensional detection software
CN111159142A (en) Data processing method and device
CN113779117A (en) Data monitoring method and device, storage medium and electronic equipment
US20150347402A1 (en) System and method for enabling a client system to generate file system operations on a file system data set using a virtual namespace
CN115544169A (en) Data synchronization method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20200114

RJ01 Rejection of invention patent application after publication