WO2021093472A1 - Procédé de traitement de données, dispositif électronique, et support de stockage lisible - Google Patents

Procédé de traitement de données, dispositif électronique, et support de stockage lisible Download PDF

Info

Publication number
WO2021093472A1
WO2021093472A1 PCT/CN2020/117623 CN2020117623W WO2021093472A1 WO 2021093472 A1 WO2021093472 A1 WO 2021093472A1 CN 2020117623 W CN2020117623 W CN 2020117623W WO 2021093472 A1 WO2021093472 A1 WO 2021093472A1
Authority
WO
WIPO (PCT)
Prior art keywords
user
bit vector
target
bit
vector
Prior art date
Application number
PCT/CN2020/117623
Other languages
English (en)
Chinese (zh)
Inventor
李发明
李海翔
邹兆年
潘安群
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Publication of WO2021093472A1 publication Critical patent/WO2021093472A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9532Query formulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9537Spatial or temporal dependent retrieval, e.g. spatiotemporal queries

Definitions

  • This application relates to the field of computer technology, and specifically to a data processing method, electronic equipment, and readable storage medium.
  • the embodiments of the present application provide a data processing method, a data processing device, a computer-readable storage medium, and an electronic device, which can improve data processing efficiency at least to a certain extent.
  • a data processing method which is executed by a computer device, and the method includes:
  • the query request In response to the query request of the first user, in response to the query request, obtain a bit vector table related to the operation data of the target user; wherein the query request includes identification information and time information, and the bit vector table includes the user identification and the The bit vector corresponding to the user identification, the identification information corresponds to the target user; the target bit vector is obtained from the bit vector table according to the identification information and the time information, and the target bit vector is logically processed To obtain target information.
  • a data processing device including: an obtaining module, configured to obtain a bit vector table related to operation data of a target user in response to a query request; wherein the query request includes identification information And time information, the bit vector table includes a user identification and a bit vector corresponding to the user identification, and the identification information corresponds to the target user; A target bit vector is obtained from the bit vector table, and logic processing is performed on the target bit vector to obtain target information.
  • a computer-readable storage medium on which a computer program is stored, and when the program is executed by a processor, the data processing method as described in the above-mentioned embodiment is implemented.
  • an electronic device including one or more processors; a storage device, for storing one or more programs, when the one or more programs are used by the one or more When executed by the two processors, the one or more processors are caused to execute the data processing method described in the foregoing embodiment.
  • a bit vector table related to the operation data of the target user in response to the query request of the first user, a bit vector table related to the operation data of the target user is obtained; then the bit vector table is obtained from the bit vector table according to the identification information and time information in the query request Obtain the target bit vector in, and obtain the target information by performing logic processing on the target bit vector.
  • the technical solution of the present application can improve data processing efficiency and reduce resource waste by converting user operation data into bit vectors.
  • FIG. 1 shows a schematic diagram of an exemplary system architecture to which the technical solutions of the embodiments of the present application can be applied;
  • Fig. 2 schematically shows a flow chart of a data processing method according to an embodiment of the present application
  • Fig. 3 schematically shows a flow chart of updating a bit vector table according to an embodiment of the present application
  • Fig. 4 schematically shows a flow chart of counting the number of user operations based on a bit vector according to an embodiment of the present application
  • FIG. 5 schematically shows a schematic flow chart of judging user operation similarity based on a bit vector according to an embodiment of the present application
  • Fig. 6 schematically shows a flow chart of determining the influence relationship between user operations based on a bit vector according to an embodiment of the present application
  • FIG. 7 schematically shows a schematic flowchart of a periodic judgment of user behavior based on a bit vector according to an embodiment of the present application
  • FIG. 8 schematically shows a flowchart of abnormal operation judgment based on a bit vector according to an embodiment of the present application
  • Fig. 9 schematically shows a flow chart of decompressing a compressed bit vector according to an embodiment of the present application.
  • Fig. 10 schematically shows a block diagram of a data processing device according to an embodiment of the present application.
  • Fig. 11 shows a schematic structural diagram of a computer system suitable for implementing an electronic device according to an embodiment of the present application.
  • the embodiments of the present application provide a data processing method, a data processing device, a computer storage medium, and an electronic device that improve the current billing system.
  • Fig. 1 shows a schematic diagram of an exemplary system architecture to which the technical solutions of the embodiments of the present application can be applied.
  • the system architecture 100 may include a terminal device 101, a network 102, and a server 103.
  • the network 102 is used to provide a medium of a communication link between the terminal device 101 and the server 103.
  • the network 102 may include various connection types, such as wired communication links, wireless communication links, and so on.
  • the numbers of terminal devices, networks, and servers in FIG. 1 are merely illustrative. According to actual needs, there can be any number of terminal devices, networks and servers.
  • the server 103 may be a server cluster composed of multiple servers.
  • the terminal device 101 may be a terminal device with a display screen, such as a notebook, a desktop computer, and a smart phone.
  • the user performs various operations on the display screen of the terminal device 101, and the terminal device 101 can send instructions corresponding to the user operations to the server 103 via the network 102.
  • the server 103 can respond to the instruction after receiving the instruction, and can analyze the user operation, construct a user operation data table according to the user operation data, and generate a bit vector table associated with the user operation data table.
  • the server 103 monitors the user operation data table. When the user operation data in the user operation data table changes, the changed user operation data can be mapped to a bit vector to update the bit vector in the bit vector table.
  • the identification information and time information as the query and mining conditions can be obtained from the bit vector table corresponding to the identification information
  • the target bit vector of the target bit vector is logically processed to obtain the target information.
  • the identification information is identification information corresponding to the target user.
  • Query and mine user operation behaviors for example, including querying whether a given target user has operated on his account during the query time interval, and querying whether there are other users who have similar operation behaviors to the given target user during the query time interval.
  • Excavate whether the operation behavior of a given target user is periodic in the query time interval determine whether the user has abnormal operations in the query time interval based on the periodic operation behavior of the given target user, and so on.
  • the target information includes the number of times the target user performs a certain operation or certain operations in the query time interval, other users whose operation behaviors are similar to the target user in the query time interval, and the user operations of the target user in the query time interval.
  • the subject of execution of each step can be a computer device, which can be any electronic device with processing and storage capabilities, such as mobile phones, tablet computers, game devices, multimedia playback devices, electronic photo frames, wearable devices, PCs (Personal Computer) and other electronic devices can also be servers, etc.
  • a computer device can be any electronic device with processing and storage capabilities, such as mobile phones, tablet computers, game devices, multimedia playback devices, electronic photo frames, wearable devices, PCs (Personal Computer) and other electronic devices can also be servers, etc.
  • PCs Personal Computer
  • the data processing method provided in the embodiments of the present application is generally executed by a server, and correspondingly, the data processing device is generally set in the server. However, in other embodiments of the present application, the data processing method provided in the embodiments of the present application may also be executed by a terminal device.
  • the operation information of a certain user in a certain period of time is queried from the charging database, and identification information such as the user's ID is usually used to retrieve from the database according to time to obtain the user's operation information during the period of time.
  • identification information such as the user's ID
  • the operation information of one or more users in different time intervals is retrieved from the billing database and aggregated, because the operation data of the same user at different times is likely to be distributed in different storage nodes, even on user data The index has been built, but it still consumes a lot of retrieval time.
  • the embodiment of the present application first proposes a data processing method.
  • the implementation details of the technical solution of the embodiment of the present application are described in detail below:
  • FIG. 2 schematically shows a flowchart of a data processing method according to an embodiment of the present application.
  • the data processing method may be executed by a server, and the server may be the server 103 shown in FIG. 1.
  • the data processing method at least includes steps S210 to S220, which are described in detail as follows:
  • step S210 in response to the query request, a bit vector table related to the operation data of the target user is obtained; wherein the query request includes identification information and time information, and the bit vector table includes a user ID and a bit vector table corresponding to the user ID.
  • the bit vector of, the identification information corresponds to the target user.
  • the query request may be a request initiated by a query user for querying the operation data of the target user.
  • the target user is a user who performs a specific operation and generates operation data.
  • the target user may be one user or multiple users.
  • the server can obtain the operation data of the target user corresponding to the identification information according to the identification information and time information in the query request, and process it.
  • the data processing efficiency is very low, and the accuracy is also poor.
  • a user operation data table can be generated based on the target user's operation data, and a bit vector table associated with the user operation data table can be generated according to the user operation data table, and then the bit vector in the bit vector table can be compared. Process to obtain target information.
  • the user operation data table is generated according to the operation data of the target user, and the bit vector table associated with the user operation data table is generated according to the user operation data table, which can be specifically performed in the following manner.
  • the user performs operations through the terminal device 101, for example, by clicking the corresponding controls on the online shopping platform to perform operations such as product browsing, ordering, payment, sharing, etc., and clicking the corresponding controls on the chat interface to perform operations such as message sending, sharing, editing, and deletion ,and many more.
  • the terminal device 101 sends an instruction corresponding to the operation to the server 103.
  • the server 103 provides corresponding feedback after receiving the instruction, analyzes the user's operation behavior, and constructs a user operation data table according to the user's operation behavior.
  • the user operation data table can be a KV (key-value) data table, in which the key can be the user ID of the user, such as the user ID generated when the user is registered, the user ID number and other information or unique identification that is uniquely associated with the user User's information; value can be data generated after the user performs operations, such as the amount of expenditure, the number of purchases, the amount of recharge, the number of recharges, and so on.
  • KV key-value
  • the key can be the user ID of the user, such as the user ID generated when the user is registered, the user ID number and other information or unique identification that is uniquely associated with the user User's information
  • value can be data generated after the user performs operations, such as the amount of expenditure, the number of purchases, the amount of recharge, the number of recharges, and so on.
  • a one-bit vector table can also be created.
  • the bit vector table is associated with the user operation data table, wherein the recorded data is mapped according to the user operation data, and the bit vector table records the user identification and the
  • the bit vector is a binary sequence composed of consecutive 0/1 sequences.
  • the length of the bit vector is the number of 0s and 1s in the bit vector. Each 0 or 1 in the bit vector is called a "bit". For example, 01011 is a bit vector of length 5.
  • Table 1 shows the structure of an example bit vector table, as shown in Table 1:
  • B [s, e) represents the bit vector mapped according to the user's operation data in the time interval [s, e).
  • s represents the start time
  • e represents the end time
  • a , C, d, e are user IDs
  • the corresponding 110011011100, 100101100011, 101110110101, and 010001000111 are bit vectors obtained by mapping the operation data of user a, user c, user d, and user e; the length of each bit vector is 12.
  • the bit vector table can be created through SQL statements.
  • the name of the bit vector table can be determined according to the user operation data table, and is specified by bitsVector_table_name. For example, when the name of the user operation data table is buy_record_tab, the name of the bit vector table associated with it may be bitsVector_buy_record_tab. If not specified, the default name of the bit vector table associated with the user operation data table is "user operation data table name_bvt", for example, buy_record_tab_bvt.
  • the time granularity is the time attribute of each bit in the bit vector, which represents a period of time.
  • Each bit in the bit vector is the user's user operation behavior in this period of time.
  • the time granularity can also be understood as the period of the bit vector.
  • the time granularity can be set to one day (d), one hour (h), one minute (m), and so on.
  • the time granularity in Table 1 is one hour.
  • the time granularity can also be set to other values, which are not specifically limited in the embodiment of the present application.
  • the purpose of setting the time granularity is to reasonably control the relationship between the storage space usage of the bit vector, the query accuracy and the query requirements according to the requirements.
  • the time granularity can be set to a small value, such as one minute; when only coarse query results are required, the time granularity can be set to a larger value The numerical value.
  • the user operation data can be mapped to a bit vector according to the user's operation time. For example, if the time granularity is set to one hour, the bit vector length of one day is 24. When a user performs an operation in the time interval of 2 o'clock-3 o'clock, then the bit corresponding to 2 o'clock-3 o'clock in the bit vector can be set to 1, indicating that the user has performed an operation within the time granularity of this hour.
  • the bit vector table can be modified through the ALTER TABLE statement.
  • the ALTER TABLE statement uses the same predicate to modify the bit vector table. If you only modify the period value, the previous data will be invalid; if you modify the name of the bit vector table, you can keep the old data and store the new data in the new bit vector table.
  • the length of the bit vector in the bit vector table can be set as required. When the time corresponding to the user operation exceeds the time supported by the bit vector in the bit vector table, a new bit vector table can be created to deal with it.
  • the bit vector table can also be used to record other operation information of the user by adding columns to the bit vector table, such as the type of operation and the transaction corresponding to the operation. Amount, payment method corresponding to the operation, etc.
  • a bit vector can be added to record the user's operation type, 1 means consumption, 0 means recharge; a bit vector can be added to record the user's transaction amount, 1 means consumption exceeds 100 yuan, 0 means consumption does not exceed 100 yuan; yes Add a bit vector to record the user's payment method, 1 represents non-cash payment, and 0 represents cash payment.
  • the user operation data table can be monitored.
  • the bit vector in the bit vector table is triggered to update.
  • a trigger can be set in the user operation data table.
  • the user operation data in the user operation data table will change, and the changed user operation data can be mapped through the trigger trigger.
  • a bit vector is formed, and the bit vector table is updated according to the bit vector.
  • Fig. 3 shows a schematic diagram of the process of updating the bit vector table.
  • the process of updating the bit vector table at least includes step S301-step S304, specifically:
  • step S301 the target user identifier corresponding to the changed user operation data is determined from the user operation data table.
  • the target user ID corresponding to the changed user operation data can be determined, and then the bit vector table can be obtained according to the target user ID.
  • the target user identifies the corresponding bit vector and updates it.
  • step S302 the first bit vector corresponding to the target user ID is obtained from the bit vector table according to the target user ID, and the changed user operation data is mapped to obtain the second bit vector.
  • the target user ID after obtaining the target user ID, it can be matched with the user ID in the bit vector table.
  • the corresponding bit vector is extracted.
  • This bit vector is the first bit vector.
  • the changed user operation data can be mapped to obtain the second bit vector .
  • the target user ID is 12345.
  • the corresponding first vector is 010000000000
  • the length of the first vector is 12, and the time granularity is 1h, which means that the target user is within 12 hours.
  • An operation was performed between 1h and 2h. If the target user performed the operation again during the hour from 4h to 5h, the second bit vector after mapping can be obtained as 000010000000.
  • step S303 the first bit vector and the second bit vector are ORed to obtain the third bit vector.
  • the first bit vector and the second bit vector can be integrated, To get the third bit vector.
  • the integration operation may specifically be an OR (
  • ) operation on the first bit vector and the second bit vector. Taking the first bit vector and the second bit vector in step S302 as an example, the third bit vector (010000000000)
  • (000010000000) 010010000000, which is used to indicate that the user corresponding to the target user ID is in the 1h-2h and 4h -The operations were performed separately at 5h.
  • step S304 the first bit vector is replaced with the third bit vector to update the bit vector in the bit vector table.
  • the first bit vector can be replaced with the third bit vector to implement the update of the bit vector table.
  • the NewBit function may be used to generate a new bit vector value.
  • the NewBit function can receive four parameters, namely: the time to start counting user operations (that is, the time when the bit vector table is generated), the time granularity specified by the user in advance, the time of the user update operation, and the user ID of the operation that occurred.
  • the NewBit function first reads the user's bit vector value from the bit vector table, and then performs an OR operation with the bit vector mapped by the user update operation to obtain a new bit vector, and finally writes the new bit vector back to the bit vector table .
  • the NewBit function can be a trigger function, a user-defined function, or a system function of the database engine, which is not specifically limited in the embodiment of the application, and a suitable function can be selected to update the bit vector table according to actual needs.
  • step S220 a target bit vector is obtained from the bit vector table according to the identification information and the time information, and logical processing is performed on the target bit vector to obtain target information.
  • the server may obtain the target bit vector from the bit vector table according to the identification information and the time information, and then obtain the target information according to the target bit vector.
  • the target information includes the number of times the user performs a certain operation or certain operations in the query time interval, whether the operations of other users in the query time interval are similar to those of the target user, and whether they are in the query time interval. The mutual influence relationship between user operation behaviors, the periodicity of user operation behaviors in the query time interval, and whether users with periodic operation behaviors have abnormal operations in the query time interval.
  • the specific method for obtaining the target information according to the target bit vector is to perform logical processing on the target bit vector, and the logical processing includes basic operations and basic operations of the bit vector.
  • the basic operations of bit vectors include AND, OR, NOT, XOR, which are represented by &,
  • different operations may be used to process the bit vector to obtain different target information.
  • the AND, OR, and XOR operations are binary operations.
  • the AND, OR, and XOR operations respectively return the result 1 in the following cases: that is, both bits are 1, any one bit is 1, and there is only one bit is 1, the other cases return the result 0; non-operation is monocular Operation, if the operation bit is 0, the result 1 is returned, otherwise the result 0 is returned.
  • the basic operation of a bit vector is shifting.
  • >> and ⁇ are used to indicate shift operations, such as Means to move two bits to the right, and add 0 to the leftmost two bits after shifting, and the resulting bit vector is 001100110111. What needs to be emphasized here is that the basic operations and shift operations of bit vectors can be well supported by the bottom of the computer and can be completed very quickly by the computer.
  • the user operation data is mapped into a bit vector based on The bit vector is processed accordingly to avoid directly exposing user operation data to the outside world, thereby improving the security of user data and avoiding the leakage of user privacy.
  • a key operation for a bit vector is counting, which is represented by Count, such as
  • the function returns the number of 1s in the bit vector corresponding to user a in the time interval [s, e).
  • the Count function can be quickly completed by a shift operation.
  • the specific algorithm steps are: take the bit vector B [s, e) as the input bit vector, and first obtain the length of the bit vector B [s, e) and For the first bit, the first bit of the bit vector B [s, e) is ANDed with 1, and the result of the AND operation is used as the initial statistical value; then the bit vector B [s, e) is shifted to the right to get its first bit Two bits, the second bit is ANDed with 1, and the initial statistical value is updated with the result of the AND operation. Repeat the above steps until the last bit of the bit vector B [s, e) is ANDed with 1 , And update the statistical value to get the final result, which is the return value of the Count() function.
  • FIG. 4 shows a schematic diagram of the flow of counting the number of user operations based on a bit vector. As shown in FIG. 4, the flow at least includes steps S401-S402, specifically:
  • step S401 a first user identification and a first time interval are obtained, and a first target bit vector corresponding to the first user identification is obtained from a bit vector table according to the first user identification and the first time interval.
  • multiple user IDs and bit vectors corresponding to each user ID can be recorded in the bit vector table.
  • the first user ID can be combined with The user ID in the bit vector table is matched.
  • the bit vector corresponding to the first user ID is determined according to the first time interval and the time granularity of the bit vector table, and it is marked as The first target bit vector.
  • the step S401 is executed in response to a query request of the query user, and the query request includes identification information and time information.
  • the first user identification is obtained according to the identification information, for example, is included in the identification information, or may be obtained from a storage device by a server through the identification information.
  • the time information includes a first time interval.
  • the first user identifier corresponds to the target user.
  • step S402 the first target bit vector is counted to obtain the number of operations performed by the user corresponding to the first user identifier in the first time interval.
  • the Count() function can be used to count it, that is, the number of 1 in the first target bit vector is counted to obtain the user corresponding to the first user ID The number of operations performed in the first time interval. For example, taking user a in Table 1 as an example, if you want to obtain the operation status of user a in the time interval [0, 5), you can first obtain the first target bit vector 11001 of the user a in the time interval [0, 5). , And then use the Count() function to count the first target bit vector 11001, you can get It shows that user a has performed at least 3 operations in the time interval [0, 5); finally, the statistical result 3 is returned.
  • Fig. 5 shows a schematic diagram of a process for judging user operation similarity based on a bit vector. As shown in Fig. 5, the process at least includes steps S501-S504, specifically:
  • step S501 a second user identification, a user identification to be compared, and a second time interval are acquired.
  • the server can first obtain the user identification of the target user, the user identification of other users, and the time interval to be queried, and then according to the operation position vector of the target user and the operation position vector of other users in the time interval , To determine whether the operation of the target user is similar to the operation of other users.
  • the user identification of the target user is marked as the second user identification
  • the user identifications of other users are marked as the user identification to be compared
  • the time interval to be queried is marked as the second time interval.
  • the comparison user ID can be one user ID or multiple user IDs.
  • the step S501 is executed in response to a query request of the querying user, for example, and the query request includes identification information and time information.
  • the second user identification and the user identification to be compared are obtained according to the identification information, for example, included in the identification information, or may be obtained from a storage device by a server through the identification information.
  • the time information includes a second time interval.
  • the second user ID and the user ID to be compared correspond to the target user.
  • step S502 according to the second user ID, the user ID to be compared, and the second time interval, a second target bit vector corresponding to the second user ID and a second target bit vector corresponding to the user ID to be compared are obtained from the bit vector table. Compare the target bit vector.
  • the second user ID and the user ID to be compared may be respectively compared with the user ID in the bit vector table.
  • the matching is performed to obtain the second target bit vector corresponding to the second user identifier and the target bit vector to be compared corresponding to the user identifier to be compared in the second time interval.
  • step S503 an XOR operation is performed on the second target bit vector and the target bit vector to be compared, and the result of the XOR operation is not operated to obtain the comparison target bit vector.
  • the second target bit vector and each target bit vector to be compared may be operated on to obtain the similarity between the two Sex.
  • the second target bit vector can be XORed with the target bit vector to be compared, and the result of the XOR operation can be negated to obtain the comparison target bit vector; then the target bit vector can be compared to obtain statistics The similarity between the two.
  • the server first obtains the bit vectors of users c, a, and d in the time interval [0, 12), which are respectively Then the bit vector of user c is XORed with the bit vector of users a and d respectively, namely Then perform the NOT operation on the result of the exclusive OR operation, that is That is, the comparison target bit vector is with
  • step S504 the target bit vector is compared and counted to obtain the operating similarity of the user corresponding to the second user identification and the user corresponding to the user identification to be compared in the second time interval.
  • the target bit vector after obtaining the comparison target bit vector, can be compared according to the Count() function to obtain the user corresponding to the second user ID and the user corresponding to the user ID to be compared Operational similarity in the second time interval. After statistics, From this, it can be determined that in the time interval [0, 12), the similarity of the operations of the user d and the user c is greater than the similarity of the operations of the user a and the user c.
  • the similarity measure can also be refined by calculating the proportion of similar operations in all operations in a given time interval. For example, in the second time interval, the ratio of similar operations between user c and user a is 3/12, and the ratio of similar operations between user c and user d is 6/12. Obviously, user c and user d have more operating behaviors. similar.
  • the operation can be used to cluster users, that is, to cluster users based on operation similarity. Similar users are divided into the same category, and different users are divided into different categories.
  • the result of the clustering can be used as a data preprocessing process to speed up the processing speed of other analysis; it can also be provided to the user portrait as a type of behavior characteristic of the user to help better understand the user.
  • Fig. 6 shows a schematic diagram of the process of judging the influence relationship between user operations based on the bit vector. As shown in Fig. 6, the process at least includes steps S601-S604, specifically:
  • step S601 the third user identification, the fourth user identification, the similarity threshold, and the third time interval are acquired.
  • the step S601 is executed in response to a query request of the query user, and the query request includes identification information and time information.
  • the third user identification and the fourth user identification are obtained according to the identification information, for example, are included in the identification information, or may be obtained from a storage device by a server through the identification information.
  • the time information includes a third time interval.
  • the third user identification and the fourth user identification correspond to the target user.
  • step S602 according to the third user ID, the fourth user ID, and the third time interval, a third target bit vector corresponding to the third user ID and a fourth target bit vector corresponding to the fourth user ID are obtained from the bit vector table. Bit vector.
  • the third user ID and the fourth user ID are respectively matched with the user ID in the bit vector table to obtain the third time interval A third target bit vector corresponding to the third user identification and a fourth target bit vector corresponding to the fourth user identification.
  • step S603 a shift operation is performed on the fourth target bit vector to obtain the shift target bit vector, and similarity judgment is performed on the shift target bit vector and the third target bit vector to obtain the similarity.
  • the influence of one user's operation on another user's operation may be synchronous or delayed. Therefore, when determining the influence relationship between user operations, you can perform a shift operation on the fourth target bit vector, and then determine the similarity between the third target bit vector and the shifted fourth target bit vector, and obtain the difference between the two The similarity between. Specifically, the fourth target bit vector can be firstly shifted to the left according to the shift unit to obtain the shift target bit vector; then the third target bit vector and the shift target bit vector are XORed, and the XOR is performed.
  • the shift unit is the number of bits that change each time the shift operation is performed, for example, it can be 1, 2, etc., as long as it is any integer less than the length of the bit vector.
  • a shift threshold can be set. When the shift operation reaches the shift threshold, the shift operation is stopped, and it is determined that the operation of the user corresponding to the third user identifier has no effect on the operation of the user corresponding to the fourth user identifier.
  • step S604 the similarity is compared with the similarity threshold, and according to the comparison result, it is determined whether the operation of the user corresponding to the third user identifier affects the operation of the user corresponding to the fourth user identifier in the third time interval.
  • the similarity can be compared with a similarity threshold, and the user’s identity corresponding to the third user identifier can be determined according to the comparison result. Whether the operation affects the operation of the user corresponding to the fourth user identifier.
  • the similarity is greater than or equal to the similarity threshold, it is determined that the operation of the user corresponding to the third user identifier in the third time interval affects the operation of the user corresponding to the fourth user identifier; when the similarity is less than the similarity threshold.
  • the fourth target bit vector is shifted again, and the similarity between the shifted bit vector and the third target bit vector is calculated, and the relationship between the similarity and the similarity threshold is judged. If the similarity is less than the similarity Threshold, repeat the above steps until the number of bits shifted to the left by the fourth target bit vector reaches the shift threshold.
  • the number of shift operations is returned, that is, the time delay for one user's operation to affect another user; when two or more users When there is no mutual influence between the user's operations, the shift threshold is returned.
  • steps S603 and S604 are implemented by shifting the fourth target bit vector, and may also be implemented by shifting the third target bit vector instead of the fourth target bit vector.
  • the third target bit vector operated by user a in the time interval [3, 8) is 001101
  • the fourth target bit vector operated by user d in the time interval [4, 9) is 110110 .
  • Perform a shift operation on the fourth target bit vector, and calculate the similarity between the shift target bit vector and the third target bit vector you can get It shows that the operation behavior of user a may affect user d, and the delay of the impact is about 1h. Similar to the similarity judgment, the influence relationship is not strongly established.
  • the influence relationship should be caused by some external factor, such as the launch of a new product; it is also possible that the influence relationship is completely accidental, that is, the two users There are similar consumer behaviors without any external factors. If in the billing data, this kind of accidental influence between two users often occurs, that is, after one user purchases certain goods, another user often buys these goods, but there is no such thing between the two users. Any connection can also make use of this influence relationship. When it is found that a user is operating on the account, it can be predicted that another user is also likely to operate on the account, thereby improving the understanding of the corresponding user.
  • Fig. 7 shows a schematic flow chart of the periodic judgment of user behavior based on a bit vector. As shown in Fig. 7, the flow at least includes steps S701-S705, specifically:
  • step S701 the fifth user identifier, the first operation mode bit vector, the first operation mode period, and the fourth time interval are acquired.
  • the step S701 is executed in response to a query request of the query user, for example, the query request includes identification information and time information.
  • the fifth user identification is obtained according to the identification information, for example, is included in the identification information, or may be obtained from a storage device by a server through the identification information.
  • the time information includes a fourth time interval.
  • the fifth user ID corresponds to the target user
  • the mode bit vector is used to determine whether the user operation in the fourth time interval is the repetition of the operation mode according to the operation mode bit vector.
  • the period of the first operation mode can be obtained to determine whether the periodicity of the user operation meets the preset Operating mode cycle.
  • step S702 a fifth target bit vector corresponding to the fifth user ID is obtained from the bit vector table according to the fifth user ID and the fourth time interval.
  • the fifth user ID can be matched with the user ID in the bit vector table to obtain the fifth target bit vector of the user corresponding to the fifth user ID in the fourth time interval, and use this
  • the fifth target bit vector is used as a benchmark for periodic analysis.
  • step S703 according to the number of bits of the first operation mode bit vector, the fifth target bit vector is converted into a plurality of first sub-bit vectors arranged in sequence, and the first operation mode bit vector and each first sub-bit vector are respectively The vector performs similarity judgment to obtain sub-similarity.
  • the new bit vector obtained after the fifth target bit vector is processed must include multiple first operation mode bit vectors, Therefore, the length of the fifth target bit vector must be greater than the length of the first operation mode bit vector.
  • the fifth target bit vector can be converted into a plurality of first sub-bit vectors arranged in sequence according to the length of the first operation mode bit vector, and then the first operation mode bit vector and each first sub-bit vector The bit vector performs similarity judgment to obtain the sub-similarity corresponding to each first sub-bit vector.
  • the bit vector 110011011100 corresponding to the operation of user a in the time interval [0, 12) is converted to obtain the first sub-bit vector: 110, 100, 001, 011, 110, 101, 011, 111, 110, and 100.
  • An operation mode bit vector is similarly judged with each first sub-bit vector, and the sub-similarity can be obtained, in order: 3, 2, 0, 1, 3, 1, 1, 2, 3, 2.
  • step S704 a sequence bit vector is determined according to the ordering and sub-similarity of each first sub-bit vector, and the repetition period of the sequence bit vector is obtained.
  • a sequence of bit vectors can be determined according to the sub-similarity corresponding to each first sub-bit vector.
  • the similarity 3
  • the similarity is 0, 1, and 2
  • a bit vector of operation mode is different.
  • the corresponding position in the sequence bit vector is 1.
  • the corresponding position in the sequence bit vector The position is 0.
  • the sequence bit vector composed of sub-similarity is 1000100010
  • the first eight bits of the sequence bit vector are 10001000, which is a cycle of 1000, indicating that the first operation mode repeats in a period of 4 hours .
  • step S705 when the repetition period is the same as the period of the first operation mode, it is determined that the operation behavior of the user corresponding to the fifth user identifier is periodic in the fourth time interval.
  • the operation of user a in the time interval [0, 12) is repeated at a period of 4 hours, and the period of the given first operation mode is also 4, indicating that the operation of user a is
  • the time interval [0, 12) is periodic. Further, it can be determined that the periodic start time is the 0th hour, and the end time is the 10th hour.
  • abnormal operation refers to the user's operation behavior that is different from the usual operation. Quick detection of abnormality can help the system quickly find the abnormal operation and determine whether the operation is the user's own operation. If it is not the user's own operation, you can Take timely measures to reduce user losses.
  • the abnormality judgment in the embodiment of the present application is based on periodic judgment, that is, the user's previous operations are periodic, and when an operation that does not meet the periodic characteristics occurs, it is defined as an abnormal operation.
  • Fig. 8 shows a schematic diagram of the process of judging abnormal operations based on bit vectors. As shown in Fig. 8, the process at least includes steps S801-S805, specifically:
  • step S801 the sixth user identifier, the second operation mode bit vector, the abnormality threshold, and the fifth time interval are acquired.
  • the step S801 is executed in response to a query request of the querying user, for example, the query request includes identification information and time information.
  • the sixth user identification is obtained according to the identification information, for example, is included in the identification information, or may be obtained from a storage device by a server through the identification information.
  • the time information includes a fifth time interval, and the sixth user identifier corresponds to the target user.
  • the second operation mode bit vector and abnormal threshold may be preset.
  • the user identification obtained in this step is recorded as the sixth user identification
  • the operation mode bit vector is recorded as the second operation mode bit vector
  • the time interval is recorded as the fifth time interval
  • the abnormal threshold for judging abnormal operation is obtained.
  • step S802 a sixth target bit vector corresponding to the sixth user ID is obtained from the bit vector table according to the sixth user ID and the fifth time interval, wherein the operation of the user corresponding to the sixth user ID is periodic.
  • the sixth user ID is matched with the user ID in the bit vector table to obtain the sixth target bit vector in the fifth time interval corresponding to the sixth user ID.
  • the sixth target bit vector can be processed according to steps S703-S704 shown in FIG. 7, and it is determined whether the operation of the user corresponding to the sixth user identifier exists in the fifth time interval according to the processing result.
  • Periodic Only on the basis of the periodicity of the user operation can it be judged whether there is an abnormal operation in the user operation. For the non-periodical user operation, it is difficult to determine whether there is an abnormal operation.
  • step S803 the sixth target bit vector is divided into a plurality of second sub-bit vectors according to the number of bits of the second operation mode bit vector.
  • the sixth target bit vector in order to determine which of the user operations does not conform to the periodicity, and to determine that the user operation is abnormal, the sixth target bit vector needs to be divided into multiple numbers according to the number of bits in the second operation mode bit vector.
  • the second sub-bit vector Take the bit vector 110011011100 corresponding to the operation of the user a in the time interval [0, 12) as an example. Given that the second operating mode bit vector is 1100, the bit vector corresponding to user a can be divided into a plurality of second sub-bit vectors: 1100, 1101, and 1100, respectively.
  • step S804 the data of each bit in the second operation mode bit vector is compared with the data of the corresponding bit of each second sub-bit vector to obtain an abnormal count.
  • a bit vector corresponding to the operation of the user e in the time interval [0, 12) is taken as an example.
  • the abnormal threshold ⁇ 2.
  • the data of each bit in the second operation mode bit vector is compared with the data of the corresponding bit of the second sub-bit vector.
  • the first bit is 1, and the first bit of the second operating mode bit vector is 0.
  • the two are different, so the exception count is set to 1;
  • the second sub-bit vector The second bit is 1, and the second bit of the second operating mode bit vector is 1.
  • the two are the same, so the exception count is still 1.
  • the third bit of the second sub-bit vector is 1, and the third bit of the second operating mode bit vector The bit is 0, the two are different, so the exception count is set to 2.
  • the fourth bit of the second sub-bit vector is 1, and the fourth bit of the second operation mode bit vector is 0. The two are different, so the exception count is set to 3.
  • step S805 when the abnormality count is greater than or equal to the abnormality threshold, it is determined that the operation behavior of the user corresponding to the sixth user identifier is abnormal in the fifth time interval.
  • the abnormality count is 3, and the abnormality threshold is 2.
  • the abnormality count is greater than the abnormality threshold, indicating that the operation behavior of the user corresponding to the sixth user identifier is abnormal in the fifth time interval.
  • a warning can be issued to the system to make the system take corresponding measures, such as freezing the user's account, etc., to avoid causing damage to the user's property.
  • the data processing method disclosed in the embodiment of the present application can be used in multiple fields, such as the medical field, the financial field, the service field, and so on.
  • electronic wallets Take the use of electronic wallets as an example.
  • the user will use the electronic wallet to pay when shopping online, and when the money in the electronic wallet is used up, the electronic wallet will be recharged. Every time a user recharges or consumes an electronic wallet is a user operation behavior.
  • the system will store the user operation data in the user operation data table. For example, user A made a transaction at 17:00 on October 1, 2019 and purchased a set of skin care products worth 800 yuan, then the system will record user A's consumption behavior, consumption amount, consumption time and other information to the user Operation data sheet.
  • the trigger triggers the mapping of the new data to update the bit vector table associated with the user operation data table.
  • the bit vector table records the user identification and the bit vector corresponding to the user identification, and each bit in the bit vector records whether the user has performed an operation in the corresponding time interval. According to the bit vector table, the user analysis department can obtain the bit vector of the target user from it.
  • bit vector of the target user By analyzing the bit vector of the target user, it can obtain the number of operations of the target user in a certain time interval; The bit vector is analyzed to determine whether there are users whose operating behaviors are similar to those of the target user, so as to cluster the users, and further study the operating behaviors of each type of user, such as whether the operations of similar users have an interaction relationship; You can also perform data mining based on the bit vector table, for example, by analyzing the bit vector of the target user to determine whether the target user's operating behavior in a certain time interval has periodicity.
  • the target user can also be judged Whether there is an abnormality in the operation behavior of the system, when it is determined that there is an abnormality, a warning can be issued in time, and the use of the target user's electronic wallet can be controlled through the system to avoid unnecessary losses.
  • the bit vector is a binary sequence composed of 0 and 1, where 0 is relatively large, the bit vector can be compressed to improve the utilization of storage space.
  • is used to represent the compressed bit vector, where p bits are a group.
  • a group of compressed vectors is a bit vector containing 8 bits, in which the first bit is a flag bit, that is, it indicates the meaning expressed by the 7 bits behind the group.
  • the first bit is 1, it means that the next 7 bits are uncompressed bit vectors; if the first bit is 0, it means that the next 7 bits are used for counting and recording the number of compressed consecutive 0s.
  • the first bit is 1, indicating that the following 7 bits 000100 are uncompressed bit vectors; in the compressed bit vector 0000100, the first bit is 0, indicating that the following 7 bits 000100 are compressed continuous The number of 0s, a total of 4 0s.
  • B [0, 51) 0010010000000000000000000000000000001100011.
  • B [0, 7) the first bit of the first group of the compressed vector ⁇ is 1, indicating that there is no compression, and the remaining 7 bits are the same as B [0, 7) , that is, the first bit of ⁇ One group is 10010010; the next 8th to 43rd bits are all 0s, so the first bit of the second group of ⁇ is 0, and the remaining 7 bits are used to record the number of compressed 0s, B [7
  • the number of 0 in 44) is 37, which is represented as 100101 in binary, and converted to 7-bit binary as 0100101, then the second group of ⁇ is 00100101; the last 44th to 50th digits contain 1, so the third of ⁇
  • the first bit of the group is 1, and the remaining 7 bits are the same as B [
  • the compression ratio is related to two aspects, the number of consecutive 0s in the bit vector and the group size set in the compressed bit vector.
  • the compressed bit vector when it is necessary to perform operations on the compressed bit vector, the compressed bit vector can be read out from the database first, and then decompressed to obtain the corresponding bit vector, and perform data based on the bit vector deal with.
  • FIG. 9 shows a schematic diagram of the process of decompressing the compressed bit vector. As shown in FIG. 9, in step S901, the query interval corresponding to the compressed vector and the bit vector to be processed is obtained, and the query interval includes a start bit and a stop bit.
  • step S902 the compressed vector is divided into multiple compressed bit vectors according to the number of bits of the compressed bit vector, and the compressed bit vectors are sequentially decompressed to obtain a decompressed bit vector with more bits than the starting bit; in step In S903, the vector value whose median of the decompressed bit vector is greater than the start bit vector is used as the vector value in the bit vector to be processed; in step S904, if the number of vector values is less than the difference between the stop bit vector and the start bit vector , The compressed bit vector adjacent to the decompressed bit vector is decompressed to obtain the vector value of the remaining bits in the bit vector to be processed.
  • the query interval of a given bit vector to be processed is from the 40th hour to the 50th hour, that is, under the condition that the time granularity is 1 hour, the starting bit number is 40.
  • the stop bit is 50
  • the compression vector B [0, 51) is obtained at the same time; then the compression vector can be scanned according to the preset group size, and the compression vector can be divided into multiple compression bit vectors, such as the preset group size It is 8 bits, that is, the length of the compressed bit vector is 8, then the compressed vector can be divided into multiple compressed bit vectors of length 8; then each compressed bit vector is decompressed in turn, for example, the first group of compressed bit vectors is 10010010 , The first bit is 1, indicating that the following seven bits are not compressed, so the first group stores the first 7 bits of the bit vector to be processed.
  • the first group of compressed bit vectors Since 7 is less than the starting bit number 40, the first group of compressed bit vectors is not compressed. Contains the bits in B [10, 51) ; the second group of compressed bit vectors is 00100101, the first bit of which is 0, indicating that the following seven bits are the number of compressed consecutive 0s, a total of 37 0s, the first group The seven bits in and the number of 0s contained in the second group are 44 bits in total, which is greater than the starting bit of 40, indicating that the second group of compressed bit vectors contains the first four bits in B [40, 51) , specifically 0000; because B [40, 51) contains eleven bits, so it is necessary to continue to decompress the third group of compressed bit vectors.
  • the embodiments of this application map user operation data to bit vectors corresponding to time, and can realize user operation-related queries, such as the tasks and target results mentioned in the above embodiments, because the basic operations and basic operations of bit vectors can be used by the computer.
  • the excellent support at the bottom layer improves the efficiency of data processing and can quickly return results.
  • the data processing method in the embodiment of the present application can improve data processing efficiency and accuracy, provide better data support for user analysis departments, and can also be used as a preprocessing process for other data query or data mining work to improve processing efficiency.
  • the data processing method in the embodiment of the present application can well protect the user's privacy and avoid the leakage of the user's privacy. Furthermore, the bit vector can be compressed and stored during storage, which can save a lot of storage space and avoid the problems of insufficient storage space and reduced data processing efficiency caused by a large bit vector.
  • Fig. 10 schematically shows a block diagram of a data processing device according to an embodiment of the present application.
  • a data processing device 1000 includes: an acquisition module 1001 and an operation module 1002.
  • the obtaining module 1001 is configured to obtain a bit vector table related to the operation data of the target user in response to a query request; wherein the query request includes identification information and time information, and the bit vector table includes a user identification and a connection with the The bit vector corresponding to the user ID, and the ID information corresponds to the user ID; the arithmetic module 1002 is configured to obtain the target bit vector from the bit vector table according to the ID information and the time information, and compare the The target bit vector performs logical processing to obtain target information.
  • the bit vector includes user operation information in each time granularity.
  • the arithmetic module 1002 is configured to obtain a first user identification and a first time interval, and obtain from the bit vector table according to the first user identification and the first time interval A first target bit vector corresponding to the first user ID; and statistics are performed on the first target bit vector to obtain the number of operations performed by the user corresponding to the first user ID in the first time interval.
  • the first user identification is obtained according to the identification information, for example, is included in the identification information, or may be obtained from a storage device by a data processing apparatus through the identification information.
  • the time information includes a first time interval.
  • the first user identifier corresponds to the target user.
  • the computing module 1002 is configured to: obtain a second user ID, a user ID to be compared, and a second time interval; according to the second user ID, the user ID to be compared, and The second time interval obtains a second target bit vector corresponding to the second user identification and a target bit vector to be compared corresponding to the user identification to be compared from the bit vector table; Perform an exclusive OR operation on the two target bit vectors and the target bit vector to be compared, and perform a negation operation on the result of the exclusive OR operation to obtain a comparison target bit vector; perform statistics on the comparison target bit vector to obtain Operational similarity between the user corresponding to the second user identifier and the user corresponding to the compared user identifier in the second time interval.
  • the second user identification and the user identification to be compared are obtained according to the identification information, for example, included in the identification information, or may be obtained from a storage device by the data processing apparatus through the identification information.
  • the time information includes a second time interval.
  • the second user ID and the user ID to be compared correspond to the target user.
  • the arithmetic module 1002 includes: an information acquisition unit, configured to acquire a third user identification, a fourth user identification, a similarity threshold, and a third time interval; and a bit vector acquiring unit, configured according to The third user ID, the fourth user ID, and the third time interval obtain a third target bit vector corresponding to the third user ID and a third target bit vector corresponding to the fourth user ID from the bit vector table.
  • the fourth target bit vector a similarity obtaining unit, configured to perform a shift operation on the fourth target bit vector to obtain a shift target bit vector, and compare the shift target bit vector and the third target bit vector Perform similarity judgment to obtain similarity; a comparison unit, configured to compare the similarity with the similarity threshold, and determine according to the comparison result that the third user ID corresponds to the third user identifier in the third time interval. Whether the user's operation affects the user's operation corresponding to the fourth user identifier.
  • the third user identification and the fourth user identification are obtained according to the identification information, for example, are included in the identification information, or may be obtained from a storage device by the data processing apparatus through the identification information.
  • the time information includes a third time interval.
  • the third user identification and the fourth user identification correspond to the target user.
  • the similarity obtaining unit is configured to: shift the fourth target bit vector to the left according to the shift unit to obtain the shift target bit vector;
  • the target bit vector and the shift target bit vector are XORed, and the result of the XOR operation is negated to obtain a similarity target bit vector; the similarity target bit vector is counted to obtain the Similarity.
  • the comparing unit is configured to: when the similarity is greater than or equal to the similarity threshold, determine that the user's identity corresponding to the third user identifier is within the third time interval. The operation has an impact on the operation of the user corresponding to the fourth user identification; when the similarity is less than the similarity threshold, the method described in the foregoing embodiment is repeatedly executed until the fourth target bit vector is shifted to the left The number of bits reaches the shift threshold.
  • the arithmetic module 1002 is configured to: obtain a fifth user identification, a first operation mode bit vector, a first operation mode period, and a fourth time interval;
  • the fourth time interval obtains a fifth target bit vector corresponding to the fifth user identifier from the bit vector table; converts the fifth target bit vector to the number of bits of the first operation mode bit vector
  • the ordering of the vectors and the sub-similarity determine the sequence bit vector, and obtain the repetition period of the sequence bit vector; when the repetition period is the same as the period of the first operation mode, it is determined that the fifth user ID corresponds to The user's operation behavior has periodicity in the fourth time interval.
  • the fifth user identification is obtained according to the identification information, for example, is included in the identification information, or may be obtained from a storage device by the data processing apparatus through the identification information.
  • the time information includes a fourth time interval.
  • the fifth user identifier corresponds to the target user.
  • the arithmetic module 1002 is configured to: obtain a sixth user identification, a second operation mode bit vector, an abnormality threshold, and a fifth time interval; according to the sixth user identification and the fifth time interval; The time interval obtains a sixth target bit vector corresponding to the sixth user identifier from the bit vector table, wherein the operation of the user corresponding to the sixth user identifier is periodic; according to the second operation mode bit vector
  • the number of bits in the sixth target bit vector is divided into a plurality of second sub-bit vectors; the data of each bit in the second operation mode bit vector and the data of the corresponding bits of each of the second sub-bit vectors are separately performed Comparison to obtain an abnormal count; when the abnormal count is greater than or equal to the abnormal threshold, it is determined that the operation behavior of the user corresponding to the sixth user identifier is abnormal in the fifth time interval.
  • the sixth user identification is obtained according to the identification information, for example, is included in the identification information, or may be obtained from a storage device by the data processing apparatus through the identification information.
  • the time information includes a fifth time interval.
  • the sixth user identifier corresponds to the target user.
  • the second operation mode bit vector and abnormal threshold may be preset.
  • the data processing device 1000 further includes: a bit vector table generating module, configured to generate a user operation data table according to the user operation data, and generate a user operation data table according to the user operation data table.
  • the bit vector table associated with the user operation data table, the user includes a target user; the bit vector table update module is used to monitor the user operation data in the user operation data table when the user operation data changes. The data is mapped to update the bit vector in the bit vector table.
  • the user operation data table is provided with a trigger;
  • the bit vector table update module is configured to: monitor the user operation data table; the data in the user operation data table is generated When it changes, the trigger triggers the mapping of the changed user operation data to update the bit vector in the bit vector table.
  • the bit vector table update module is configured to: determine from the user operation data table the target user identification corresponding to the changed user operation data; Obtain the first bit vector corresponding to the target user identifier from the bit vector table, and map the changed user operation data to obtain the second bit vector; OR the first bit vector and the second bit vector Operate to obtain a third bit vector; replace the first bit vector with the third bit vector to update the bit vector in the bit vector table.
  • the bit vector is a compressed bit vector
  • the first bit of the compressed bit vector is a flag bit.
  • the flag bit is 1, the remaining bits after the first bit are A bit vector without compression; when the flag bit is 0, the remaining bits after the first bit are the number of consecutive 0s that are compressed.
  • the data processing device 1000 further includes: an acquiring module, configured to acquire a query interval corresponding to the compressed vector and the bit vector to be processed, the query interval including a start bit number and a stop bit number;
  • the decompression module is configured to divide the compressed vector into a plurality of compressed bit vectors according to the number of bits of the compressed bit vector, and sequentially decompress the compressed bit vectors to obtain the number of bits greater than the starting number of bits.
  • Decompression bit vector used to take the vector value of the decompression bit vector whose median is greater than the start bit vector as the vector value in the to-be-processed bit vector; bit-compensation module, used in the The number of vector values is less than the difference between the stop bit number and the start bit number, then the compressed bit vector adjacent to the decompressed bit vector is decompressed to obtain a vector of the remaining bits in the bit vector to be processed value.
  • Fig. 11 shows a schematic structural diagram of a computer system suitable for implementing an electronic device according to an embodiment of the present application.
  • the computer system 1100 includes a central processing unit (CPU) 1101, which can be loaded into a random system according to a program stored in a read-only memory (Read-Only Memory, ROM) 1102 or from a storage part 1108. Access to the program in the memory (Random Access Memory, RAM) 1103 to execute various appropriate actions and processing to implement the data processing method described in the foregoing embodiment. In RAM 1103, various programs and data required for system operation are also stored.
  • the CPU 1101, the ROM 1102, and the RAM 1103 are connected to each other through a bus 1104.
  • An input/output (Input/Output, I/O) interface 1105 is also connected to the bus 1104.
  • the following components are connected to the I/O interface 1105: input part 1106 including keyboard, mouse, etc.; including output part 1107 such as cathode ray tube (Cathode Ray Tube, CRT), liquid crystal display (LCD), etc., and speakers, etc. ; A storage part 1108 including a hard disk, etc.; and a communication part 1109 including a network interface card such as a LAN (Local Area Network) card and a modem.
  • the communication section 1109 performs communication processing via a network such as the Internet.
  • the driver 1110 is also connected to the I/O interface 1105 as needed.
  • a removable medium 1111 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, etc., is installed on the drive 1110 as needed, so that the computer program read therefrom is installed into the storage portion 1108 as needed.
  • the process described below with reference to the flowchart can be implemented as a computer software program.
  • the embodiments of the present application include a computer program product, which includes a computer program carried on a computer-readable medium, and the computer program contains program code for executing the method shown in the flowchart.
  • the computer program may be downloaded and installed from the network through the communication part 1109, and/or installed from the removable medium 1111.
  • CPU central processing unit
  • the computer-readable medium shown in the embodiment of the present application may be a computer-readable signal medium or a computer-readable storage medium, or any combination of the two.
  • the computer-readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, device, or device, or a combination of any of the above.
  • Computer-readable storage media may include, but are not limited to: electrical connections with one or more wires, portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable removable Erasable Programmable Read Only Memory (EPROM), flash memory, optical fiber, portable compact disk read-only memory (Compact Disc Read-Only Memory, CD-ROM), optical storage device, magnetic storage device, or any suitable of the above The combination.
  • a computer-readable storage medium may be any tangible medium that contains or stores a program, and the program may be used by or in combination with an instruction execution system, apparatus, or device.
  • a computer-readable signal medium may include a data signal propagated in a baseband or as a part of a carrier wave, and a computer-readable program code is carried therein.
  • This propagated data signal can take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • the computer-readable signal medium may also be any computer-readable medium other than the computer-readable storage medium.
  • the computer-readable medium may send, propagate, or transmit the program for use by or in combination with the instruction execution system, apparatus, or device .
  • the program code contained on the computer-readable medium can be transmitted by any suitable medium, including but not limited to: wireless, wired, etc., or any suitable combination of the above.
  • each block in the flowchart or block diagram may represent a module, program segment, or part of the code, and the above-mentioned module, program segment, or part of the code contains one or more for realizing the specified logic function.
  • Executable instructions may also occur in a different order from the order marked in the drawings. For example, two blocks shown in succession can actually be executed substantially in parallel, and they can sometimes be executed in the reverse order, depending on the functions involved.
  • each block in the block diagram or flowchart, and the combination of blocks in the block diagram or flowchart can be implemented by a dedicated hardware-based system that performs the specified function or operation, or can be implemented by It is realized by a combination of dedicated hardware and computer instructions.
  • the units involved in the embodiments described in the present application may be implemented in software or hardware, and the described units may also be provided in a processor. Among them, the names of these units do not constitute a limitation on the unit itself under certain circumstances.
  • this application also provides a computer-readable medium.
  • the computer-readable medium may be included in the data processing device described in the above-mentioned embodiment; or it may exist alone without being integrated into the electronic device. In the device.
  • the foregoing computer-readable medium carries one or more programs, and when the foregoing one or more programs are executed by an electronic device, the electronic device realizes the method described in the foregoing embodiment.
  • modules or units of the device for action execution are mentioned in the above detailed description, this division is not mandatory.
  • the features and functions of two or more modules or units described above may be embodied in one module or unit.
  • the features and functions of a module or unit described above can be further divided into multiple modules or units to be embodied.
  • the example embodiments described here can be implemented by software, or can be implemented by combining software with necessary hardware. Therefore, the technical solution according to the embodiments of the present application can be embodied in the form of a software product, which can be stored in a non-volatile storage medium (can be CD-ROM, U disk, mobile hard disk, etc.) or on the network , Including several instructions to make a computing device (which can be a personal computer, a server, a touch terminal, or a network device, etc.) execute the method according to the embodiments of the present application.
  • a computing device which can be a personal computer, a server, a touch terminal, or a network device, etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

La présente demande se rapporte au domaine des ordinateurs, et concerne un procédé de traitement de données, un dispositif électronique, et un support de stockage lisible. Le procédé consiste : en réponse à une demande d'interrogation, à obtenir une table de vecteurs binaires relative à des données de fonctionnement d'un utilisateur cible, la demande d'interrogation comprenant des informations d'identification et des informations temporelles, la table de vecteurs binaires comprenant une identification d'utilisateur et un vecteur binaire correspondant à l'identification d'utilisateur, et les informations d'identification correspondant à l'utilisateur cible ; et à obtenir un vecteur binaire cible à partir de la table de vecteurs binaires en fonction des informations d'identification et des informations temporelles, et à effectuer un traitement logique sur le vecteur binaire cible afin d'obtenir des informations cibles.
PCT/CN2020/117623 2019-11-15 2020-09-25 Procédé de traitement de données, dispositif électronique, et support de stockage lisible WO2021093472A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201911122281.7 2019-11-15
CN201911122281.7A CN111159515B (zh) 2019-11-15 2019-11-15 数据处理方法、装置及电子设备

Publications (1)

Publication Number Publication Date
WO2021093472A1 true WO2021093472A1 (fr) 2021-05-20

Family

ID=70555961

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/117623 WO2021093472A1 (fr) 2019-11-15 2020-09-25 Procédé de traitement de données, dispositif électronique, et support de stockage lisible

Country Status (2)

Country Link
CN (1) CN111159515B (fr)
WO (1) WO2021093472A1 (fr)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111159515B (zh) * 2019-11-15 2024-05-28 腾讯科技(深圳)有限公司 数据处理方法、装置及电子设备
CN111724148B (zh) * 2020-06-22 2024-03-22 深圳前海微众银行股份有限公司 一种基于区块链系统的交易广播方法及节点
CN113946617A (zh) * 2021-10-29 2022-01-18 北京锐安科技有限公司 一种数据处理方法、装置、电子设备及存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105373614A (zh) * 2015-11-24 2016-03-02 中国科学院深圳先进技术研究院 一种基于用户账号的子用户识别方法及系统
CN108989383A (zh) * 2018-05-31 2018-12-11 阿里巴巴集团控股有限公司 数据处理方法和客户端
US10425353B1 (en) * 2017-01-27 2019-09-24 Triangle Ip, Inc. Machine learning temporal allocator
CN111159515A (zh) * 2019-11-15 2020-05-15 腾讯科技(深圳)有限公司 数据处理方法、装置及电子设备

Family Cites Families (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2798480B1 (fr) * 2011-12-30 2018-09-26 Intel Corporation Instruction de compression de fréquence vectorielle
US9002903B2 (en) * 2013-03-15 2015-04-07 Wisconsin Alumni Research Foundation Database system with data organization providing improved bit parallel processing
CN103559274B (zh) * 2013-11-05 2016-08-31 中国联合网络通信集团有限公司 车况信息查询方法和装置
CN104765790B (zh) * 2015-03-24 2019-09-20 北京大学 一种数据查询的方法和装置
US10467215B2 (en) * 2015-06-23 2019-11-05 Microsoft Technology Licensing, Llc Matching documents using a bit vector search index
CN107545021B (zh) * 2017-05-10 2020-12-11 新华三信息安全技术有限公司 一种数据存储方法及装置
CN110019331A (zh) * 2017-09-08 2019-07-16 北京京东尚科信息技术有限公司 一种基于结构化查询语言的查询数据库的方法和装置
CN110111167A (zh) * 2018-02-01 2019-08-09 北京京东尚科信息技术有限公司 一种确定推荐对象的方法和装置
CN110223093B (zh) * 2018-03-02 2024-04-16 北京京东尚科信息技术有限公司 一种商品推介的方法和装置
CN108829572A (zh) * 2018-05-30 2018-11-16 北京奇虎科技有限公司 用户登录行为的分析方法及装置
CN109687991B (zh) * 2018-09-07 2023-04-18 平安科技(深圳)有限公司 用户行为识别方法、装置、设备及存储介质
CN109657890B (zh) * 2018-09-14 2023-04-25 蚂蚁金服(杭州)网络技术有限公司 一种转账欺诈的风险确定方法及装置
CN110362700B (zh) * 2019-06-17 2023-09-22 中国平安财产保险股份有限公司 数据处理方法、装置、计算机设备及存储介质
CN110365748B (zh) * 2019-06-24 2022-11-08 深圳市腾讯计算机系统有限公司 业务数据的处理方法和装置、存储介质及电子装置

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105373614A (zh) * 2015-11-24 2016-03-02 中国科学院深圳先进技术研究院 一种基于用户账号的子用户识别方法及系统
US10425353B1 (en) * 2017-01-27 2019-09-24 Triangle Ip, Inc. Machine learning temporal allocator
CN108989383A (zh) * 2018-05-31 2018-12-11 阿里巴巴集团控股有限公司 数据处理方法和客户端
CN111159515A (zh) * 2019-11-15 2020-05-15 腾讯科技(深圳)有限公司 数据处理方法、装置及电子设备

Also Published As

Publication number Publication date
CN111159515A (zh) 2020-05-15
CN111159515B (zh) 2024-05-28

Similar Documents

Publication Publication Date Title
WO2021093472A1 (fr) Procédé de traitement de données, dispositif électronique, et support de stockage lisible
WO2022267735A1 (fr) Procédé et appareil de traitement de données de service, dispositif informatique et support de stockage
US20230409349A1 (en) Systems and methods for proactively providing recommendations to a user of a computing device
US10853847B2 (en) Methods and systems for near real-time lookalike audience expansion in ads targeting
US9047558B2 (en) Probabilistic event networks based on distributed time-stamped data
US20160147776A1 (en) Altering data type of a column in a database
US9940360B2 (en) Streaming optimized data processing
CN111666304B (zh) 数据处理装置、数据处理方法、存储介质与电子设备
US11086694B2 (en) Method and system for scalable complex event processing of event streams
Li et al. Feature selection with partition differentiation entropy for large-scale data sets
WO2019187358A1 (fr) Dispositif d'évaluation
Ding et al. An adaptive density data stream clustering algorithm
US20110225116A1 (en) Systems and methods for policy based execution of time critical data warehouse triggers
US9633088B1 (en) Event log versioning, synchronization, and consolidation
US11860880B2 (en) Systems for learning and using one or more sub-population features associated with individuals of one or more sub-populations of a gross population and related methods therefor
US11520756B2 (en) Data reduction in multi-dimensional computing systems including information systems
CN113392150A (zh) 一种基于业务域的数据表展示方法、装置、设备及介质
Wang et al. A temporal consistency method for online review ranking
US11985368B2 (en) Synthetic total audience ratings
WO2022262663A1 (fr) Procédé et appareil de traitement de données et dispositif électronique
Huang et al. US-Rule: Discovering utility-driven sequential rules
Wei et al. Decision-relative discernibility matrices in the sense of entropies
CN116628049B (zh) 一种基于大数据的信息系统维护管理系统及方法
CN117390011A (zh) 报表数据处理方法、装置、计算机设备和存储介质
CN116955413A (zh) 基于线上分析处理的数据查询方法、装置、介质及设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20886262

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20886262

Country of ref document: EP

Kind code of ref document: A1