CN112579608B - Case data query method, system, equipment and computer readable storage medium - Google Patents

Case data query method, system, equipment and computer readable storage medium Download PDF

Info

Publication number
CN112579608B
CN112579608B CN202011558721.6A CN202011558721A CN112579608B CN 112579608 B CN112579608 B CN 112579608B CN 202011558721 A CN202011558721 A CN 202011558721A CN 112579608 B CN112579608 B CN 112579608B
Authority
CN
China
Prior art keywords
data
query
target data
target
sequence number
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011558721.6A
Other languages
Chinese (zh)
Other versions
CN112579608A (en
Inventor
周维健
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Bank Co Ltd
Original Assignee
Ping An Bank Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Bank Co Ltd filed Critical Ping An Bank Co Ltd
Priority to CN202011558721.6A priority Critical patent/CN112579608B/en
Publication of CN112579608A publication Critical patent/CN112579608A/en
Application granted granted Critical
Publication of CN112579608B publication Critical patent/CN112579608B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/221Column-oriented storage; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2474Sequence data queries, e.g. querying versioned data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Computing Systems (AREA)
  • Fuzzy Systems (AREA)
  • Mathematical Physics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention provides a case data query method, which comprises the steps of obtaining a query account number and query conditions from a data query instruction; converting the query account number and the query condition into index data, wherein the index data comprises a first row of keys and a column cluster field, the first row of keys comprise query account number reverse order data, a time range and a first serial number, and the first serial number represents a data acquisition position; determining a target partition according to the query account reverse order data; determining a data query range in the target partition according to the time range; extracting a target data block in the data query range according to the column cluster field and the first sequence number, or extracting a plurality of target data blocks in the data query range according to the column cluster field and the first sequence number and the subsequent sequence numbers of the first sequence number; and splitting one target data block or a plurality of target data blocks to obtain a target data set. The embodiment of the invention improves the query efficiency of the judicial case data and meets the timeliness requirement of massive judicial case data query.

Description

Case data query method, system, equipment and computer readable storage medium
Technical Field
The embodiment of the invention relates to the technical field of big data, in particular to a case data query method, a case data query system, computer equipment and a computer readable storage medium.
Background
The bank has the obligation of carrying out data extraction in cooperation with judicial investigate into a case, and the general processing mode is to manually extract data according to the requirements of the extraction by aiming at the data query requirements, particularly the flow query requirements, under the judicial scene.
Because the bottom layer of data storage is a hive data warehouse, the hive data warehouse is a data warehouse infrastructure in a distributed system infrastructure, the query efficiency for a large amount of data is extremely low, and the occupied resources are large. In addition, the paging scene query can be performed by adopting a search engine, and case data is queried through multiple page turning, however, the multiple page turning query cannot meet the requirement of timeliness in judicial query.
Thus, with manual querying of data and querying of data in judicial cases by means of a search engine, the inventors found that the following drawbacks all exist: the query efficiency is low and the timeliness is poor.
Disclosure of Invention
In view of the above, embodiments of the present invention provide a case data query method, system, computer device, and computer readable storage medium, which are used for solving the problems of low query efficiency and poor timeliness when using manual query data or querying data in judicial cases by means of a search engine.
The embodiment of the invention solves the technical problems through the following technical scheme:
a case data query method, comprising:
Receiving a data query instruction, and acquiring a query account number and query conditions from the data query instruction;
Converting the query account number and the query condition into index data, wherein the index data comprises a first row key and a column cluster field, the first row key comprises query account number reverse order data, a time range and a first serial number, and the first serial number represents a data acquisition position;
Determining a target partition according to the query account reverse order data;
determining a data query range in the target partition according to the time range;
Extracting a target data block in the data query range according to the column cluster field and the first sequence number, or extracting a plurality of target data blocks in the data query range according to the column cluster field and a plurality of continuous sequence numbers, wherein the plurality of sequence numbers comprise the first sequence number and at least one subsequent sequence number continuous with the first sequence number; and
And dividing the target data block or the target data blocks to obtain a target data set.
Optionally, the method comprises:
Creating a plurality of partitions in an initial database;
a start key and a stop key are defined for each partition of the plurality of partitions, respectively, to generate a preset database.
Optionally, the method comprises:
acquiring a plurality of sample data of a plurality of sample accounts;
generating a plurality of row keys corresponding to a plurality of sample data of the plurality of sample accounts based on a preset row key rule;
And writing the plurality of sample data into the plurality of partitions of the preset database when the plurality of row keys meet preset conditions.
Optionally, the step of writing the plurality of sample data into a plurality of partitions of the database when the plurality of row keys satisfy a preset condition includes:
determining a key range based on a start key and a stop key corresponding to the plurality of partitions;
And when the plurality of row keys are positioned in the key range, writing sample data of the plurality of sample accounts into the preset database.
Optionally, the subsequent sequence number in the sequence numbers is an ith sequence number, i is a positive integer, and the initial value of i is 2;
The step of extracting a plurality of target data blocks in the data query range according to the column cluster field, the first sequence number and the sequence number subsequent to the first sequence number includes:
Extracting an ith target data block from the data query range according to the column cluster field and an ith sequence number in the plurality of sequence numbers;
judging whether the data quantity in the ith target data block meets a data quantity threshold value or not:
If the data quantity in the ith target data block meets the data quantity threshold, generating an (i+1) th sequence number in the plurality of sequence numbers according to the ith sequence number, and acquiring an (i+1) th target data block in the data query range according to the column cluster field and the (i+1) th sequence number;
And if the data quantity in the ith target data block does not meet the data quantity threshold value, determining the ith target data block as the last target data block extracted in the data range.
Optionally, the step of splitting the target data block or the target data blocks to obtain a target data set includes
Acquiring one character string set or a plurality of character string sets corresponding to the target data block or the target data blocks;
Dividing the character string set or the character string sets based on a preset column dividing rule and a preset row dividing rule to obtain a plurality of target data;
and assembling the plurality of target data into a target data set according to a time sequence.
Optionally, the method comprises:
And storing the acquired target data set in a blockchain.
In order to achieve the above object, an embodiment of the present invention further provides a case data query system, including:
the acquisition module is used for receiving a data query instruction and acquiring a query account number and query conditions from the data query instruction;
The conversion module is used for converting the query account number and the query condition into index data, wherein the index data comprises a first row key and a column cluster field, the first row key comprises query account number reverse order data, a time range and a first serial number, and the first serial number represents a data acquisition position;
the first determining module is used for determining a target partition according to the query account reverse order data;
the second determining module is used for determining a data query range in the target partition according to the time range;
The extraction module is used for extracting one target data block in the data query range according to the column cluster field and the first sequence number, or extracting a plurality of target data blocks in the data query range according to the column cluster field and the first sequence number and the subsequent sequence numbers of the first sequence number; and
And the segmentation module is used for segmenting the target data block or the target data blocks to obtain a target data set.
To achieve the above object, an embodiment of the present invention further provides a computer device, including a memory, a processor, and a computer program stored on the memory and executable on the processor, where the processor implements the steps of the case data query method described above when executing the computer program.
To achieve the above object, embodiments of the present invention also provide a computer-readable storage medium having stored therein a computer program executable by at least one processor to cause the at least one processor to perform the steps of the case data query method as described above.
According to the case data query method, the system, the computer equipment and the computer readable storage medium, index data are obtained from a database through a data query instruction, a plurality of target data blocks are extracted according to query account reverse order data, a time range and a first sequence number in a first row key in the index data, and a target data set is obtained according to requirements; the position of the data of the query account in the database can be directly positioned through the reverse order data of the query account in the first key, the data query range can be rapidly determined through the designed time range of the first key, and the data of the query account can be rapidly and orderly acquired according to the first serial number or the first serial number and the subsequent serial numbers of the first serial number, so that the query efficiency of judicial case data is improved, and the timeliness requirement of massive judicial case data query is met.
The invention will now be described in more detail with reference to the drawings and specific examples, which are not intended to limit the invention thereto.
Drawings
FIG. 1 is a flowchart illustrating a case data query method according to an embodiment of the present invention;
FIG. 2 is a flowchart illustrating steps for pre-partitioning a database in a case data query method according to an embodiment of the present invention;
FIG. 3 is a flowchart illustrating a method for searching case data according to an embodiment of the present invention, in which sample data is written in a database in advance;
FIG. 4 is a flowchart illustrating steps of writing a plurality of sample data into a plurality of partitions in advance in a case data query method according to an embodiment of the present invention;
FIG. 5 is a flowchart illustrating steps for obtaining a plurality of target data blocks in a case data query method according to an embodiment of the present invention;
FIG. 6 is a flowchart illustrating steps for assembling a target data set in a case data query method according to an embodiment of the present invention;
FIG. 7 is a schematic diagram illustrating a program module of a case data query system according to a second embodiment of the present invention;
Fig. 8 is a schematic hardware structure of a computer device according to a third embodiment of the invention.
Detailed Description
The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The technical solutions between the embodiments may be combined with each other, but it is necessary to base the implementation on the basis of those skilled in the art that when the combination of technical solutions contradicts or cannot be implemented, it should be considered that the combination of technical solutions does not exist and is not within the scope of protection claimed by the present invention.
Example 1
Referring to fig. 1, a flowchart illustrating steps of a case data query method according to an embodiment of the present invention is shown. It will be appreciated that the flow charts in the method embodiments are not intended to limit the order in which the steps are performed. The following description is exemplary with a computer device as an execution subject, and specifically follows:
as shown in fig. 1, the case data query method may include steps S100 to 600, in which:
Step S100, a data query instruction is received, and a query account number and query conditions are obtained from the data query instruction.
In an exemplary embodiment, the data query instruction is received via an Hbase (distributed storage database) database. The data query instruction is used for requesting to query related case data of one or more query accounts, and applying the related case data to corresponding judicial programs after screening and checking. The data of the query account number can be transaction flow data, financial data, transfer data and the like which are provided by matching with the query of a judicial organization.
In an exemplary embodiment, as shown in fig. 2, the method includes: pre-partitioning the database; specifically, the method comprises the following steps S101 to S102, wherein: step S101, creating a plurality of partitions in an initial database; step S102, defining a generation start key and a termination key for each partition in the plurality of partitions respectively to generate a preset database.
When the database is built, hbase defaults to a region, all data can be written into the default region when the data is written, and along with the increase of the data volume, the region cannot bear more data volume, so that the database is pre-partitioned, the scattered storage of the data is facilitated, and the problem of hot spots of nodes when the data is written is prevented because the data is stored in the initial region.
In an exemplary embodiment, as shown in fig. 3, the method further comprises: storing the sample data into the corresponding partition; specifically, the method comprises the following steps S103 to S105, wherein: step S103, obtaining a plurality of sample data of a plurality of sample accounts; step S104, generating a plurality of row keys corresponding to a plurality of sample data of the plurality of sample accounts based on a preset row key rule; step S105, writing the plurality of sample data into the plurality of partitions of the preset database when the plurality of row keys satisfy a preset condition.
Illustratively, the row key rules are expressed as a combination rule of row keys. The combination rule of the row keys is set to be account number reverse order + date + sequence number, wherein the sequence number represents the position of the target data block.
And writing the plurality of row keys into the plurality of partitions, and effectively dispersing data corresponding to the accounts into different partitions according to the plurality of accounts, so that a plurality of sample data written into the database can be distributed more uniformly.
In an exemplary embodiment, as shown in fig. 4, the step 105 may further include steps S1051 to 1052, wherein: step S1051, determining a key range based on a start key and a stop key corresponding to the plurality of partitions; step S1052, writing the sample data of the plurality of sample accounts into the preset database when the plurality of row keys are located in the key range.
The method includes the steps that an A partition is obtained, the A partition has an A key range, and when a row key falls in the A key range, sample data of a sample account corresponding to the row key are written into the preset database; and when the row key is positioned outside the A key range, acquiring the key range of the next partition, and continuing to compare the key range of the next partition with the row key until the row key can fall in the key range in a certain partition.
By comparing the row key and the key range, a better aggregation effect is achieved for discrete single accounts, and the primary index physical address of the same account can be understood as that the data of the same account are all in a data block of one partition or in a certain data block of a plurality of partitions; when the data of the same account is queried, the data are relatively concentrated, other irrelevant partitions are not required to be invoked, and the high efficiency of data query is ensured.
Step S200, converting the query account number and the query condition into index data, where the index data includes a first row key and a column cluster field, the first row key includes query account number reverse order data, a time range, and a first serial number, and the first serial number represents a data acquisition position.
Illustratively, the query conditions include a query date and query content.
And turning over the query account to obtain a first account, and converting the first account, the query date and the query content in the query condition into a first row key and a column cluster field.
In an exemplary embodiment, the combined data of the first account number, the query date in the query condition, and the preset first serial number is converted into a first row key, and the query content in the query condition is converted into a column cluster field.
In an exemplary embodiment, the first row key (rowkey) is a record in the Hbase database that can uniquely identify a row of data, which is string-type data. Since rowkey are ordered according to dictionary order in Hbase database, the overall data ordering is guaranteed. Rowkey of all dates under the account can be enumerated by inquiring the through account under judicial inquiry, and the region (address) where the data is can be accurately positioned through rowkey and the column cluster field, so that a quick inquiry result can be achieved.
The reverse order data of the query account is to turn over the positions of the characters corresponding to the first several digits and the characters corresponding to the last several digits of the query account to form a new character combination. Since the first few of the accounts are the same and the end of the accounts are generally random, reversing the order of the accounts facilitates even distribution of data across each server in the distributed cluster.
Illustratively, account 1 is "135xxxxx93", and account 2 is: "188xxxx 98", account 3 "158xxxx 80", account 1 "39xxxx 531" and account 2 "89xxxx 881" and account 3 "08xxxx 851.
In addition, the account number is placed in the first position of rowkey in reverse order; by utilizing the characteristic of the dictionary sequence, the method can ensure that: the primary index physical address where the same account is located is contiguous and the data is substantially evenly distributed to all regions. The primary index physical address of the same account can be understood as that the data of the same account are in the data block of one partition or in the same data block of a plurality of partitions, when the data of the same account are queried, the data are relatively concentrated, other irrelevant partitions are not required to be invoked, and the high efficiency of data query is ensured.
And step S300, determining a target partition according to the query account reverse order data.
In an exemplary embodiment, when the data is pre-stored, the data is pre-distributed on different partitions according to account data and stored data content. Therefore, at the time of actual query, at least one target partition can be determined according to the reverse order data of the query account number. Other partitions irrelevant to the query account are not required to be traversed, and the efficiency of data query is improved.
Step S400, determining a data query range in the target partition according to the time range.
In an exemplary embodiment, the time range refers to a date range in which data corresponding to the query account actually occurs. If the query account number is the account number A and the time range is '11 months in 2020-11 months in 1-11 months in 2020', determining that the data query range is the data of the account number A between 11 months in 2020-11 months in 1-11 months in 2020.
And step S500, extracting a target data block in the data query range according to the column cluster field and the first sequence number, or extracting a plurality of target data blocks in the data query range according to the column cluster field and a plurality of continuous sequence numbers, wherein the plurality of sequence numbers comprise the first sequence number and at least one subsequent sequence number continuous with the first sequence number.
In an exemplary embodiment, the data storage structure of the Hbase database includes a corresponding rowkey (reverse account + date + first serial number), column cluster, column name. The column names are located below the column cluster, that is, the column cluster contains a plurality of column names, and the column cluster field corresponds to a specific column name. Through rowkey +the column cluster+the specific column name, a certain piece of data content of the query account can be further located.
The first sequence number indicates an initial position of data acquisition in the data query range, which can be indicated by 000001.
In an exemplary embodiment, an initial position acquired by a target data block is determined through a first sequence number, whether the data amount of the data block in the initial position meets a data amount threshold is determined, if the data amount of the data block in the initial position does not meet the data amount threshold, that is, if the data amount of the data block in the initial position is smaller than the data amount threshold, the data block in the initial position is determined to be a unique target data block in a data query range, and the target data block is extracted.
In other exemplary embodiments, the initial position of the target data block is determined through the first sequence number, whether the data amount of the data block in the initial position meets the data amount threshold is determined, if the data amount of the data block in the initial position meets the data amount threshold, that is, if the data amount of the data block in the initial position is equal to the data amount threshold, a subsequent sequence number of the first sequence number is generated, then the subsequent target data block is acquired according to the subsequent sequence number of the first sequence number, and at this time, the acquired multiple target data blocks include the target data block corresponding to the initial position and the subsequent target data block corresponding to the subsequent sequence number of the first sequence number.
In an exemplary embodiment, a subsequent sequence number of the plurality of sequence numbers is an ith sequence number, i is a positive integer, and an initial value of i is 2. Referring to fig. 5, the step S500 may further include steps S501 to S502, wherein: step S501, extracting an ith target data block in the data query range according to the column cluster field and an ith sequence number in the plurality of sequence numbers; step S502, determining whether the data amount in the ith target data block meets a data amount threshold: if the data quantity in the ith target data block meets the data quantity threshold, generating an (i+1) th sequence number in the plurality of sequence numbers according to the ith sequence number, and acquiring an (i+1) th target data block in the data query range according to the column cluster field and the (i+1) th sequence number; and if the data quantity in the ith target data block does not meet the data quantity threshold value, determining the ith target data block as the last target data block extracted in the data range.
In an exemplary embodiment, when the data amount in the data block at the initial position meets the data amount preset condition, that is, when the data amount in the first data block is equal to a preset access threshold, a second sequence number is obtained based on the first sequence number in an ascending order, a new line key is generated based on the second sequence number and the line key, a next target data block is queried and extracted in the database based on the new line key, and access is stopped until the data amount in the extracted next target data block is smaller than the preset access threshold, and the extracted data blocks are all target data blocks.
Specifically, the second sequence number is obtained based on the ascending sequence of the first sequence number, which can be understood that the first sequence number +1 is obtained to obtain the second sequence number, and then the next data block is obtained through the second sequence number.
In an exemplary embodiment, when the data amount of the first data block is equal to a preset access threshold, a second sequence number is obtained based on the first sequence number in an ascending order, a next data block is obtained based on the second sequence number, and if the next data block is null, access is stopped, and the first data block is determined to be a target data block.
In an exemplary embodiment, if the data amount in the data block of the initial position meets the data amount threshold, the first sequence number is up-ordered to obtain a second sequence number, for example, "000001" is up-ordered to obtain "000002"; generating a new row key according to the query account reverse order data, the second serial number and the time corresponding to the next position of the initial position; and acquiring a second data block from the preset database according to the new row key, judging whether the data volume in the second data block meets a data volume threshold, if the data volume in the second data block meets the data volume threshold, then ascending the second sequence number, continuously judging whether the data volume of the acquired next data block meets the data volume threshold or not until the data volume of the current data block meets the data volume threshold, determining the current data block as the last data block, and outputting a plurality of target data blocks, wherein the plurality of target data blocks comprise the data block corresponding to the initial position, the last data block and at least one data block between the data block corresponding to the initial position and the last data block.
Through the judging operation, when the acquisition requirement of the maximum data amount allowed by the server is met each time, the accuracy of the acquisition is improved, and the data is prevented from being missed.
And S600, cutting the target data block or the target data blocks to obtain a target data set.
In an exemplary embodiment, referring to fig. 6, the step S600 may further include steps S601 to S603, wherein: step S601, acquiring one or more character string sets corresponding to the one or more target data blocks; step S602, segmenting the character string set or the character string sets based on a preset column segmentation rule and a preset row segmentation rule to obtain a plurality of target data; step S603, assembling the plurality of target data into a target data set according to a time sequence.
Segmenting the character string set according to a preset column segmentation rule and a preset navigation segmentation rule, and generating a target data set according to requirements; facilitating subsequent processing and analysis of the plurality of target data in the target data set.
The method further comprises the steps of: data in the database is managed based on the version number. Each data is configured with a version number when being imported, and the same data is imported in different time periods and is matched with the corresponding version number respectively. For example, when data a is imported at 9 am on 11/1/2020, the corresponding version number is #0010, and when data a imported at 9 am on 1/2020 is incorrect at 4 pm on 11/2/2020, and correct data a needs to be re-imported, the version number of correct data a imported at 4 pm on 11/2020 is #0011. When the data A only has the two version numbers; then #0011 is the latest version number corresponding to data a. The latest version number of a certain data may be regarded as the latest version number of the warehouse entry time.
Therefore, correct data can be directly imported into the database without deleting, hbase can simultaneously reserve two versions of data, and the latest version of data is queried by default during query, so that seamless management of the data is realized; the data management is convenient, the data management can be carried out aiming at specific data, the data can be effectively prevented from being deleted by mistake, and the management convenience is improved.
In an exemplary embodiment, the method includes: and storing the acquired target data set in a blockchain.
The blockchain (Blockchain) is essentially a decentralised database, and is a series of data blocks which are generated by association using a cryptography method, and each data block contains information of a batch of network transactions and is used for verifying the validity (anti-counterfeiting) of the information and generating a next block. The blockchain may include a blockchain underlying platform, a platform product services layer, an application services layer, and the like. The target data set is stored in the blockchain, so that the tracing and further verification of the subsequent data are facilitated.
According to the embodiment of the invention, index data is obtained from a database through a data query instruction, a plurality of target data blocks are extracted according to query account reverse order data in a first row key in the index data, a time range and a first sequence number, and a target data set is obtained as required; the query efficiency of judicial case data is improved, and the timeliness requirement of massive judicial case data query is met.
The proposal directly realizes that most judicial queries can return query results within a few seconds, and improves the efficiency from the original few hours to the second level. The problems that the judicial query efficiency is low and even the direct query cannot be performed are solved. The processing efficiency of judicial inquiry is improved to the second level; the improvement of the query efficiency enables judicial data to have a unified and efficient query outlet. The problems that a series of data are uncontrollable, excessive labor is input, time is consumed and the like due to manual collection of data are solved.
Example two
With continued reference to fig. 7, a schematic diagram of program modules of the case data query system of the present invention is shown. In this embodiment, the case data query system 20 may include or be divided into one or more program modules, which are stored in a storage medium and executed by one or more processors to accomplish the present invention and to implement the case data query method described above. Program modules depicted in the embodiments of the present invention are directed to a series of computer program instruction segments capable of performing the specified functions and are more suitable than the programs themselves for describing the execution of the case data query system 20 in a storage medium. The following description will specifically describe functions of each program module of the present embodiment:
The acquisition module 700 is configured to receive a data query instruction, and acquire a query account number and a query condition from the data query instruction;
the conversion module 710 is configured to convert the query account number and the query condition into index data, where the index data includes a first row key and a column cluster field, the first row key includes query account number reverse order data, a time range, and a first serial number, and the first serial number represents a data acquisition position;
a first determining module 720, configured to determine a target partition according to the query account reverse order data;
a second determining module 730, configured to determine a data query range in the target partition according to the time range;
The extracting module 740 is configured to extract a target data block in the data query range according to the column cluster field and the first sequence number, or extract a plurality of target data blocks in the data query range according to the column cluster field and the first sequence number and the subsequent sequence numbers of the first sequence number; and
The segmentation module 750 is configured to segment the target data block or the target data blocks to obtain a target data set.
In an exemplary embodiment, the method includes:
acquiring a plurality of sample data of a plurality of sample accounts;
generating a plurality of row keys corresponding to a plurality of sample data of the plurality of sample accounts based on a preset row key rule;
And writing the plurality of sample data into the plurality of partitions of the preset database when the plurality of row keys meet preset conditions.
In an exemplary embodiment, the step of writing the plurality of sample data into a plurality of partitions of the database when the plurality of row keys satisfy a preset condition includes:
determining a key range based on a start key and a stop key corresponding to the plurality of partitions;
And when the plurality of row keys are positioned in the key range, writing sample data of the plurality of sample accounts into the preset database.
In an exemplary embodiment, the step of extracting a plurality of target data blocks in the data query range according to the column cluster field and the first sequence number and the sequence number subsequent to the first sequence number includes:
extracting a first target data block from the data query range according to the column cluster field and the first sequence number;
extracting an ith target data block from the data query range according to the column cluster field and the sequence number subsequent to the first sequence number;
judging whether the data volume in the ith target data block meets a data volume threshold, wherein i is a positive integer, and the initial value of i is 1:
if the data quantity in the ith target data block meets the data quantity threshold, acquiring an (i+1) th target data block;
And if the data quantity in the ith target data block does not meet the data quantity threshold value, determining the ith target data block as the last target data block.
In an exemplary embodiment, the step of splitting the target data block or target data blocks to obtain a target data set includes
Acquiring one character string set or a plurality of character string sets corresponding to the target data block or the target data blocks;
Dividing the character string set or the character string sets based on a preset column dividing rule and a preset row dividing rule to obtain a plurality of target data;
and assembling the plurality of target data into a target data set according to a time sequence.
In an exemplary embodiment, the method includes:
And storing the acquired target data set in a blockchain.
Example III
Referring to fig. 8, a hardware architecture diagram of a computer device according to a third embodiment of the present invention is shown. In this embodiment, the computer device 2 is a device capable of automatically performing numerical calculation and/or information processing according to a preset or stored instruction. The computer device 2 may be a rack server, a blade server, a tower server, or a rack server (including a stand-alone server, or a server cluster made up of multiple servers), or the like. As shown in fig. 8, the computer device 2 includes, but is not limited to, at least a memory 21, a processor 22, a network interface 23, and a case data query system 20, which are communicatively connected to each other via a system bus. Wherein:
In this embodiment, the memory 21 includes at least one type of computer-readable storage medium including flash memory, a hard disk, a multimedia card, a card memory (e.g., SD or DX memory, etc.), a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a read-only memory (ROM), an electrically erasable programmable read-only memory (EEPROM), a programmable read-only memory (PROM), a magnetic memory, a magnetic disk, an optical disk, and the like. In some embodiments, the memory 21 may be an internal storage unit of the computer device 2, such as a hard disk or a memory of the computer device 2. In other embodiments, the memory 21 may also be an external storage device of the computer device 2, such as a plug-in hard disk provided on the computer device 2, a smart memory card (SMART MEDIA CARD, SMC), a Secure Digital (SD) card, a flash memory card (FLASH CARD), or the like. Of course, the memory 21 may also include both internal storage units of the computer device 2 and external storage devices. In this embodiment, the memory 21 is typically used to store an operating system and various types of application software installed on the computer device 2, such as program codes of the case data query system 20 of the above embodiment. Further, the memory 21 may be used to temporarily store various types of data that have been output or are to be output.
Processor 22 may be a central processing unit (Central Processing Unit, CPU), controller, microcontroller, microprocessor, or other data processing chip in some embodiments. The processor 22 is typically used to control the overall operation of the computer device 2. In this embodiment, the processor 22 is configured to execute the program code stored in the memory 21 or process data, for example, execute the case data query system 20, so as to implement the case data query method of the foregoing embodiment.
The network interface 23 may comprise a wireless network interface or a wired network interface, which network interface 23 is typically used for establishing a communication connection between the computer apparatus 2 and other electronic devices. For example, the network interface 23 is used to connect the computer device 2 to an external terminal through a network, establish a data transmission channel and a communication connection between the computer device 2 and the external terminal, and the like. The network may be an Intranet (Intranet), the Internet (Internet), a global system for mobile communications (Global System of Mobile communication, GSM), wideband code division multiple access (Wideband Code Division Multiple Access, WCDMA), a 4G network, a 5G network, bluetooth (Bluetooth), wi-Fi, or other wireless or wired network.
It is noted that fig. 8 only shows a computer device 2 having components 20-23, but it is understood that not all of the illustrated components are required to be implemented, and that more or fewer components may alternatively be implemented.
In this embodiment, the case data query system 20 stored in the memory 21 may be further divided into one or more program modules, which are stored in the memory 21 and executed by one or more processors (the processor 22 in this embodiment) to complete the present invention.
For example, fig. 7 shows a schematic diagram of a program module for implementing the case data query system 20 according to the second embodiment, where the case data query system 20 may be divided into an obtaining module 700, a converting module 710, a first determining module 730, a second determining module 740, an extracting module 740, and a slicing module 750. Program modules in the present invention are understood to mean a series of computer program instruction segments capable of performing a specific function, more preferably than a program, for describing the execution of the case data query system 20 in the computer device 2. The specific functions of the program modules 700-750 are described in detail in the second embodiment, and are not described herein.
Example IV
The present embodiment also provides a computer-readable storage medium such as a flash memory, a hard disk, a multimedia card, a card-type memory (e.g., SD or DX memory, etc.), a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a read-only memory (ROM), an electrically erasable programmable read-only memory (EEPROM), a programmable read-only memory (PROM), a magnetic memory, a magnetic disk, an optical disk, a server, an App application store, etc., on which a computer program is stored, which when executed by a processor, performs the corresponding functions. The computer readable storage medium of the present embodiment is used to store the case data query system 20, and when executed by a processor, implements the case data query method of the above embodiment.
The foregoing embodiment numbers of the present invention are merely for the purpose of description, and do not represent the advantages or disadvantages of the embodiments.
From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment.
The foregoing description is only of the preferred embodiments of the present invention, and is not intended to limit the scope of the invention, but rather is intended to cover any equivalents of the structures or equivalent processes disclosed herein or in the alternative, which may be employed directly or indirectly in other related arts.

Claims (8)

1. A case data query method, comprising:
Receiving a data query instruction, and acquiring a query account number and query conditions from the data query instruction;
Converting the query account number and the query condition into index data, wherein the index data comprises a first row key and a column cluster field, the first row key comprises query account number reverse order data, a time range and a first serial number, and the first serial number represents a data acquisition position;
Determining a target partition according to the query account reverse order data;
determining a data query range in the target partition according to the time range;
Extracting a target data block in the data query range according to the column cluster field and the first sequence number, or extracting a plurality of target data blocks in the data query range according to the column cluster field and a plurality of continuous sequence numbers, wherein the plurality of sequence numbers comprise the first sequence number and at least one subsequent sequence number continuous with the first sequence number; and
Dividing the target data block or the target data blocks to obtain a target data set;
The method further comprises the steps of:
Creating a plurality of partitions in an initial database;
defining a generation start key and a termination key for each partition in the plurality of partitions respectively to generate a preset database;
The method further comprises the steps of:
acquiring a plurality of sample data of a plurality of sample accounts;
generating a plurality of row keys corresponding to a plurality of sample data of the plurality of sample accounts based on a preset row key rule;
And writing the plurality of sample data into the plurality of partitions of the preset database when the plurality of row keys meet preset conditions.
2. The case data query method of claim 1, wherein the step of writing the plurality of sample data into a plurality of partitions of the database when the plurality of row keys satisfy a preset condition comprises:
determining a key range based on a start key and a stop key corresponding to the plurality of partitions;
And when the plurality of row keys are positioned in the key range, writing sample data of the plurality of sample accounts into the preset database.
3. The case data query method of claim 2, wherein a subsequent sequence number of the plurality of sequence numbers is an ith sequence number, i is a positive integer, and an initial value of i is 2;
The step of extracting a plurality of target data blocks in the data query range according to the column cluster field, the first sequence number and the sequence number subsequent to the first sequence number includes:
Extracting an ith target data block from the data query range according to the column cluster field and an ith sequence number in the plurality of sequence numbers;
judging whether the data quantity in the ith target data block meets a data quantity threshold value or not:
If the data quantity in the ith target data block meets the data quantity threshold, generating an (i+1) th sequence number in the plurality of sequence numbers according to the ith sequence number, and acquiring an (i+1) th target data block in the data query range according to the column cluster field and the (i+1) th sequence number;
And if the data quantity in the ith target data block does not meet the data quantity threshold value, determining the ith target data block as the last target data block extracted in the data range.
4. The case data query method of claim 1, wherein the step of splitting the one target data block or the plurality of target data blocks to obtain a target data set comprises
Acquiring one character string set or a plurality of character string sets corresponding to the target data block or the target data blocks;
Dividing the character string set or the character string sets based on a preset column dividing rule and a preset row dividing rule to obtain a plurality of target data;
and assembling the plurality of target data into a target data set according to a time sequence.
5. The case data query method of claim 1, wherein the method comprises:
And storing the acquired target data set in a blockchain.
6. A case data query system, comprising:
the acquisition module is used for receiving a data query instruction and acquiring a query account number and query conditions from the data query instruction;
The conversion module is used for converting the query account number and the query condition into index data, wherein the index data comprises a first row key and a column cluster field, the first row key comprises query account number reverse order data, a time range and a first serial number, and the first serial number represents a data acquisition position;
the first determining module is used for determining a target partition according to the query account reverse order data;
the second determining module is used for determining a data query range in the target partition according to the time range;
The extraction module is used for extracting a target data block in the data query range according to the column cluster field and the first sequence number, or extracting a plurality of target data blocks in the data query range according to the column cluster field and a plurality of continuous sequence numbers, wherein the plurality of sequence numbers comprise the first sequence number and a plurality of subsequent sequence numbers continuous with the first sequence number; and
The segmentation module is used for segmenting the target data block or the target data blocks to obtain a target data set;
The case data query system is further configured to:
Creating a plurality of partitions in an initial database;
defining a generation start key and a termination key for each partition in the plurality of partitions respectively to generate a preset database;
acquiring a plurality of sample data of a plurality of sample accounts;
generating a plurality of row keys corresponding to a plurality of sample data of the plurality of sample accounts based on a preset row key rule;
And writing the plurality of sample data into the plurality of partitions of the preset database when the plurality of row keys meet preset conditions.
7. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the steps of the case data query method according to any of claims 1 to 5 when the computer program is executed by the processor.
8. A computer-readable storage medium, wherein a computer program is stored in the computer-readable storage medium, and the computer program is executable by at least one processor, so that the at least one processor performs the steps of the case data query method according to any one of claims 1 to 5.
CN202011558721.6A 2020-12-25 2020-12-25 Case data query method, system, equipment and computer readable storage medium Active CN112579608B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011558721.6A CN112579608B (en) 2020-12-25 2020-12-25 Case data query method, system, equipment and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011558721.6A CN112579608B (en) 2020-12-25 2020-12-25 Case data query method, system, equipment and computer readable storage medium

Publications (2)

Publication Number Publication Date
CN112579608A CN112579608A (en) 2021-03-30
CN112579608B true CN112579608B (en) 2024-06-21

Family

ID=75140242

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011558721.6A Active CN112579608B (en) 2020-12-25 2020-12-25 Case data query method, system, equipment and computer readable storage medium

Country Status (1)

Country Link
CN (1) CN112579608B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113204567B (en) * 2021-05-31 2022-12-23 山东政法学院司法鉴定中心 Big data judicial case analysis processing system
CN115617878B (en) * 2022-11-17 2023-03-10 浪潮电子信息产业股份有限公司 Data query method, system, device, equipment and computer storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104112011A (en) * 2014-07-16 2014-10-22 深圳市国泰安信息技术有限公司 Method and device for extracting mass data
CN106682077A (en) * 2016-11-18 2017-05-17 山东鲁能软件技术有限公司 Method for storing massive time series data on basis of Hadoop technologies

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2960809B1 (en) * 2014-06-27 2017-08-09 Sap Se Transparent access to multi-temperature data
CN107783980B (en) * 2016-08-24 2021-10-19 阿里巴巴集团控股有限公司 Index data generation and data query method and device, and storage and query system
CN108256088A (en) * 2018-01-23 2018-07-06 清华大学 A kind of storage method and system of the time series data based on key value database

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104112011A (en) * 2014-07-16 2014-10-22 深圳市国泰安信息技术有限公司 Method and device for extracting mass data
CN106682077A (en) * 2016-11-18 2017-05-17 山东鲁能软件技术有限公司 Method for storing massive time series data on basis of Hadoop technologies

Also Published As

Publication number Publication date
CN112579608A (en) 2021-03-30

Similar Documents

Publication Publication Date Title
CN110309125B (en) Data verification method, electronic device and storage medium
WO2020186786A1 (en) File processing method and apparatus, computer device and storage medium
CN112579608B (en) Case data query method, system, equipment and computer readable storage medium
CN111881158B (en) Processing method, device, computer system and readable storage medium for managing report data
CN111737577A (en) Data query method, device, equipment and medium based on service platform
CN112598289A (en) Index configuration method, system, computer device and computer readable storage medium
CN110162540B (en) Block chain account book data query method, electronic device and storage medium
CN110134839B (en) Time sequence data characteristic processing method and device and computer readable storage medium
CN114138877A (en) Method, device and equipment for realizing theme data service based on micro-service architecture
CN110866007B (en) Information management method, system and computer equipment for big data application and table
CN112035551A (en) Time series data conversion method, system, computer device and storage medium
CN112163948A (en) Method, system, equipment and storage medium for separately-moistening calculation
CN108519984B (en) Weather data processing method, server and computer readable storage medium
CN114511314A (en) Payment account management method and device, computer equipment and storage medium
CN115081228A (en) BIM-based rebar data statistical method, device, equipment and readable storage medium
CN112597162A (en) Data set acquisition method, system, device and storage medium
CN111651466B (en) Data sampling method and device
CN113936130A (en) Document information intelligent acquisition and error correction method, system and equipment based on OCR technology
CN109885710B (en) User image depicting method based on differential evolution algorithm and server
CN113535206A (en) Multi-version code upgrading method and system
CN113360505B (en) Time sequence data-based data processing method and device, electronic equipment and readable storage medium
CN115756968B (en) Data backup method and system based on network and cloud platform
CN113190381B (en) Data backup method, system, equipment and storage medium
CN114385267B (en) Data pushing method for cash transaction service
CN113421118B (en) Data pushing method, system, computer equipment and computer readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant