CN117421481A - Crowd searching method, system, electronic device and computer readable storage medium - Google Patents

Crowd searching method, system, electronic device and computer readable storage medium Download PDF

Info

Publication number
CN117421481A
CN117421481A CN202311378341.8A CN202311378341A CN117421481A CN 117421481 A CN117421481 A CN 117421481A CN 202311378341 A CN202311378341 A CN 202311378341A CN 117421481 A CN117421481 A CN 117421481A
Authority
CN
China
Prior art keywords
crowd
block
target
bitmap
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311378341.8A
Other languages
Chinese (zh)
Inventor
陈灏
姜皓然
邵加佳
温嘉鸣
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Shouqianba Internet Technology Co ltd
Original Assignee
Shanghai Shouqianba Internet Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Shouqianba Internet Technology Co ltd filed Critical Shanghai Shouqianba Internet Technology Co ltd
Priority to CN202311378341.8A priority Critical patent/CN117421481A/en
Publication of CN117421481A publication Critical patent/CN117421481A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application provides a crowd searching method, a crowd searching system, electronic equipment and a computer readable storage medium, and relates to the field of big data. The crowd searching method comprises the following steps: generating a plurality of initial bit patterns based on the raw crowd data; performing intra-block compression on the initial bit block to obtain a final bit block, and recording bitmap block information corresponding to the final bitmap block; and searching a target crowd in the final bitmap block based on the bitmap block information. The crowd locating or crowd prediction is carried out by using the crowd locating method provided by the application, the bitmap is used as a basic unit for data storage and processing, and the operations such as bitmap compression, inquiry and calculation are carried out, so that the efficient processing of large-scale crowd data is realized.

Description

Crowd searching method, system, electronic device and computer readable storage medium
Technical Field
The present application relates to the field of big data, and in particular, to a crowd searching method, system, electronic device, and computer readable storage medium.
Background
In the target user identification, the target crowd identification or estimation is required in the scenes such as advertisement information delivery and the like. At present, crowd estimation algorithms often rely on a large amount of raw data detail data for processing and analysis.
During processing, a large amount of original user behavior attribute detail data needs to be stored in the relational database, and the data amount of the original user behavior attribute detail data is very large, generally at the TB level. Therefore, the traditional crowd prediction method is complicated in data acquisition, high in cost and still needs to be improved in real-time performance.
Disclosure of Invention
An object of the embodiments of the present application is to provide a crowd searching method, system, electronic device, and computer readable storage medium, which implement efficient processing of large-scale crowd data by bitmap compression, query, and computation based on bitmaps as basic units for data storage and processing.
In a first aspect, an embodiment of the present application provides a crowd searching method, including: generating a plurality of initial bit patterns based on the raw crowd data; performing intra-block compression on the initial bit block to obtain a final bit block, and recording bitmap block information corresponding to the final bit block; and searching the target crowd in the final bitmap block based on the bitmap block information.
In the implementation process, the crowd searching method provided by the application can effectively reduce the storage requirement by processing the original crowd data into the final bitmap block. Because the bit patterns are in compressed form, the required memory space is significantly reduced compared to conventional storage of large amounts of raw user behavior attribute specification data. Since the data has been compressed into bitmap blocks, the subsequent crowd finding and locating process becomes more efficient. The corresponding information is queried in the bit block without the need of complex relational database query, so that the computing efficiency is improved.
Optionally, in an embodiment of the present application, generating a plurality of initial bit tiles based on the raw crowd data includes: storing the original crowd data in a key value pair mode; wherein the key-value pair includes a key name and a key value; the key names represent crowd characteristics of the original crowd, and the key values represent data corresponding to the crowd characteristics.
In the implementation process, the crowd searching method provided by the embodiment of the application may use the roaring bitmap algorithm to generate a plurality of initial bit map blocks from the original data. The original crowd data is divided into keys and key values, and the keys and the key values are stored by using a Roaring bitmap algorithm, so that the storage requirement can be remarkably reduced; the method can effectively store a large-scale integer data set, is suitable for processing TB-level data, and effectively reduces storage cost and hardware requirements.
Optionally, in the embodiment of the present application, recording bitmap block information corresponding to the final bitmap block includes: recording positioning information corresponding to the final bitmap block in an index mode; wherein the positioning information includes a block ID and an offset.
In the implementation process, when the bitmap block information corresponding to the final bitmap block is recorded, the positioning information of the bitmap block is generally recorded in an index manner; the block ID is used to uniquely identify the final bitmap block, and the offset indicates the specific location of the desired data in the bitmap block. Therefore, the target bitmap block can be quickly positioned, and the required data can be extracted or processed from the bitmap block, so that the efficiency of query and retrieval operations is improved.
Optionally, in an embodiment of the present application, searching the target crowd in the final bitmap block based on the bitmap block information includes: carrying out hierarchical searching on the final bitmap block according to the index and the crowd characteristics of the target crowd so as to locate the target bitmap block in the final bitmap block; within the target bit pattern block, the target population is found.
In the implementation process, the embodiment of the application provides a method for searching the target crowd in the final bitmap block based on the bitmap block information, and the target bitmap block is positioned according to the index and the characteristics of the target crowd. Because hierarchical searching is used, bitmap blocks which do not meet the conditions can be skipped, so that the computing resources are saved, and the computing cost is reduced. Once a target bitmap block is located, a lookup can be performed within the bitmap block, which is beneficial to improving efficiency for advertising, personalized recommendations, and other applications involving the target user.
Optionally, in the embodiment of the present application, performing hierarchical search on the final bitmap block according to the index and the crowd characteristics of the target crowd includes: locating a target bit pattern block in the final bit pattern block according to the crowd characteristics and the indexes; searching the key name and the key value of the key value pair of the target bit block; wherein the key name is stored in the high order of the data and the key value is stored in the low order of the data.
In the implementation process, the bitmap blocks containing the target crowd data can be more accurately positioned by searching through the crowd features and the indexes instead of searching in a full range, so that the searching accuracy is improved, and the error is reduced; on the other hand, the hierarchical search method reduces unnecessary traversal and query, thereby improving search efficiency.
Optionally, in an embodiment of the present application, searching for the target crowd in the target bit block includes: dividing the target data into a first sub-table and a second sub-table according to crowd characteristics by taking the middle position of the target data corresponding to the key value as a demarcation point; searching crowd characteristics in the first sub-table and the second sub-table respectively; and repeating the two halves in the first sub-table and the second sub-table under the condition that the crowd characteristics are not found, until the crowd characteristics are found or the table length is 0.
In the implementation process, in order to find the target crowd, first, according to crowd characteristics, an appropriate demarcation point is selected in the target data, and the target data is divided into two sub-tables, namely a first sub-table and a second sub-table, wherein the demarcation point is usually located in the middle of the target data. Further, searching crowd characteristics in the first sub-table and the second sub-table respectively. If not, selecting a proper demarcation point again, and dividing the sub-table into smaller sub-tables; this process may be repeated until a target crowd feature is found or it is determined that the feature is not present in the target data. Therefore, the crowd searching method provided by the embodiment of the application can be used for efficiently positioning the target crowd, and particularly when the target data volume is large, the searching efficiency can be remarkably improved.
Optionally, in an embodiment of the present application, performing intra-block compression on the initial bit map block to obtain a final bit map block includes: the elements within the initial bit-tile that are 0 and the repeated elements are compressed and the information in the initial bit-tile is encoded to obtain the final bit-tile.
In the implementation process, in the embodiment of the application, for the case that most elements in the initial bit block are zero, the storage space can be effectively reduced by adopting a sparse matrix compression method; for elements that repeatedly appear continuously, they can be encoded into a combination of elements and repetition times using RLE method, thereby further reducing storage space; therefore, the crowd searching method provided by the embodiment of the application can significantly reduce the use of storage space, especially in the case of processing large bit patterns or containing a large number of zero elements and continuous repeated elements.
In a second aspect, an embodiment of the present application provides a crowd searching system, including: the bitmap block generation module and the searching module; the bitmap block generation module is used for generating a plurality of initial bitmap blocks based on the original crowd data; the bitmap block generation module is also used for carrying out intra-block compression on the initial bitmap block to obtain a final bitmap block, and recording bitmap block information corresponding to the final bitmap block; the searching module is used for searching the target crowd in the final bitmap block based on the bitmap block information.
In a third aspect, an embodiment of the present application provides an electronic device, where the electronic device includes a memory and a processor, where the memory stores program instructions, and when the processor reads and executes the program instructions, the processor performs the steps in any of the foregoing implementation manners.
In a fourth aspect, embodiments of the present application further provide a computer readable storage medium having stored therein computer program instructions that, when read and executed by a processor, perform the steps of any of the above implementations.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the embodiments of the present application will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and should not be considered as limiting the scope, and other related drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of crowd searching provided in an embodiment of the present application;
FIG. 2 is a flowchart for searching for a target crowd according to an embodiment of the present application;
fig. 3 is a schematic block diagram of a crowd searching system according to an embodiment of the present application;
fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be described below with reference to the drawings in the embodiments of the present application. For example, the flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions. In addition, functional modules in the embodiments of the present invention may be integrated together to form a single part, or each module may exist alone, or two or more modules may be integrated to form a single part.
When market research is carried out, the needs and preferences of target groups for specific products or services need to be estimated so as to better meet the needs of the target groups; before developing a new product or service, the needs and feedback of potential users are estimated to ensure that the product can meet market needs; when a strategy is established on social media, characteristics of interests, ages, sexes and the like of audiences are estimated to determine what content to publish and when to publish and the like. It can be seen that the search or estimation of a target population is widely used in a variety of scenarios.
The applicant finds that the traditional crowd positioning method needs to store a large amount of original user behavior attribute detail data in a relational database in the research process, and the data amount of the original user behavior attribute detail data is very large generally in the TB level. The need to store and process large amounts of redundant detail data results in large storage overhead and low computational efficiency, as well as high hardware costs for storing TB-level data.
Based on the above, the application discloses a crowd searching method, a system, electronic equipment and a computer readable storage medium, wherein the crowd searching method generates a bitmap block based on original crowd data, and can search in the bitmap block according to corresponding information of the bitmap block in the subsequent searching process so as to locate a target crowd; by using the crowd searching method provided by the application, the storage occupation can be effectively reduced, the calculation speed is accelerated, and the real-time prediction is realized.
Before describing the present application in detail, a brief description of the bitmap algorithm will be provided. The core idea of the BitMap algorithm, i.e. the BitMap algorithm, is to record two states of 0 and 1 by using a bit array, and then map specific data to a specific position of the bit array, where in general, a bit set to 0 indicates that data is absent and a bit set to 1 indicates that data is present. Because Bit is adopted as a unit to store data, the storage space can be greatly saved.
Referring to fig. 1, fig. 1 is a flowchart of crowd searching provided in an embodiment of the present application; the first aspect of the present application provides a crowd searching method, including:
step S100: based on the raw crowd data, a plurality of initial bit tiles are generated.
In the above step S100, a plurality of initial bit patterns are generated based on the original crowd data. For example, bitmap encoding the raw crowd data generates a plurality of initial bitmaps blocks.
In this embodiment of the present application, the raw crowd data includes fields such as user ID, gender, occupation, region, population migration, consumption characteristics, consumption frequency, consumption content, consumption POI, mobile phone model, mobile phone price, and the like.
Step S200: and carrying out intra-block compression on the initial bit block to obtain a final bit block, and recording bitmap block information corresponding to the final bit block.
In the step S200, the generated initial bit block is subjected to intra-block compression to obtain final bit blocks, the final bit blocks are bit map blocks storing data, and the original crowd data is processed into the final bit blocks, and the bit map block information corresponding to each final bit map block can be recorded.
Step S300: and searching the target crowd in the final bitmap block based on the bitmap block information.
As can be seen from fig. 1, the crowd searching method provided in the present application can effectively reduce the storage requirement by processing the original crowd data into the final bitmap block. Because the bit patterns are in compressed form, the required memory space is significantly reduced compared to conventional storage of large amounts of raw user behavior attribute specification data. Since the data has been compressed into bitmap blocks, the subsequent crowd finding and locating process becomes more efficient. The corresponding information is queried in the bit block without the need of complex relational database query, so that the computing efficiency is improved.
In an alternative embodiment, step S100 is as follows: based on the raw crowd data, generating a plurality of initial bit patterns may be accomplished by: the original crowd data is stored in a key value pair mode.
Note that, the key value pair includes a key name and a key value; the key names represent crowd characteristics of the original crowd, and the key values represent data corresponding to the crowd characteristics.
For example, crowd characteristics may include user ID, gender, occupation, population migration, consumption characteristics, frequency of consumption, content of consumption, POI of consumption, cell phone model, cell phone price, etc.; the data corresponding to the crowd features are specific data corresponding to the crowd features, such as sex data of men, women, etc.
Storing the raw crowd data in key-value pairs may use the roaring bitmap algorithm to split integers into bins, with the upper (e.g., upper 16 bits) values as the index of their bins, one for each bin (e.g., lower 16 bits)); among these, there are three types of container structures: ordered array, uncompressed bitmap, and run length encoding; ordered number group: when the number of elements in the lower 16 bits is less than 4096, storing by adopting an ordered array structure; uncompressed bitmap: the uncompressed bitmap storage results are the bitmap storage structures described herein, implemented using a fixed contiguous block of memory; run-length encoding (run-length encoding): run-length encoding is a lossless data compression technique that is based on the principle of storing consecutively occurring data as a start value and calculating two parts.
That is, bitMap marks the Value corresponding to an element with one bit, and Key is the element. Key is stored in the upper 16 bits, value is stored in the lower 16 bits; the Value found in BitMap represents the state of the data.
It can be seen that the crowd searching method provided by the embodiment of the application can generate a plurality of initial bit patterns from the original data by using the roaring bitmap algorithm. The original crowd data is divided into keys and key values, and the keys and the key values are stored by using a Roaring bitmap algorithm, so that the storage requirement can be remarkably reduced; the method can effectively store a large-scale integer data set, is suitable for processing TB-level data, and effectively reduces storage cost and hardware requirements.
In an optional embodiment, the recording bitmap block information corresponding to the final bitmap block in step S200 includes: recording positioning information corresponding to the final bitmap block in an index mode; wherein the positioning information includes a block ID and an offset.
The offset refers to an offset of a position in the final bitmap block relative to a start position of the bitmap block. The offset is typically measured in bytes or bits. The purpose of the offset is to identify the location of particular data or information in the final bit pattern block for quick locating and accessing of the desired data in subsequent query or retrieval operations.
Therefore, in the embodiment of the present application, when the bitmap block information corresponding to the final bitmap block is recorded, the positioning information of the bitmap block is generally recorded in an indexed manner; the block ID is used to uniquely identify the final bitmap block, and the offset indicates the specific location of the desired data in the bitmap block. Therefore, the target bitmap block can be quickly positioned, and the required data can be extracted or processed from the bitmap block, so that the efficiency of query and retrieval operations is improved.
Referring to fig. 2, fig. 2 is a flowchart of searching for a target crowd according to an embodiment of the present application; in an optional implementation manner of the embodiment of the present application, the above step S300 is implemented by searching for the target crowd in the final bitmap block based on the bitmap block information, and the steps are as follows:
step S310: and carrying out hierarchical searching on the final bitmap block according to the index and the crowd characteristics of the target crowd so as to locate the target bitmap block in the final bitmap block.
In the step S310, after storing the target crowd data, the target crowd may be searched for by querying the target data in the stored data.
And quickly determining bitmap blocks containing target crowd in the final bitmap block according to the index information, carrying out hierarchical search according to specific attributes of crowd characteristics, and finally positioning the target bitmap blocks containing target crowd data through hierarchical search.
Step S320: within the target bitmap block to locate a target bitmap block containing target crowd data.
In the above step S320, after the target bitmap block containing the target crowd data is successfully located, further, a target crowd search needs to be performed in the bitmap block. And carrying out query operation by using the crowd characteristics of the target crowd so as to screen out users meeting the conditions.
As can be seen from fig. 2, the embodiment of the present application provides a method for searching for a target crowd in a final bitmap block based on bitmap block information, and locating a target bitmap block according to an index and characteristics of the target crowd. Because hierarchical searching is used, bitmap blocks which do not meet the conditions can be skipped, so that the computing resources are saved, and the computing cost is reduced. Once a target bitmap block is located, a lookup can be performed within the bitmap block, which is beneficial to improving efficiency for advertising, personalized recommendations, and other applications involving the target user.
In an alternative embodiment, the step S310 performs hierarchical searching on the final bitmap block according to the index and the crowd characteristics of the target crowd, which may be implemented by the following ways: locating a target bit pattern block in the final bit pattern block according to the crowd characteristics and the indexes; searching the key name and the key value of the key value pair of the target bit block; wherein the key name is stored in the high order of the data and the key value is stored in the low order of the data.
The target bit block is determined first, and then the key name and key value of the key value pair of the target bit block are found.
By searching through the crowd characteristics and the indexes, bitmap blocks containing target crowd data can be more accurately positioned instead of full-range searching, so that the searching accuracy is improved, and errors are reduced; on the other hand, the hierarchical search method reduces unnecessary traversal and query, thereby improving search efficiency.
In an alternative embodiment, locating the target population within the target bit pattern block may be accomplished by:
dividing the target data into a first sub-table and a second sub-table according to crowd characteristics by taking the middle position of the target data corresponding to the key value as a demarcation point; searching crowd characteristics in the first sub-table and the second sub-table respectively; and repeating the two halves in the first sub-table and the second sub-table under the condition that the crowd characteristics are not found, until the crowd characteristics are found or the table length is 0.
The method is to arrange the records in order (increasing or decreasing), and find them in jump mode, i.e. the middle point of the ordered sequence is used as the comparison object, if the element value to be found is smaller than the middle point element, the sequence to be found is reduced to the left half, otherwise the right half. The search interval is reduced by half by one comparison.
Therefore, in order to find the target crowd, first, according to crowd characteristics, an appropriate demarcation point is selected from the target data, and the target data is divided into two sub-tables, namely a first sub-table and a second sub-table, wherein the demarcation point is usually located in the middle of the target data. Further, searching crowd characteristics in the first sub-table and the second sub-table respectively. If not, selecting a proper demarcation point again, and dividing the sub-table into smaller sub-tables; this process may be repeated until a target crowd feature is found or it is determined that the feature is not present in the target data. Therefore, the crowd searching method provided by the embodiment of the application can be used for efficiently positioning the target crowd, and particularly when the target data volume is large, the searching efficiency can be remarkably improved.
In an alternative embodiment, the intra-block compression of the initial bit block in step S200 to obtain the final bit block may be implemented by:
the elements within the initial bit-tile that are 0 and the repeated elements are compressed and the information in the initial bit-tile is encoded to obtain the final bit-tile.
In the above process, the element with 0 in the initial bit block is compressed by using a sparse matrix compression method. According to the sparse matrix compression method, aiming at bitmap blocks with most elements in the matrix being zero, the sparse matrix compression method can be used for reducing storage space and improving calculation efficiency.
The repeated elements in the compressed initial bit block may use Run-Length Encoding (RLE), which is a simple and effective method for data compression, and is mainly used for compressing continuously repeated data, especially for sparse matrix, image, text, and other data. The basic idea of run-length encoding is to represent consecutive identical data elements (typically 0 or 1) as a combination of one element and their repetition number. Thereby significantly reducing the storage space, especially in the case of a large number of consecutive repetitions in the data. For example, for a group of data, the data is scanned element by element starting from the beginning of the data, recording the current element and its repetition number; when successive repetition elements are found, the current element and repetition number are encoded as a tuple or a special tag, typically (element, number) or (0, number); if discontinuous data is encountered, it is directly recorded, usually (element) or (1); and continuing to scan the whole data, and repeating the steps until the end of the data is reached.
Therefore, according to the embodiment of the application, for the situation that most elements in the initial bit block are zero, the storage space can be effectively reduced by adopting a sparse matrix compression method; for elements that repeatedly appear continuously, they can be encoded into a combination of elements and repetition times using RLE method, thereby further reducing storage space; therefore, the crowd searching method provided by the embodiment of the application can significantly reduce the use of storage space, especially in the case of processing large bit patterns or containing a large number of zero elements and continuous repeated elements.
Referring to fig. 3, fig. 3 is a schematic block diagram of a crowd searching system according to an embodiment of the disclosure; there is also provided in a second aspect of the present application a crowd lookup system 100 including a bitmap block generation module 110 and a lookup module 120.
The bitmap block generating module 110 is configured to generate a plurality of initial bitmap blocks based on the original crowd data; the bitmap block generating module 110 is further configured to perform intra-block compression on the initial bitmap block to obtain a final bitmap block, and record bitmap block information corresponding to the final bitmap block.
The searching module 120 is configured to search the target crowd in the final bitmap block based on the bitmap block information.
In an alternative embodiment, the bitmap block generating module 110 is configured to store the original crowd data in a key-value pair manner; wherein the key-value pair includes a key name and a key value; the key names represent crowd characteristics of the original crowd, and the key values represent data corresponding to the crowd characteristics.
In an alternative embodiment, the bitmap block generating module 110 is configured to record, in an indexed manner, positioning information corresponding to the final bitmap block; wherein the positioning information includes a block ID and an offset.
In an alternative embodiment, the searching module 120 is configured to perform hierarchical searching on the final bitmap block according to the index and the crowd characteristics of the target crowd, so as to locate the target bitmap block in the final bitmap block; within the target bit pattern block, the target population is found.
In an alternative embodiment, the lookup module 120 is configured to locate the target bit tile in the final bit tile according to the crowd characteristics and the index; searching the key name and the key value of the key value pair of the target bit block; wherein the key name is stored in the high order of the data and the key value is stored in the low order of the data.
In an optional embodiment, the searching module 120 is configured to divide the target data into a first sub-table and a second sub-table according to the crowd feature, with a middle position of the target data corresponding to the key value as a demarcation point; searching crowd characteristics in the first sub-table and the second sub-table respectively; and repeating the two halves in the first sub-table and the second sub-table under the condition that the crowd characteristics are not found, until the crowd characteristics are found or the table length is 0.
In an alternative embodiment, the bitmap block generating module 110 is further configured to compress the element of 0 and the repeated element in the initial bitmap block, and encode the information in the initial bitmap block to obtain the final bitmap block.
Referring to fig. 4, fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present application. An electronic device 300 provided in an embodiment of the present application includes: a processor 301 and a memory 302, the memory 302 storing machine-readable instructions executable by the processor 301, which when executed by the processor 301 perform the method as described above.
Based on the same inventive concept, embodiments of the present application also provide a computer readable storage medium, where a computer program instruction is stored, and when the computer program instruction is read and executed by a processor, the steps in any of the above implementations are performed.
The computer readable storage medium may be any of various media capable of storing program codes, such as random access Memory (Random Access Memory, RAM), read Only Memory (ROM), programmable Read Only Memory (Programmable Read-Only Memory, PROM), erasable Read Only Memory (Erasable Programmable Read-Only Memory, EPROM), electrically erasable Read Only Memory (Electric Erasable Programmable Read-Only Memory, EEPROM), and the like. The storage medium is used for storing a program, the processor executes the program after receiving an execution instruction, and the method executed by the electronic terminal defined by the process disclosed in any embodiment of the present invention may be applied to the processor or implemented by the processor.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other manners. The above-described apparatus embodiments are merely illustrative, for example, the division of the units is merely a logical function division, and there may be other manners of division in actual implementation, and for example, multiple units or components may be combined or integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be through some communication interface, device or unit indirect coupling or communication connection, which may be in electrical, mechanical or other form.
Further, the units described as separate units may or may not be physically separate, and units displayed as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
Furthermore, functional modules in various embodiments of the present application may be integrated together to form a single portion, or each module may exist alone, or two or more modules may be integrated to form a single portion.
Alternatively, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, produces a flow or function in accordance with embodiments of the present invention, in whole or in part.
The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. The computer instructions may be stored in or transmitted from one computer-readable storage medium to another, for example, by wired (e.g., coaxial cable, optical fiber, digital Subscriber Line (DSL)), or wireless (e.g., infrared, wireless, microwave, etc.).
In this document, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising … …" does not exclude the presence of other like elements in a process, method, article or apparatus that comprises the element.
The foregoing is merely exemplary embodiments of the present application and is not intended to limit the scope of the present application, and various modifications and variations may be suggested to one skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principles of the present application should be included in the protection scope of the present application.

Claims (10)

1. A crowd finding method, the method comprising:
generating a plurality of initial bit patterns based on the raw crowd data;
performing intra-block compression on the initial bit block to obtain a final bit block, and recording bitmap block information corresponding to the final bitmap block;
and searching a target crowd in the final bitmap block based on the bitmap block information.
2. The method of claim 1, wherein generating a plurality of initial bittiles based on the raw crowd data comprises:
storing the original crowd data in a key value pair mode;
wherein the key-value pair includes a key name and a key value; the key names represent crowd characteristics of the original crowd, and the key values represent data corresponding to the crowd characteristics.
3. The method according to claim 2, wherein the recording bitmap block information corresponding to the final bitmap block includes:
recording positioning information corresponding to the final bitmap block in an index mode; wherein the positioning information includes a block ID and an offset.
4. The method of claim 3, wherein the searching for the target group in the final bitmap block based on the bitmap block information comprises:
performing hierarchical searching on the final bitmap block according to the index and the crowd characteristics of the target crowd so as to locate the target bitmap block in the final bitmap block;
and searching the target crowd in the target bit block.
5. The method of claim 4, wherein the hierarchically searching the final bitmap block according to the index and the crowd characteristics of the target crowd comprises:
locating a target bit pattern in the final bit pattern based on the crowd characteristics and the index;
searching the key name and the key value of the key value pair of the target bitmap block; wherein the key name is stored in the high order of the data, and the key value is stored in the low order of the data.
6. The method of claim 5, wherein said locating the target crowd within the target bit tile comprises:
dividing the target data into a first sub-table and a second sub-table by taking the middle position of the target data corresponding to the key value as a demarcation point according to the crowd characteristics;
searching the crowd characteristics in the first sub-table and the second sub-table respectively;
and repeating the bisection in the first sub-table and the second sub-table until the crowd feature is found or the table length is 0 under the condition that the crowd feature is not found.
7. The method of claim 1, wherein the intra-block compressing the initial bit-tile to obtain a final bit-tile comprises:
compressing elements within the initial bit-tile that are 0 and repeated elements and encoding information in the initial bit-tile to obtain the final bit-map block.
8. A crowd-seeking system, the crowd-seeking system comprising: the bitmap block generation module and the searching module;
the bitmap block generation module is used for generating a plurality of initial bitmap blocks based on the original crowd data;
the bitmap block generating module is further used for carrying out intra-block compression on the initial bitmap block to obtain a final bitmap block, and recording bitmap block information corresponding to the final bitmap block;
the searching module is used for searching the target crowd in the final bitmap block based on the bitmap block information.
9. An electronic device comprising a memory and a processor, the memory having stored therein program instructions which, when executed by the processor, perform the steps of the method of any of claims 1-7.
10. A computer readable storage medium, characterized in that the computer readable storage medium has stored therein computer program instructions which, when executed by a processor, perform the steps of the method of any of claims 1-7.
CN202311378341.8A 2023-10-23 2023-10-23 Crowd searching method, system, electronic device and computer readable storage medium Pending CN117421481A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311378341.8A CN117421481A (en) 2023-10-23 2023-10-23 Crowd searching method, system, electronic device and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311378341.8A CN117421481A (en) 2023-10-23 2023-10-23 Crowd searching method, system, electronic device and computer readable storage medium

Publications (1)

Publication Number Publication Date
CN117421481A true CN117421481A (en) 2024-01-19

Family

ID=89532023

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311378341.8A Pending CN117421481A (en) 2023-10-23 2023-10-23 Crowd searching method, system, electronic device and computer readable storage medium

Country Status (1)

Country Link
CN (1) CN117421481A (en)

Similar Documents

Publication Publication Date Title
CN104657362B (en) Data storage, querying method and device
US8838551B2 (en) Multi-level database compression
JP4961126B2 (en) An efficient algorithm for finding candidate objects for remote differential compression
CN108733317B (en) Data storage method and device
US8838550B1 (en) Readable text-based compression of resource identifiers
CN111629081A (en) Internet protocol IP address data processing method and device and electronic equipment
CN113568995A (en) Dynamic tile map making method based on retrieval conditions and tile map system
CN115208414B (en) Data compression method, data compression device, computer device and storage medium
CN114817651B (en) Data storage method, data query method, device and equipment
CN111611250A (en) Data storage device, data query method, data query device, server and storage medium
CN109325089A (en) A kind of non-pointing object querying method, device, terminal device and storage medium
CN114328632A (en) User data analysis method and device based on bitmap and computer equipment
CN111414527B (en) Query method, device and storage medium for similar items
CN108345607B (en) Searching method and device
CN110377822B (en) Method and device for network characterization learning and electronic equipment
CN117421481A (en) Crowd searching method, system, electronic device and computer readable storage medium
CN115794861A (en) Offline data query multiplexing method based on feature abstract and application thereof
CN112100412B (en) Picture retrieval method, device, computer equipment and storage medium
CN114282119A (en) Scientific and technological information resource retrieval method and system based on heterogeneous information network
CN110941730B (en) Retrieval method and device based on human face feature data migration
CN113139834A (en) Information processing method, device, electronic equipment and storage medium
CN109255090B (en) Index data compression method of web graph
CN108897807B (en) Method and system for carrying out hierarchical processing on data in mobile terminal
CN115391355B (en) Data processing method, device, equipment and storage medium
CN111930954B (en) Intention recognition method and device, storage medium and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination