CN109376292B - Method for constructing bit sequence table, data searching method, public opinion monitoring system and information transmission system - Google Patents

Method for constructing bit sequence table, data searching method, public opinion monitoring system and information transmission system Download PDF

Info

Publication number
CN109376292B
CN109376292B CN201811112028.9A CN201811112028A CN109376292B CN 109376292 B CN109376292 B CN 109376292B CN 201811112028 A CN201811112028 A CN 201811112028A CN 109376292 B CN109376292 B CN 109376292B
Authority
CN
China
Prior art keywords
bit sequence
bit
variable
sequence table
time
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811112028.9A
Other languages
Chinese (zh)
Other versions
CN109376292A (en
Inventor
张志宏
郭磊
刘佳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Changsha University
Original Assignee
Changsha University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Changsha University filed Critical Changsha University
Priority to CN201811112028.9A priority Critical patent/CN109376292B/en
Publication of CN109376292A publication Critical patent/CN109376292A/en
Application granted granted Critical
Publication of CN109376292B publication Critical patent/CN109376292B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

A method for constructing bit sequence list, bit sequence units contained in the bit sequence list all have the following characteristics that' the bits for storing data string matching condition votes are less than or equal to the bits of data processed by an applied computer at one time. The public opinion monitoring system comprises a data searching method, a public opinion monitoring system and an information transmission system, and is characterized in that: and constructing the bit sequence table by applying the bit sequence table constructing method to realize searching. The invention can improve time efficiency, space efficiency and utilization rate of computer resources, and provides a new idea.

Description

Method for constructing bit sequence table, data searching method, public opinion monitoring system and information transmission system
Technical Field
The invention relates to the field of computer data processing, in particular to a method for constructing a bit sequence table, a data searching method, a public opinion monitoring system and an information transmission system.
Technical Field
The data search method may also be referred to as a data matching method, and is an important application of computer technology, and is often used in technologies such as real-time text processing, file search systems, real-time data analysis, big data analysis, public opinion automatic survey systems, and the like. The time efficiency of the prior art for all new data has a space for improvement, the space efficiency of the prior art has a space for improvement, and the application of the prior art to computer resources is insufficient and has a space for improvement.
Disclosure of Invention
The invention provides a method for constructing a bit sequence table, a data searching method, a public opinion monitoring system and an information transmission system, which can improve the time efficiency, the space efficiency and the utilization rate of computer resources.
The technical content of the invention is as follows:
1. a sequence of bit bits, characterized by: the voting device is used for storing the votes of the matching conditions of the data strings, and the bit number of the bit sequence used for voting is less than or equal to the bit number of the once processed data of the applied computer; the bit sequence is used for storing the support authority or the rejection authority of the elements of the data string on the matching condition of each position of the whole data string and the substring to be searched, the bit sequence integrally corresponds to one data string, and each bit in the bit sequence independently corresponds to one element of the data string.
2. A sequence of bit bits, characterized by: the number of bits used for voting by the bit sequence is less than the total number of bits of the bit sequence.
3. The bit sequence as claimed in claim 1, characterized in that: the bit sequence has bit corresponding to the data string elements in sequence.
4. The bit sequence as claimed in claim 2, characterized in that: the bit in the bit sequence corresponds to the data string elements in the same order, i.e., the first bit in the bit sequence corresponds to the first element of the data string, the second bit in the bit sequence corresponds to the second element of the data string, the third bit in the bit sequence corresponds to the third element of the data string, and so on.
5. The bit sequence as claimed in claim 2, characterized in that: the bit sequence has the corresponding mode of the bit and the data string element in the reverse order, that is, the first bit in the bit sequence corresponds to the first last element of the data string, the second bit in the bit sequence corresponds to the second last element of the data string, the third bit in the bit sequence corresponds to the third last element of the data string, and so on.
6. The bit sequence according to claims 1-4, characterized in that: bits in the bit sequence are represented by 1 for support and zero for veto.
7. The bit sequence according to claims 1-4, characterized in that: the bits in the bit sequence are negated by 1 and supported by zero.
8. A bit sequence table characterized by: storing the bit sequence according to any one of the technical contents 1 to 6;
integrating a plurality of bit sequence data into a bit sequence table, wherein the bit numbers of all the bit sequences are equal;
the bit sequence table has two query variables, the first query variable corresponds to the sequence position of the data string elements in the data string, and the second query variable corresponds to the value of the data string elements with the sequence number equal to the value of the first query variable in the data string.
9. The bit sequence table constructing method is characterized in that: a construction method of the bit sequence table for technical content 7;
all bit sequence data correspond to the same substring to be searched;
for convenience of expression, a first query variable of a certain element in a bit sequence table is set as j, a second query variable is set as ca, the element in the bit sequence table is expressed as biao [ j ] [ ca ], and a sequence number of a first element of a data string is assumed as w;
the bit sequence stored in the biao [ j ] [ ca ] has no veto authority to match the data string elements with the position sequence numbers larger than j in the data string; (for example, if j is 4, the sub-string length is 3, the bit sequence stored in biao [ j ] [ ca ] has no veto authority to match the data string elements whose index is greater than 4;
the bit sequence stored in the biao [ j ] [ ca ] has no veto authority to match the data string elements with the position serial numbers smaller than j and the step length from the serial numbers to the j larger than the length of the substring in the data string; (e.g., if j is 4 and the sub-string length is 3, then the bit sequence stored in biao [ j ] [ ca ] does not deny permission to match elements with index less than 1 in the data string);
if ca is equal to the value of the substring element with the serial number of k, the matching condition of the bit sequence stored in the biao [ j ] [ ca ] to the data string element with the position serial number of j-k + w in the data string is a support ticket;
if the values of ca and the substring elements with the sequence number of k are not equal, the matching condition of the bit sequence stored in the biao [ j ] [ ca ] to the data string elements with the position sequence number of j-k + w in the data string is a negative ticket;
the above rules are not defined in chronological order.
10. The bit sequence table constructing method is characterized in that: a construction method of the bit sequence table for technical content 7;
all bit sequence data correspond to the same substring to be searched;
for convenience of expression, a first query variable of a certain element in a bit sequence table is set as j, a second query variable is set as ca, the element in the bit sequence table is expressed as biao [ ca ] [ j ], and a sequence number of a first element of a data string is assumed as w;
the bit sequence stored in the biao [ ca ] [ j ] has no veto authority to match the data string elements with the position sequence numbers larger than j in the data string;
the bit sequence stored in the biao [ ca ] [ j ] has no veto authority to match the data string elements with the position serial numbers smaller than j and the step length from the serial numbers to the j larger than the length of the substring in the data string;
if ca is equal to the value of the substring element with the serial number of k, the matching condition of the bit sequence stored in the biao [ ca ] [ j ] to the data string element with the position serial number of j-k + w in the data string is a support ticket;
if the values of ca and the substring elements with the sequence numbers of k are not equal, the matching condition of the bit sequence stored in the biao [ ca ] [ j ] to the data string elements with the position sequence numbers of j-k + w in the data string is a negative ticket;
the above rules are not defined in chronological order.
11. The method according to claim 9 or 10, wherein the method comprises: the sequence numbers are counted from 0, i.e. the value of w is 0.
12. The method according to claim 9 or 10, wherein the method comprises: the sequence numbers are counted from 1, i.e. the value of w is 1. (array Access count for partial programming languages starts at 1, such as easy language)
13. The data matching and obtaining method is characterized by comprising the following steps: the device is used for judging the matching condition of the data string and the substring to be searched;
the length of the data string is less than or equal to the bit number of the bit sequence table;
a, constructing a bit order list biao [ ] [ ] according to the data of the substring to be searched by using the bit order list construction method described in technical content 8 or 9;
b, taking the position sequence number of each element of the data string as a first query variable, taking the value of each element of the data string as a second query variable (Bz1.1), querying a bit sequence (Sj1.3) corresponding to each element of the data string in a bit sequence table, counting voting information, and obtaining statistical data to obtain a voting result sequence of each position of the data string and a substring;
flow a precedes flow b in chronological order.
The flows a and b can be linked in time, and other flows can be inserted between the flows a and b.
14. As described in claim 10, the data search method is characterized in that: and (4) counting by adopting a vote rejection mode to obtain a voting result sequence (Bz2.3).
15. As described in claim 10, the data search method is characterized in that: and counting by adopting a mode of accumulating support tickets, and if the accumulated value of a certain element of the data string is equal to the length value of the substring to be searched, indicating that the data element is a completely matched position.
16. The data searching method is characterized in that:
the method comprises the steps of obtaining a matching position of a substring (xu- > str) to be searched from a searched mother string (bei- > str), wherein the length (bei- > str) of the mother string (bei- > str) is larger than or equal to the length (xu- > size) of the substring (xu- > str);
the value ranges of the elements of the parent strings are fixed, the value ranges of the elements of the parent strings are foreseen before the parent strings are loaded, the value ranges of the elements of the substrings are fixed, and the value ranges of the elements of the substrings are foreseen before the substrings are loaded;
comprises the following steps:
a, constructing a bit order list biao [ ] [ ] according to the data of the substring to be searched by using the bit order list construction method described in technical content 8 or 9;
and b, matching the mother strings by using the bit position sequence table biao [ ] [ ] constructed in the process a, wherein the matching rule is as follows:
step 1, according to the arrangement sequence of the mother strings, acquiring the data strings, wherein the length number of the acquired data strings is equal to the total number of bits used for voting in bit sequence stored by elements of a bit sequence table bio [ ] [ ]; see the attached drawings (you mo)
Step 2, inquiring bit sequences of data which are not inquired in a data string, not inquiring the inquired elements when the step is executed last time, using position sequence numbers of the elements of the data string as first inquiry variables, using values of the elements of the data string as second inquiry variables (Bz1.1, Bz1.2 and Bz1.3), inquiring bit sequences (Sj1.1, Sj1.2 and Sj1.3) corresponding to the elements of the data string in a bit sequence table, counting voting information of all the elements in the data string by adopting a vote (Bz2.1, Bz2.2 and Bz2.3) mode, including voting information reserved by executing the step last time, and obtaining voting result sequences (Sj2.1, Sj2.2 and Sj2.3);
setting a position k in the data string, and if the step size from k to the tail of the data string is more than or equal to the result number ' of subtracting 1 from the length (xu- > str) of the ' substring (xu- > str '), then k is positioned in the matching area (pipe);
if the step size of k to the tail of the data string is smaller than the result number 'of the length (xu- > size) of the' substring (xu- > str) minus 1, then k is located in the sliding region (hudong);
if the sequence number of k is greater than the maximum sequence number of the sliding region, k is meaningless (ny in Sj2.3);
the positions which are located in the matching area and the voting result of which is supported are matching positions, and the matching positions are recorded (BZ3.1, BZ3.2 and BZ 3.3);
the position which is located in the sliding area and has the minimum sequence number in the position set with the voting result as support is a sliding position;
step 3, if the total number of the matching positions obtained currently is less than the number of the matching positions to be found, and the data string is not taken to the tail end of the mother string, the step 4 is carried out, otherwise, the flow c is carried out;
step 4, if the voting result sequence (sj2.1) obtained from a2 has a sliding position, retaining the corresponding sliding position in the voting result sequence (sj2.1) and the voting information after the sliding position, erasing other information, and entering step 1 (see the lower left corner in fig. 2) with the sliding position as the initial position of the data string in step 1;
if there is no sliding position in the voting result sequence (sj2.1) obtained by a2, taking the next position of the mother string element corresponding to the tail end of the current data string as step 1 and entering step 1 without retaining any information in the voting result sequence (sj2.1).
And c, completing the process.
17. The data search method according to claim 13, characterized in that: the voting information corresponding to the sliding position and after the sliding position in the voting result sequence (Sj2.1) is retained, and other information is erased by performing a shift operation (BZ6 in FIG. 2) using a bit operation command of a computer.
18. The data search method according to claim 14, characterized in that: the bit operation instruction is a displacement instruction.
19. A file search system, characterized by: the technical proposal is as described in any one of the technical contents 1 to 15.
20. A file management system, characterized by: the technical proposal is as described in any one of the technical contents 1 to 15.
21. A data search system characterized by: the technical proposal is as described in any one of the technical contents 1 to 15.
22. A data management system, characterized by: the technical proposal is as described in any one of the technical contents 1 to 15.
23. A text search system, characterized by: the technical proposal is as described in any one of the technical contents 1 to 15.
24. A text management system, characterized by: the technical proposal is as described in any one of the technical contents 1 to 15.
25. A search engine, characterized by: the technical proposal is as described in any one of the technical contents 1 to 15.
26. Public opinion monitored control system, its characterized in that: the technical proposal is as described in any one of the technical contents 1 to 15.
27. Big data analysis system, its characterized in that: the technical proposal is as described in any one of the technical contents 1 to 15.
28. An artificial intelligence system, characterized by: the technical proposal is as described in any one of the technical contents 1 to 15.
29. The social network analysis system is characterized in that: the technical proposal is as described in any one of the technical contents 1 to 15.
Description of technical effects:
the invention can improve time efficiency, space efficiency and utilization rate of computer resources, and provides a new idea. The invention can fully utilize the resources of the computer processor, so that the speed of acquiring data matching is accelerated.
Drawings
Fig. 1 is a schematic diagram of a test example of embodiment 1 of the present invention, and to help readers understand, fig. 1 plots a logical structure of youmo for fetching a data string, youmo does not exist specifically, youmo has 8 fictional logical units, a first unit y0 of youmo corresponds to an element with a subscript of i of a parent string bei- > str, a second unit y1 of youmo corresponds to an element with a subscript of i +1 of the parent string bei- > str, and so on, which correspond in sequence; bz1.1 in FIG. 1 is a bit sequence (Sj1.1) corresponding to each element of the query data string in the bit sequence table; b2.1 in fig. 1 represents a vote negative statistic performed on the voting information of each bit sequence of sj1.1, and sj2.1 in fig. 1 is a 'voting result sequence' obtained by the B2.1 statistic, where the position corresponding to the bit with the median of 1 in the pipei segment is a matching position, and the position corresponding to the bit with the median of 1 in the huadong segment is a sliding position. The numbers arranged in sequence below sj1.1 and sj2.1 in fig. 1 are the serial numbers of the individual bits in the bit sequence. The sequentially arranged numbers below bei- > str in fig. 1 are the serial numbers of the respective elements in the mother string.
FIG. 2 is a schematic view of a test example of example 1 of the present invention, taken in chronological order from FIG. 1, and since FIG. 1 obtained one sliding position 6, the first unit y0 of the game of FIG. 2 corresponds to bei- > str [6 ]. Sj2.1 in fig. 2, sj2.1 in fig. 1, because of the sliding position obtained in fig. 1, fig. 2 retains the contents of the huadong segment of fig. 1 by a shift operation (Bz6), and after the shift operation, the 7 th bit (subscript 6) and the 8 th bit (subscript 7) of sj2.1 are retained at the 1 st bit (subscript 0) and the second bit (subscript 1) of Sj 3; in order not to influence the subsequent voting, carrying out bit OR operation (BZ4) on Sj3 and Sj4 to obtain a bit sequence Sj5 as voting information; since the vote information of bei- > str [6] and bei- > str [7] is already read in FIG. 1 and stored in Sj5, it is not necessary to repeat the reading, so the table lookup operation starts with y2 corresponding to bei- > str [8 ].
Fig. 3 is a schematic diagram of a test example of embodiment 1 of the present invention, which is chronologically connected to fig. 2, and since there is no sliding position in fig. 2, the sliding of fig. 3 starts from 14 (6+8 is 14); since youmo took the end of bei- > str, the search ended. Since the length of the representative data string is only 7 and the length of the bit sequence is 8 in fig. 3, a meaningless region ny exists in the 'voting result sequence' sj2.3 representing the voting result;
fig. 4 is a screenshot of a code execution result of embodiment 29 of the present invention.
DETAILED DESCRIPTION OF EMBODIMENT (S) OF INVENTION
Embodiment 1, the method of searching data is as follows, characterized in that:
search processing for computer data, implemented by means of a digital computer;
the method comprises the steps that the matching position of a substring (xu- > str) needing to be searched is obtained from a searched mother string (bei- > str), the length (bei- > size) of the mother string (bei- > str) is larger than or equal to the length (xu- > size) of the substring (xu- > str), the value range of mother string elements is fixed, the value range of the mother string elements can be predicted before the mother string is loaded, the value range of the substring elements is fixed, and the value range of the substring elements can be predicted before the substring is loaded;
allocating a bit sequence table [ ], wherein the maximum sequence number of a first dimension is 256, and the maximum sequence number of a second dimension is the number of bits used for voting by the bit sequence; bit sequence units contained in the bit sequence list have the following characteristics that the bit number of the bit sequence used for voting is less than or equal to the bit number of data processed by an applied computer at one time; the bit sequence is used for storing the support authority or the denial authority of elements of the data string on each position of the whole data string and the matching condition of the substring to be searched, the bit sequence integrally corresponds to one data string, each bit in the bit sequence respectively and independently corresponds to one element of the data string, the corresponding modes of the bits in the bit sequence and the elements of the data string are sequentially corresponding, the bit in the bit sequence represents support by 1 and is denied by zero, the corresponding modes of the bits in the bit sequence and the elements of the data string are in same sequence, namely, the first bit in the bit sequence corresponds to the first element of the data string, the second bit in the bit sequence corresponds to the second element of the data string, the third bit in the bit sequence corresponds to the third element of the data string, and so on;
the assigned bit-sequence variables n1, n 2;
pretreatment:
d1, constructing a bit sequence table biao [ ] [ ] by adopting the bit sequence table constructing method; the method comprises the following specific steps:
the method for constructing the bit sequence table biao [ ] [ ] is as follows:
step D1.1, constructing a bit sequence table template: the C language pseudo code is expressed as: constructing a bit sequence table template ();
the method for constructing the bit sequence table template comprises the following steps:
step D1.1.1, allocating memory for the bit sequence table template (wei _ zhuanan jm [8]), where the bit sequence table template has bit sequence units, and the number of the bit sequence units in the bit sequence table template is equal to the number of bits used for voting by the bit sequence; the bit sequence table template is provided with the sequence numbers of the bit sequence units which are calculated from zero;
d1.1.2, assigning values to bit sequence units in a bit sequence table template (wei _ zhuanan jm [8 ]); the assignment operation is as follows:
step D1.1.2.1, assigning all bits used for voting in each bit sequence unit in the bit sequence table template (wei _ zhuanan jm [8]) to be 1, and expressing the C language pseudo code as: jm [ j ]. cha is 0 XFF;
step D1.1.2.2, performing a shift operation on all bits used for voting of each bit sequence unit, wherein the bit number of the shift operation is the sum of the sequence number of each bit sequence unit in the bit sequence table template (wei _ zhuanan jm [8]) and 1; the C language pseudo code is expressed as: jm [ j ]. cha < ═ (j + 1);
step D1.2, loading a bit sequence table template (jm) into a first dimension (biao [0] [ ]) of a bit sequence table, wherein the method comprises the following steps:
step D1.2.1, copying the data of the bit sequence table template (jm) into the bit sequence table in sequence, wherein the C language pseudo code is expressed as: memcpy (& biao [0] [0], jm, sizeof (wei _ zhua) × 8);
step D1.3, assigning the bit without veto power of each bit sequence unit in the first dimension (biao [0] [ ]) of the bit sequence table to be 1, specifically as follows:
step D1.3.1, allocating a variable, called the ninth variable (n9), the number of bits of the ninth variable (n9) being equal to the number of bits of the bit sequence;
step D1.3.2, assigning each bit of the ninth variable (n9) to be 1, c-language pseudo code is expressed as: n9.cha — 0 XFF; (ii) a
Step D1.3.3, performing a right shift operation on the ninth variable, where the right shift is the maximum sequence number of the bit sequence unit minus the sequence number of the currently processed bit sequence unit, and adding the maximum sequence number (size) of the character in the character string to be searched, where the first sequence number of the character in the character string is zero, that is, the subscript of the character in the character string starts from zero, and the C language pseudocode is expressed as: n9.cha > > > (7-j + size); (ii) a
Step D1.3.4, bit-or-ing the value of the current bit-sequence-unit with a ninth variable (n9), and assigning the result of the bit-or-operation to the current bit-sequence-unit, the C-language pseudo code is expressed as: biao [ ca ] [ j ]. cha | ═ n1. cha; (ii) a
Step D1.4, copying each bit sequence unit in the first dimension (biao [0] [ ]) of the bit sequence table to other dimensions of the bit sequence table in the same sequence, wherein C language pseudo codes are expressed as: (size _ t i ═ 1; i < 256; i + +) memcpy (& biao [ i ] [0], & biao [0] [0], sizeof (wei _ zhua) × 8);
step D1.5, carrying out voting information assignment on each bit sequence unit in the bit sequence table, wherein the specific method comprises the following steps:
step D1.5.1, constructing a first time loop having a first time variable
(j) The initial value of the first time-counting variable is the bit number of the bit sequence used for voting, the first time-counting variable (j) is reduced by 1 once per cycle of the first time-counting cycle, and when the first time-counting variable (j) is less than zero, the cycle stops, and the C language pseudo code is expressed as: for (int j ═ 8; j ═ 0; j- -) { }; each cycle of the first counting cycle comprises the following operation processes:
step D1.5.1.1, assigning a kr variable (kr);
step D1.5.1.2, assigning the value of the first time variable (j) to the kr variable;
step D1.5.1.3, if the first time-counting variable (j) is larger than the maximum sequence number (size) of the characters in the character string to be searched, assigning the maximum sequence number (size) of the characters in the character string to be searched to the kr variable;
step D1.5.1.4, constructing a second time counting loop, wherein the second time counting loop has a second time counting variable (k), the initial value of the second time counting variable (k) is the value of the kr variable, the first time counting loop reduces the second time counting variable (k) by 1 once per loop, when the second time counting variable (k) is less than zero, the loop stops, and the C language pseudo code is expressed as: or (int k ═ kr; k ═ 0; k- -) { }; each cycle of the second time-counting variable (k) comprises the following operating procedures:
step D1.5.1.4.1, assigning a variable, referred to as a tenth variable (n10), the number of bits of the tenth variable (n10) being equal to the number of bits of the bit sequence;
step D1.5.1.4.2, assign the tenth variable (n10) to 1, c-language pseudo code is expressed as: n10.cha — 0x 1; (ii) a
Step D1.5.1.4.3, performing a left shift operation on a tenth variable (n10), wherein the number of bits of the left shift operation is a value obtained by subtracting a second time variable (k) from a first time variable (j), the value obtained by the left shift operation is given to the tenth variable (n10), and the C language pseudo code is expressed as n10.cha < (j-k); (ii) a
Step D1.5.1.4.4, allocating a ca variable;
step D1.5.1.4.5, acquiring the character (xu- > str [ k ]) with the serial number of the second time-counting variable (k) in the character string (xu- > str) to be searched, assigning the acquired character (xu- > str [ k ]) to the ca variable, and expressing the C language pseudo code as follows: unforgned char ca ═ xu- > str [ k ]; (ii) a
Step D1.5.1.4.6, performing assignment operation on the unit (biao [ ca ] [ j ]) with the first dimension serial number equal to the value of the ca variable and the second dimension serial number equal to the value of the first time variable (j), wherein the assignment operation mode is as follows: the value of the cell and the tenth variable (n10) are first bit-or-operated on and the resulting value of the bit-or-operation is then assigned to the cell, the C language pseudocode being expressed as: biao [ ca ] [ j ]. cha | ═ n10. cha; (ii) a
And D1.6, ending.
The searching operation flow is as follows:
d2.1, distributing unsigned integer variables sp and hr; the c language pseudo code is expressed as: signaled int sp ═ 0;
d2.2, allocating a character pointer sty; the c language pseudo code is expressed as: unidimensional int hr;
d2.3, allocating a bit sequence pointer kp; the c language pseudo code is expressed as: wei _ zhuan kp;
d2.4, assigning a value of zero to the effective length value (sn.shu) of the intuitive result example (jieguo) constructed in the step D1; the c language pseudo code is expressed as: jueguo. shu is 0;
d2.5, distributing an unsigned integer variable b _ size; and assigning the value of the parent string length (bei- > size) to b _ size; the c language pseudo code is expressed as: designed int b _ size ═ bei- > size;
d2.6, judging whether the length (xu- > size) of the substring (xu- > str) to be searched is larger than the number of the bits of the bit sequence used for voting, if so, exiting the process, and returning empty information to the operator; if the number of the bits used for voting is not larger than the number of the bits used for voting in the bit sequence, entering the next step; the c language pseudo code is expressed as: if (xu- > size >8) return (NULL);
d2.7, comparing the length (bei- > size) of the parent string (bei- > str) with the length (xu- > size) of the substring (xu- > str), and if the length (bei- > size) of the parent string (bei- > str) is smaller than the length (xu- > size) of the substring (xu- > str), quitting the process and returning empty information to the operator; otherwise, entering the next step; the c language pseudo code is expressed as: if (bei- > size < xu- > size) return (NULL);
d2.8, distributing an unsigned integer variable ks; and a value obtained by subtracting the number of bits used for voting from the bit sequence from the variable b _ size is given to the variable ks; the C language pseudo code is expressed as: signaled int ks-8;
d2.9, distributing an unsigned integer variable bc, and assigning the variable bc to be zero; the C language pseudo code is expressed as: the signaled char bc is 0;
d2.10, distributing a signed integer variable i, and assigning the variable i to be zero; the C language pseudo code is expressed as: int i ═ 0;
d2.11, the signed integer variable key is assigned, and the number of bits of the bit sequence used for voting is added by one and then the length (xu- > size) of the sub-string (xu- > str) is subtracted, the C language pseudo code is expressed as: int key 8+1-xu- > size;
d2.12, assigning 1 to all bits used for voting of a bit sequence variable n 1; c language pseudo code expression n1.cha 0 XFF;
d2.13, constructing a third loop, wherein a condition variable for the loop is i, the counting initial value of i is 0, and the loop is not continuously executed when i is larger than ks; (ii) a c language code is expressed as for (i ═ 0; i < ═ ks;) { };
the circulation of the third counting cycle has the following procedures in vivo:
d2.13.0, judging whether bc is larger than zero, if not, entering step D2.13.6, and if so, entering D2.13.1; the C language pseudo code is expressed as: if (bc >0) { };
d2.13.1, shifting the variable n1 to the right, wherein the value of the digit of the right shift operation is bc, and the value obtained after the shift operation is given to n 1; the C language pseudo code is expressed as: n1.cha > > > -bc;
d2.13.2, assigning each bit of the variable n2 for the sequence of bits to a value of 1; the C language pseudo code is expressed as: n2.cha — 0 XFF;
d2.13.3, the result of subtracting bc from the number of bits used by the bit sequence for voting is given to bc; the C language pseudo code is expressed as: bc is 8-bc;
d2.13.4, performing left shift operation on n2, wherein the number of left shift operation digits is bc, and assigning the value obtained by the left shift operation to n 2; the C language pseudo code is expressed as: n2.cha < < ═ bc;
d2.13.5, bit-OR-ing n1 and n2, and assigning the result of the bit-OR-ing to n 1; the C language pseudo code is expressed as: n1.cha ═ n1.cha | n2. cha;
d2.13.6, a character which leads sty to the parent string (bei- > str) and has the same sequence number with the value of the variable i; the C language pseudo code is expressed as: sty is (bei- > str) + i; (ii) a
D2.13.7, constructing a fourth time loop, wherein the condition variable for the loop is j, the initial value of the time of j is bc, the value of j increases by 1 once in each loop, and when j is more than or equal to the number of the bits of the bit sequence used for voting, the loop is not executed any more; the C language code is expressed as: for (int j ═ bc; j < 8; j + + { };
the circulation of the fourth counting cycle has the following flow in vivo:
d2.13.7.1, allocating a variable ca, and assigning the value of the unit with the sequence number equal to the value of j in the data string with sty as the beginning in the mother string to ca; the C language code is expressed as: assigned char ca ═ sty [ j ];
d2.13.7.2, obtaining the value of a bit sequence unit with the first dimension sequence number of ca and the second dimension sequence number of j in the bit sequence list, and endowing the value to n 2; the C language code is expressed as: n2.cha ═ biao [ ca ] [ j ]. cha; (ii) a
D2.13.7.3, performing a bit AND operation on n1 and n2, and giving a value obtained by the bit AND operation to n 1; the C language code is expressed as: n1.cha & ═ n2. cha;
d2.13.8, judging whether the value of n1 is not zero; if not, go to step D2.13.9; otherwise go to step D2.13.10;
d2.13.9, the step includes the following substeps:
d2.13.9.1, assigning each bit of n2 as 1; the C language code is expressed as: n2.cha — 0 XFF;
d2.13.9.2, performing right shift operation on n2, wherein the number of bits of the right shift operation is the length (xu- > str) of the substring (xu- > str) minus 1; assigning the value obtained by the displacement operation to n 2; the C language code is expressed as: n2.cha > > - > xu- > size-1;
d2.13.9.3, bitwise AND-operating n2 and n1, and assigning the value obtained by the operation to n 2; the C language code is expressed as: n2.cha & ═ n1. cha;
d2.13.9.4, judging whether n2 is not zero, if n2 is not zero, entering D2.13.9.5, and if n2 is zero, entering D2.13.9.6;
d2.13.9.5, a bit of 1 in the variable n2 indicates a matching position, and n2 is analyzed; the analysis rule is: if the nth bit in n2 is 1, the position of i + n in the mother string (bei- > str) is a substring matching position;
d2.13.9.6, assigning each bit of n2 as 1; the C language code is expressed as: n2.cha — 0 XFF;
d2.13.9.7, performing left shift operation on n2, wherein the number of bits of the left shift operation is the value of key; assigning the value obtained by the displacement operation to n 2; the C language code is expressed as: n2.cha < ═ key;
d2.13.9.8, bitwise AND-operating n2 and n1, and assigning the value obtained by the operation to n 2; the C language code is expressed as: n2.cha & ═ n1. cha;
d2.13.9.9, judging whether n2 is not zero, if n2 is not zero, entering D2.13.9.10, and if n2 is zero, entering D2.13.9.11;
d2.13.9.10, a bit of 1 in the variable n2 indicates a sliding step, and n2 is analyzed; the analysis rule is: judging in sequence that the bit meeting the first 1 is the minimum sliding step length, if the serial number of the first 1 is nx, increasing the value of i by nx, giving nx to bc, and jumping to the tail part of the third loop, wherein the c language pseudo code is expressed as: a continue; (ii) a
D2.13.9.11, the number of bits used to vote from the ascending bit sequence; the c language pseudo code is expressed as: i + ═ 8;
d2.13.9.12, assigning bc to zero; the c language pseudo code is expressed as: bc is 0;
d2.13.9.13, assigning each bit of n1 as 1; the C language code is expressed as: n1.cha is 0 XFF;
d2.13.9.14, jump to the end of the third loop, the c language pseudo code is expressed as: a continue; (ii) a
D2.13.10, the step includes the following substeps:
d2.13.10.1, the number of bits used to vote from the ascending bit sequence; the c language pseudo code is expressed as: i + ═ 8;
d2.13.10.2, assigning bc to zero; the c language pseudo code is expressed as: bc is 0;
d2.13.10.3, assigning each bit of n1 as 1; the C language code is expressed as: n1.cha is 0 XFF;
d2.13.10.4, jump to the end of the third loop, the c language pseudo code is expressed as: a continue; (ii) a
D2.14, judging whether i is positioned in an interval of a value obtained by subtracting the number of bits used for voting from the parent string length (bei- > size) and a value obtained by subtracting the substring length (xu- > size) from the parent string length (bei- > size); if the current time is within the interval, entering D2.15, and if the current time is not within the interval, entering D2.16; the c language pseudo code is expressed as: if (i < ═ bei- > size-xu- > size & & i > -bei- > size-8) { };
d2.15, the step comprises the following substeps:
d2.15.1, assigning an unsigned integer variable hr; and assigning hr the value of the parent string length minus i; the C language code is expressed as: designed int hr ═ bei- > size-i;
d2.15.2, judging whether bc is equal to zero, if so, entering D2.15.3, and if not, entering D2.15.4;
d2.15.3, assigning all bits of n1 as 1; the C language pseudo code is expressed as: n1.cha is 0 XFF;
d2.15.4, judging whether bc is larger than zero, if bc is larger than zero, entering D2.15.5, otherwise entering D2.15.6;
d2.15.5, the step includes the following substeps:
d2.15.5.0, judging whether bc is larger than zero, if not, entering step D2.15.5.6, and if so, entering D2.15.5.1; the C language pseudo code is expressed as: if (bc >0) { };
d2.15.5.1, shifting the variable n1 to the right, wherein the value of the digit of the right shift operation is bc, and the value obtained after the shift operation is given to n 1; the C language pseudo code is expressed as: n1.cha > > > -bc;
d2.15.5.2, assigning each bit of the variable n2 for the sequence of bits to a value of 1; the C language pseudo code is expressed as: n2.cha — 0 XFF;
d2.15.5.3, the result of subtracting bc from the number of bits used by the bit sequence for voting is given to bc; the C language pseudo code is expressed as: bc is 8-bc;
d2.15.5.4, performing left shift operation on n2, wherein the number of left shift operation digits is bc, and assigning the value obtained by the left shift operation to n 2; the C language pseudo code is expressed as: n2.cha < < ═ bc;
d2.15.5.5, bit-OR-ing n1 and n2, and assigning the result of the bit-OR-ing to n 1; the C language pseudo code is expressed as: n1.cha ═ n1.cha | n2. cha;
d2.15.6, a character which leads sty to the parent string (bei- > str) and has the same sequence number with the value of the variable i; the C language pseudo code is expressed as: sty is (bei- > str) + i; (ii) a
D2.15.7, constructing a fifth time loop, wherein the condition variable for the loop is jr, the time initial value of jr is bc, the value of jr increases by 1 once in each loop, and when the value of jr is more than or equal to hr, the loop is not executed continuously; the C language code is expressed as: for (int jr ═ bc; jr < hr; jr + + { };
the fifth counting cycle has the following procedures in the circulation body:
d2.13.7.1, allocating a variable ca, and assigning the value of the unit with the numerical value of the sequence number equal to jr in the data string with sty as the beginning in the mother string to ca; the C language code is expressed as: assigned char ca ═ sty jr;
d2.13.7.2, obtaining the value of the bit sequence unit with the first dimension serial number of ca and the second dimension serial number of jr in the bit sequence list, and endowing the value to n 2; the C language code is expressed as: n2.cha ═ biao [ ca ] [ jr ]. cha; (ii) a
D2.13.7.3, performing a bit AND operation on n1 and n2, and giving a value obtained by the bit AND operation to n 1; the C language code is expressed as: n1.cha & ═ n2. cha;
d2.15.8, judging whether the value of n1 is not zero; if not, go to step D2.15.9; otherwise go to step D2.15.10;
d2.15.9, the step includes the following substeps:
d2.15.9.1, assigning each bit of n2 as 1; the C language code is expressed as: n2.cha — 0 XFF;
d2.15.9.2, performing right shift operation on n2, wherein the number of bits of the right shift operation is the sum of the length (xu- > str) of the substring (xu- > str) minus 1 and the value of hr minus plus the length (xu- > size) of the substring; assigning the value obtained by the displacement operation to n 2; the C language code is expressed as: n2.cha > > (8-1-hr + xu- > size);
d2.15.9.3, bitwise AND-operating n2 and n1, and assigning the value obtained by the operation to n 2; the C language code is expressed as: n2.cha & ═ n1. cha;
d2.15.9.4, judging whether n2 is not zero, if n2 is not zero, entering D2.15.9.5, and if n2 is zero, entering D2.15.10;
d2.15.9.5, a bit of 1 in the variable n2 indicates a matching position, and n2 is analyzed; the analysis rule is: if the nth bit in n2 is 1, the position of i + n in the mother string (bei- > str) is a substring matching position;
d2.15.10, enter D2.16;
d2.16, and finishing.
The codes of the present embodiment are as follows; the following codes clearly express the data searching method of the present invention to those skilled in the art, and the detailed remarks are made, even a novice with poor technical level can copy and paste the following codes to realize the embodiment, so the disclosure of the present invention is clear.
Figure GDA0001852863210000111
Figure GDA0001852863210000121
Figure GDA0001852863210000131
Figure GDA0001852863210000141
Figure GDA0001852863210000151
Figure GDA0001852863210000161
Figure GDA0001852863210000171
Figure GDA0001852863210000181
Figure GDA0001852863210000191
The above examples show the implementation manner of the present invention, and the content of the above codes is not a limitation to the scope of the patent right protection of the present invention, and the above codes have been compiled and can normally run, but the applicant cannot guarantee that a change occurs in the storage and transmission processes of the data, if the reader cannot compile, the reader should repair the codes in combination with the common knowledge in the art, and if the repair is still difficult to run, the applicant can be contacted to obtain the above codes.

Claims (4)

1. A method for constructing a bit sequence table is characterized in that:
the bit sequence units contained in the bit sequence list all have the following characteristics: "is used for storing the data string matching condition vote, the bit number of the bit sequence used for voting is less than or equal to the bit number of the once processed data of the applied computer; the bit sequence is used for storing the support authority or the denial authority of elements of the data string on each position of the whole data string and the matching condition of the substring to be searched, the bit sequence integrally corresponds to one data string, each bit in the bit sequence respectively and independently corresponds to one element of the data string, the corresponding modes of the bits in the bit sequence and the elements of the data string are sequentially corresponding, the bit in the bit sequence represents support by 1 and is denied by zero, the corresponding modes of the bits in the bit sequence and the elements of the data string are in same sequence, namely, the first bit in the bit sequence corresponds to the first element of the data string, the second bit in the bit sequence corresponds to the second element of the data string, the third bit in the bit sequence corresponds to the third element of the data string, and so on;
the method comprises the steps that the matching position of a substring (xu- > str) needing to be searched is obtained from a searched mother string (bei- > str), the length (bei- > size) of the mother string (bei- > str) is larger than or equal to the length (xu- > size) of the substring (xu- > str), the value range of mother string elements is fixed, the value range of the mother string elements can be predicted before the mother string is loaded, the value range of the substring elements is fixed, and the value range of the substring elements can be predicted before the substring is loaded;
the construction method comprises the following steps:
step C1, constructing a bit sequence table template: the C language pseudo code is expressed as: constructing a bit sequence table template ();
the method for constructing the bit sequence table template comprises the following steps:
step C1.1, allocating memory for a bit sequence table template (wei _ zhuman jm [8]), wherein the bit sequence table template is provided with bit sequence units, and the number of the bit sequence units in the bit sequence table template is equal to the number of bit positions used for voting by the bit sequence; the bit sequence table template is provided with the sequence numbers of the bit sequence units which are calculated from zero;
c1.2, assigning values to each bit sequence unit in the bit sequence table template (wei _ zhuanan jm [8 ]); the assignment operation is as follows:
step C1.2.1, assigning all bits used for voting in each bit sequence unit in the bit sequence table template (wei _ zhuanan jm [8]) to be 1;
step C1.2.2, performing a shift operation on all bits used for voting of each bit sequence unit, wherein the bit number of the shift operation is the sum of the sequence number of each bit sequence unit in the bit sequence table template (wei _ zhuanan jm [8]) and 1;
step C2, loading the bit sequence table template (jm) into the first dimension (biao [0] [ ]) of the bit sequence table, the method is as follows:
c2.1, copying the data of the bit sequence table template (jm) into a bit sequence table in sequence;
step C3, assigning the bit without veto power of each bit sequence unit in the first dimension (biao [0] [ ]) of the bit sequence list as 1, specifically as follows:
step C3.1, assigning a variable, referred to as a ninth variable (n9), the number of bits of the ninth variable (n9) being equal to the number of bits of the bit sequence;
step C3.2, assigning each bit of the ninth variable (n9) to be 1;
step C3.3, performing right shift operation on the ninth variable, wherein the right shift is the maximum serial number of the bit sequence unit minus the serial number of the currently processed bit sequence unit, and the maximum serial number (size) of the characters in the character string to be searched is added, the first serial number of the characters in the character string is zero, namely the subscript of the characters in the character string starts from zero;
step C3.4, carrying out bit OR operation on the value of the current bit-sequence unit and a ninth variable (n9), and endowing the result of the bit OR operation to the current bit-sequence unit;
c4, copying each bit sequence unit in the first dimension (biao [0] [ ]) of the bit sequence list to other dimensions of the bit sequence list in the same order;
step C5, carrying out voting information assignment on each bit sequence unit in the bit sequence list, wherein the specific method is as follows:
step C5.1, constructing a first time loop, wherein the first time loop is provided with a first time variable (j), the initial value of the first time variable is the bit number of the bit sequence used for voting, the first time variable (j) is reduced by 1 every time the first time loop circulates once, and the loop stops when the first time variable (j) is less than zero; each cycle of the first counting cycle comprises the following operation processes:
step C5.1.1, assigning a kr variable (kr);
step C5.1.2, assigning the value of the first time variable (j) to the kr variable;
step C5.1.3, if the first time-counting variable (j) is larger than the maximum sequence number (size) of the characters in the character string to be searched, assigning the maximum sequence number (size) of the characters in the character string to be searched to the kr variable;
step C5.1.4, constructing a second time loop, wherein the second time loop has a second time variable (k), the initial value of the second time variable (k) is the value of the kr variable, the first time loop reduces the second time variable (k) by 1 once per loop, and the loop stops when the second time variable (k) is less than zero; each cycle of the second time-counting variable (k) comprises the following operating procedures:
step C5.1.4.1, assigning a variable, referred to as a tenth variable (n10), the number of bits of the tenth variable (n10) being equal to the number of bits of the bit sequence;
a step C5.1.4.2, assigning a tenth variable (n10) to 1;
step C5.1.4.3, a left shift operation is carried out on the tenth variable (n10), the number of bits of the left shift operation is the value obtained by subtracting the second time-counting variable (k) from the first time-counting variable (j), and the value obtained by the left shift operation is given to the tenth variable (n 10);
step C5.1.4.4, allocating a ca variable;
step C5.1.4.5, acquiring the character (xu- > str [ k ]) with the serial number of the second time-counting variable (k) in the character string (xu- > str) to be searched, and assigning the acquired character (xu- > str [ k ]) to the ca variable;
step C5.1.4.6, performing assignment operation on the unit (biao [ ca ] [ j ]) in the bit sequence table, where the first dimension serial number of the unit (biao [ ca ] [ j ]) subjected to assignment operation is equal to the value of the ca variable, and the second dimension serial number of the unit (biao [ ca ] [ j ]) subjected to assignment operation is equal to the value of the first secondary variable (j), and the assignment operation mode is: bit-or-manipulating the value of the cell and a tenth variable (n10) and then assigning the value of the bit-or-manipulation to the cell;
and C6, ending.
2. The data searching method is characterized in that: the bit sequence table is constructed by applying the bit sequence table construction method of claim 1 to realize searching.
3. Public opinion monitored control system, its characterized in that: the bit sequence table is constructed by applying the bit sequence table construction method of claim 1 to realize searching.
4. An information dissemination system, characterized by: the bit sequence table is constructed by applying the bit sequence table construction method of claim 1 to realize searching.
CN201811112028.9A 2018-09-21 2018-09-21 Method for constructing bit sequence table, data searching method, public opinion monitoring system and information transmission system Active CN109376292B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811112028.9A CN109376292B (en) 2018-09-21 2018-09-21 Method for constructing bit sequence table, data searching method, public opinion monitoring system and information transmission system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811112028.9A CN109376292B (en) 2018-09-21 2018-09-21 Method for constructing bit sequence table, data searching method, public opinion monitoring system and information transmission system

Publications (2)

Publication Number Publication Date
CN109376292A CN109376292A (en) 2019-02-22
CN109376292B true CN109376292B (en) 2021-11-02

Family

ID=65401612

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811112028.9A Active CN109376292B (en) 2018-09-21 2018-09-21 Method for constructing bit sequence table, data searching method, public opinion monitoring system and information transmission system

Country Status (1)

Country Link
CN (1) CN109376292B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109213809A (en) * 2018-09-26 2019-01-15 长沙学院 Data search method, information dissemination system, public sentiment management system, artificial intelligence system

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101377806A (en) * 2008-07-24 2009-03-04 江苏大学 Information flow analysis method based on system source code searching concealed channel
CN102163221A (en) * 2011-04-02 2011-08-24 华为技术有限公司 Pattern matching method and device thereof

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7974962B2 (en) * 2005-01-06 2011-07-05 Aptiv Digital, Inc. Search engine for a video recorder
US10430485B2 (en) * 2016-05-10 2019-10-01 Go Daddy Operating Company, LLC Verifying character sets in domain name requests

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101377806A (en) * 2008-07-24 2009-03-04 江苏大学 Information flow analysis method based on system source code searching concealed channel
CN102163221A (en) * 2011-04-02 2011-08-24 华为技术有限公司 Pattern matching method and device thereof

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
"基于有序二叉树的快速多模式字符串匹配算法";周燕等;《计算机工程》;20100905;全文 *

Also Published As

Publication number Publication date
CN109376292A (en) 2019-02-22

Similar Documents

Publication Publication Date Title
KR102376117B1 (en) Parallel decision tree processor architecture
CN109284424B (en) Method for constructing sliding condition table
AU2014315619B2 (en) Methods and systems of four-valued simulation
CN1950819B (en) A system and method for retrieving information and a system and method for storing information
CN109271507B (en) Substring information processing method, computer data management system, public opinion analysis system and social network analysis system
US20150262063A1 (en) Decision tree processors
CN110096264A (en) A kind of code operation method and device
CN109376292B (en) Method for constructing bit sequence table, data searching method, public opinion monitoring system and information transmission system
CN109376281B (en) Bit sequence, data searching method, searching system, social network analysis system and public opinion monitoring system
CN113536308B (en) Binary code tracing method for multi-granularity information fusion under software gene view angle
CN117493622B (en) Method and device for inquiring character strings based on field programmable array device
CN103270512A (en) Intelligent architecture creator
CN104885060B (en) Data leakage updates and checks that the leakage of device, data updates inspection method
CN115456421A (en) Work order dispatching method and device, processor and electronic equipment
CN109213808A (en) Searching method, internet information library, the analysis of public opinion system based on search
CN115378824B (en) Model similarity determination method, device, equipment and storage medium
Borsotti et al. A closer look at TDFA
CN116841622B (en) Address self-increasing memory instruction generation method, device, equipment and medium
CN109376279A (en) Construct method, data search system, computer information processing system, the artificial intelligence system of search result storage container
CN117473494A (en) Method and device for determining homologous binary files, electronic equipment and storage medium
CN111859860A (en) Data analysis method and device, storage medium, and electronic device
Anagnostopoulos Some Basic Computer-Science Concepts for Data Scientists
CN109241115A (en) Construct method, the searching method, computer public sentiment monitoring system, artificial intelligence system of match condition table
CN109344301A (en) Method, computer data processing system, the information management system of construction ballot mark table
CN113821211A (en) Command analysis method and device, storage medium and computer equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant