CN101159795A - Calling list rearrangement method and device - Google Patents

Calling list rearrangement method and device Download PDF

Info

Publication number
CN101159795A
CN101159795A CNA200710176358XA CN200710176358A CN101159795A CN 101159795 A CN101159795 A CN 101159795A CN A200710176358X A CNA200710176358X A CN A200710176358XA CN 200710176358 A CN200710176358 A CN 200710176358A CN 101159795 A CN101159795 A CN 101159795A
Authority
CN
China
Prior art keywords
file
feature string
feature
string
ticket writing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CNA200710176358XA
Other languages
Chinese (zh)
Other versions
CN100571317C (en
Inventor
华国栋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
ZTE Corp
Original Assignee
ZTE Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ZTE Corp filed Critical ZTE Corp
Priority to CNB200710176358XA priority Critical patent/CN100571317C/en
Publication of CN101159795A publication Critical patent/CN101159795A/en
Application granted granted Critical
Publication of CN100571317C publication Critical patent/CN100571317C/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a phone bill non-repetition method and a device. The device can pick up the key information in the phone bill record and generate characteristic string, the characteristic string are then stored in different files under different directories according to a certain rule. In a phone bill record dealing process, if the same characteristic string records are found in the corresponding characteristic files, it means that the phone bill record is repeated; otherwise, the phone bill record will be listed into the phone bill file and its characteristic string are added into the corresponding characteristic file. The way to directly use the file system to avoid phone bill record repeating needs no special database system and huge memory, but it generates characteristic strings according to the key information it picks up from the phone bill. The characteristic strings are then dispersed into many small files. The invention realizes the fast positioning search with high reliability, small investment, low cost, easy operation and convenient transplantation. Therefore, the invention can realize high performance repeat-avoiding for the massive data phone bill in a large time span.

Description

A kind of calling list rearrangement method and device
Technical field
The present invention relates to the ticket treatment technology, be meant a kind of calling list rearrangement method and device especially.
Background technology
Telecommunication service operator is providing the information that needs the accurate recording user to use business in the data services service process to the user, these information embody with the ticket form usually, as rates ground.The accuracy of CDR file directly influences the satisfaction of user to business, thereby becomes the problem that telecommunication service operator pays close attention to the most.
Causing the inaccurate major reason of CDR file is exactly that ticket writing repeats, and the reason that produces the ticket writing repetition may come from each processing links such as service enabler, management system and charge system.Before the formal generation of the CDR file that is used to charge, must arrange heavily ticket writing and handle, two identical ticket writings appear in the Call Detail Record of avoiding the user is chargeed.
In order to realize the heavy processing of ticket row, can all handle in the CDR file the new ticket writing of each bar and search, determine whether to exist identical ticket writing, if exist at all, show that then current ticket writing is the ticket writing of repetition, directly abandons this ticket writing; If there is no, then current ticket writing is write in the CDR file of having handled.But the non-constant of the treatment effeciency of this processing mode can't be arranged heavily a large amount of ticket writings and handle, and therefore the following two kinds of processing modes of main at present employing are improved treatment effeciency.
A kind of processing mode is to use database to realize that ticket row heavily handles, promptly utilize the unique constraints characteristic of database, set up the uniqueness index at the critical field in the CDR file, can destroy the unique constraints of database when inserting the ticket writing that repeats, will make the insertion failure of ticket writing.To carry out ticket row heavy but this processing mode must rely on high performance Database Systems, needs the special-purpose relational database software of buying, and cost is very high.
Another kind of processing mode is to use internal memory to carry out ticket row and heavily handles, promptly utilize the time sequencing characteristic in the CDR file, the CDR file of close time is written into internal memory, the current ticket writing that will be written into then compares with the CDR file that is written into, if the time is close and information inconsistency, show that then current ticket writing is not the ticket writing of repetition, writes current ticket writing and handles in the CDR file; If the time is close and information is consistent, show that then current ticket writing is the ticket writing of repetition, directly abandons this ticket writing; If the time is inconsistent, show that then CDR file corresponding to time period of current ticket writing not in internal memory, uses page swap to load the CDR file of close time, carry out ticket row then and heavily handle.Using internal memory to carry out ticket row heavily handles, though handling property is very high,, can cause the frequent file page hand-deliver of internal memory to be changed for arrange heavily processing from a plurality of systems, time-discrete ticket because memory size is limited, cause system loading to increase, handling property reduces; In addition, use internal memory to carry out ticket row and weigh easy drop-out when the machine of delaying, power down and program exception occurring.
Summary of the invention
In view of this, main purpose of the present invention is to provide a kind of calling list rearrangement method and device, realizes the unirecord quick productive rearrangement processing of dialogue.
For achieving the above object, technical scheme of the present invention is achieved in that
A kind of calling list rearrangement method, the method includes the steps of:
The key message generating feature string of A, the current ticket writing of extraction;
B, by the feature string file in the regular compute location feature tandem list file of feature string;
C, judge whether in described feature string file, to find the described feature string of current ticket writing,, then directly abandon current ticket writing if find; If do not find, then current ticket writing is write in the CDR file.
Described key message is: session number, or source Mobile Station International ISDN Number, or destination Mobile Station International ISDN Number, or time of origin, or service identification, or above combination arbitrarily.
Described steps A comprises: key message is spliced the composition characteristic string in order.
Described steps A further comprises: use the low hashing algorithm of collision rate to calculate the hashed value string of key message, with the hashed value string that obtains as the feature string.
Described step B comprises: determine the hashed value of the feature string of current ticket writing, this hashed value is made modulo operation to setting integer, obtains feature string file title.
Described step C comprises:
Whether there is feature string file in C1, the judging characteristic tandem list file,, then continues execution in step C2, if there is no, then continue execution in step C3 if exist corresponding to described feature string file title;
C2, judge whether in described feature string file, to find the feature string of current ticket writing, if find, then directly abandon this ticket writing, finish current flow process, if do not find, then current ticket writing is write in the CDR file, and in described feature string file, add the feature string of current ticket writing;
C3, current ticket writing is write in the CDR file, and in feature tandem list file, create feature string file corresponding to described feature string file title.
Add after the feature string of current ticket writing described in the step C2, further comprise: additional newline after the feature string, the feature string in the single file storage feature string file.
This method further comprises: the limited field in the selected characteristic string is as index field, with the subdirectory name of index field as described feature tandem list file, described feature string file is distributed in the different subdirectories goes.
A kind of ticket row refitting is put, and comprising: extraction unit, generation unit, positioning unit, search unit and processing unit, and wherein, described extraction unit is used to extract the key message of current ticket writing; Described generation unit is used for generating according to described key message the feature string of current ticket writing; Described positioning unit is used for the feature string file by the regular compute location feature tandem list file of described feature string; Described search unit is used for determining whether to find at described feature string file the feature string of current ticket writing, and provides lookup result to processing unit; Described processing unit is used for search unit not when the feature string file finds the feature string of current ticket writing, and current ticket writing is write in the CDR file.
Described processing unit is further used for adding the feature string of current ticket writing in the feature string file, or is creating the feature string file in feature tandem list file.
The present invention is based on the ticket row double recipe case of file system, it is the key message generating feature string that extracts in the ticket writing, according to certain rule it is left in the different files under the different directories, when handling a ticket writing, if in the character pair string file, find the record of same characteristic features string then represent to duplicate, otherwise this ticket writing is write CDR file, and the feature string is added into the characteristic of correspondence string file.Directly using file system to carry out ticket row heavily handles, do not need private database system and huge internal memory, but by extracting the key message generating feature string of ticket writing, evenly spread in numerous small documents, realize that search the location fast, the reliability height, invest for a short time, cost is low, and is easy to implement, be convenient to transplant, the efficient ticket row who can be implemented in the interior mass data of big time span is heavy.
Description of drawings
Fig. 1 is the heavy flow chart of ticket row among the present invention;
Fig. 2 is basic feature tandem list file structure schematic diagram among the present invention;
The feature tandem list file structure schematic diagram of Fig. 3 for optimizing among the present invention;
Fig. 4 A comes schematic diagram for ticket row restitution among the present invention;
Fig. 4 B is ticket row refitting interposed structure schematic diagram among the present invention.
Embodiment
Among the present invention, extract the key message generating feature string of current ticket writing, by the feature string file in the regular compute location feature tandem list file of this feature string, judge whether in this feature string file, to find the feature string of current ticket writing, if find, show that then current ticket writing is the ticket writing of repetition, directly abandons current ticket writing; If do not find, then current ticket writing is write in the CDR file.
Fig. 1 is the heavy flow chart of ticket row among the present invention, and as shown in Figure 1, the processing procedure of among the present invention ticket being reset may further comprise the steps:
Step 101~step 102: import a ticket writing, extract the key message generating feature string of current ticket writing.
Key message is meant and can be used to carry out the field that ticket row heavily handles in the ticket writing, for example, the combination in any of session number, source Mobile Station International ISDN Number, destination Mobile Station International ISDN Number, time of origin, service identification, above field, or the like, can use these key messages to splice the composition characteristic string in order, be used for ticket writing of unique identification.Key message is according to the difference of type of service, and its kind and quantity can be different.If the spliced feature string of key message is longer, in order to improve matched and searched efficient, can use the low hashing algorithm of collision rate to calculate the hashed value string of key message, with the hashed value string that obtains as the feature string of finally storing.
Step 103~step 104: the hashed value A that determines the feature string of current ticket writing, hashed value A obtains integer M to Integer N as modulo operation, M is the title of characteristic of correspondence string file in the feature tandem list file, is used at feature tandem list file location individual features string file.Integer N is predefined integer, is used for the feature string file quantity that the expectation of representation feature tandem list file comprises.Hashed value A is exactly the input with the random length character string, becomes integer output by computational transformation, and this integer output is exactly the integer hashed value.
Step 105: whether have feature string file M in the judging characteristic tandem list file,, then continue execution in step 106 if exist; If there is no, show that then current ticket writing is not the ticket writing of repetition, continue execution in step 107.
Step 106: judge whether in feature string file M, to find the feature string that generates in step 101~step 102,, show that then current ticket writing is the ticket writing of repetition, directly abandon this ticket writing, finish current flow process if find; If do not find, show that then current ticket writing is not the ticket writing of repetition, current ticket writing is write in the CDR file, and in feature string file M, add the feature string that generates in step 101~step 102, finish current flow process.In order to improve search efficiency, the feature string in the feature string file can be stored by single file, and additional newline after each feature string promptly adds newline after the feature string that adds in feature string file M, and single file is stored this feature string.
Step 107: current ticket writing is write in the CDR file, and in feature tandem list file, create feature string file M, finish current flow process.
In the processing procedure that step 103~step 107 is described, the progressively generative process that has comprised feature string file in the feature tandem list file, be under the initial situation, the feature string that generates according to the key message of ticket writing can't find in the feature string file of feature tandem list file, then sets up new feature string file in feature tandem list file; Perhaps in the individual features string file, add the feature string that generates.After heavily the ticket writing of processing got more and more through row, the feature string file that comprises in the feature tandem list file was also just more and more, and the form of the feature string that comprises also just gets more and more.
In step 103~step 106, the hashed value A of the feature string by calculating current ticket writing, it is evenly disperseed to store in N the ticket feature string file that feature tandem list file comprises, the index structure of basic feature tandem list file as shown in Figure 2, if the ticket writing quantity of having arranged in the CDR file of heavily handling is the Q bar, the feature string quantity that each feature string file is on average deposited is the I=Q/N bar, because I and Q are linear relationships, when Q is big, I also can be bigger, carrying out consuming time when searching of feature string also can increase, in order to improve search efficiency, can adopt limited field in the selected characteristic string as index field,, the feature string file is distributed in the different subdirectories go with the subdirectory name of index field as feature tandem list file, reduce the quantity of documents under the individual layer catalogue, reduce document size, improve the search efficiency of feature string, the index structure of the feature tandem list file that process is optimized as shown in Figure 3.If the value of index field has K kind possibility, then the quantity of feature string file increases to the K*N bar in the feature tandem list file, and the feature string quantity of on average depositing in each feature string file is the I=Q/K*N bar, can effectively improve the search efficiency of feature string.
Further, can choose a more than index field as subdirectory name, the index field that promptly is used for the subdirectory title can be chosen a plurality of, use the method for multistage subdirectory can further reduce the feature string quantity of each feature string file in the feature tandem table, improve the search efficiency of feature string, but should value limited and be evenly distributed.But multistage subdirectory may cause feature string file quantity too much, causes increase consuming time when carrying out identical locating file, reduces ff efficient on the contrary, need weigh subdirectory progression according to actual conditions during realization, is no more than three layers usually.
With index field during as subdirectory name, carrying out ticket arranges when heavily handling, at first determine index field according to the feature string of current ticket writing, determine the hashed value of index field then, determine the integer of locator catalogue file again, according to whether exist in the feature tandem list file corresponding sub directory file, and the subdirectory file in whether exist the feature string to carry out respective handling.
In addition, if the index field of choosing has time attribute, as time, month etc., the characteristic that can utilize ticket on the rise time, to concentrate relatively, periodic cleaning surpasses the feature string file in certain hour time limit, keeps stable characteristics string search efficiency.
The typical case that Fig. 4 A has described a system provided by the invention forms, wherein ticket row refitting is put the ticket writing of a plurality of tickets source input is arranged heavily processing, obtain process at last and arrange the CDR file of heavily processing and the ticket writing of repetition, the ticket writing of repetition can directly abandon.
Fig. 4 B is ticket row refitting interposed structure schematic diagram among the present invention, shown in Fig. 4 B, ticket row refitting is put and is comprised extraction unit, generation unit, positioning unit, search unit and processing unit, wherein, extraction unit is used to extract the key message of current ticket writing, and provides this key message to generation unit; Generation unit is used for generating according to key message the feature string of current ticket writing, and provides this feature string to positioning unit; Positioning unit is used for the feature string file by the regular compute location feature tandem list file of this feature string, provides this feature string file to search unit; Search unit is used for determining whether to find at the feature string file feature string of current ticket writing, and provides lookup result to processing unit; Processing unit is used for search unit not when the feature string file finds the feature string of current ticket writing, and current ticket writing is write in the CDR file.
Processing unit is further used for adding the feature string of current ticket writing in the feature string file, or is creating the feature string file in feature tandem list file.
The above is preferred embodiment of the present invention only, is not to be used to limit protection scope of the present invention.

Claims (10)

1. calling list rearrangement method is characterized in that the method includes the steps of:
The key message generating feature string of A, the current ticket writing of extraction;
B, by the feature string file in the regular compute location feature tandem list file of feature string;
C, judge whether in described feature string file, to find the described feature string of current ticket writing,, then directly abandon current ticket writing if find; If do not find, then current ticket writing is write in the CDR file.
2. method according to claim 1 is characterized in that, described key message is: session number, or source Mobile Station International ISDN Number, or destination Mobile Station International ISDN Number, or time of origin, or service identification, or above combination arbitrarily.
3. method according to claim 2 is characterized in that, described steps A comprises: key message is spliced the composition characteristic string in order.
4. method according to claim 3 is characterized in that, described steps A further comprises: use the low hashing algorithm of collision rate to calculate the hashed value string of key message, with the hashed value string that obtains as the feature string.
5. method according to claim 1 is characterized in that, described step B comprises: determine the hashed value of the feature string of current ticket writing, this hashed value is made modulo operation to setting integer, obtains feature string file title.
6. method according to claim 5 is characterized in that, described step C comprises:
Whether there is feature string file in C1, the judging characteristic tandem list file,, then continues execution in step C2, if there is no, then continue execution in step C3 if exist corresponding to described feature string file title;
C2, judge whether in described feature string file, to find the feature string of current ticket writing, if find, then directly abandon this ticket writing, finish current flow process, if do not find, then current ticket writing is write in the CDR file, and in described feature string file, add the feature string of current ticket writing;
C3, current ticket writing is write in the CDR file, and in feature tandem list file, create feature string file corresponding to described feature string file title.
7. method according to claim 6 is characterized in that, adds after the feature string of current ticket writing described in the step C2, further comprises: additional newline after the feature string, the feature string in the single file storage feature string file.
8. method according to claim 1, it is characterized in that, this method further comprises: the limited field in the selected characteristic string is as index field, with the subdirectory name of index field as described feature tandem list file, described feature string file is distributed in the different subdirectories goes.
9. ticket row refitting is put, and it is characterized in that this device comprises: extraction unit, generation unit, positioning unit, search unit and processing unit, wherein,
Described extraction unit is used to extract the key message of current ticket writing;
Described generation unit is used for generating according to described key message the feature string of current ticket writing;
Described positioning unit is used for the feature string file by the regular compute location feature tandem list file of described feature string;
Described search unit is used for determining whether to find at described feature string file the feature string of current ticket writing, and provides lookup result to processing unit;
Described processing unit is used for search unit not when the feature string file finds the feature string of current ticket writing, and current ticket writing is write in the CDR file.
10. device according to claim 9 is characterized in that, described processing unit is further used for adding the feature string of current ticket writing in the feature string file, or is creating the feature string file in feature tandem list file.
CNB200710176358XA 2007-10-25 2007-10-25 A kind of calling list rearrangement method and device Expired - Fee Related CN100571317C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CNB200710176358XA CN100571317C (en) 2007-10-25 2007-10-25 A kind of calling list rearrangement method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CNB200710176358XA CN100571317C (en) 2007-10-25 2007-10-25 A kind of calling list rearrangement method and device

Publications (2)

Publication Number Publication Date
CN101159795A true CN101159795A (en) 2008-04-09
CN100571317C CN100571317C (en) 2009-12-16

Family

ID=39307708

Family Applications (1)

Application Number Title Priority Date Filing Date
CNB200710176358XA Expired - Fee Related CN100571317C (en) 2007-10-25 2007-10-25 A kind of calling list rearrangement method and device

Country Status (1)

Country Link
CN (1) CN100571317C (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2010096981A1 (en) * 2009-02-25 2010-09-02 中兴通讯股份有限公司 Monthly rent bill repetition preventing method and apparatus
CN102541918A (en) * 2010-12-30 2012-07-04 阿里巴巴集团控股有限公司 Method and equipment for identifying repeated information
CN101609466B (en) * 2009-07-01 2012-11-28 中兴通讯股份有限公司 Method for duplicate checking of mass data and system thereof
CN103019696A (en) * 2012-11-22 2013-04-03 广东欧珀移动通信有限公司 System and method of quickly setting up memo by using mobile terminal desktop and mobile terminal
CN104793997A (en) * 2014-01-17 2015-07-22 华为技术有限公司 Data processing device and method
CN106488055A (en) * 2015-08-28 2017-03-08 华为软件技术有限公司 Calling list rearrangement method, back end equipment and routing node device
CN107203544A (en) * 2016-03-17 2017-09-26 阿里巴巴集团控股有限公司 A kind of method and device for business processing
CN112069510A (en) * 2020-07-24 2020-12-11 北京思特奇信息技术股份有限公司 Data encryption and de-duplication method
CN114915927A (en) * 2021-02-09 2022-08-16 中国联合网络通信集团有限公司 Data processing method, device and equipment

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107357862B (en) * 2017-06-30 2020-03-13 中国联合网络通信集团有限公司 Method and device for arranging repeated voice messages

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101494695B (en) * 2009-02-25 2013-02-27 中兴通讯股份有限公司 Method and apparatus for preventing month lease call ticket from repeating
WO2010096981A1 (en) * 2009-02-25 2010-09-02 中兴通讯股份有限公司 Monthly rent bill repetition preventing method and apparatus
CN101609466B (en) * 2009-07-01 2012-11-28 中兴通讯股份有限公司 Method for duplicate checking of mass data and system thereof
CN102541918A (en) * 2010-12-30 2012-07-04 阿里巴巴集团控股有限公司 Method and equipment for identifying repeated information
CN103019696A (en) * 2012-11-22 2013-04-03 广东欧珀移动通信有限公司 System and method of quickly setting up memo by using mobile terminal desktop and mobile terminal
CN104793997B (en) * 2014-01-17 2018-06-26 华为技术有限公司 A kind of data processing equipment and method
CN104793997A (en) * 2014-01-17 2015-07-22 华为技术有限公司 Data processing device and method
CN106488055A (en) * 2015-08-28 2017-03-08 华为软件技术有限公司 Calling list rearrangement method, back end equipment and routing node device
CN106488055B (en) * 2015-08-28 2019-10-22 华为软件技术有限公司 Calling list rearrangement method, back end equipment and routing node device
CN107203544A (en) * 2016-03-17 2017-09-26 阿里巴巴集团控股有限公司 A kind of method and device for business processing
CN112069510A (en) * 2020-07-24 2020-12-11 北京思特奇信息技术股份有限公司 Data encryption and de-duplication method
CN112069510B (en) * 2020-07-24 2024-01-30 北京思特奇信息技术股份有限公司 Data encryption and duplication elimination method
CN114915927A (en) * 2021-02-09 2022-08-16 中国联合网络通信集团有限公司 Data processing method, device and equipment
CN114915927B (en) * 2021-02-09 2023-10-31 中国联合网络通信集团有限公司 Data processing method, device and equipment

Also Published As

Publication number Publication date
CN100571317C (en) 2009-12-16

Similar Documents

Publication Publication Date Title
CN100571317C (en) A kind of calling list rearrangement method and device
CN101442731B (en) Method and apparatus for removing call ticket repeat
CN102906751B (en) A kind of method of data storage, data query and device
CN109471905B (en) Block chain indexing method supporting time range and attribute range compound query
CN1316397C (en) System and method of indexing unique electronic mail messages and uses for same
US7805439B2 (en) Method and apparatus for selecting data records from versioned data
EP2317785B1 (en) Address list system and implementation method thereof
CN102833298A (en) Distributed repeated data deleting system and processing method thereof
CN104217023B (en) It is a kind of to solve the method for map tile storage using packaging technique
CN103733195A (en) Managing storage of data for range-based searching
CN102375853A (en) Distributed database system, method for building index therein and query method
CN104462141A (en) Data storage and query method and system and storage engine device
CN106874348A (en) File is stored and the method for indexing means, device and reading file
CN103383690A (en) Distributed data storage method and system
CN105447166A (en) Keyword based information search method and system
CN104486777A (en) Method and device for processing data
CN103198150A (en) Big data indexing method and system
US11868328B2 (en) Multi-record index structure for key-value stores
CN104951464A (en) Data storage method and system
CN109597574A (en) Distributed data storage method, server and readable storage medium storing program for executing
CN106503054A (en) A kind of data query method and server
CN101980190A (en) Method and device for quickly putting service data into base
CN110532284B (en) Mass data storage and retrieval method and device, computer equipment and storage medium
CN103034649B (en) Method and system for realizing data storage and search
CN103955519A (en) Account inquiring and recording system and inquiring and recording method thereof

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20091216

Termination date: 20161025