WO2022105666A1 - Duplicate reservation identification method and apparatus - Google Patents

Duplicate reservation identification method and apparatus Download PDF

Info

Publication number
WO2022105666A1
WO2022105666A1 PCT/CN2021/130027 CN2021130027W WO2022105666A1 WO 2022105666 A1 WO2022105666 A1 WO 2022105666A1 CN 2021130027 W CN2021130027 W CN 2021130027W WO 2022105666 A1 WO2022105666 A1 WO 2022105666A1
Authority
WO
WIPO (PCT)
Prior art keywords
passenger
pnr
repeated
flight
same
Prior art date
Application number
PCT/CN2021/130027
Other languages
French (fr)
Chinese (zh)
Inventor
曾进进
余真真
林彤
郜美华
高宁宁
王晓逸
王汉博
付英茂
韩楠
郭鑫
Original Assignee
中国民航信息网络股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中国民航信息网络股份有限公司 filed Critical 中国民航信息网络股份有限公司
Publication of WO2022105666A1 publication Critical patent/WO2022105666A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24552Database cache management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/02Reservations, e.g. for tickets, services or events
    • G06Q10/025Coordination of plural reservations, e.g. plural trip segments, transportation combined with accommodation

Definitions

  • the present invention relates to the technical field of airline ticket reservation, and more particularly, to a method and device for recognizing repeated seat reservations.
  • the present invention discloses a method and device for recognizing repeated reservations, so as to realize the recognition of repeated reservations.
  • a method for identifying duplicate reservations comprising:
  • the PNR data includes: PNR number;
  • the PNR number is placed in the cache database, and is stored in the form of a corresponding relationship with the passenger ID, and the PNR data corresponding to the PNR number is stored in the cache database simultaneously;
  • a device for identifying repeated reservations comprising:
  • an acquisition unit configured to acquire PNR data, the PNR data includes: a PNR number;
  • a parsing unit for parsing the PNR data, and extracting passenger identity information and flight information from the PNR data
  • a search unit configured to find the passenger ID corresponding to the passenger identity information from the cache database, where the passenger ID is generated in the cache database when the passenger makes a reservation for the first time;
  • the first storage unit is used to put the PNR number in the cache database, and store it in the form of a corresponding relationship with the passenger ID, and store the PNR data corresponding to the PNR number in the cache. in the database;
  • an extraction unit for extracting all valid target PNR data from the cache database based on the passenger ID and the associated PNR number
  • the duplicate order judgment unit is used to compare all the target PNR data with duplicate orders according to the preset duplicate reservation identification rules, and judge whether the passenger has duplicate orders;
  • the second storage unit is configured to save the repeated order in the main database when the repeated order determination unit determines yes.
  • the present invention discloses a method and a device for re-booking identification, which analyzes the acquired PNR data, extracts the passenger identity information and flight information from the PNR data, and finds the relevant passengers from the cache database.
  • the passenger ID corresponding to the identity information the passenger ID is generated in the cache database when the passenger makes a reservation for the first time
  • the PNR number is placed in the cache database, and stored in the form of a corresponding relationship with the passenger ID, while the PNR number is stored in the cache database.
  • the corresponding PNR data is stored in the cache database.
  • all valid target PNR data are extracted from the cache database, and all target PNR data are repeated according to the preset repeated reservation identification rules.
  • the present invention introduces a cache database to save the current valid PNR data, and directly generates the passenger ID corresponding to the passenger identity information in the cache database when the passenger makes a reservation for the first time.
  • the PNR data Prior to the reservation identification, the PNR data is preferentially placed in the cache database, so that the passenger ID can correspond to all the associated PNR data, so that all valid PNR data associated with the same passenger ID can be compared with repeated orders. Identification of duplicate reservations.
  • FIG. 1 is a flowchart of a method for re-booking identification disclosed in an embodiment of the present invention
  • FIG. 2 is a processing process of a repeated reservation identification system disclosed in an embodiment of the present invention
  • FIG. 3 is a schematic structural diagram of a repeated reservation identification device disclosed in an embodiment of the present invention.
  • the processing method of the repeated reservation identification in the prior art is: the repeated reservation identification system receives the PNR (Passenger Name Record, passenger reservation record) booked by various reservation channels in real time or regularly, and the repeated reservation identification system is based on the passenger information. Repeated reservation identification is performed by querying the database to find out whether the same passenger has repeated reservations or suspected repeated reservations within a certain range.
  • PNR Passenger Name Record, passenger reservation record
  • the current repeated reservation identification system in order to improve the throughput capacity of the repeated reservation identification system, the current repeated reservation identification system generally adopts a distributed parallel processing architecture.
  • the distributed architecture has many advantages such as reliability, scalability, resource sharing, flexibility, high speed, and high performance, but there are also certain problems.
  • the order data of the same passenger in the database needs to be queried and compared with the PNR pushed to the system in real time.
  • the system receives two or more PNRs of the same passenger almost simultaneously, such as: PNR1 and PNR2. Since the system adopts distributed parallel processing, these orders are allocated to different machines for simultaneous processing.
  • a possible solution is to use the traditional serial calculation method for processing, that is, when two or more PNRs are received at the same time, PNR1 is preferentially processed, PNR2 waits, and then processed after PNR1 is processed. PNR2.
  • this solution can guarantee the integrity of the duplicate order identification business, it also greatly reduces the processing speed of the system and loses the advantages of the high throughput capability of the parallel processing architecture.
  • the embodiment of the present invention discloses a method and device for re-booking identification, which analyzes the acquired PNR data, extracts the passenger identity information and flight information from the PNR data, and finds the relevant passenger identity information from the cache database.
  • the corresponding passenger ID, the passenger ID is generated in the cache database when the passenger makes a reservation for the first time, and the PNR number is placed in the cache database and stored in the form of a corresponding relationship with the passenger ID.
  • the PNR data is stored in the cache database.
  • all valid target PNR data are extracted from the cache database. According to the preset repeated reservation identification rules, all the target PNR data are compared for repeated orders. Yes, and save the identified duplicate orders in the master database.
  • the present invention introduces a cache database to save the current valid PNR data, and directly generates the passenger ID corresponding to the passenger identity information in the cache database when the passenger makes a reservation for the first time.
  • the PNR data Prior to the reservation identification, the PNR data is preferentially placed in the cache database, so that the passenger ID can correspond to all the associated PNR data, so that all valid PNR data associated with the same passenger ID can be compared with repeated orders. Identification of duplicate reservations.
  • the present invention directly generates the passenger ID corresponding to the passenger identity information in the cache database, so that the passenger ID can correspond to all the associated PNR data, effectively avoiding the inability to identify the same data in the traditional solution due to the use of distributed parallel processing.
  • the cache database can greatly improve the read and write efficiency, and can shorten the read and write operations to the millisecond level.
  • the term “including” and variations thereof are open-ended inclusions, ie, "including but not limited to”.
  • the term “based on” is “based at least in part on.”
  • the term “one embodiment” means “at least one embodiment”; the term “another embodiment” means “at least one additional embodiment”; the term “some embodiments” means “at least some embodiments”. Relevant definitions of other terms will be given in the description below.
  • a flowchart of a method for identifying repeated reservations disclosed in an embodiment of the present invention includes:
  • Step S101 obtaining PNR data
  • the PNR data of the airline can be obtained from the China Civil Aviation Information System, and the PNR data includes: the PNR number.
  • PNR is the passenger reservation record, which is the abbreviation of Passenger Name Record. It reflects the passenger's voyage, the number of seats occupied by the flight, and passenger information, and is applicable to the civil aviation reservation system.
  • Step S102 parse the PNR data, and extract the passenger identity information and flight information from the PNR data
  • the passenger identity information may include: passenger name, ID number, passport number, frequent flyer card number, and the like.
  • the flight information may include: flight reservation information, flight origin, flight destination, flight number, departure date and arrival date, and the like.
  • Step S103 finding the passenger ID corresponding to the passenger identity information from the cache database
  • the passenger ID is generated in the cache database when the passenger makes a reservation for the first time.
  • passengers may use any one or more of ID number, passport number and frequent flyer card number for each seat reservation, so it is necessary to identify the passenger identity.
  • the present invention In order to facilitate subsequent repeated identification of multiple orders of the same passenger, the present invention generates a passenger ID for each passenger in the cache database to uniquely identify each passenger.
  • Example 1 Combining passenger IDs of different ID cards
  • Processing order 1 (English name + ID card 1): Get the real English name + encrypted ID number 1 key, query the cache database, no results are stored in the cache database, with the real English name + encrypted ID number 1 as the key, the passenger ID1 is the value;
  • Processing order 2 (English name + ID card 2): Get the real English name + encrypted ID number 2 key, query the cache database, no results are stored in the cache database, with the real English name + encrypted ID number 2 as the key, the passenger ID2 is the value;
  • Processing order 3 (English name + ID number 1 + ID number 2): Get the real English name + encrypted ID number 1 key and the real English name + encrypted passport number 2 key to circulate the certificate type, query the cache database, and find this Two key data exists, and there are two different passenger ID values, randomly use the passenger ID value of order 1 or order 2 as the final passenger ID (such as ID1), update the passenger ID value of order 1 and order 2, and aggregate become the same person.
  • Example 2 Combining the passenger ID of the ID card and the passport
  • Processing order 1 (English name + ID card): Get the real English name + encrypted ID number key, query the cache database, no result is stored in the cache database, take the real English name + encrypted ID number as the key, and passenger ID1 as the value ;
  • Processing order 2 (English name + passport): Get the real English name + encrypted passport number key, query the cache database, no result is stored in the cache database, take the real English name + encrypted passport number as the key, and passenger ID2 as the value;
  • Processing order 3 (English name + ID card + passport): Obtain the real English name + encrypted ID number key and the real English name + encrypted passport number key, cycle the certificate type, query the cache database, and find that these two key name data exist , and there are two different passenger ID values, randomly use the passenger ID value of order 1 or order 2 as the final passenger ID (such as ID1), update the passenger ID value of order 1 and order 2 (both ID1), and aggregate them into the same person.
  • the present invention can determine whether the passenger is the first reservation by judging whether the passenger ID corresponding to the passenger identity information is stored in the cache database, and if not, it is determined that the passenger has not made a reservation before. Generate a new passenger ID in the cache database; if it is, it is determined that the passenger is not the first reservation, and when the passenger rebooks, if there are multiple documents, it is determined to merge the passenger ID, otherwise the previously generated passenger ID is directly used. Passenger ID is sufficient.
  • this step is a key step to solve the loopholes existing in the distributed parallel repeated reservation identification system for instantaneously high concurrent repeated order identification.
  • the repeated reservation recognition system receives two PNRs of the same passenger at a very short time interval (such as 1ms), they are PNR1 and PNR2 respectively. Since the system adopts distributed parallel processing, these orders will be It is assigned to different machines for simultaneous processing. When machine 1 is processing PNR1, PNR2 cannot be queried in the database, and it is impossible to identify whether PNR2 is duplicated with PNR1. Similarly, when the machine 2 is processing the PNR2, the information of the PNR1 cannot be queried in the database, so it is impossible to identify whether the PNR1 is repeated with the PNR2.
  • the present invention introduces a cache database, in which the currently valid PNR data can be stored, and by directly generating a unique passenger ID corresponding to the passenger identity information in the cache database, the passenger ID can be established with all associated PNR data.
  • a corresponding relationship is also established between the PNR data and all passenger IDs, which effectively avoids the problem of inability to identify multiple PNRs of the same passenger in the traditional scheme due to the use of distributed parallel processing.
  • the cache database can greatly improve the read and write efficiency, and can shorten the read and write operations to the millisecond level.
  • Step S104 put the PNR number in the cache database, and store it in the form of a corresponding relationship with the passenger ID, and store the PNR data corresponding to the PNR number in the cache database simultaneously;
  • a passenger ID corresponding to the passenger identity information is first generated in the cache database, and then the acquired PNR data and the PNR number are stored in the cache database.
  • the obtained PNR data together with the PNR number are directly stored in the cache database.
  • Step S105 based on the passenger ID and the associated PNR number, extract all valid target PNR data from the cache database;
  • one passenger ID corresponds to all associated PNR data.
  • the valid target PNR data in this embodiment refers to the order data that has not been cancelled.
  • Step S106 compare all the target PNR data with duplicate orders, and determine whether the passenger has duplicate orders, and if so, go to step S107;
  • the duplicate orders determined according to the preset duplicate reservation identification rules meet the conditions: (1) the passenger name and ID are the same; (2) the flight origin or the flight arrival place (airport) is the same; (3) the time meets the following requirements
  • the duplicate PNR data of (the class, flight number and booking responsibility group in the duplicate PNR data can be different), as follows:
  • the flight segment is a domestic flight segment (the country corresponding to the departure airport and the arrival airport is CN), the departure or arrival airport is the same, and the departure time of the two flight segments is within the first preset time range. part;
  • the flight segment is an international flight segment (at least one of the countries corresponding to the departure airport and the arrival airport is not CN), the departure or arrival airports are the same, and the departure time of the two flight segments is considered to be within the second preset time range. is a repeating segment.
  • the criteria for judging whether the passenger's name and the identity ID are the same are: the passenger's name is duplicated and the identity ID is duplicated.
  • the basis for judging the duplication of IDs is that the two passengers use the same ID card, passport or frequent flyer card.
  • the system will maintain the information of the passenger's English name, ID type and ID content.
  • the same passenger can have multiple document information. If a passenger has the same English name and the same identity ID, they will be assigned the same passenger ID. If the same passenger appears in the same PNR with different identities, then these identities will also be assigned the same passenger ID. For example, if a passenger has entered the ID number and passport number in the same PNR, if the subsequent change of passenger uses the ID number and passport number to book a PNR, it will also be confirmed as the same passenger.
  • the certificate code has NI ID card, PP passport;
  • Gender and baby identification include: M for MALE, F for FEMALE; MI for boy, FI for girl.
  • Each passenger can only enter one passport information. That is, the PNR of multiple people must specify which passenger the information belongs to, and the PNR of a single person may not specify.
  • the passport number is 123456789, and this person is the first passenger in the PNR.
  • PAXLST The United States requires airlines to begin adopting PAXLST based on the UNEDIFACT standard on October 4, 2005. Canada began to use on November 1, 2005.
  • the information added to the US PAXLST includes: country of residence, US address (except for US citizens or US resident cards), passport expiration date, Canada
  • PAXLST increase Information: country of residence, passport expiration date.
  • PAXLST usually only requires one type of valid document information, at most no more than two, passport information is preferred, and each passenger including infants must hold at least one valid document.
  • the document code is the prefix of the airline, such as CA, LH, etc. (This requirement supports the non-Air China alliance card number entered by Air China.).
  • the frequent flyer number must be a real and valid number.
  • Example 1 (Table 2) - same ID number:
  • Example 2 (table 3) - same passport number:
  • Example 4 (Table 5) -
  • the ID number is the same as the frequent flyer card number:
  • the judgment is mainly based on the identity information of the passenger. Since the passenger names in the PNR data may have a wide variety of name suffixes, such as MR, MS, VIP, etc., before judging whether the name is repeated, the name suffix needs to be stripped in order to obtain an accurate name. Whether the passenger's name is repeated. When the passenger ID corresponding to the passenger's identity information is found from the cache database, according to the passenger's name stripped from the name suffix and the ID number, passport number or frequent flyer card number used for this booking, etc. Whether the passenger ID of the passenger identification information exists.
  • name suffixes such as MR, MS, VIP, etc.
  • the passenger identity information includes: any one of the ID number, passport number or frequent flyer card number used for this reservation, and the passenger's name with the name suffix stripped.
  • Example 1 zhang/san, it is stored as zhangsan.
  • Example 2 zhang san, it is stored as zhangsan.
  • Example 3 HAN/WAI LEEA, it is stored as HANWAILEEA.
  • Example 4 VANASS/LEONAD JOHA MR, it is stored as VANASSLEONADJOHA.
  • the flight segment is a valid flight segment
  • the present invention supports the scope control of whether to enable full detection within the alliance. For example, FM and MU are the same alliance, and the flight segments with duplicate reservations within the alliance can be identified.
  • the document information (ID number, passport number and frequent flyer card number) in the PNR order is encrypted, and the encryption method can be SM4.
  • the encryption method can be SM4.
  • Step S107 save the repeated order in the main database.
  • the automatic cleaning module reads duplicate orders from the main database, and after cleaning the duplicate orders, the cleaning results are kept in the main database.
  • the present invention discloses a method for re-booking identification, which analyzes the acquired PNR data, extracts the passenger identity information and flight information from the PNR data, and finds the passenger corresponding to the passenger identity information from the cache database.
  • ID the passenger ID is generated in the cache database when the passenger makes a reservation for the first time
  • the PNR number is placed in the cache database, and stored in the form of a corresponding relationship with the passenger ID, and the PNR data corresponding to the PNR number is stored at the same time.
  • the cache database based on the passenger ID and the associated PNR number, all valid target PNR data are extracted from the cache database, and all target PNR data are compared according to the preset repeated reservation identification rules for repeated orders, and Save identified duplicate orders in the master database.
  • the present invention introduces a cache database to save the current valid PNR data, and directly generates the passenger ID corresponding to the passenger identity information in the cache database when the passenger makes a reservation for the first time.
  • the PNR data Prior to the reservation identification, the PNR data is preferentially placed in the cache database, so that the passenger ID can correspond to all the associated PNR data, so that all valid PNR data associated with the same passenger ID can be compared with repeated orders. Identification of duplicate reservations.
  • the present invention directly generates the passenger ID corresponding to the passenger identity information in the cache database, so that the passenger ID can correspond to all the associated PNR data, effectively avoiding the inability to identify the same data in the traditional solution due to the use of distributed parallel processing.
  • the cache database can greatly improve the read and write efficiency, and can shorten the read and write operations to the millisecond level.
  • each block in the flowchart or block diagrams may represent a module, segment, or portion of code that contains one or more logical functions for implementing the specified functions executable instructions.
  • the functions noted in the blocks may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved.
  • each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations can be implemented in dedicated hardware-based systems that perform the specified functions or operations , or can be implemented in a combination of dedicated hardware and computer instructions.
  • Computer program code for performing operations of the present disclosure may be written in one or more programming languages, including but not limited to object-oriented programming languages—such as Java, Smalltalk, C++, and This includes conventional procedural programming languages - such as the "C" language or similar programming languages.
  • the program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer, or entirely on the remote computer or server.
  • the remote computer may be connected to the user's computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or may be connected to an external computer (eg, using an Internet service provider via Internet connection).
  • LAN local area network
  • WAN wide area network
  • Internet service provider via Internet connection
  • the present invention also discloses a repeated reservation identification device.
  • a schematic structural diagram of a device for identifying repeated reservations disclosed in an embodiment of the present invention includes:
  • an obtaining unit 201 configured to obtain PNR data, the PNR data includes: a PNR number;
  • a parsing unit 202 for parsing the PNR data, and extracting passenger identity information and flight information from the PNR data
  • the passenger identity information may include: passenger name, ID number, passport number, frequent flyer card number, and the like.
  • the flight information may include: flight reservation information, flight origin, flight destination, flight number, departure date and arrival date, and the like.
  • Searching unit 203 configured to find the passenger ID corresponding to the passenger identity information from the cache database, where the passenger ID is generated in the cache database when the passenger makes a reservation for the first time;
  • the passenger ID is generated in the cache database when the passenger makes a reservation for the first time.
  • passengers may use any one or more of ID number, passport number and frequent flyer card number for each seat reservation, so it is necessary to identify the passenger's identity.
  • the present invention In order to facilitate subsequent repeated identification of multiple orders of the same passenger, the present invention generates a passenger ID for each passenger in the cache database to uniquely identify each passenger.
  • the repeated reservation identification device may further include: a generating unit, for the first storage unit 204 to put the PNR number into the cache database, and store it in the form of a corresponding relationship with the passenger ID, and at the same time store the PNR number in the cache database.
  • a generating unit for the first storage unit 204 to put the PNR number into the cache database, and store it in the form of a corresponding relationship with the passenger ID, and at the same time store the PNR number in the cache database.
  • a data corresponding to the passenger is generated from the cache database.
  • the identity information corresponds to the passenger ID.
  • Example 1 Combining passenger IDs of different ID cards
  • Processing order 1 (English name + ID card 1): Get the real English name + encrypted ID number 1 key, query the cache database, no results are stored in the cache database, with the real English name + encrypted ID number 1 as the key, the passenger ID1 is the value;
  • Processing order 2 (English name + ID card 2): Get the real English name + encrypted ID number 2 key, query the cache database, no results are stored in the cache database, with the real English name + encrypted ID number 2 as the key, the passenger ID2 is the value;
  • Processing order 3 (English name + ID number 1 + ID number 2): Get the real English name + encrypted ID number 1 key and the real English name + encrypted passport number 2 key to circulate the certificate type, query the cache database, and find this Two key data exists, and there are two different passenger ID values, randomly use the passenger ID value of order 1 or order 2 as the final passenger ID (such as ID1), update the passenger ID value of order 1 and order 2, and aggregate become the same person.
  • Example 2 Combining the passenger ID of the ID card and the passport
  • Processing order 1 (English name + ID card): Get the real English name + encrypted ID number key, query the cache database, no result is stored in the cache database, take the real English name + encrypted ID number as the key, and passenger ID1 as the value ;
  • Processing order 2 (English name + passport): Get the real English name + encrypted passport number key, query the cache database, no result is stored in the cache database, take the real English name + encrypted passport number as the key, and passenger ID2 as the value;
  • Processing order 3 (English name + ID card + passport): Obtain the real English name + encrypted ID number key and the real English name + encrypted passport number key, cycle the certificate type, query the cache database, and find that these two key name data exist , and there are two different passenger ID values, randomly use the passenger ID value of order 1 or order 2 as the final passenger ID (such as ID1), update the passenger ID value of order 1 and order 2 (both ID1), and aggregate them into the same person.
  • the present invention can determine whether the passenger is the first reservation by judging whether the passenger ID corresponding to the passenger identity information is stored in the cache database, and if not, it is determined that the passenger has not made a reservation before. Generate a new passenger ID in the cache database; if it is, it is determined that the passenger is not the first reservation, and when the passenger rebooks, if there are multiple documents, it is determined to merge the passenger ID, otherwise the previously generated passenger ID is directly used. Passenger ID is sufficient.
  • the first storage unit 204 is used to put the PNR number in the cache database, and store it in the form of a corresponding relationship with the passenger ID, and store the PNR data corresponding to the PNR number in the cached in the database;
  • a passenger ID corresponding to the passenger identity information is first generated in the cache database, and then the acquired PNR data and the PNR number are stored in the cache database.
  • the obtained PNR data together with the PNR number are directly stored in the cache database.
  • Extraction unit 205 for extracting all valid target PNR data from the cache database based on the passenger ID and the associated PNR number;
  • one passenger ID corresponds to all associated PNR data.
  • the valid target PNR data in this embodiment refers to the order data that has not been cancelled.
  • the duplicate order judgment unit 206 is configured to compare all the target PNR data with duplicate orders according to the preset duplicate reservation identification rules, and judge whether the passenger has duplicate orders;
  • the duplicate orders determined according to the preset duplicate reservation identification rules meet the conditions: (1) the passenger name and ID are the same; (2) the flight origin or the flight arrival place (airport) is the same; (3) the time meets the following requirements
  • the duplicate PNR data of (the class, flight number and booking responsibility group in the duplicate PNR data can be different), as follows:
  • the flight segment is a domestic flight segment (the country corresponding to the departure airport and the arrival airport is CN), the departure or arrival airport is the same, and the departure time of the two flight segments is within the first preset time range. part;
  • the flight segment is an international flight segment (at least one of the countries corresponding to the departure airport and the arrival airport is not CN), the departure or arrival airports are the same, and the departure time of the two flight segments is considered to be within the second preset time range. is a repeating segment.
  • the second storage unit 207 is configured to save the repeated order in the main database when the repeated order determination unit determines yes.
  • the automatic cleaning module reads duplicate orders from the main database, and after cleaning the duplicate orders, the cleaning results are kept in the main database.
  • the present invention discloses a repeated reservation identification device, which analyzes the acquired PNR data, extracts the passenger identity information and flight information from the PNR data, and finds the passenger corresponding to the passenger identity information from the cache database.
  • ID the passenger ID is generated in the cache database when the passenger makes a reservation for the first time
  • the PNR number is placed in the cache database, and stored in the form of a corresponding relationship with the passenger ID, and the PNR data corresponding to the PNR number is stored at the same time.
  • all valid target PNR data are extracted from the cache database, and all target PNR data are compared according to the preset repeated reservation identification rules for repeated orders, and Save identified duplicate orders in the master database.
  • the present invention introduces a cache database to save the current valid PNR data, and directly generates the passenger ID corresponding to the passenger identity information in the cache database when the passenger makes a reservation for the first time.
  • the PNR data Prior to the reservation identification, the PNR data is preferentially placed in the cache database, so that the passenger ID can correspond to all the associated PNR data, so that all valid PNR data associated with the same passenger ID can be compared with repeated orders. Identification of duplicate reservations.
  • the present invention directly generates the passenger ID corresponding to the passenger identity information in the cache database, so that the passenger ID can correspond to all the associated PNR data, effectively avoiding the inability to identify the same data in the traditional solution due to the use of distributed parallel processing.
  • the cache database can greatly improve the read and write efficiency, and can shorten the read and write operations to the millisecond level.
  • the repeated reservation identification device when performing name repetition judgment, may further include:
  • the name repetition judgment unit is used to strip the suffix of the passenger's name, and judge whether the passenger's name after the suffix stripped is repeated.
  • the units involved in the embodiments of the present disclosure may be implemented in a software manner, and may also be implemented in a hardware manner.
  • the name of the unit does not constitute a limitation of the unit itself under certain circumstances, for example, the first acquisition unit can also be described as "a unit that acquires at least two Internet Protocol addresses".
  • exemplary types of hardware logic components include: Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), Systems on Chips (SOCs), Complex Programmable Logical Devices (CPLDs) and more.
  • FPGAs Field Programmable Gate Arrays
  • ASICs Application Specific Integrated Circuits
  • ASSPs Application Specific Standard Products
  • SOCs Systems on Chips
  • CPLDs Complex Programmable Logical Devices

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Tourism & Hospitality (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Development Economics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Economics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Strategic Management (AREA)
  • General Business, Economics & Management (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

A duplicate reservation identification method and apparatus. Acquired PNR data (S101) is parsed, and passenger identity information and flight information are extracted from the PNR data (S102); and a cache database is searched for a passenger ID corresponding to the passenger identity information (S103). A cache database is introduced to store the current valid PNR data, and a passenger ID corresponding to passenger identity information is directly generated in the cache database when a passenger makes a reservation for the first time. When real-time PNR data is received, the PNR data is preferentially placed in the cache database before duplicate reservation identification, such that the passenger ID can correspond to all associated PNR data, and thus, a duplicate reservation can be identified by means of performing duplicate order comparison on all valid PNR data associated with the same passenger ID.

Description

一种重复订座识别方法及装置Method and device for re-booking identification
本申请要求于2020年11月19日提交中国专利局、申请号为202011302293.0、发明名称为“一种重复订座识别方法及装置”的国内申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of the domestic application with the application number 202011302293.0 and the invention titled "A Method and Device for Recognizing Repeated Reservations", which was filed with the China Patent Office on November 19, 2020, the entire contents of which are incorporated into this application by reference middle.
技术领域technical field
本发明涉及民航机票预订技术领域,更具体的说,涉及一种重复订座识别方法及装置。The present invention relates to the technical field of airline ticket reservation, and more particularly, to a method and device for recognizing repeated seat reservations.
背景技术Background technique
在同一航班或者航线上,经常出现某一位旅客在不同PNR(Passenger Name Record,旅客订座记录)中都预订了同一航班或临近航班的座位,或者同一个PNR中预订了多个同一航班或临近航班上的座位。由于旅客的主观意愿是使用其中一个座位,因此重复预订座位会造成航空公司的座位虚占,使得航空公司无法尽快将虚占座位销售给真正需要的旅客,从而导致航班上座率低,使航空公司的收益收到损失。On the same flight or route, it is often seen that a certain passenger has booked seats on the same flight or adjacent flights in different PNRs (Passenger Name Records), or has booked multiple same flights in the same PNR or A seat on an adjacent flight. Since the passenger's subjective willingness is to use one of the seats, double booking of seats will cause the airline to occupy empty seats, so that the airline cannot sell the empty seats to the passengers who really need it as soon as possible, resulting in low flight occupancy rate and making the airline company gains are lost.
综上,如何提供一种重复订座识别方法成为了本领域技术人员亟需解决的技术问题。In conclusion, how to provide a method for identifying repeated reservations has become a technical problem that those skilled in the art need to solve urgently.
发明内容SUMMARY OF THE INVENTION
有鉴于此,本发明公开一种重复订座识别方法及装置,以实现对重复订座的识别。In view of this, the present invention discloses a method and device for recognizing repeated reservations, so as to realize the recognition of repeated reservations.
一种重复订座识别方法,包括:A method for identifying duplicate reservations, comprising:
获取PNR数据,所述PNR数据包括:PNR号码;Acquire PNR data, the PNR data includes: PNR number;
解析所述PNR数据,从所述PNR数据中提取出旅客身份信息和航班信息;Parse the PNR data, and extract the passenger identity information and flight information from the PNR data;
从缓存数据库中查找到与所述旅客身份信息对应的旅客ID,所述旅客ID为旅客第一次订座时在所述缓存数据库中生成;Find the passenger ID corresponding to the passenger identity information from the cache database, and the passenger ID is generated in the cache database when the passenger makes a reservation for the first time;
将所述PNR号码放至所述缓存数据库中,并与所述旅客ID以对应关系的形式存储,同时将所述PNR号码对应的所述PNR数据存储至所述缓存数 据库中;The PNR number is placed in the cache database, and is stored in the form of a corresponding relationship with the passenger ID, and the PNR data corresponding to the PNR number is stored in the cache database simultaneously;
基于所述旅客ID以及相关联的PNR号码,从所述缓存数据库中提取出所有有效的目标PNR数据;extracting all valid target PNR data from the cache database based on the passenger ID and the associated PNR number;
按照预设重复订座识别规则,将所有的所述目标PNR数据进行重复订单比对,判断旅客是否存在重复订单;According to the preset duplicate reservation identification rules, compare all the target PNR data with duplicate orders to determine whether the passenger has duplicate orders;
如果是,则将所述重复订单保存在主数据库中。If so, the repeat order is saved in the master database.
一种重复订座识别装置,包括:A device for identifying repeated reservations, comprising:
获取单元,用于获取PNR数据,所述PNR数据包括:PNR号码;an acquisition unit, configured to acquire PNR data, the PNR data includes: a PNR number;
解析单元,用于解析所述PNR数据,从所述PNR数据中提取出旅客身份信息和航班信息;a parsing unit for parsing the PNR data, and extracting passenger identity information and flight information from the PNR data;
查找单元,用于从缓存数据库中查找到与所述旅客身份信息对应的旅客ID,所述旅客ID为旅客第一次订座时在所述缓存数据库中生成;a search unit, configured to find the passenger ID corresponding to the passenger identity information from the cache database, where the passenger ID is generated in the cache database when the passenger makes a reservation for the first time;
第一存储单元,用于将所述PNR号码放至所述缓存数据库中,并与所述旅客ID以对应关系的形式存储,同时将所述PNR号码对应的所述PNR数据存储至所述缓存数据库中;The first storage unit is used to put the PNR number in the cache database, and store it in the form of a corresponding relationship with the passenger ID, and store the PNR data corresponding to the PNR number in the cache. in the database;
提取单元,用于基于所述旅客ID以及相关联的PNR号码,从所述缓存数据库中提取出所有有效的目标PNR数据;an extraction unit for extracting all valid target PNR data from the cache database based on the passenger ID and the associated PNR number;
重复订单判断单元,用于按照预设重复订座识别规则,将所有的所述目标PNR数据进行重复订单比对,判断旅客是否存在重复订单;The duplicate order judgment unit is used to compare all the target PNR data with duplicate orders according to the preset duplicate reservation identification rules, and judge whether the passenger has duplicate orders;
第二存储单元,用于在所述重复订单判断单元判断为是的情况下,将所述重复订单保存在主数据库中。The second storage unit is configured to save the repeated order in the main database when the repeated order determination unit determines yes.
从上述的技术方案可知,本发明公开了一种重复订座识别方法及装置,对获取的PNR数据进行解析,从PNR数据中提取出旅客身份信息和航班信息,从缓存数据库中查找到与旅客身份信息对应的旅客ID,该旅客ID为旅客第一次订座时在所述缓存数据库中生成,将PNR号码放至缓存数据库中,并与旅客ID以对应关系的形式存储,同时将PNR号码对应的PNR数据存储至缓存数据库中,基于旅客ID以及相关联的PNR号码,从缓存数据库中提取出所有有效的目标PNR数据,按照预设重复订座识别规则,将所有的目标PNR数据进行重复订单比对,并将确定的重复订单保存在主 数据库中。本发明引入了缓存数据库来保存当前有效的PNR数据,并在旅客第一次订座时直接在缓存数据库中生成与旅客身份信息对应的旅客ID,当接收到实时的PNR数据时,会在重复订座识别之前,优先将PNR数据放到缓存数据库中,使旅客ID可以与所有相关联的PNR数据对应,从而通过将同一旅客ID关联的所有的有效的PNR数据重复订单比对,即可实现对重复订座的识别。As can be seen from the above technical solutions, the present invention discloses a method and a device for re-booking identification, which analyzes the acquired PNR data, extracts the passenger identity information and flight information from the PNR data, and finds the relevant passengers from the cache database. The passenger ID corresponding to the identity information, the passenger ID is generated in the cache database when the passenger makes a reservation for the first time, the PNR number is placed in the cache database, and stored in the form of a corresponding relationship with the passenger ID, while the PNR number is stored in the cache database. The corresponding PNR data is stored in the cache database. Based on the passenger ID and the associated PNR number, all valid target PNR data are extracted from the cache database, and all target PNR data are repeated according to the preset repeated reservation identification rules. Orders are compared and the determined duplicate orders are saved in the main database. The present invention introduces a cache database to save the current valid PNR data, and directly generates the passenger ID corresponding to the passenger identity information in the cache database when the passenger makes a reservation for the first time. Prior to the reservation identification, the PNR data is preferentially placed in the cache database, so that the passenger ID can correspond to all the associated PNR data, so that all valid PNR data associated with the same passenger ID can be compared with repeated orders. Identification of duplicate reservations.
附图说明Description of drawings
结合附图并参考以下具体实施方式,本公开各实施例的上述和其他特征、优点及方面将变得更加明显。贯穿附图中,相同或相似的附图标记表示相同或相似的元素。应当理解附图是示意性的,原件和元素不一定按照比例绘制。The above and other features, advantages and aspects of various embodiments of the present disclosure will become more apparent when taken in conjunction with the accompanying drawings and with reference to the following detailed description. Throughout the drawings, the same or similar reference numbers refer to the same or similar elements. It should be understood that the drawings are schematic and that the originals and elements are not necessarily drawn to scale.
图1为本发明实施例公开的一种重复订座识别方法流程图;1 is a flowchart of a method for re-booking identification disclosed in an embodiment of the present invention;
图2为本发明实施例公开的一种重复订座识别系统的处理过程;FIG. 2 is a processing process of a repeated reservation identification system disclosed in an embodiment of the present invention;
图3为本发明实施例公开的一种重复订座识别装置的结构示意图。FIG. 3 is a schematic structural diagram of a repeated reservation identification device disclosed in an embodiment of the present invention.
具体实施方式Detailed ways
现有技术中重复订座识别的处理方法为:重复订座识别系统实时或定时地接收各种订座渠道预定的PNR(Passenger Name Record,旅客订座记录),重复订座识别系统根据旅客信息通过查询数据库查找相同旅客是否存在重复预定或一定范围之内的疑似重复预定来进行重复订座识别。The processing method of the repeated reservation identification in the prior art is: the repeated reservation identification system receives the PNR (Passenger Name Record, passenger reservation record) booked by various reservation channels in real time or regularly, and the repeated reservation identification system is based on the passenger information. Repeated reservation identification is performed by querying the database to find out whether the same passenger has repeated reservations or suspected repeated reservations within a certain range.
从系统架构的角度来说,为提高重复订座识别系统的吞吐能力,目前重复订座识别系统一般采用分布式并行处理架构。分布式架构具有可靠性、可扩展性、资源共享、灵活性、速度快、性能高等诸多优点,但也存在一定的问题。在重复订单的识别过程中,需要查询数据库中相同旅客的订单数据,并与实时推送到系统的PNR进行比较。在一些情况下,系统几乎同时接到同一旅客两个或两个以上PNR,如:PNR1和PNR2。由于系统采用分布式并行处理,因此导致这些订单被分配到不同机器上同时进行处理,当机器1正在处理PNR1时,数据库中无法查询到PNR2,也就无法对PNR2 是否与PNR1重复进行识别。同理,当机器2正在处理PNR2时,也无法在数据库中查询到PNR1的信息,从而也就无法对PNR1是否与PNR2重复进行识别。在这种场景下,如果PNR1与PNR2是重复订单,系统将无法得到正确的重复订座识别结果。由此可以得出结论,分布式并行的重复订座识别系统对瞬时高并发的重复订单的识别存在漏洞,可能导致系统无法完全识别重复订单数据。From the point of view of system architecture, in order to improve the throughput capacity of the repeated reservation identification system, the current repeated reservation identification system generally adopts a distributed parallel processing architecture. The distributed architecture has many advantages such as reliability, scalability, resource sharing, flexibility, high speed, and high performance, but there are also certain problems. In the process of identifying duplicate orders, the order data of the same passenger in the database needs to be queried and compared with the PNR pushed to the system in real time. In some cases, the system receives two or more PNRs of the same passenger almost simultaneously, such as: PNR1 and PNR2. Since the system adopts distributed parallel processing, these orders are allocated to different machines for simultaneous processing. When machine 1 is processing PNR1, PNR2 cannot be queried in the database, and it is impossible to identify whether PNR2 is duplicated with PNR1. Similarly, when the machine 2 is processing the PNR2, the information of the PNR1 cannot be queried in the database, so it is impossible to identify whether the PNR1 is repeated with the PNR2. In this scenario, if PNR1 and PNR2 are duplicate orders, the system will not be able to obtain the correct duplicate reservation identification result. From this, it can be concluded that there is a loophole in the identification of instantaneously high concurrent repeated orders by the distributed parallel repeated reservation identification system, which may cause the system to fail to fully identify the repeated order data.
针对上述问题,一种可能的解决方案是,采用传统串行计算方式进行处理,即当同时接收到两个或两个以上的PNR时,优先处理PNR1,PNR2等待,待PNR1处理结束后再处理PNR2。尽管该解决方案可以保证重复订单识别业务的完整性,但也大大降低了系统的处理速度,同时丢掉了并行处理架构的高吞吐能力的优点。In view of the above problems, a possible solution is to use the traditional serial calculation method for processing, that is, when two or more PNRs are received at the same time, PNR1 is preferentially processed, PNR2 waits, and then processed after PNR1 is processed. PNR2. Although this solution can guarantee the integrity of the duplicate order identification business, it also greatly reduces the processing speed of the system and loses the advantages of the high throughput capability of the parallel processing architecture.
基于此,本发明实施例公开了一种重复订座识别方法及装置,对获取的PNR数据进行解析,从PNR数据中提取出旅客身份信息和航班信息,从缓存数据库中查找到与旅客身份信息对应的旅客ID,该旅客ID为旅客第一次订座时在所述缓存数据库中生成,将PNR号码放至缓存数据库中,并与旅客ID以对应关系的形式存储,同时将PNR号码对应的PNR数据存储至缓存数据库中,基于旅客ID以及相关联的PNR号码,从缓存数据库中提取出所有有效的目标PNR数据,按照预设重复订座识别规则,将所有的目标PNR数据进行重复订单比对,并将确定的重复订单保存在主数据库中。本发明引入了缓存数据库来保存当前有效的PNR数据,并在旅客第一次订座时直接在缓存数据库中生成与旅客身份信息对应的旅客ID,当接收到实时的PNR数据时,会在重复订座识别之前,优先将PNR数据放到缓存数据库中,使旅客ID可以与所有相关联的PNR数据对应,从而通过将同一旅客ID关联的所有的有效的PNR数据重复订单比对,即可实现对重复订座的识别。Based on this, the embodiment of the present invention discloses a method and device for re-booking identification, which analyzes the acquired PNR data, extracts the passenger identity information and flight information from the PNR data, and finds the relevant passenger identity information from the cache database. The corresponding passenger ID, the passenger ID is generated in the cache database when the passenger makes a reservation for the first time, and the PNR number is placed in the cache database and stored in the form of a corresponding relationship with the passenger ID. The PNR data is stored in the cache database. Based on the passenger ID and the associated PNR number, all valid target PNR data are extracted from the cache database. According to the preset repeated reservation identification rules, all the target PNR data are compared for repeated orders. Yes, and save the identified duplicate orders in the master database. The present invention introduces a cache database to save the current valid PNR data, and directly generates the passenger ID corresponding to the passenger identity information in the cache database when the passenger makes a reservation for the first time. Prior to the reservation identification, the PNR data is preferentially placed in the cache database, so that the passenger ID can correspond to all the associated PNR data, so that all valid PNR data associated with the same passenger ID can be compared with repeated orders. Identification of duplicate reservations.
另外,本发明通过直接在缓存数据库中生成与旅客身份信息对应的旅客ID,使得旅客ID可以与所有相关联的PNR数据对应,有效避免了传统方案中因采用分布式并行处理导致的无法对同一旅客的多个PNR进行重复订座识别的问题。并且,缓存数据库相对于主数据库而言,可以极大的提升读写效率,可以将读写操作缩短到毫秒级。In addition, the present invention directly generates the passenger ID corresponding to the passenger identity information in the cache database, so that the passenger ID can correspond to all the associated PNR data, effectively avoiding the inability to identify the same data in the traditional solution due to the use of distributed parallel processing. The problem of multiple PNRs for passengers to identify duplicate reservations. Moreover, compared with the main database, the cache database can greatly improve the read and write efficiency, and can shorten the read and write operations to the millisecond level.
下面将参照附图更详细地描述本公开的实施例。虽然附图中显示了本公开的某些实施例,然而应当理解的是,本公开可以通过各种形式来实现,而且不应该被解释为限于这里阐述的实施例,相反提供这些实施例是为了更加透彻和完整地理解本公开。应当理解的是,本公开的附图及实施例仅用于示例性作用,并非用于限制本公开的保护范围。Embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While certain embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be construed as limited to the embodiments set forth herein, but rather are provided for the purpose of A more thorough and complete understanding of the present disclosure. It should be understood that the drawings and embodiments of the present disclosure are only for exemplary purposes, and are not intended to limit the protection scope of the present disclosure.
本文使用的术语“包括”及其变形是开放性包括,即“包括但不限于”。术语“基于”是“至少部分地基于”。术语“一个实施例”表示“至少一个实施例”;术语“另一实施例”表示“至少一个另外的实施例”;术语“一些实施例”表示“至少一些实施例”。其他术语的相关定义将在下文描述中给出。As used herein, the term "including" and variations thereof are open-ended inclusions, ie, "including but not limited to". The term "based on" is "based at least in part on." The term "one embodiment" means "at least one embodiment"; the term "another embodiment" means "at least one additional embodiment"; the term "some embodiments" means "at least some embodiments". Relevant definitions of other terms will be given in the description below.
需要注意,本公开中提及的“第一”、“第二”等概念仅用于对不同的装置、模块或单元进行区分,并非用于限定这些装置、模块或单元所执行的功能的顺序或者相互依存关系。It should be noted that concepts such as "first" and "second" mentioned in the present disclosure are only used to distinguish different devices, modules or units, and are not used to limit the order of functions performed by these devices, modules or units or interdependence.
参见图1,本发明实施例公开的一种重复订座识别方法流程图,该方法包括:Referring to FIG. 1 , a flowchart of a method for identifying repeated reservations disclosed in an embodiment of the present invention includes:
步骤S101、获取PNR数据;Step S101, obtaining PNR data;
具体的,在实际应用中,可以从中国民航信息系统中获取航空公司的PNR数据,PNR数据包括:PNR号码。Specifically, in practical applications, the PNR data of the airline can be obtained from the China Civil Aviation Information System, and the PNR data includes: the PNR number.
PNR是旅客订座记录,即Passenger Name Record的缩写,它反映了旅客的航程,航班座位占用的数量,及旅客信息,适用民航订座系统。PNR is the passenger reservation record, which is the abbreviation of Passenger Name Record. It reflects the passenger's voyage, the number of seats occupied by the flight, and passenger information, and is applicable to the civil aviation reservation system.
步骤S102、解析所述PNR数据,从所述PNR数据中提取出旅客身份信息和航班信息;Step S102, parse the PNR data, and extract the passenger identity information and flight information from the PNR data;
其中,旅客身份信息可以包括:旅客姓名、身份证号码、护照号码和常旅客卡号等等。The passenger identity information may include: passenger name, ID number, passport number, frequent flyer card number, and the like.
航班信息可以包括:航班订座信息、航班始发地、航班目的地、航班号、出发日期和到达日期,等等。The flight information may include: flight reservation information, flight origin, flight destination, flight number, departure date and arrival date, and the like.
步骤S103、从缓存数据库中查找到与所述旅客身份信息对应的旅客ID;Step S103, finding the passenger ID corresponding to the passenger identity information from the cache database;
其中,旅客ID为旅客第一次订座时在所述缓存数据库中生成。The passenger ID is generated in the cache database when the passenger makes a reservation for the first time.
在实际应用中,旅客每次订座可能使用身份证号码、护照号码和常旅 客卡号其中的任意一种或多种,因此需要对旅客身份进行识别。In practical applications, passengers may use any one or more of ID number, passport number and frequent flyer card number for each seat reservation, so it is necessary to identify the passenger identity.
为便于后续对同一个旅客的多份订单进行重复识别,本发明针对每位旅客在缓存数据库中均生成一个来唯一标识每一位旅客的旅客ID。In order to facilitate subsequent repeated identification of multiple orders of the same passenger, the present invention generates a passenger ID for each passenger in the cache database to uniquely identify each passenger.
需要说明的是:It should be noted:
A)当在缓存数据库中未查找到与所述旅客身份信息对应的旅客ID时,在缓存数据库中生成一个与所述旅客身份信息对应旅客ID。A) When the passenger ID corresponding to the passenger identification information is not found in the cache database, a passenger ID corresponding to the passenger identification information is generated in the cache database.
B)如果缓存数据库中找到多个与所述旅客身份信息对应的旅客ID时,需要对多个旅客ID进行合并,保证有多个证件的旅客只有一个唯一的旅客ID。B) If multiple passenger IDs corresponding to the passenger identity information are found in the cache database, the multiple passenger IDs need to be merged to ensure that passengers with multiple documents have only one unique passenger ID.
举例一:不同身份证的旅客ID合并Example 1: Combining passenger IDs of different ID cards
处理订单1(英文名+身份证1):获取真实英文名+加密身份证号1键,查询缓存数据库,无结果存入缓存数据库中,以真实英文名+加密身份证号1为键,旅客ID1为值;Processing order 1 (English name + ID card 1): Get the real English name + encrypted ID number 1 key, query the cache database, no results are stored in the cache database, with the real English name + encrypted ID number 1 as the key, the passenger ID1 is the value;
处理订单2(英文名+身份证2):获取真实英文名+加密身份证号2键,查询缓存数据库,无结果存入缓存数据库中,以真实英文名+加密身份证号2为键,旅客ID2为值;Processing order 2 (English name + ID card 2): Get the real English name + encrypted ID number 2 key, query the cache database, no results are stored in the cache database, with the real English name + encrypted ID number 2 as the key, the passenger ID2 is the value;
处理订单3(英文名+身份证号1+身份证号2):获取真实英文名+加密身份证号1键以及真实英文名+加密护照号2键循环证件类型,查询缓存数据库,发现这个这两个键名数据存在,且有两个不同的旅客ID值,随机使用订单1或订单2的旅客ID值为最终的旅客ID(如ID1),更新订单1和订单2的旅客ID值,聚合成同一个人。Processing order 3 (English name + ID number 1 + ID number 2): Get the real English name + encrypted ID number 1 key and the real English name + encrypted passport number 2 key to circulate the certificate type, query the cache database, and find this Two key data exists, and there are two different passenger ID values, randomly use the passenger ID value of order 1 or order 2 as the final passenger ID (such as ID1), update the passenger ID value of order 1 and order 2, and aggregate become the same person.
举例二:身份证和护照的旅客ID合并Example 2: Combining the passenger ID of the ID card and the passport
处理订单1(英文名+身份证):获取真实英文名+加密身份证号键,查询缓存数据库,无结果存入缓存数据库中,以真实英文名+加密身份证号为键,旅客ID1为值;Processing order 1 (English name + ID card): Get the real English name + encrypted ID number key, query the cache database, no result is stored in the cache database, take the real English name + encrypted ID number as the key, and passenger ID1 as the value ;
处理订单2(英文名+护照):获取真实英文名+加密护照号键,查询缓存数据库,无结果存入缓存数据库中,以真实英文名+加密护照号为键,旅客ID2为值;Processing order 2 (English name + passport): Get the real English name + encrypted passport number key, query the cache database, no result is stored in the cache database, take the real English name + encrypted passport number as the key, and passenger ID2 as the value;
处理订单3(英文名+身份证+护照):获取真实英文名+加密身份证号键以及真实英文名+加密护照号键,循环证件类型,查询缓存数据库,发现 这个这两个键名数据存在,且有两个不同的旅客ID值,随机使用订单1或订单2的旅客ID值为最终的旅客ID(如ID1),更新订单1和订单2的旅客ID值(均为ID1),聚合成同一个人。Processing order 3 (English name + ID card + passport): Obtain the real English name + encrypted ID number key and the real English name + encrypted passport number key, cycle the certificate type, query the cache database, and find that these two key name data exist , and there are two different passenger ID values, randomly use the passenger ID value of order 1 or order 2 as the final passenger ID (such as ID1), update the passenger ID value of order 1 and order 2 (both ID1), and aggregate them into the same person.
在实际应用中,本发明可以通过判断缓存数据库中是否存储与所述旅客身份信息对应的旅客ID,来确定旅客是否是第一次订座,如果否,则判定旅客之前未订座,此时在缓存数据库生成一个新的旅客ID;如果是,则判定旅客非第一次订座,当旅客再次订座时,如有多个证件,则判断进行旅客ID的合并,否则直接使用之前生成的旅客ID即可。In practical applications, the present invention can determine whether the passenger is the first reservation by judging whether the passenger ID corresponding to the passenger identity information is stored in the cache database, and if not, it is determined that the passenger has not made a reservation before. Generate a new passenger ID in the cache database; if it is, it is determined that the passenger is not the first reservation, and when the passenger rebooks, if there are multiple documents, it is determined to merge the passenger ID, otherwise the previously generated passenger ID is directly used. Passenger ID is sufficient.
需要说明的是,本步骤是解决分布式并行的重复订座识别系统对瞬时高并发的重复订单识别存在的漏洞的关键步骤。假设未引入缓存数据库,若重复订座识别系统在极短的时间间隔(比如1ms)收到同一位旅客的两个PNR,分别为PNR1和PNR2,由于系统采用分布式并行处理,因此导致这些订单被分配到不同机器上同时进行处理,当机器1正在处理PNR1时,数据库中无法查询到PNR2,也就无法对PNR2是否与PNR1重复进行识别。同理,当机器2正在处理PNR2时,也无法在数据库中查询到PNR1的信息,从而也就无法对PNR1是否与PNR2重复进行识别。It should be noted that this step is a key step to solve the loopholes existing in the distributed parallel repeated reservation identification system for instantaneously high concurrent repeated order identification. Assuming that the cache database is not introduced, if the repeated reservation recognition system receives two PNRs of the same passenger at a very short time interval (such as 1ms), they are PNR1 and PNR2 respectively. Since the system adopts distributed parallel processing, these orders will be It is assigned to different machines for simultaneous processing. When machine 1 is processing PNR1, PNR2 cannot be queried in the database, and it is impossible to identify whether PNR2 is duplicated with PNR1. Similarly, when the machine 2 is processing the PNR2, the information of the PNR1 cannot be queried in the database, so it is impossible to identify whether the PNR1 is repeated with the PNR2.
因此,本发明引入了缓存数据库,缓存数据库中可以保存当前有效的PNR数据,通过直接在缓存数据库中生成与旅客身份信息对应的唯一的旅客ID,使得旅客ID可以与所有相关联的PNR数据建立对应关系,同时PNR数据与所有的旅客ID之间也建立对应关系,有效避免了传统方案中因采用分布式并行处理导致的无法对同一旅客的多个PNR进行重复订座识别的问题。并且,缓存数据库相对于主数据库而言,可以极大的提升读写效率,可以将读写操作缩短到毫秒级。Therefore, the present invention introduces a cache database, in which the currently valid PNR data can be stored, and by directly generating a unique passenger ID corresponding to the passenger identity information in the cache database, the passenger ID can be established with all associated PNR data. At the same time, a corresponding relationship is also established between the PNR data and all passenger IDs, which effectively avoids the problem of inability to identify multiple PNRs of the same passenger in the traditional scheme due to the use of distributed parallel processing. Moreover, compared with the main database, the cache database can greatly improve the read and write efficiency, and can shorten the read and write operations to the millisecond level.
步骤S104、将PNR号码放至所述缓存数据库中,并与所述旅客ID以对应关系的形式存储,同时将所述PNR号码对应的所述PNR数据存储至所述缓存数据库中;Step S104, put the PNR number in the cache database, and store it in the form of a corresponding relationship with the passenger ID, and store the PNR data corresponding to the PNR number in the cache database simultaneously;
具体的,当旅客第一次订座时,首先在缓存数据库中生成一个与旅客身份信息对应旅客ID,然后将获取的PNR数据连同PNR号码存储至缓存数据库。Specifically, when a passenger makes a reservation for the first time, a passenger ID corresponding to the passenger identity information is first generated in the cache database, and then the acquired PNR data and the PNR number are stored in the cache database.
当旅客不是第一次订座时,直接将获取的PNR数据连同PNR号码存储 至缓存数据库。When it is not the first time that the passenger makes a reservation, the obtained PNR data together with the PNR number are directly stored in the cache database.
步骤S105、基于所述旅客ID以及相关联的PNR号码,从所述缓存数据库中提取出所有有效的目标PNR数据;Step S105, based on the passenger ID and the associated PNR number, extract all valid target PNR data from the cache database;
需要说明的是,在缓存数据库中,一个旅客ID与所有相关联的PNR数据相对应。It should be noted that, in the cache database, one passenger ID corresponds to all associated PNR data.
本实施例中有效的目标PNR数据指的是:未被取消的订单数据。The valid target PNR data in this embodiment refers to the order data that has not been cancelled.
步骤S106、按照预设重复订座识别规则,将所有的所述目标PNR数据进行重复订单比对,判断旅客是否存在重复订单,如果是,则执行步骤S107;Step S106, according to the preset duplicate reservation identification rules, compare all the target PNR data with duplicate orders, and determine whether the passenger has duplicate orders, and if so, go to step S107;
其中,按照预设重复订座识别规则确定的重复订单满足条件:(一)旅客姓名和身份ID相同;(二)航班始发地或航班达到地(机场)相同;(三)时间符合如下要求的重复PNR数据(重复PNR数据中的舱位、航班号和订座责任组可不同),如下:Among them, the duplicate orders determined according to the preset duplicate reservation identification rules meet the conditions: (1) the passenger name and ID are the same; (2) the flight origin or the flight arrival place (airport) is the same; (3) the time meets the following requirements The duplicate PNR data of (the class, flight number and booking responsibility group in the duplicate PNR data can be different), as follows:
A、当航段为国内航段时(起飞机场到达机场对应的国家均为CN),起飞或到达机场相同,两个航段的起飞时间在第一预设时间范围内则认为是重复航段;A. When the flight segment is a domestic flight segment (the country corresponding to the departure airport and the arrival airport is CN), the departure or arrival airport is the same, and the departure time of the two flight segments is within the first preset time range. part;
B、当航段为国际航段时(起飞机场到达机场对应的国家至少有一个不为CN),起飞或到达机场相同,两个航段的起飞时间在第二预设时间范围内则认为是重复航段。B. When the flight segment is an international flight segment (at least one of the countries corresponding to the departure airport and the arrival airport is not CN), the departure or arrival airports are the same, and the departure time of the two flight segments is considered to be within the second preset time range. is a repeating segment.
具体的,(一)旅客姓名和身份ID是否相同的判断标准为:旅客姓名重复且身份ID重复。Specifically, (1) the criteria for judging whether the passenger's name and the identity ID are the same are: the passenger's name is duplicated and the identity ID is duplicated.
1)旅客姓名重复的判断依据是:英文姓名相同。1) The basis for judging the duplication of passenger names is that the English names are the same.
举例说明,详见表1,假设重复订座识别系统收到两个PNR,分别为PNR1和PNR2。For example, see Table 1 for details. It is assumed that the duplicate reservation identification system receives two PNRs, which are PNR1 and PNR2 respectively.
表1Table 1
Figure PCTCN2021130027-appb-000001
Figure PCTCN2021130027-appb-000001
Figure PCTCN2021130027-appb-000002
Figure PCTCN2021130027-appb-000002
2)身份ID重复的判断依据是:两位旅客使用了相同的身份证、护照或常旅客卡。2) The basis for judging the duplication of IDs is that the two passengers use the same ID card, passport or frequent flyer card.
同一位旅客的含义是:系统会维护旅客英文姓名、身份ID类型和身份ID内容的信息。同一位旅客可以有多种证件信息。如果旅客具有同样的英文姓名和同样的身份ID,就会被分配同一旅客ID。假如同一位旅客以不同种身份ID出现在同一PNR内,那么也会为这些身份ID分配同一个旅客ID。举例说,假如有位旅客在同一个PNR中输入过身份证号码和护照号码,若后续改旅客分别用身份证号码和护照号码各订一个PNR,则也会被确认为是同一位旅客。The meaning of the same passenger is: the system will maintain the information of the passenger's English name, ID type and ID content. The same passenger can have multiple document information. If a passenger has the same English name and the same identity ID, they will be assigned the same passenger ID. If the same passenger appears in the same PNR with different identities, then these identities will also be assigned the same passenger ID. For example, if a passenger has entered the ID number and passport number in the same PNR, if the subsequent change of passenger uses the ID number and passport number to book a PNR, it will also be confirmed as the same passenger.
其中,针对身份ID类型解析种类如下:Among them, the types of identity ID type analysis are as follows:
**********<FOID>**************************************************<FOID>**************************************** *
使用SSR FOID输入旅客的身份信息Use SSR FOID to enter passenger identification information
指令格式:Instruction format:
SSR FOID AIRLINE-CODE HK/证件代码及号码/PnSSR FOID AIRLINE-CODE HK/document code and number/Pn
注意:1)证件代码有NI身份证,PP护照;Note: 1) The certificate code has NI ID card, PP passport;
2)每个旅客只能够输入一个FOID身份信息,若输错,必须删除后重输。即多人的PNR必须指定该信息,属于哪个旅客,单人PNR可以不指定。2) Each passenger can only enter one FOID identity information. If you enter the wrong information, you must delete it and enter it again. That is, the PNR of multiple people must specify the information, which passenger belongs to, and the PNR of single person can not specify.
示例:Example:
SSR FOID CA HK/NI110108200306016012/P1SSR FOID CA HK/NI110108200306016012/P1
SSR FOID CA HK/PP112233/P2SSR FOID CA HK/PP112233/P2
**********<PSPT>**************************************************<PSPT>**************************************** *
使用SSR PSPT输入旅客护照信息Entering Passenger Passport Information Using SSR PSPT
指令格式:Instruction format:
SSR PSPT AIRLINE-CODE HK1/护照号码/国籍/旅客生日/旅客姓/旅客名/性别及婴儿标识/持有人标识/PnSSR PSPT AIRLINE-CODE HK1/passport number/nationality/passenger birthday/passenger surname/passenger name/gender and baby ID/holder ID/Pn
注意:Notice:
1性别及婴儿标识包括:M表示MALE,F表示FEMALE;MI表示男孩,FI表示女孩。1 Gender and baby identification include: M for MALE, F for FEMALE; MI for boy, FI for girl.
2每位旅客只能够输入一个护照信息,若输错,必须删除后重输。即多人的PNR必须指定该信息属于哪个旅客,单人PNR可以不指定。2 Each passenger can only enter one passport information. That is, the PNR of multiple people must specify which passenger the information belongs to, and the PNR of a single person may not specify.
3仅需在以下两种情况输入旅客姓和旅客名:A、当PNR中旅客姓名与护照中旅客姓名不同时;B、为婴儿输入护照信息。3. Only need to enter the passenger's surname and passenger's name in the following two situations: A. When the passenger's name in the PNR is different from the passenger's name in the passport; B. Enter the passport information for the baby.
示例:Example:
为02年4月20日出生的中国男孩(男性婴儿)TEST/NAME输入护照信息,该护照号为1234567890123456,护照持有人为该PNR中第一个旅客。Enter the passport information for a Chinese boy (male baby) TEST/NAME born on April 20, 2002, the passport number is 1234567890123456, and the passport holder is the first passenger in the PNR.
SSR PSPT CA HK1/1234567890123456/CN/20APR02/TEST/NAMESSR PSPT CA HK1/1234567890123456/CN/20APR02/TEST/NAME
/MI/H/P1/MI/H/P1
为70年4月20日出生的中国男性输入护照信息,该护照号为123456789,此人为该PNR中的第一个旅客。Enter the passport information for a Chinese male born on April 20, 1970, the passport number is 123456789, and this person is the first passenger in the PNR.
SSR PSPT CA HK1/123456789/CN/20APR70///M/P1SSR PSPT CA HK1/123456789/CN/20APR70///M/P1
********<API使用格式>**************************************<API usage format>******************************
背景:background:
美国要求航空公司于2005年10月4日开始采用基于UNEDIFACT标准的PAXLST。加拿大为2005年11月1日开始使用和目前的API信息相比,美国PAXLST增加的信息有:居住国,美国地址(美国公民或持美国居住卡的除外),护照到期日期,加拿大PAXLST增加的信息:居住国,护照到期日期。PAXLST通常都只要求一种有效证件信息,最多不超过两种,首选护照信息,每个旅客包括婴儿都必须持有至少一种有效证件。The United States requires airlines to begin adopting PAXLST based on the UNEDIFACT standard on October 4, 2005. Canada began to use on November 1, 2005. Compared with the current API information, the information added to the US PAXLST includes: country of residence, US address (except for US citizens or US resident cards), passport expiration date, Canada PAXLST increase Information: country of residence, passport expiration date. PAXLST usually only requires one type of valid document information, at most no more than two, passport information is preferred, and each passenger including infants must hold at least one valid document.
*DOCS*DOCS
指令格式:Instruction format:
SSR:DOCS航空公司代码Action-Code1证件类型/发证国家/证件号 码/国籍/出生日期/性别/证件有效期限SSR: DOCS Airline Code Action-Code1 Document Type / Issuing Country / Document Number / Nationality / Date of Birth / Gender / Document Validity Period
/SURNAME(姓)/FIRST-NAME(名)/MID-NAME(中间名)/持有人/SURNAME(last name)/FIRST-NAME(first name)/MID-NAME(middle name)/holder
标识H/P1Logo H/P1
示例:Example:
SSR:DOCS CA HK1P/CHN/143810297/CHN/24APR67/M/23APR02SSR: DOCS CA HK1P/CHN/143810297/CHN/24APR67/M/23APR02
ZHANG/DALONG/P1ZHANG/DALONG/P1
说明:illustrate:
证件类型:P               护照签发国:CHNDocument Type: P Passport Issuing Country: CHN
护照号:143810297         国籍:CHPassport number: 143810297 Nationality: CH
出生日期:24APR67         性别:MDate of Birth: 24APR67 Gender: M
证件有效期:23APR02Certificate validity: 23APR02
姓:ZHANG                 名:DALONGSurname: ZHANG         Name: DALONG
注:只关注DOCS类型的P类型Note: only focus on the P type of the DOCS type
**********<FQTV>**************************************************<FQTV>************************************ *
使用SSR FQTV输入常旅客信息Entering frequent flyer information using SSR FQTV
指令格式:Instruction format:
SSR FQTV AIRLINE-CODE HK/证件代码及号码/PnSSR FQTV AIRLINE-CODE HK/document code and number/Pn
注意:1证件代码为航空公司前缀,如CA、LH等等(本需求支持国航输入的非国航联盟卡号。)。Note: 1 The document code is the prefix of the airline, such as CA, LH, etc. (This requirement supports the non-Air China alliance card number entered by Air China.).
2常旅客号码必须为真实有效号码。2 The frequent flyer number must be a real and valid number.
示例:Example:
SSR FQTV CA HK/CA101599260/P1SSR FQTV CA HK/CA101599260/P1
疑似重复旅客举例(当旅客姓名相同且身份信息有如下几种情况时):Examples of suspected duplicate passengers (when the passenger names are the same and the identity information has the following situations):
例子1(表2)——身份证号码相同:Example 1 (Table 2) - same ID number:
表2Table 2
PNR1中的旅客Passengers in PNR1 NI:11012340098900233NI: 11012340098900233
PNR2中的旅客Passengers in PNR2 NI:11012340098900233NI: 11012340098900233
例子2(表3)——护照号码相同:Example 2 (table 3) - same passport number:
表3table 3
PNR1中的旅客Passengers in PNR1 PP:25782769PP: 25782769
PNR2中的旅客Passengers in PNR2 PP:25782769PP: 25782769
例子3(表4)——常旅客卡号相同:Example 3 (Table 4) - Same frequent flyer card number:
表4Table 4
PNR1中的旅客Passengers in PNR1 CA:101599260CA: 101599260
PNR2中的旅客Passengers in PNR2 CA:101599260CA: 101599260
例子4(表5)——身份证号码与常旅客卡号相同:Example 4 (Table 5) - The ID number is the same as the frequent flyer card number:
表5table 5
PNR1中的旅客Passengers in PNR1 NI:11012340098900233、CA:101599260NI: 11012340098900233, CA: 101599260
PNR2中的旅客Passengers in PNR2 NI:11012340098900233NI: 11012340098900233
PNR3中的旅客Passengers in PNR3 CA:101599260CA: 101599260
需要说明的是,在判断旅客是否为第一次订座时,主要通过旅客身份信息来判断。由于PNR数据中的旅客姓名可能会带有种类繁多的姓名后缀,如MR、MS、VIP等,因此在判断姓名是否重复之前,需要进行姓名后缀剥离,以便得到准确的姓名,并判断后缀剥离后的旅客姓名是否重复。在从缓存数据库中查找到与旅客身份信息对应的旅客ID时,具体根据姓名后缀剥离的旅客姓名以及本次订座使用的身份证号、护照号或常旅客卡号等,从缓存数据中查找相同旅客身份信息的旅客ID是否存在。It should be noted that when judging whether a passenger is booking a seat for the first time, the judgment is mainly based on the identity information of the passenger. Since the passenger names in the PNR data may have a wide variety of name suffixes, such as MR, MS, VIP, etc., before judging whether the name is repeated, the name suffix needs to be stripped in order to obtain an accurate name. Whether the passenger's name is repeated. When the passenger ID corresponding to the passenger's identity information is found from the cache database, according to the passenger's name stripped from the name suffix and the ID number, passport number or frequent flyer card number used for this booking, etc. Whether the passenger ID of the passenger identification information exists.
其中,旅客身份信息包括:本次订座使用的身份证号、护照号或常旅客卡号中的任意一个,以及姓名后缀剥离的旅客姓名。Among them, the passenger identity information includes: any one of the ID number, passport number or frequent flyer card number used for this reservation, and the passenger's name with the name suffix stripped.
3)姓名后缀剥离3) Stripping the name suffix
将下面形式的后缀剥离出来,存储在数据库表中的姓名(剥离后)、后缀(剥离后)两个字段中。在重复订单结果数据中,能够识别出有后缀和有后缀姓名之间的重复、无后缀和无后缀姓名之间的重复、有后缀和无后缀姓名之间的重复。Strip the suffix in the following form and store it in the database table in two fields: name (after stripping) and suffix (after stripping). In the duplicate order result data, duplication between names with suffix and suffix, duplication between names without suffix and without suffix, duplication between names with suffix and without suffix can be identified.
可识别的姓名后缀主要有以下类型,支持配置:There are mainly the following types of recognizable name suffixes, which support configuration:
MR/MS/CHD/MRS/MISS/MSTR/SD/STU/DL/DR/MDM/INF/SC/V/LBR/VVIP/IN/DE/INS/DIPL/CBBG/EXST/MASTER/SEA/EM/MIS/GM/EMI/MR/MS/CHD/MRS/MISS/MSTR/SD/STU/DL/DR/MDM/INF/SC/V/LBR/VVIP/IN/DE/INS/DIPL/CBBG/EXST/MASTER/SEA/EM/ MIS/GM/EMI/
STCR/JC/WCHR/WCHS/WCHC/VIP/MAAS/INAD/DEPA/DEPU/DEAF/CHILD/CIP/BLND/MAS/YP/MADAM/AS/LEGL/PETC/SP/VFSTCR/JC/WCHR/WCHS/WCHC/VIP/MAAS/INAD/DEPA/DEPU/DEAF/CHILD/CIP/BLND/MAS/YP/MADAM/AS/LEGL/PETC/SP/VF
剥离姓名中的斜杠和空格。Strip slashes and spaces from names.
举例1:zhang/san,则存储为zhangsan。Example 1: zhang/san, it is stored as zhangsan.
举例2:zhang san,则存储为zhangsan。Example 2: zhang san, it is stored as zhangsan.
举例3:HAN/WAI LEEA,则存储为HANWAILEEA。Example 3: HAN/WAI LEEA, it is stored as HANWAILEEA.
举例4:VANASS/LEONAD JOHA MR,则存储为VANASSLEONADJOHA。Example 4: VANASS/LEONAD JOHA MR, it is stored as VANASSLEONADJOHA.
(二)航班始发地或航班达到地(机场)相同(或者说重复航段)的判断标准为:(2) Judgment criteria for the same (or repeating segment) of the flight origin or arrival place (airport) are as follows:
1)航段为有效航段;1) The flight segment is a valid flight segment;
2)航段相似的条件是起飞城市或者到达城市相同,并且在起飞城市相同的情况下,起飞时间比较近,在到达城市相同的情况下,到达时间比较近。2) The conditions for similar flight segments are that the departure city or the arrival city is the same, and if the departure city is the same, the departure time is relatively short, and if the arrival city is the same, the arrival time is relatively short.
需要说明的是,针对疑似重复匹配说明参见表6所示。It should be noted that, for the description of suspected duplicate matching, see Table 6.
表6Table 6
Figure PCTCN2021130027-appb-000003
Figure PCTCN2021130027-appb-000003
Figure PCTCN2021130027-appb-000004
Figure PCTCN2021130027-appb-000004
还需要说明的是,在检测是否有其他重复航段时,本发明支持对是否启用联盟内全检测的范围控制,例如FM、MU为同一联盟,联盟内的重复订座的航段可以识别。It should also be noted that when detecting whether there are other duplicate flight segments, the present invention supports the scope control of whether to enable full detection within the alliance. For example, FM and MU are the same alliance, and the flight segments with duplicate reservations within the alliance can be identified.
需要特别说明的是,为保护旅客隐私数据,对PNR订单中证件信息(身份证号码、护照号码和常旅客卡号)进行加密处理,加密方式可以为SM4。具体加密过程可参见现有成熟方案,此处不再赘述。It should be noted that, in order to protect the privacy data of passengers, the document information (ID number, passport number and frequent flyer card number) in the PNR order is encrypted, and the encryption method can be SM4. For the specific encryption process, reference may be made to the existing mature solution, which will not be repeated here.
步骤S107、将重复订单保存在主数据库中。Step S107, save the repeated order in the main database.
其中,在将重复订单保存在主数据库中后,就可以供后续查询或者清理功能使用。具体可以为:自动清理模块从主数据库读取重复订单,并对重复订单进行清理后,将清理结果保持至主数据库中。Among them, after the repeated orders are saved in the main database, they can be used for subsequent query or cleaning functions. Specifically, the automatic cleaning module reads duplicate orders from the main database, and after cleaning the duplicate orders, the cleaning results are kept in the main database.
需要说明的是,本发明公开的重复订座识别系统处理过程可参见图2所示,其中,PNR1和PNR2仅为一个示例,在实际应用中,重复订座识别系统获取的PNR的数量与实际应用中对应。It should be noted that the processing process of the repeated reservation identification system disclosed in the present invention can be referred to as shown in FIG. 2 , wherein PNR1 and PNR2 are only an example. In practical applications, the number of PNRs obtained by the repeated reservation identification system is the corresponding in the application.
综上可知,本发明公开了一种重复订座识别方法,对获取的PNR数据进行解析,从PNR数据中提取出旅客身份信息和航班信息,从缓存数据库中查找到与旅客身份信息对应的旅客ID,该旅客ID为旅客第一次订座时在所述缓存数据库中生成,将PNR号码放至缓存数据库中,并与旅客ID以对应关系的形式存储,同时将PNR号码对应的PNR数据存储至缓存数据库中,基于旅客ID以及相关联的PNR号码,从缓存数据库中提取出所有有效的目标PNR数据,按照预设重复订座识别规则,将所有的目标PNR数据进行重 复订单比对,并将确定的重复订单保存在主数据库中。本发明引入了缓存数据库来保存当前有效的PNR数据,并在旅客第一次订座时直接在缓存数据库中生成与旅客身份信息对应的旅客ID,当接收到实时的PNR数据时,会在重复订座识别之前,优先将PNR数据放到缓存数据库中,使旅客ID可以与所有相关联的PNR数据对应,从而通过将同一旅客ID关联的所有的有效的PNR数据重复订单比对,即可实现对重复订座的识别。To sum up, the present invention discloses a method for re-booking identification, which analyzes the acquired PNR data, extracts the passenger identity information and flight information from the PNR data, and finds the passenger corresponding to the passenger identity information from the cache database. ID, the passenger ID is generated in the cache database when the passenger makes a reservation for the first time, the PNR number is placed in the cache database, and stored in the form of a corresponding relationship with the passenger ID, and the PNR data corresponding to the PNR number is stored at the same time. In the cache database, based on the passenger ID and the associated PNR number, all valid target PNR data are extracted from the cache database, and all target PNR data are compared according to the preset repeated reservation identification rules for repeated orders, and Save identified duplicate orders in the master database. The present invention introduces a cache database to save the current valid PNR data, and directly generates the passenger ID corresponding to the passenger identity information in the cache database when the passenger makes a reservation for the first time. Prior to the reservation identification, the PNR data is preferentially placed in the cache database, so that the passenger ID can correspond to all the associated PNR data, so that all valid PNR data associated with the same passenger ID can be compared with repeated orders. Identification of duplicate reservations.
另外,本发明通过直接在缓存数据库中生成与旅客身份信息对应的旅客ID,使得旅客ID可以与所有相关联的PNR数据对应,有效避免了传统方案中因采用分布式并行处理导致的无法对同一旅客的多个PNR进行重复订座识别的问题。并且,缓存数据库相对于主数据库而言,可以极大的提升读写效率,可以将读写操作缩短到毫秒级。In addition, the present invention directly generates the passenger ID corresponding to the passenger identity information in the cache database, so that the passenger ID can correspond to all the associated PNR data, effectively avoiding the inability to identify the same data in the traditional solution due to the use of distributed parallel processing. The problem of multiple PNRs for passengers to identify duplicate reservations. Moreover, compared with the main database, the cache database can greatly improve the read and write efficiency, and can shorten the read and write operations to the millisecond level.
需要说明的是,附图中的流程图和框图,图示了按照本公开各种实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段、或代码的一部分,该模块、程序段、或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意,在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个接连地表示的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或操作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。It should be noted that the flowcharts and block diagrams in the accompanying drawings illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code that contains one or more logical functions for implementing the specified functions executable instructions. It should also be noted that, in some alternative implementations, the functions noted in the blocks may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It is also noted that each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations, can be implemented in dedicated hardware-based systems that perform the specified functions or operations , or can be implemented in a combination of dedicated hardware and computer instructions.
可以以一种或多种程序设计语言或其组合来编写用于执行本公开的操作的计算机程序代码,上述程序设计语言包括但不限于面向对象的程序设计语言—诸如Java、Smalltalk、C++,还包括常规的过程式程序设计语言—诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络——包括局域网(LAN)或广域网(WAN)—连接到用户计算机,或者, 可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。Computer program code for performing operations of the present disclosure may be written in one or more programming languages, including but not limited to object-oriented programming languages—such as Java, Smalltalk, C++, and This includes conventional procedural programming languages - such as the "C" language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer, or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or may be connected to an external computer (eg, using an Internet service provider via Internet connection).
与上述方法实施例相对应,本发明还公开了一种重复订座识别装置。Corresponding to the above method embodiments, the present invention also discloses a repeated reservation identification device.
参见图3,本发明实施例公开的一种重复订座识别装置的结构示意图,该装置包括:Referring to FIG. 3 , a schematic structural diagram of a device for identifying repeated reservations disclosed in an embodiment of the present invention includes:
获取单元201,用于获取PNR数据,所述PNR数据包括:PNR号码;an obtaining unit 201, configured to obtain PNR data, the PNR data includes: a PNR number;
解析单元202,用于解析所述PNR数据,从所述PNR数据中提取出旅客身份信息和航班信息;a parsing unit 202 for parsing the PNR data, and extracting passenger identity information and flight information from the PNR data;
其中,旅客身份信息可以包括:旅客姓名、身份证号码、护照号码和常旅客卡号等等。The passenger identity information may include: passenger name, ID number, passport number, frequent flyer card number, and the like.
航班信息可以包括:航班订座信息、航班始发地、航班目的地、航班号、出发日期和到达日期,等等。The flight information may include: flight reservation information, flight origin, flight destination, flight number, departure date and arrival date, and the like.
查找单元203,用于从缓存数据库中查找到与所述旅客身份信息对应的旅客ID,所述旅客ID为旅客第一次订座时在所述缓存数据库中生成;Searching unit 203, configured to find the passenger ID corresponding to the passenger identity information from the cache database, where the passenger ID is generated in the cache database when the passenger makes a reservation for the first time;
其中,旅客ID为旅客第一次订座时在所述缓存数据库中生成。The passenger ID is generated in the cache database when the passenger makes a reservation for the first time.
在实际应用中,旅客每次订座可能使用身份证号码、护照号码和常旅客卡号其中的任意一种或多种,因此需要对旅客身份进行识别。In practical applications, passengers may use any one or more of ID number, passport number and frequent flyer card number for each seat reservation, so it is necessary to identify the passenger's identity.
为便于后续对同一个旅客的多份订单进行重复识别,本发明针对每位旅客在缓存数据库中均生成一个来唯一标识每一位旅客的旅客ID。In order to facilitate subsequent repeated identification of multiple orders of the same passenger, the present invention generates a passenger ID for each passenger in the cache database to uniquely identify each passenger.
需要说明的是:It should be noted:
A)当在缓存数据库中未查找到与所述旅客身份信息对应的旅客ID时,在缓存数据库中生成一个与所述旅客身份信息对应旅客ID。A) When the passenger ID corresponding to the passenger identification information is not found in the cache database, a passenger ID corresponding to the passenger identification information is generated in the cache database.
因此,重复订座识别装置还可以包括:生成单元,用于第一存储单元204将所述PNR号码放至所述缓存数据库中,并与所述旅客ID以对应关系的形式存储,同时将所述PNR号码对应的所述PNR数据存储至所述缓存数据库之前,当所述缓存数据库中未查找到与所述旅客身份信息对应的旅客ID时,从所述缓存数据库中生成一个与所述旅客身份信息对应旅客ID。Therefore, the repeated reservation identification device may further include: a generating unit, for the first storage unit 204 to put the PNR number into the cache database, and store it in the form of a corresponding relationship with the passenger ID, and at the same time store the PNR number in the cache database. Before the PNR data corresponding to the PNR number is stored in the cache database, when the passenger ID corresponding to the passenger identity information is not found in the cache database, a data corresponding to the passenger is generated from the cache database. The identity information corresponds to the passenger ID.
B)如果缓存数据库中找到多个与所述旅客身份信息对应的旅客ID时, 需要对多个旅客ID进行合并,保证有多个证件的旅客只有一个唯一的旅客ID。B) If multiple passenger IDs corresponding to the passenger identity information are found in the cache database, the multiple passenger IDs need to be merged to ensure that passengers with multiple certificates have only one unique passenger ID.
举例一:不同身份证的旅客ID合并Example 1: Combining passenger IDs of different ID cards
处理订单1(英文名+身份证1):获取真实英文名+加密身份证号1键,查询缓存数据库,无结果存入缓存数据库中,以真实英文名+加密身份证号1为键,旅客ID1为值;Processing order 1 (English name + ID card 1): Get the real English name + encrypted ID number 1 key, query the cache database, no results are stored in the cache database, with the real English name + encrypted ID number 1 as the key, the passenger ID1 is the value;
处理订单2(英文名+身份证2):获取真实英文名+加密身份证号2键,查询缓存数据库,无结果存入缓存数据库中,以真实英文名+加密身份证号2为键,旅客ID2为值;Processing order 2 (English name + ID card 2): Get the real English name + encrypted ID number 2 key, query the cache database, no results are stored in the cache database, with the real English name + encrypted ID number 2 as the key, the passenger ID2 is the value;
处理订单3(英文名+身份证号1+身份证号2):获取真实英文名+加密身份证号1键以及真实英文名+加密护照号2键循环证件类型,查询缓存数据库,发现这个这两个键名数据存在,且有两个不同的旅客ID值,随机使用订单1或订单2的旅客ID值为最终的旅客ID(如ID1),更新订单1和订单2的旅客ID值,聚合成同一个人。Processing order 3 (English name + ID number 1 + ID number 2): Get the real English name + encrypted ID number 1 key and the real English name + encrypted passport number 2 key to circulate the certificate type, query the cache database, and find this Two key data exists, and there are two different passenger ID values, randomly use the passenger ID value of order 1 or order 2 as the final passenger ID (such as ID1), update the passenger ID value of order 1 and order 2, and aggregate become the same person.
举例二:身份证和护照的旅客ID合并Example 2: Combining the passenger ID of the ID card and the passport
处理订单1(英文名+身份证):获取真实英文名+加密身份证号键,查询缓存数据库,无结果存入缓存数据库中,以真实英文名+加密身份证号为键,旅客ID1为值;Processing order 1 (English name + ID card): Get the real English name + encrypted ID number key, query the cache database, no result is stored in the cache database, take the real English name + encrypted ID number as the key, and passenger ID1 as the value ;
处理订单2(英文名+护照):获取真实英文名+加密护照号键,查询缓存数据库,无结果存入缓存数据库中,以真实英文名+加密护照号为键,旅客ID2为值;Processing order 2 (English name + passport): Get the real English name + encrypted passport number key, query the cache database, no result is stored in the cache database, take the real English name + encrypted passport number as the key, and passenger ID2 as the value;
处理订单3(英文名+身份证+护照):获取真实英文名+加密身份证号键以及真实英文名+加密护照号键,循环证件类型,查询缓存数据库,发现这个这两个键名数据存在,且有两个不同的旅客ID值,随机使用订单1或订单2的旅客ID值为最终的旅客ID(如ID1),更新订单1和订单2的旅客ID值(均为ID1),聚合成同一个人。Processing order 3 (English name + ID card + passport): Obtain the real English name + encrypted ID number key and the real English name + encrypted passport number key, cycle the certificate type, query the cache database, and find that these two key name data exist , and there are two different passenger ID values, randomly use the passenger ID value of order 1 or order 2 as the final passenger ID (such as ID1), update the passenger ID value of order 1 and order 2 (both ID1), and aggregate them into the same person.
在实际应用中,本发明可以通过判断缓存数据库中是否存储与所述旅客身份信息对应的旅客ID,来确定旅客是否是第一次订座,如果否,则判定旅客之前未订座,此时在缓存数据库生成一个新的旅客ID;如果是,则判定旅客非第一次订座,当旅客再次订座时,如有多个证件,则判断进行 旅客ID的合并,否则直接使用之前生成的旅客ID即可。In practical applications, the present invention can determine whether the passenger is the first reservation by judging whether the passenger ID corresponding to the passenger identity information is stored in the cache database, and if not, it is determined that the passenger has not made a reservation before. Generate a new passenger ID in the cache database; if it is, it is determined that the passenger is not the first reservation, and when the passenger rebooks, if there are multiple documents, it is determined to merge the passenger ID, otherwise the previously generated passenger ID is directly used. Passenger ID is sufficient.
第一存储单元204,用于将所述PNR号码放至所述缓存数据库中,并与所述旅客ID以对应关系的形式存储,同时将所述PNR号码对应的所述PNR数据存储至所述缓存数据库中;The first storage unit 204 is used to put the PNR number in the cache database, and store it in the form of a corresponding relationship with the passenger ID, and store the PNR data corresponding to the PNR number in the cached in the database;
具体的,当旅客第一次订座时,首先在缓存数据库中生成一个与旅客身份信息对应旅客ID,然后将获取的PNR数据连同PNR号码存储至缓存数据库。Specifically, when a passenger makes a reservation for the first time, a passenger ID corresponding to the passenger identity information is first generated in the cache database, and then the acquired PNR data and the PNR number are stored in the cache database.
当旅客不是第一次订座时,直接将获取的PNR数据连同PNR号码存储至缓存数据库。When the passenger is not making the first reservation, the obtained PNR data together with the PNR number are directly stored in the cache database.
提取单元205,用于基于所述旅客ID以及相关联的PNR号码,从所述缓存数据库中提取出所有有效的目标PNR数据; Extraction unit 205, for extracting all valid target PNR data from the cache database based on the passenger ID and the associated PNR number;
需要说明的是,在缓存数据库中,一个旅客ID与所有相关联的PNR数据相对应。It should be noted that, in the cache database, one passenger ID corresponds to all associated PNR data.
本实施例中有效的目标PNR数据指的是:未被取消的订单数据。The valid target PNR data in this embodiment refers to the order data that has not been cancelled.
重复订单判断单元206,用于按照预设重复订座识别规则,将所有的所述目标PNR数据进行重复订单比对,判断旅客是否存在重复订单;The duplicate order judgment unit 206 is configured to compare all the target PNR data with duplicate orders according to the preset duplicate reservation identification rules, and judge whether the passenger has duplicate orders;
其中,按照预设重复订座识别规则确定的重复订单满足条件:(一)旅客姓名和身份ID相同;(二)航班始发地或航班达到地(机场)相同;(三)时间符合如下要求的重复PNR数据(重复PNR数据中的舱位、航班号和订座责任组可不同),如下:Among them, the duplicate orders determined according to the preset duplicate reservation identification rules meet the conditions: (1) the passenger name and ID are the same; (2) the flight origin or the flight arrival place (airport) is the same; (3) the time meets the following requirements The duplicate PNR data of (the class, flight number and booking responsibility group in the duplicate PNR data can be different), as follows:
A、当航段为国内航段时(起飞机场到达机场对应的国家均为CN),起飞或到达机场相同,两个航段的起飞时间在第一预设时间范围内则认为是重复航段;A. When the flight segment is a domestic flight segment (the country corresponding to the departure airport and the arrival airport is CN), the departure or arrival airport is the same, and the departure time of the two flight segments is within the first preset time range. part;
B、当航段为国际航段时(起飞机场到达机场对应的国家至少有一个不为CN),起飞或到达机场相同,两个航段的起飞时间在第二预设时间范围内则认为是重复航段。B. When the flight segment is an international flight segment (at least one of the countries corresponding to the departure airport and the arrival airport is not CN), the departure or arrival airports are the same, and the departure time of the two flight segments is considered to be within the second preset time range. is a repeating segment.
第二存储单元207,用于在所述重复订单判断单元判断为是的情况下,将所述重复订单保存在主数据库中。The second storage unit 207 is configured to save the repeated order in the main database when the repeated order determination unit determines yes.
其中,在将重复订单保存在主数据库中后,就可以供后续查询或者清理功能使用。具体可以为:自动清理模块从主数据库读取重复订单,并对 重复订单进行清理后,将清理结果保持至主数据库中。Among them, after the repeated orders are saved in the main database, they can be used for subsequent query or cleaning functions. Specifically, the automatic cleaning module reads duplicate orders from the main database, and after cleaning the duplicate orders, the cleaning results are kept in the main database.
需要说明的是,本发明公开的重复订座识别系统处理过程可参见图2所示,其中,PNR1和PNR2仅为一个示例,在实际应用中,重复订座识别系统获取的PNR的数量与实际应用中对应。It should be noted that the processing process of the repeated reservation identification system disclosed in the present invention can be referred to as shown in FIG. 2 , wherein PNR1 and PNR2 are only an example. In practical applications, the number of PNRs obtained by the repeated reservation identification system is the corresponding in the application.
综上可知,本发明公开了一种重复订座识别装置,对获取的PNR数据进行解析,从PNR数据中提取出旅客身份信息和航班信息,从缓存数据库中查找到与旅客身份信息对应的旅客ID,该旅客ID为旅客第一次订座时在所述缓存数据库中生成,将PNR号码放至缓存数据库中,并与旅客ID以对应关系的形式存储,同时将PNR号码对应的PNR数据存储至缓存数据库中,基于旅客ID以及相关联的PNR号码,从缓存数据库中提取出所有有效的目标PNR数据,按照预设重复订座识别规则,将所有的目标PNR数据进行重复订单比对,并将确定的重复订单保存在主数据库中。本发明引入了缓存数据库来保存当前有效的PNR数据,并在旅客第一次订座时直接在缓存数据库中生成与旅客身份信息对应的旅客ID,当接收到实时的PNR数据时,会在重复订座识别之前,优先将PNR数据放到缓存数据库中,使旅客ID可以与所有相关联的PNR数据对应,从而通过将同一旅客ID关联的所有的有效的PNR数据重复订单比对,即可实现对重复订座的识别。In summary, the present invention discloses a repeated reservation identification device, which analyzes the acquired PNR data, extracts the passenger identity information and flight information from the PNR data, and finds the passenger corresponding to the passenger identity information from the cache database. ID, the passenger ID is generated in the cache database when the passenger makes a reservation for the first time, the PNR number is placed in the cache database, and stored in the form of a corresponding relationship with the passenger ID, and the PNR data corresponding to the PNR number is stored at the same time. In the cache database, based on the passenger ID and the associated PNR number, all valid target PNR data are extracted from the cache database, and all target PNR data are compared according to the preset repeated reservation identification rules for repeated orders, and Save identified duplicate orders in the master database. The present invention introduces a cache database to save the current valid PNR data, and directly generates the passenger ID corresponding to the passenger identity information in the cache database when the passenger makes a reservation for the first time. Prior to the reservation identification, the PNR data is preferentially placed in the cache database, so that the passenger ID can correspond to all the associated PNR data, so that all valid PNR data associated with the same passenger ID can be compared with repeated orders. Identification of duplicate reservations.
另外,本发明通过直接在缓存数据库中生成与旅客身份信息对应的旅客ID,使得旅客ID可以与所有相关联的PNR数据对应,有效避免了传统方案中因采用分布式并行处理导致的无法对同一旅客的多个PNR进行重复订座识别的问题。并且,缓存数据库相对于主数据库而言,可以极大的提升读写效率,可以将读写操作缩短到毫秒级。In addition, the present invention directly generates the passenger ID corresponding to the passenger identity information in the cache database, so that the passenger ID can correspond to all the associated PNR data, effectively avoiding the inability to identify the same data in the traditional solution due to the use of distributed parallel processing. The problem of multiple PNRs for passengers to identify duplicate reservations. Moreover, compared with the main database, the cache database can greatly improve the read and write efficiency, and can shorten the read and write operations to the millisecond level.
上述实施例中,在进行姓名重复判断时,重复订座识别装置还可以包括:In the above-mentioned embodiment, when performing name repetition judgment, the repeated reservation identification device may further include:
姓名重复判断单元,用于对旅客姓名进行后缀剥离,并判断后缀剥离后的旅客姓名是否重复。The name repetition judgment unit is used to strip the suffix of the passenger's name, and judge whether the passenger's name after the suffix stripped is repeated.
其中,描述于本公开实施例中所涉及到的单元可以通过软件的方式实现,也可以通过硬件的方式来实现。其中,单元的名称在某种情况下并不 构成对该单元本身的限定,例如,第一获取单元还可以被描述为“获取至少两个网际协议地址的单元”。The units involved in the embodiments of the present disclosure may be implemented in a software manner, and may also be implemented in a hardware manner. Wherein, the name of the unit does not constitute a limitation of the unit itself under certain circumstances, for example, the first acquisition unit can also be described as "a unit that acquires at least two Internet Protocol addresses".
本文中以上描述的功能可以至少部分地由一个或多个硬件逻辑部件来执行。例如,非限制性地,可以使用的示范类型的硬件逻辑部件包括:现场可编程门阵列(FPGA)、专用集成电路(ASIC)、专用标准产品(ASSP)、片上系统(SOC)、复杂可编程逻辑设备(CPLD)等等。The functions described herein above may be performed, at least in part, by one or more hardware logic components. For example, without limitation, exemplary types of hardware logic components that may be used include: Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), Systems on Chips (SOCs), Complex Programmable Logical Devices (CPLDs) and more.
需要特别说明的是,装置实施例中各组成部分的具体工作原理,请参见方法实施例对应部分,此处不再赘述。It should be particularly noted that, for the specific working principles of each component in the apparatus embodiment, please refer to the corresponding part of the method embodiment, which will not be repeated here.
最后,还需要说明的是,在本文中,诸如第一和第二等之类的关系术语仅仅用来将一个实体或者操作与另一个实体或操作区分开来,而不一定要求或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。而且,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、物品或者设备中还存在另外的相同要素。Finally, it should also be noted that in this document, relational terms such as first and second are used only to distinguish one entity or operation from another, and do not necessarily require or imply these entities or that there is any such actual relationship or sequence between operations. Moreover, the terms "comprising", "comprising" or any other variation thereof are intended to encompass a non-exclusive inclusion such that a process, method, article or device that includes a list of elements includes not only those elements, but also includes not explicitly listed or other elements inherent to such a process, method, article or apparatus. Without further limitation, an element qualified by the phrase "comprising a..." does not preclude the presence of additional identical elements in a process, method, article or apparatus that includes the element.
本说明书中各个实施例采用递进的方式描述,每个实施例重点说明的都是与其他实施例的不同之处,各个实施例之间相同相似部分互相参见即可。The various embodiments in this specification are described in a progressive manner, and each embodiment focuses on the differences from other embodiments, and the same and similar parts between the various embodiments can be referred to each other.
对所公开的实施例的上述说明,使本领域专业技术人员能够实现或使用本发明。对这些实施例的多种修改对本领域的专业技术人员来说将是显而易见的,本文中所定义的一般原理可以在不脱离本发明的精神或范围的情况下,在其它实施例中实现。因此,本发明将不会被限制于本文所示的这些实施例,而是要符合与本文所公开的原理和新颖特点相一致的最宽的范围。The above description of the disclosed embodiments enables any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be implemented in other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein, but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (14)

  1. 一种重复订座识别方法,其特征在于,包括:A method for identifying repeated reservations, comprising:
    获取PNR数据,所述PNR数据包括:PNR号码;Acquire PNR data, the PNR data includes: PNR number;
    解析所述PNR数据,从所述PNR数据中提取出旅客身份信息和航班信息;Parse the PNR data, and extract the passenger identity information and flight information from the PNR data;
    从缓存数据库中查找到与所述旅客身份信息对应的旅客ID,所述旅客ID为旅客第一次订座时在所述缓存数据库中生成;Find the passenger ID corresponding to the passenger identity information from the cache database, and the passenger ID is generated in the cache database when the passenger makes a reservation for the first time;
    将所述PNR号码放至所述缓存数据库中,并与所述旅客ID以对应关系的形式存储,同时将所述PNR号码对应的所述PNR数据存储至所述缓存数据库中;Put the PNR number in the cache database, and store it in the form of a corresponding relationship with the passenger ID, and store the PNR data corresponding to the PNR number in the cache database simultaneously;
    基于所述旅客ID以及相关联的PNR号码,从所述缓存数据库中提取出所有有效的目标PNR数据;extracting all valid target PNR data from the cache database based on the passenger ID and the associated PNR number;
    按照预设重复订座识别规则,将所有的所述目标PNR数据进行重复订单比对,判断旅客是否存在重复订单;According to the preset duplicate reservation identification rules, compare all the target PNR data with duplicate orders to determine whether the passenger has duplicate orders;
    如果是,则将所述重复订单保存在主数据库中。If so, the repeat order is saved in the master database.
  2. 根据权利要求1所述的重复订座识别方法,其特征在于,在将所述PNR号码放至所述缓存数据库中,并与所述旅客ID以对应关系的形式存储,同时将所述PNR号码对应的所述PNR数据存储至所述缓存数据库之前,还包括:The method for identifying repeated reservations according to claim 1, wherein the PNR number is stored in the cache database in the form of a corresponding relationship with the passenger ID, and the PNR number is stored at the same time. Before storing the corresponding PNR data in the cache database, it also includes:
    当所述缓存数据库中未查找到与所述旅客身份信息对应的旅客ID时,从所述缓存数据库中生成一个与所述旅客身份信息对应旅客ID。When the passenger ID corresponding to the passenger identity information is not found in the cache database, a passenger ID corresponding to the passenger identity information is generated from the cache database.
  3. 根据权利要求1所述的重复订座识别方法,其特征在于,按照所述预设重复订座识别规则确定的重复订单满足条件:The method for identifying repeated reservations according to claim 1, wherein the repeated order determined according to the preset repeated reservation identification rule satisfies the conditions:
    旅客姓名和身份ID相同;The passenger's name and ID are the same;
    航班始发地或航班达到地相同;The origin of the flight or the place of arrival of the flight is the same;
    时间符合如下要求的重复PNR数据:Repeated PNR data whose time meets the following requirements:
    A、当航段为国内航段时,起飞或到达机场相同,两个航段的起飞时间在第一预设时间范围内则认为是重复航段;A. When the flight segment is a domestic flight segment, the departure or arrival airport is the same, and the departure time of the two flight segments is within the first preset time range, it is considered as a duplicate flight segment;
    B、当航段为国际航段时,起飞或到达机场相同,两个航段的起飞时 间在第二预设时间范围内则认为是重复航段。B. When the flight segment is an international flight segment, the departure or arrival airport is the same, and the departure time of the two flight segments is within the second preset time range, it is considered as a duplicate flight segment.
  4. 根据权利要求3所述的重复订座识别方法,其特征在于,所述旅客姓名和所述身份ID是否相同的判断标准为:旅客姓名重复且身份ID重复;The method for identifying repeated reservations according to claim 3, wherein the criterion for determining whether the passenger name and the identity ID are the same is: the passenger name is repeated and the identity ID is repeated;
    所述旅客姓名重复的判断依据是:英文姓名相同;The basis for judging that the passenger's name is repeated is: the English name is the same;
    所述身份ID重复的判断依据是:两位旅客使用了相同的身份证、护照或常旅客卡。The basis for judging that the IDs are duplicated is that the two passengers use the same ID card, passport or frequent flyer card.
  5. 根据权利要求4所述的重复订座识别方法,其特征在于,判断旅客姓名重复的过程包括:The method for identifying repeated reservations according to claim 4, wherein the process of judging that the passenger's name is repeated comprises:
    对旅客姓名进行后缀剥离,并判断后缀剥离后的旅客姓名是否重复。The suffix is stripped from the passenger's name, and it is judged whether the passenger's name after the suffix stripping is repeated.
  6. 根据权利要求5所述的重复订座识别方法,其特征在于,所述旅客身份信息包括:本次订座使用的身份证号、护照号或常旅客卡号中的任意一个,以及姓名后缀剥离的旅客姓名。The method for identifying repeated reservations according to claim 5, wherein the passenger identity information includes: any one of the ID number, passport number or frequent flyer card number used for the current reservation, and the name suffix stripped Passenger Name.
  7. 根据权利要求3所述的重复订座识别方法,其特征在于,所述航班始发地或航班达到地相同的判断标准为:航段为有效航段;航段相似的条件是起飞城市或者到达城市相同,并且在起飞城市相同的情况下,起飞时间比较近,在到达城市相同的情况下,到达时间比较近。The method for re-booking identification according to claim 3, wherein the criteria for determining that the flight originating place or the flight arriving place are the same are: the flight segment is a valid flight segment; the conditions for similar flight segments are the departure city or the arrival location. The cities are the same, and if the departure city is the same, the departure time is relatively close, and if the arrival city is the same, the arrival time is relatively close.
  8. 一种重复订座识别装置,其特征在于,包括:A device for identifying repeated reservations, comprising:
    获取单元,用于获取PNR数据,所述PNR数据包括:PNR号码;an acquisition unit, configured to acquire PNR data, the PNR data includes: a PNR number;
    解析单元,用于解析所述PNR数据,从所述PNR数据中提取出旅客身份信息和航班信息;a parsing unit for parsing the PNR data, and extracting passenger identity information and flight information from the PNR data;
    查找单元,用于从缓存数据库中查找到与所述旅客身份信息对应的旅客ID,所述旅客ID为旅客第一次订座时在所述缓存数据库中生成;a search unit, configured to find the passenger ID corresponding to the passenger identity information from the cache database, where the passenger ID is generated in the cache database when the passenger makes a reservation for the first time;
    第一存储单元,用于将所述PNR号码放至所述缓存数据库中,并与所述旅客ID以对应关系的形式存储,同时将所述PNR号码对应的所述PNR数据存储至所述缓存数据库中;The first storage unit is used to put the PNR number in the cache database, and store it in the form of a corresponding relationship with the passenger ID, and store the PNR data corresponding to the PNR number in the cache. in the database;
    提取单元,用于基于所述旅客ID以及相关联的PNR号码,从所述缓存数据库中提取出所有有效的目标PNR数据;an extraction unit for extracting all valid target PNR data from the cache database based on the passenger ID and the associated PNR number;
    重复订单判断单元,用于按照预设重复订座识别规则,将所有的所述目标PNR数据进行重复订单比对,判断旅客是否存在重复订单;The duplicate order judgment unit is used to compare all the target PNR data with duplicate orders according to the preset duplicate reservation identification rules, and judge whether the passenger has duplicate orders;
    第二存储单元,用于在所述重复订单判断单元判断为是的情况下,将 所述重复订单保存在主数据库中。The second storage unit is configured to save the repeated order in the main database when the repeated order judgment unit judges yes.
  9. 根据权利要求8所述的重复订座识别装置,其特征在于,还包括:The repeated reservation identification device according to claim 8, further comprising:
    生成单元,用于所述第一存储单元将所述PNR号码放至所述缓存数据库中,并与所述旅客ID以对应关系的形式存储,同时将所述PNR号码对应的所述PNR数据存储至所述缓存数据库之前,当所述缓存数据库中未查找到与所述旅客身份信息对应的旅客ID时,从所述缓存数据库中生成一个与所述旅客身份信息对应旅客ID。A generating unit, for the first storage unit to put the PNR number in the cache database, and store it in the form of a corresponding relationship with the passenger ID, and store the PNR data corresponding to the PNR number at the same time Before reaching the cache database, when no passenger ID corresponding to the passenger identity information is found in the cache database, a passenger ID corresponding to the passenger identity information is generated from the cache database.
  10. 根据权利要求8所述的重复订座识别装置,其特征在于,按照所述预设重复订座识别规则确定的重复订单满足条件:The repeated reservation identification device according to claim 8, wherein the repeated order determined according to the preset repeated reservation identification rule satisfies the conditions:
    旅客姓名和身份ID相同;The passenger's name and ID are the same;
    航班始发地或航班达到地相同;The origin of the flight or the place of arrival of the flight is the same;
    时间符合如下要求的重复PNR数据:Repeated PNR data whose time meets the following requirements:
    A、当航段为国内航段时,起飞或到达机场相同,两个航段的起飞时间在第一预设时间范围内则认为是重复航段;A. When the flight segment is a domestic flight segment, the departure or arrival airport is the same, and the departure time of the two flight segments is within the first preset time range, it is considered as a duplicate flight segment;
    B、当航段为国际航段时,起飞或到达机场相同,两个航段的起飞时间在第二预设时间范围内则认为是重复航段。B. When the flight segment is an international flight segment, the departure or arrival airport is the same, and the departure time of the two flight segments is within the second preset time range, it is considered as a duplicate flight segment.
  11. 根据权利要求10所述的重复订座识别装置,其特征在于,所述旅客姓名和所述身份ID是否相同的判断标准为:旅客姓名重复且身份ID重复;The repeated reservation identification device according to claim 10, wherein the criterion for determining whether the passenger name and the identity ID are the same is: the passenger name is repeated and the identity ID is repeated;
    所述旅客姓名重复的判断依据是:英文姓名相同;The basis for judging that the passenger's name is repeated is: the English name is the same;
    所述身份ID重复的判断依据是:两位旅客使用了相同的身份证、护照或常旅客卡。The basis for judging that the IDs are duplicated is that the two passengers use the same ID card, passport or frequent flyer card.
  12. 根据权利要求11所述的重复订座识别装置,其特征在于,还包括:The repeated reservation identification device according to claim 11, characterized in that, further comprising:
    姓名重复判断单元,用于对旅客姓名进行后缀剥离,并判断后缀剥离后的旅客姓名是否重复。The name repetition judgment unit is used to strip the suffix of the passenger's name, and judge whether the passenger's name after the suffix stripped is repeated.
  13. 根据权利要求12所述的重复订座识别装置,其特征在于,所述旅客身份信息包括:本次订座使用的身份证号、护照号或常旅客卡号中的任意一个,以及姓名后缀剥离的旅客姓名。The device for re-booking identification according to claim 12, wherein the passenger identity information includes: any one of an ID number, a passport number or a frequent flyer card number used for the current reservation, and a stripped name suffix. Passenger Name.
  14. 根据权利要求10所述的重复订座识别装置,其特征在于,所述航 班始发地或航班达到地相同的判断标准为:航段为有效航段;航段相似的条件是起飞城市或者到达城市相同,并且在起飞城市相同的情况下,起飞时间比较近,在到达城市相同的情况下,到达时间比较近。The device for re-booking identification according to claim 10, wherein the criteria for determining that the flight departure place or the flight arrival place are the same are: the flight segment is a valid flight segment; the condition for similar flight segments is the departure city or the arrival location. The cities are the same, and if the departure city is the same, the departure time is relatively close, and if the arrival city is the same, the arrival time is relatively close.
PCT/CN2021/130027 2020-11-19 2021-11-11 Duplicate reservation identification method and apparatus WO2022105666A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011302293.0 2020-11-19
CN202011302293.0A CN112214520A (en) 2020-11-19 2020-11-19 Repeated seat reservation identification method and device

Publications (1)

Publication Number Publication Date
WO2022105666A1 true WO2022105666A1 (en) 2022-05-27

Family

ID=74067895

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/130027 WO2022105666A1 (en) 2020-11-19 2021-11-11 Duplicate reservation identification method and apparatus

Country Status (2)

Country Link
CN (1) CN112214520A (en)
WO (1) WO2022105666A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115168456A (en) * 2022-09-07 2022-10-11 中国民航信息网络股份有限公司 Flight sales process feature acquisition method and device, storage medium and electronic equipment

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112214520A (en) * 2020-11-19 2021-01-12 中国民航信息网络股份有限公司 Repeated seat reservation identification method and device
CN113313277A (en) * 2021-06-10 2021-08-27 中国民航信息网络股份有限公司 Information processing method and device
CN115345335B (en) * 2022-08-23 2024-03-19 中国民航信息网络股份有限公司 Processing method and device for passenger name in civil aviation open passenger booking system
CN116483868A (en) * 2023-04-14 2023-07-25 首约科技(北京)有限公司 Method, device, equipment, medium and program for improving capacity response efficiency

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2515262A1 (en) * 2011-04-18 2012-10-24 Amadeus S.A.S. De-synchronization monitoring system and method
CN106776811A (en) * 2016-11-23 2017-05-31 李天� data index method and device
CN107392682A (en) * 2017-09-13 2017-11-24 沈阳东知科技有限公司 A kind of customer information processing system and processing method by all kinds of means
CN107862396A (en) * 2017-10-27 2018-03-30 携程旅游网络技术(上海)有限公司 Stroke order repeats predetermined process method, system, storage medium and electronic equipment
CN110750217A (en) * 2019-10-18 2020-02-04 北京浪潮数据技术有限公司 Information management method and related device
CN112214520A (en) * 2020-11-19 2021-01-12 中国民航信息网络股份有限公司 Repeated seat reservation identification method and device

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2515262A1 (en) * 2011-04-18 2012-10-24 Amadeus S.A.S. De-synchronization monitoring system and method
CN106776811A (en) * 2016-11-23 2017-05-31 李天� data index method and device
CN107392682A (en) * 2017-09-13 2017-11-24 沈阳东知科技有限公司 A kind of customer information processing system and processing method by all kinds of means
CN107862396A (en) * 2017-10-27 2018-03-30 携程旅游网络技术(上海)有限公司 Stroke order repeats predetermined process method, system, storage medium and electronic equipment
CN110750217A (en) * 2019-10-18 2020-02-04 北京浪潮数据技术有限公司 Information management method and related device
CN112214520A (en) * 2020-11-19 2021-01-12 中国民航信息网络股份有限公司 Repeated seat reservation identification method and device

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115168456A (en) * 2022-09-07 2022-10-11 中国民航信息网络股份有限公司 Flight sales process feature acquisition method and device, storage medium and electronic equipment
CN115168456B (en) * 2022-09-07 2022-11-25 中国民航信息网络股份有限公司 Flight sales process feature acquisition method and device, storage medium and electronic equipment

Also Published As

Publication number Publication date
CN112214520A (en) 2021-01-12

Similar Documents

Publication Publication Date Title
WO2022105666A1 (en) Duplicate reservation identification method and apparatus
Bouros et al. Spatio-textual similarity joins
Becker et al. Identifying content for planned events across social media sites
Kalter et al. The significance of a child's age at the time of parental divorce
Poston Jr et al. The demographic and socioeconomic composition of China’s ethnic minorities
US20140059083A1 (en) Context-based search for a data store related to a graph node
Huang et al. Institution name disambiguation for research assessment
WO2014173279A1 (en) Compound query method oriented to hbase database
JP6088091B1 (en) Update apparatus, update method, and update program
Koumarelas et al. Experience: Enhancing address matching with geocoding and similarity measure selection
Zhu et al. Efficacy and safety of tacrolimus versus cyclophosphamide for primary membranous nephropathy: a meta-analysis
CN105069076A (en) Method and apparatus for determining address information in home page of official website
CN106021276A (en) Method and system for checkpoint vehicle search based on distributed full-text retrieval system
JP5558514B2 (en) Method and apparatus for optimally processing N-sort queries in multi-range scanning
CN110750599A (en) Associated information extraction and display method based on entity modeling
Dumani et al. Quality-aware ranking of arguments
WO2007105273A1 (en) Confidential information managing program, method and device
CN109947914A (en) A kind of software defect automatic question-answering method based on template
Shestakov et al. On estimating the scale of national deep web
Zeraatkar Radiology, nuclear medicine, and medical imaging: A bibliometric study in Iran
Guerra et al. Big data integration of heterogeneous data sources: the re-search alps case study
JP2015032228A (en) Program, method, apparatus and server generating co-occurrence pattern for detecting near-synonym
CN111581942B (en) Data file comparison method
Mokbel et al. Microblogs data management systems: querying, analysis, and visualization
CN115239060A (en) Airworthiness approval risk assessment system and method based on big data analysis

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21893813

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21893813

Country of ref document: EP

Kind code of ref document: A1