How To: Parse the cancellation policies in the remarks
We have a regex that is used to identify the policies in the remarks section (use case insensitive):
In C#.NET (take note - escaped backslashes and duplicate group names are allowed) -
STARTING (?<startdate>\\d{1,2}/\\d{1,2}/\\d{4}) ((?:OR|AND) NO SHOW[ ]+)?CXL[ ]*-[ ]*PENALTY FEE[ ]+IS (?:(?:PRICE OF )?(?<value>\\d+(\\.\\d{1,2})?)(?:(?<pct>%)[ ]+OF (?<nights>\\d)? ?(?<based>(?:FIRST NIGHT|BOOKING) PRICE|TOTAL|NIGHTS)| (?<apply>NIGHTS|\\w{3}\\b))|(?:CNX FEE - (?<nights>\\d+) (?<based>NT) X (?<value>\\d{1,3}(?:\\.\\d{1,2})?)(?<pct>%)))
In other flavors the duplicate names are not allowed - the 2nd variance has named groups 3-6 appended with 1 and uses diff syntax for groups and escaped chars.
In Python -
STARTING (?P<startdate>\d{1,2}/\d{1,2}/\d{4}) ((?:OR|AND) NO SHOW[ ]+)?CXL[ ]*-[ ]*PENALTY FEE[ ]+IS (?:(?:PRICE OF )?(?P<value>\d+(\.\d{1,2})?)(?:(?P<pct>%)[ ]+OF (?P<nights>\d)? ?(?P<based>(?:FIRST NIGHT|BOOKING) PRICE|TOTAL|NIGHTS)| (?P<apply>NIGHTS|\w{3}\b))|(?:CNX FEE - (?P<nights1>\d+) (?P<based1>NT) X (?P<value1>\d{1,3}(?:\.\d{1,2})?)(?P<pct1>%)))
In Javascript-
STARTING (?<startdate>\d{1,2}\/\d{1,2}\/\d{4}) ((?:OR|AND) NO SHOW[ ]+)?CXL[ ]*-[ ]*PENALTY FEE[ ]+IS (?:(?:PRICE OF )?(?<value>\d+(\.\d{1,2})?)(?:(?<pct>%)[ ]+OF (?<nights>\d)? ?(?<based>(?:FIRST NIGHT|BOOKING) PRICE|TOTAL|NIGHTS)| (?<apply>NIGHTS|\w{3}\b))|(?:CNX FEE - (?<nights1>\d+) (?<based1>NT) X (?<value1>\d{1,3}(?:\.\d{1,2})?)(?<pct1>%)))
without any line breaks.
Using the named groups, you can get the following:
1. "startdate" will hold the date where this policy starts in the format dd/MM/yyyy (just in case it will match on single-digit values).
2. "apply" - if present, holds either "NIGHTS" denoting penalty by # of nights, or ISO currency code for flat value.
3. "pct" will simply hold "%". If exists, indicating penalty is percentage.
4. "based"/"based1", if present, holds "FIRST NIGHT PRICE" denoting % of first night price, "NIGHTS", denoting % of nights indicated; otherwise it is based on the booking/total price.
5. "nights"/"nights1", if present, indicates # of nights for % penalty - empty = total nights.
6. "value"/"value1" will hold a value of integer or float with up to two decimal places - % amount or flat nights/monetary.
As per this pattern, all policy-related info will start with the word "STARTING" and should include "CXL-PENALTY FEE".
So you can have % of total/nights or number of nights.
These are a few samples:
STARTING 02/02/2020 CXL - PENALTY FEE IS PRICE OF 1 USD
STARTING 20/12/2021 CXL - PENALTY FEE IS 2 NIGHTS
STARTING 30/04/2021 CXL - PENALTY FEE IS 20% OF NIGHTS
STARTING 20/12/2021 CXL - PENALTY FEE IS 20% OF 3 NIGHTS
STARTING 30/04/2020 AND NO SHOW CXL - PENALTY FEE IS 8.47 USD
STARTING 20/12/2021 CXL - PENALTY FEE IS 2.6 EUR
STARTING 02/02/2020 OR NO SHOW CXL - PENALTY FEE IS PRICE OF 47.5 USD
STARTING 30/04/2020 AND NO SHOW CXL - PENALTY FEE IS PRICE OF 8% OF TOTAL
STARTING 02/02/2020 CXL - PENALTY FEE IS PRICE OF 8.31 GBP
STARTING 20/12/2021 CXL - PENALTY FEE IS 78 GBP
STARTING 30/04/2020 AND NO SHOW CXL - PENALTY FEE IS PRICE OF 75% OF FIRST NIGHT PRICE
STARTING 02/02/2020 CXL - PENALTY FEE IS CNX FEE - 4 NT X 100.00%
STARTING 30/04/2020 CXL - PENALTY FEE IS CNX FEE - 2 NT X 85.00%
Multiple policies:
STARTING 20/05 /2020 CXL - PENALTY FEE IS 13% STARTING 28/05/2020 - PENALTY FEE IS 100 %
STARTING 18/04/2020 CXL-PENALTY FEE IS 100.0000% OF BOOKING PRICE STARTING 10/05/2022 CXL-PENALTY FEE IS 100.0000% OF BOOKING PRICE