To: hotelgroup Subject: updated framework document for table-driven design Dear Colleagues, In the course of implementing a reference implementation, some enhancements have come up. My best belief is that this setup will allow us to capture virtually every scenario from Jag's documents. I hope to have some decent performance numbers in a few days unless Tim shows me the folly of my ways. Please ask questions if anything is unclear. Also, I can be available by phone at 11 AM your time any day. Best, Dennis Suggested Schema and Processing for Pricing Tim Buckland Dennis Shasha 7 May 2004 updated: 10 May 2004 The basic idea of this design is to make the description of pricing policies be based on three tables: base price, add modifier, and multiply modifier. A query specifies some conditions that choose a base price (based on "most specific wins") and then it chooses perhaps based on dates or special promotions additive and multiplicative modifiers. The design allows to check for validity of a query (e.g. to forbid a cot in a single room) should be pretty high performance (still to be tested) and appears to handle obscure cases (all the ones I could think of). Tim will be throwing more queries at me. I will be implementing more stuff. SCHEMAS base(hotel, basecondition, attribute, value) -- key is hotel, basecondition, attribute baseprice(hotel, basecondition, price) -- key is hotel, basecondition addmodifier(hotel, addcondition, attribute, value) -- key is hotel, addcondition, attribute addincrement(hotel, addcondition, increment) -- key is hotel, addcondition baseadd(hotel, basecondition, addcondition) -- key is hotel, basecondition, addcondition -- This level of indirection allows us to add the price of a cot -- say to any kind of room. multmodifier(hotel, multcondition, attribute, value) -- key is hotel, multcondition, attribute multfactor(hotel, multcondition, factor) -- key is hotel, multcondition basemult(hotel, basecondition, multcondition) -- key is hotel, basecondition, multcondition -- This level of indirection allows us to add the price of a cot -- say to any kind of room. NB: A variant of these schemas is proposed in the subsection entitled Variant Schema: full inversion in which hotel would no longer be an explicit attribute. Doing that is just an implementation variant, so is not very important for the main thrust of this discussion. I request consideration of this question from Tim and Mark however should we adopt this basic strategy. I think it's better. EXAMPLES Here are some example prices: doubleroom, sea view : 100 general doubleroom: 80 singleroom: 70 cot costs 15 but applies only to doublerooms champagne costs 25 if you stay over a mon, tues or wed, you get a 20% discount Hotel is called fawlty base(hotel, basecondition, attribute, value) fawlty, 1, roomtype, double fawlty, 1, view, sea fawlty, 2, roomtype, double fawlty, 3, roomtype, single baseprice(hotel, basecondition, price) fawlty, 1, 100 fawlty, 2, 80 fawlty, 3, 70 addmodifier(hotel, addcondition, attribute, value) fawlty, 1, cot, yes fawlty, 2, champagne, yes addincrement(hotel, addcondition, increment) fawlty, 1, 15 fawlty, 2, 25 -- champagne costs 25, cot 15 baseadd(hotel, basecondition, addcondition) fawlty, 1, 1 fawlty, 1, 2 fawlty, 2, 1 fawlty, 2, 2 fawlty, 3, 2 -- note we don't allow cots in single rooms multmodifier(hotel, multcondition, attribute, value) fawlty, 1, daysany, 0111000 -- boolean vector representing monday, tues, wed multfactor(hotel, multcondition, factor) fawlty, 1, 0.8 basemult(hotel, basecondition, multcondition) fawlty, 1, 1 ====== Example 1: A query comes in asking for regular doubleroom, a cot and champagne and over a Friday. Based on most specific wins, we get a basecondition of 2 because we want the most specific spec that matches the query. Both addconditions 1 and 2 apply, but no multcondition applies. So, get the price as 80 + 15 + 25 End of example. ====== PROCESSING Thus, the data flow is I. "most specific wins" for base table (for our example, basecondition = 2); find all matches in add table (for our example, addcondition = 1, 2); find all matches in mult table (for our example, there are none). II. check for validity: select baseadd.addcondition from baseadd where baseadd.hotel = "fawlty" and baseadd.basecondition = 2 and check that this result is a superset of the add conditions we have found. Otherwise, the query is unsatisfiable. Similarly for multiplication. III. Then to compute the price, do the following: 1. take the base price 2. add in all the increments to be added from addincrement 3. multiply by all the factors in multfactor. Definition of Most Specific Wins The most specific wins strategy applies to the base conditions and works as follows: 1. For each attribute-value pair in the query, a condition is acceptable if every attribute-value pair in the condition satisfies the query (for most conditions that means that the attribute-value pairs of the condition is a subset of the one in the query). 2. There is no other acceptable condition that is a superset. For example, if the query says (roomtype: doubleroom; view: sea) and one condition says (roomtype: doubleroom) and another says (roomtype: doubleroom; view: sea) then the second one wins because it is more specific. If the query is changed to (roomtype: doubleroom; agency: royal cruises) then (roomtype: doubleroom) would win. So the way matching on the base table works is that we perform the most-specific-wins strategy. Maximal Specific Wins We should handle (I think) the addmodifier/multmodifier using a maximal specific wins. That is, among those acceptable conditions C find that subset S such that for each s in S, there is no c in C that is a proper superset of S. Validation If the query specifies a single hotel, then clearly only that hotel's conditions will be called forth. If the query doesn't, then conditions several hotels will be called out. In either case, we have to determine whether a hotel is valid. Validity means the following: find the conditions from each table (base, addmodifier, multmodifier) that win based on the above criteria and see which of them link together according to baseadd an basemult. This will give us a set of conditions and attribute-value pairs. We compare that set to the set of the query and see whether any essential element of the query is missing. If so, then the hotel cannot satisfy the query. For example, if the query specifies (roomtype: single; cot: yes) but the hotel has only doublerooms with cots, then the query cannot be satisfied. If cots are necessary, then this doesn't work. On the other hand, if the query specifies (roomtype: single; view: sea) and there is no single room with a view of the sea, this may not be considered essential. So queries should have essential and inessential attributes. For now, we consider nothing to be essential. Finding Minimum Price Assume that a customer may use only one discount at a time. (If many can apply then the problem enters an algorithmic stage of difficulty known as NP-complete*.) For each base condition, find the addmodifier or multmodifier condition that is compatible and gives the lowest price. That's the minimum you announce. * (NP-completeness if many can apply: for each base condition join to find all addmodifier and multmodifier conditions. Find the compatible set of such conditions that gives the smallest price. One might get incompatibilities if for example a 0.8 multiplier applies for June and a 0.6 for July; both can't apply. Finding the best compatible set is harder than the NP-complete problem known as bin packing. An NP-complete problem is one whose only known solution is to try all possibilities.) MORE EXAMPLES ====== Example 2: A query comes in asking for singleroom, a cot over a Wednesday. Based on most specific wins, we get a basecondition of 3 because we want the most specific spec that matches the query. Addcondition 1 applies, and multcondition 1 applies. Problem is that the baseadd table doesn't have a row for 3, 1 so that combination is impossible. This is how it should be. There is no cot for single rooms. End of example. ====== ====== Example 3: We rent an apartment that can handle four people. If the apartment has one price regardless of the number of people, consider the base price to be the price of the apartment and then if the occupancy is 3, we divide that price by 3. Of course if the apartment has no fixed price, then we just have separate base prices for each occupancy setup. ======= ======= Example 4: We charge 200 every day except Saturday when we charge 150. Represent this as an add modifier whose condition is muststaydays 0000001 If both Friday and Saturday gave us cheap rates, then we'd have muststaydaysall 0000011 and the add modifier would be -100 ======= ======= Example 5: Impossible conditions, e.g. 5 adults in a single, are accounted for by having nothing that matches. So the end-customer can't select such a room. Thus we don't need an explicit min-occcupancy/max-occupancy. ======= ====== Example 6: Supplier reduces cost price, we reduce it less. This is no problem for us because we will represent the cost price tables separately from the sale price to customers when necessary. If a hotel ties the two together, we can do that by having a multfactor that gives us the price. ====== ====== Example 7: We handle occupancy as just an attribute. So (roomtype: doubleroom, occ:2 adults) may be a different condition from (roomtype: doubleroom, occ:3 adults). ====== ======= Example 8: Different prices depending on partner. This would come in as an add modifier or a mult modifier. Similarly for locales I think. ======= OTHER ISSUES Date Processing Whether a date range gives an additive increment or a multiplicative factor, we need to encode various conditions like "stay over a saturday", "stay within this date range", "stay at least this many days". The attributes we need follow. They require special treatment, e.g. if I want to stay for a date range, then I must check to see which promotions it overlaps with. attribute value availablestart date -- promotion or high season start availableend date muststaydaysall bit vector of length 7 -- must stay all of those days. -- week starts on Sunday muststaydayssome bit vector of length 7 -- must stay at least one of those days nostaydays bit vector of length 7 -- may not stay any of those days. numdaysstay number -- can handle supplement for short stay, stay x nights and get y free -- stay x nights and get z% off y nights) firstday dayofweek -- this plus numdaysstay can hande stay sat and sun get mon free lastday -- this plus numdaysstay can hande stay sat and sun get mon free Booking Start daysbeforebookingstart num daysendbookingstart num Funny Money Some of the requirements call for credit for other things, e.g. stay x nights and get some percentage off of another product. This requires a notion of a customer account. Use Based Pricing There is a modifier of the form: "If you've sold over x rooms for a night raise the price." This goes into addmodifier. Global discounts Some discounts apply to all base conditions of a hotel. If that is the case, we should have the notion of a blank condition field in multmodifier and addmodifier which means that everything is matched. User Interface This design could drive an interface. I think the hotel managers could have an interface in which they specify conditions and then indicate the cross-links using a graphical interface. Or there can be a bunch of pre-canned possibilities that are implemented in this form. Cancellation This is separate from pricing but doesn't seem hard. Escape A similar setup to addmodifier/multmodifier could be used for complex functions. But we may get away without needing it. That is we might have a table setup of the form. complexmodifier(hotel, complexcondition, attribute, value) -- key is hotel, complexcondition, attribute complexfunction(hotel, complexcondition, complexfunction) -- key is hotel, complexcondition basecomplex(hotel, basecondition, complexcondition) -- key is hotel, basecondition, complexcondition complexfunction applies to the end after all other modifiers have been applied. This could be in some interpreted language such as python or perl. IMPLEMENTATION NOTES Variant Schema: full inversion As of now, hotel is an attribute, but it need not be. We could be fully inverted if hotel were eliminated as an attribute but became a value of the column "attribute". Pros of keeping hotel as an attribute: easier to conceptualize queries that go through known hotels. Pros of making hotel a value of the column attribute: more symmetric. Example: Instead of: base(hotel, basecondition, attribute, value) fawlty, 1, roomtype, double fawlty, 1, view, sea fawlty, 2, roomtype, double fawlty, 3, roomtype, single we would have base(basecondition, attribute, value) 1, hotel, fawlty 1, roomtype, double 1, view, sea 2, hotel, fawlty 2, roomtype, double 3, hotel, fawlty 3, roomtype, single This setup is more economical in space but requires handling one kind of query as a special case: if the query doesn't specify the hotel, a condition can still be acceptable. In that case, we just ignore the hotel field. Implementation of Lookup: Most of the lookups other than date-based lookups entail taking a query's attribute-value pair and seeing which conditions have them. We then keep a count of the matches per condition and declare those conditions to be winners if they satisfy most specific wins or maximal specific wins. To do this, would want a hash table based on attribute-value and referring to condition. Then keep an ongoing list of conditions with their associated matching attribute-value pairs from the query. Notes: If I know just how many people are coming, I have to map that to room types for each hotel. For that, I need the following extra schema elements. This precedes the pricing schema above. hotel, roomtype, numadults, numchildren, numinfant hotel, childage hotel, infantage