The other week I came across an interesting challenge, creating a look-up that is based on 3 separate criteria, which also forced me to think about the process used to create more complicated formulas when the circumstances warrant it. The situation was as follows.
In a retail setting management is considering analyzing products by segment, similar to what is demonstrated in the table below. For reasons beyond the scope of this blog, Excel is where this needs to be done, and they wanted the lookup values to be in an easy-to-maintain table such as is illustrated. For each row of product data, they would like to be able to show which segment (Economy, Mainstream, Premium, or Super-Premium) that product would fall into based on category, size, and price. Adding to the challenge is that category and size would be exact look-ups, but price would be approximate based on ranges.
How would you approach this problem? What follows is how I went about it, and some observations on what this process reminded of about what makes Excel so powerful.
Here is the lookup table (named PriceBands). Note, for this to work properly this table should be sorted in ascending order first by Category, then by Size. The price in each segment column should be the starting point for that segment (for example, Category A, size 375 ml, the Mainstream segment goes from $10.00 to $12.99).
Here is table containing the end result, with the following intimidating looking formula in the Segment column to retrieve the appropriate segment label from the lookup table.
=IFERROR((INDEX(PriceBands[#Headers],MATCH([@[Retail Price]], OFFSET(Grid!$F$1, (INDEX(PriceBands[Offset], MATCH(1,INDEX(([@Category]=PriceBands[Category]) * ([@[Vol in ml]]=PriceBands[Size]),0,1),0))) ,0,1,4),1)+5)), “Not Applicable”)
While I knew that this should be possible (after all, there are usually at least 4 ways to do anything in Excel, as I often tell people when teaching about Excel), the question was how to build a formula to do what I wanted it to. The answer was to build it in pieces, testing each part, then finally combining them together for the end result.
For this purpose, I started with the smaller test table (named Segment) below, with sample lookup criteria in row 9. (The column Offset is a helper column to simplify building the formula.)
First, I knew that this would have to be a variation on an INDEX(MATCH) formula to have the flexibility needed. (See blog VLOOKUP vs INDEX MATCH.) And since two of the three lookup criteria are exact matches, I suspected that these could be combined into one formula. Some productive time spent with Google allowed me to create the following formula to pick the Offset number for the row that exactly matches the first two lookup criteria, Category and Size (note that I am using the table name and column header (e.g. Segment[Offset]) instead of an address range in the formula for maximum flexibility in maintaining the information as new entries are added – see the post A Key Advantage of Tables):
To analyze this formula and understand (more or less) what it does, remember that nested functions get evaluated from the inside out. The innermost function is the second INDEX function, which creates an array (or series) of 0s for FALSE and 1s for TRUE, based on testing each line of the lookup table to see whether the Category column contains the lookup value in cell B9 and the Size column contains the lookup value in cell C9. Multiplying the two results returns a single value for each row tested, and since 0 x 1 = 0, there will only be one value of 1 corresponding to the row that exactly matches both criteria.
Next, the MATCH function reads the position of the only TRUE (or 1) in the resulting array, corresponding with the row where both criteria are met.
This, in turn, is fed to the first INDEX function to read the value in the Offset column (to be used later).
Having accomplished the two exact matches, we can now turn attention to the approximate match to price, using the following formula:
This matches the price given in cell D9 to the values in cells D4:H4 and returned a value of 2. This only looks at one predetermined row, as at this point I was trying to ensure that this portion of the formula would correctly pick up the number of the first column in the supplied range where the input value (D9) is equal to or less than the number in that column (since the values are going up from left to right in the table rows).
Next, we build on this formula by using the result of it to feed and INDEX function as follows:
The MATCH formula created in the previous step is here telling the INDEX function to return the value in the Headers row of the table from the column that matches the lookup value for price (+ 2 to account for the Category and Size columns in the table which come before the start of the segment labels I am interested in).
At this point, please stop and notice how the various pieces needed are being assembled separately and proven to work as expected.
As I was working through this process, I hit a snag when I tried to use the first formula to create a range of one row to be tested by the last piece of formula we just looked at. A pesky #NA error kept showing up. Then I remembered something I had read along the way about ensuring that the output of a function is of the correct data type to feed another function and wondered if that was the source of my problem. Some more time spent with Google led me to the OFFSET function to create a dynamic range that could change as the result of a formula and be used to feed another function. But this was something new and took some experimentation. So, I started playing with it by creating:
This selected and summed the values in range D2:G2 (1 = same row, 4 = 4 columns wide), for a value of 105.
This changed the starting point to D1 and used the value in H2 (which contains the number 1) as the amount to offset that starting point, giving the same range of D2:G2 and the same final value of 105.
Next, I tried using this OFFSET statement in a MATCH function to see if it would create a dynamic range that it could properly read, using this formula:
This gave the expected value of 1, as it was matching 15 (the value in D9) to the same range of D2:G2 I had been working with, and the first column of that range with a value less than or equal to 15 is column D (since column E is 25, which is greater than 15).
So far, so good. Now to try to use this in a formula the way I hoped it would work by combining # 3 and # 6 above to produce:
This produced the correct result, reading the Header label Economy, matching the value in D9 (15) to the range D2:G2.
Now I could go back and add in the piece of formula created in #1 to #7 to provide the row number for the OFFSET statement:
- INDEX(Segment[#Headers],MATCH($D$9,OFFSET($D$1,(INDEX(Segment[Offset], MATCH(1,INDEX(($B$9=Segment[Category])*($C$9=Segment[Size]),0,1),0))),0,1,4),1)+2)
This is basically the formula I was wanting to create, using an exact match to the values in B9 and C9 to create a dynamic range that the INDEX(MATCH) formula we looked at in #3 could use for an approximate match to the value in D9 and return the matching header label.
One more step was desirable, namely to wrap #8 in an IFERROR statement to catch any problems that might be encountered, as follows:
- =IFERROR((INDEX(Segment[#Headers],MATCH($D$9,OFFSET($D$1,(INDEX(Segment[Offset], MATCH(1,INDEX(($B$9=Segment[Category])*($C$9=Segment[Size]),0,1),0))),0,1,4),1)+2)),”Not Applicable”)
If you look closely at this, you can pick out the various pieces created along the way. This formula could now be adapted to the actual tables I wanted to apply it to, giving the formula shown at the beginning.
What was I reminded of by going through this process?
That most big challenges can be broken down and addressed as a series of smaller challenges, even an Excel formula that you need to keep together to work properly in the final solution.
That when you face something that seems hard to get your head wrapped around, stop and see what pieces of it you think will work and start from there.
That one of the real strengths of Excel is the ability to solve one segment of a problem as a discreet section of cells, and then use that result as an input for another segment of the problem.
And finally, if you need to do a lookup with multiple criteria, I hope this will give you something that you can adapt to suit your need.