• Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  • When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

VLOOKUP to turn one array into another?

duglessxott

New Member
Suppose I have this range of string values in cells A1:A10

[pre]
Code:
A
B
C
A
B
C
D
A
B
C

and another range for lookups in B1:C4 that maps those values to their numeric equivalents:

[pre][code]A  5
B  10
C  15
D  20[/pre]

Ideally I would like to be able to build an array that maps each entry in A1:A10 into B1:C4, via VLOOKUP, in some array-formula-like manner such as:


[code]{=VLOOKUP(A1:A10,B1:C4,2,FALSE)}


(Notice the lookup value argument is a range rather than a single cell reference - hence the inclination to try it as an array formula). Expectation would be the synthetic array:


{5, 10, 15, 5, 10, 15, 20, 5, 10, 15}[/code]


The goal is to use that resultant array inside of SUMPRODUCT calls that will then selectively add up the elements based on additional filter criteria - the usual stuff for SUMPRODUCT. For example:


{=SUMPRODUCT(--(D1=E1:E10),VLOOKUP(A1:A10,B1:C4,2,FALSE))}
{=SUMPRODUCT(--(D2=E1:E10),VLOOKUP(A1:A10,B1:C4,2,FALSE))}
...
{=SUMPRODUCT(--(D10=E1:E10),VLOOKUP(A1:A10,B1:C4,2,FALSE))}[/code][/pre]
In this case, the first array is the typical filter that (would) inform which of the synthesized array values from this ideal use of VLOOKUP would be added for the sum I want.


My research tends to discover that VLOOKUP is designed to accept arrays for the column index parameter, allowing one to return entire rows out of the lookup table as arrays. My goal is to return an array based on one-to-one mapping of each elements of an array passed as the lookup value parameter.


My expectation is that if it's possible, it will have to be via something more elaborate.
 
Hi ,


If you define two ranges , A1:A10 as Data_Range , and B1:C4 as Lookup_Table , then the following construct :


=VLOOKUP(Data_Range,Lookup_Table,2,FALSE)


returns the array that you expect.


Can you give more details about how you intend to use this construct , with some data ?


Narayan
 
Hey Narayan,


Thanks for the post. I just tried your suggestion and I get the same behaviour. The only difference that I can see is that in my example I referenced the data and lookup with literal cell ranges, and you're suggesting to dereference those from named ranges. Correct? The result I get during evaluation is that for either the named range or the literal range, both evaluate to the value of the first cell in that range rather than to a range of values that then (by some magic) each are replaced with their mapped counterparts from the lookup table.


Happy to expand the example:

[pre]
Code:
_|  A        B           C          D
1  Project  Consultant  Fee Table
2  Fido     Alan        Name       Fee
3  Fido     Betty       Alan       10
4  Fido     Carter      Betty      15
5  Fluffy   Betty       Carter     20
6  Fluffy   Carter      David      25
7  Fluffy   David
8  Lassie   Alan
9  Lassie   David
10  Patches  Betty
11  Patches  Carter
12
13  Summaries
14  Project  Total Spend
15  Fido     {=SUMPRODUCT(--($A15=$A$2:$A$11), VLOOKUP($B$2:$B$11,$C$3:$D$6,2,FALSE))}
16  Fluffy   {=SUMPRODUCT(--($A16=$A$2:$A$11), VLOOKUP($B$2:$B$11,$C$3:$D$6,2,FALSE))}
17  Lassie   {=SUMPRODUCT(--($A17=$A$2:$A$11), VLOOKUP($B$2:$B$11,$C$3:$D$6,2,FALSE))}
18  Patches  {=SUMPRODUCT(--($A18=$A$2:$A$11), VLOOKUP($B$2:$B$11,$C$3:$D$6,2,FALSE))}
[/pre]

In this example I'm trying to summarize the total consultant fees against the projects in Column A. The project column specifies the consultants that worked on each project. Thus the SUMPRODUCT function is to isolate fees by each project called out to its left (A15-A18).


The conjectured VLOOKUP is to map the consultants in column B against their nominal consulting fees and then produce an equivalantly sized array of fee charges by each consultant in order of their charge against each project.


Then, assuming say for Fido, I have in the SUMPRODUCT call a resultant {1,1,1,0,0,0,0,0,0,0} (which works obviously) multiplied by {10,15,20,15,20,25,10,25,15,20} (which we're trying to get to work) to produce a product of {10,15,20,0,0,0,0,0,0,0} and then sum to a final answer of: 45.


Does that make sense? I hope so, because I'm typing this right out of my head :)


Thanks again.


Doug
 
Here is a solution i created for you. I created a helper column which makes it simple!

If you want to upload your own file, I can apply the same method to it. If not, adapt what i did from the file below to your own file. Enjoy. :)


http://speedy.sh/f7jgt/Fido-solution.xls
 
Appreciated Montrey - Speedy.sh is asking me to join, so I'm guessing that you have a column in betweeners Consultants and Fee Table that produces the fee mappings, and I get that that's one route.


However consider my example is fairly simplistic for example's sake. Imagine Column B was Consultants/January, and that Project was actually Project Summary/January, and that after I get some solution figured out, I want to expand it horizontally month-by-month. Now imagine that those month-by-month columns already exist and over a good stretch of time. Adding help columns seems a little tedious compared to taking care of it in code.


I guess I *can* live with the helper column, but ideally I'm looking for a solution (*and maybe it doesn't exist*) that can let me after-the-fact get the equivalent behavior via the return from VLOOKUP. If I have to restructure the table and absorb everything that implies for clutter, I can, and I can hide columns too, but it seems like a lot of maintenance to embark upon before clarifying if there is a less structural way to accomplish the same goal.
 
the download link is at the top of the speedyshare page in big bold letters!! :D

That website is crap. lol.


I understand what your saying, let me work on it some more to find a solution without a helper column.
 
Dugless check this out. Let me know what u think.


http://www.2shared.com/document/gapu0g5S/Fido_solution.html
 
Hey Montrey,


Unfortunately 2shared.com is a Windows-only service (I'm on a Mac), and even if I tried this at work, our security policies would block them because they require a downloaded component to be installed to use their service.


Any chance you can either paste a textified version to here, or, realizing that could be a big ask, instead put it on a sharing service like Dropbox that allows for public web-based access?


Sorry for the hassle.


Doug
 
Sure - hope you're OK that I don't publish my email in a public forum. How about send it to dugless.montrey at good ol' gmail? You can attach the .xls to that, and I'll respond with a real email for future use?


thanks!
 
Hi Montrey,


I received your suggestion, much appreciated. For discussion, SirJB7 and others, it's copied here in a textified version:

[pre]
Code:
_| A         B            C        D
1 Project   Consultant   Fee Table
2 fido      alan         alan     10
3 fido      betty        betty    15
4 fido      carter       carter   20
5 tree      james        james    35
6
7 Project   Summary
8 fido      =SUM(B9:B12)
9   alan    =SUMPRODUCT(($A$2:$A$5=$A$8)*($B$2:$B$5=A9)*($D$2:$D$5))
10   betty   =SUMPRODUCT(($A$2:$A$5=$A$8)*($B$2:$B$5=A10)*($D$2:$D$5))
11   carter  =SUMPRODUCT(($A$2:$A$5=$A$8)*($B$2:$B$5=A11)*($D$2:$D$5))
12   james   =SUMPRODUCT(($A$2:$A$5=$A$8)*($B$2:$B$5=A12)*($D$2:$D$5))
13 Tree      =SUM(B14:B17)
14   james   =SUMPRODUCT(($A$2:$A$5=$A$13)*($B$2:$B$5=A14)*($D$2:$D$5))
15   alan    =SUMPRODUCT(($A$2:$A$5=$A$13)*($B$2:$B$5=A15)*($D$2:$D$5))
16   betty   =SUMPRODUCT(($A$2:$A$5=$A$13)*($B$2:$B$5=A16)*($D$2:$D$5))
17   carter  =SUMPRODUCT(($A$2:$A$5=$A$13)*($B$2:$B$5=A16)*($D$2:$D$5))

Which produces the following result:


_| A         B            C        D
1 Project   Consultant   Fee Table
2 fido      alan         alan     10
3 fido      betty        betty    15
4 fido      carter       carter   20
5 tree      james        james    35
6
7 Project   Summary
8 fido      45
9   alan    10
10   betty   15
11   carter  20
12   james   0
13 Tree      35
14   james   35
15   alan    0
16   betty   0
17   carter  0
[/pre]
It definitely works, that is, it gets to the answer. And the multiple multiplications have given me something to think about.


What I don't like about it is the structure, and what that implies about maintenance: it requires you to maintain the consulting contributors to the project in two places - once starting under A2, once starting under the Project Summary under A7. Every time I notate a new consultant to a project I have to update both lists. And that's not even considering there may be cases where I'd have to update ALL project summaries in case the consultant contributed to *multiple projects*. I believe my original, now-proving-a-pipedream, example wouldn't have suffered from this. It was self-maintaining.


(NB the spreadsheet actually driving my simpler example isn't really about consultants, and it will actually have this last quality described, i.e. newly introduced "consultants" could hypothetically contribute to any number of "projects" even if the project predates the arrival of the consultant.)


In this example it probably doesn't seem like a big deal, but imagine that this thing grows a lot. Bugs creep in when humans drop the ball, so my goal is always something where a maintainer (think of the next guy) only has to update a single line: adding a project-consultant mapping.


At the same time I realize that what I'm asking for tends to obfuscate the purpose of the code, which is a coding smell, and maybe in the Excel domain I should strive to make things more transparent like this.


Hm. Something to mull over.


Thanks again.


Doug
 
Hi, duglessxott!

You should build two tables: one for projects, another for consultants. Both in separated sheets or in a separate sheet in different columns, so as to define simple name ranges like "WorksheetXXX!A:D" if in worksheet "WorksheetXXX" and from columns A:D.

Then in the project assignment sheet, you can use VLOOKUP/INDEX functions or a dropdown list box to retrieve project and consultant name. Avoided data duplication.

For totalizing values per project, you can handle each project in a different sheet with totals at top (then summing from row 3:1000, for example), or you can use the Subtotal features included in Excel.

Further comments without the workbook and the full specs for requirements would be like shooting to the sky.

Regards!
 
Hey Doug!,


You could make the columns into named ranges. Then this would make it so you dont have to update the formulas.
 
Hey guys,


@SirJB7 - I definitely agree with you, that for data clarity and management I should normalize the data sources onto separate sheets. That said, this example was for simplicity, and I don't believe that putting the samples into a single sheet for illustration is actually the problem. The full sheet that I'm trying to build won't reveal anything additional about this particular problem that the boiled down example doesn't already, i.e. I believe that it will exist regardless of whether the data is separated into separate sheets or not. Do you?


@Montrey - thanks a lot for continuing to think about this, I appreciate your feedback. I will ultimately snap some named ranges into place. But again, I think that isn't the issue here. In the last example, named ranges for the consultants and for the fee table still don't solve the problem that if I add a new consultant, I would have to add him to the fee table (expected), the Project/Consultant table (expected) and to the Summary table in the Project column (undesirable). It's this last that I want to be automatic, that I think my original hypothetical example conceptually takes care of.


Basically the more I hash this out, the more I think I'm asking for a functional programming "map" construct that takes one list, the consultants-over-projects history, and maps it to their fees for each project. Using my original example, I'm looking for something like this that *in one call* maps all of the input elements to their fees:

[pre]
Code:
Input array:  ->  Map Operation:      ->    Output array:
[                                           [
Alan        ->  vlookup(fee table)  ->     10
Betty       ->  vlookup(fee table)  ->     15
Carter      ->  vlookup(fee table)  ->     20
Betty       ->  vlookup(fee table)  ->     15
Carter      ->  vlookup(fee table)  ->     20
David       ->  vlookup(fee table)  ->     25
Alan        ->  vlookup(fee table)  ->     10
David       ->  vlookup(fee table)  ->     25
Betty       ->  vlookup(fee table)  ->     15
Carter      ->  vlookup(fee table)  ->     20
]                                           ]
[/pre]

The key think that I think all answers thus far haven't taken into consideration is that I want the input array to be specified without regard to its constituents to the operating function (in this case a lookup), and the output has to be an array of equal size containing the results (lookups against each input array member).


The solutions offered so far force the spreadsheet author/maintainer to always specify that input array's constituents in separate cells. It doesn't matter, so far as I can see, whether a named range has been used or the data has been separated to separate sheets. Those solutions still require, again so far as I can see, the input array to be specified in a 2nd place, under the Project summary, constituent-by-constituent. This is believe is an opening for a maintenance bug as the spreadsheet gains additional data. Maybe I could just wrap all data-input with a VBA script or something, but that's getting less-elegant and more heavyweight.


Thanks guys for a very interesting discussion. I think it's helping me make a case for a array-as-input vlookup function if I can discover an Excel PM who'd consider such a thing :). Since MSFT has built at least one functional language to date (F#) maybe they'd see the utility.


Doug
 
PS @SirJB7 - I will look into subtotaling. If that gives me something more automatic for the summaries, I'll repost my experience.
 
Hi, duglessxott!

You can build a table for getting the summaries, too. It'll all depends on your workbook/worksheets' structure. Surely Montrey might be able to help you as the main solution provider for this topic. Without the workbook there's nothing much more that I could do.

Regards!
 
Hey guys,


To be clear, I don' t expect you to go building me spreadsheets! Although I much appreciate the yeoman's effort, what this is is a discussion about whether Excel can do something or not.


If I took the example I gave and said we'd put Project Summaries on Sheet 1, Project Work by Consultants on Sheet 2 and Consultant Fee Table on Sheet 3, then I still don't think the problem is materially changed. Do you? The nut of the problem is defined: how to create a line-by-line summary of Projects that adds up all of the costs per project by an unpredictable list of consultants.


If you wish to extend the problem to say that the sheets are broken out as I just conjectured, well certainly be my guest, but I still maintain it's the same problem. The data is just dereferenced from different locations. And as for the necessity of an original workbook, well let's put it this way: I'm asking for opinions on whether the problem *as defined* can be solved in the manner I am curious. If you look at it that way, then the originating workbook is really a superfluous detail.


Therefore, I've personally come to the conclusion that the original question is answered (VLOOKUP is not functionally aware enough to accept input arrays) and as yet there isn't a solution that doesn't add additional tables to maintain. I can live with that until happily proven otherwise, but I think the question is now answered. And I don't want to ask folks to continue to build sheets in ways that essentially solve the problem by ways we've explored here. If this is it, it is it and I'm satisfied that I at least got some great feedback.


Make sense? You guys have been great, but I think we've scraped all of the meat from this melon.


Doug
 
I know what exactly what needs to be done Doug. Unfortunately getting that all into an array formula is proving difficult. I've tried so many different formula combos. I've came close! but without helper columns or rows it is very hard.


I'ma keep trying to figure a solution though.
 
OMG OMG OMG. I FIGURED IT OUT!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!


http://speedy.sh/yrmEx/Complete-Fido-solution.xls


FINALLY! Now that is genius right there! WOW! Here is the formula that took 3 days to figure out. Funny thing is, it isnt an array either! I have to find more uses for this now!


=SUMPRODUCT(($B$2:$B$5=B12)*(LOOKUP($C$2:$C$5,INDEX($J$2:$J$5,,1),INDEX($K$2:$K$5,,1))))
 
Back
Top