• Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  • When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

Question about how to explain concepts to non-technical people

MSC Bobs

Member
Hi everyone,

I was hoping to tap into the vast knowledge base here in preparing me to make a proposal.

I'm trying to get everyone where I work to start structuring their files database-style; flat-file tables.

Before I get to my question, I want to give a little background on how I do stuff (which I think is vastly superior).

Every new workbook I start has (at least) three tabs: Summary, Calcs, Data. The Summary tab is for my charts and dashboard. The Calcs tab is for intermediate calculations that pull from the data and flow to the Summary. The Data tab is just that. Mostly just one or two tables. I use tables religiously and I structure them so they're flat file; they grow downward and not out to the right. Anything I might need to sort or look up, I add a new field to the table. I've got no empty fields and no empty space in any record. Nobody taught me to do it this way. I just use enough spreadsheets that I figured it out on my own. It was actually learning SUMPRODUCT and SUM(IF( that helped me figure this out.

The problem is that if nobody is taught this stuff, they do what feels natural which usually leads to matrix-style and cross-tab file structures. We all know this severely limits the functionality, efficiency, and ultimately the effectiveness of the spreadsheet. The data winds up being a really bad manual pivot table that cannot grow or expand.

So, my question is how do I explain database-style structure to people in a way that non-experts can understand and also so that they see the value in it and don't resist changing? I'm trying to articulate that mixing data and intermediate calculations is a bad thing, but I'm really struggling to come up with an explanation that's easy to understand.
 
@MSC Bobs
If you want an ally, Craig Hatmaker
https://sites.google.com/site/beyondexcel/home
is a strong advocate of the methods you propose, even applying them to financial models as opposed to data analysis.

Some things are on your side in that it is becoming easier to cope with cross-tab structures, either by using PowerQuery and loading to the data model or, coming soon, using modern Dynamic Arrays. The trouble is that the flat files you describe are not a natural way of providing human interaction so your colleagues probably find it easier to interact with cross-tab structures within input forms.

I find alternating or irregular data structures to be even more of an issues than a matrix. That and wanting analysis to be based upon a structured paragraph of information held within a single cell :rolleyes:.
 
Sorry, I didn't really answer your question; more a case of setting out to convince you that workarounds are increasingly possible.

The starting point is probably to acknowledge the strengths of a cross-tab when it comes to creating a user-friendly presentation of any dataset that is characterised by two parameters (use example from your business such as time-period and product line). Where it starts getting messy is when a third parameter creeps in (such as retail outlet) and you start needing multiple tables strung together. Also it doesn't work for raw data where many entries may contribute to a single cell.

The better alternative is then to turn to standard database tables. Here the data appears in a single list rather than a 2-D array. To read the attributes associated with any given term, instead of reading the row heading and the column heading, one finds both pieces of information in columns of the table. There are many more standard tools such as filtering, sorting and aggregating, which are designed to help with the management of data once it is in tabular format and almost all data transfer between computer applications uses data structured this way.

If you wish to study the data in cross-tab formats, that is still possible by using standard pivot tables. Instead of the data being fixed in one format dictated by the input, it can be restructured and presented to answer different questions that may be asked of the data.

I am open to any improvements from those who know better; after all, my spreadsheets tend to be models that are light on data but contain a substantial number of process steps.
 
So, my question is how do I explain database-style structure to people in a way that non-experts can understand and also so that they see the value in it and don't resist changing? I'm trying to articulate that mixing data and intermediate calculations is a bad thing, but I'm really struggling to come up with an explanation that's easy to understand.

I am one of those who has learnt this not because I have been taught, but because I dabbled in relational databases some years ago. I am completely self-taught - never had an IT lesson in my life. Here's how I explain it to people.

Do NOT muddle form and function. There are two stages to your project, and the first is the functional stage (the bit that makes data analysis easy). This needs to be a flat-file layout, which means one row per record. The second stage is the aesthetic (form) stage, where you report on your data in a format that is easily readable for the end user. This might be in the form of a dashboard that brings together the results of your data analysis and sits on top of the underlying functional data.

In short: your desired end result is not where you should start.
 
Back
Top