• Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  • When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

Mastering the Exclusive OR of Multiple Occurrences

Richard Riffle

New Member
Hello everyone. I've been a user of Chandoo.org for a while now but this is my first post. I understand the great service offered by the members of this organization and I'm hoping I'll benefit by your collective knowledge. Here's my problem...

I have a spreadsheet that has 40,000+ rows and 167 columns containing 6,500,000+ data values. Each row of data is a record for a specific time interval with the first column of data containing a counter for each time interval.

Of the 166 other columns of data:
  • 81 columns contain Boolean data (represented by 0 = False, 1 = True)
  • 5 columns contain integer values representing enumerated data (represented by 0 = EnumeratedValue1, …, N = EnumeratedValueN and the maximum value of N is11)
  • 78 columns containing real numbers
  • 2 columns containing integer values)
I would like to reduce the amount of data contained in the spreadsheet for several reasons: to more easily identify changing values; to improve the responsiveness of Excel; and, ultimately, to make it easier to import into Visio as a series of timelines.

I used the approach outlined in Excel to the Next Level by Mastering Multiple Occurrences to create a column containing every counter value (the time interval from the first column) corresponding with each possible value of a Boolean or enumerated data column. Shown in the attached spreadsheet.

Rather than every counter value for each Boolean or enumerated data column – all 40,000+ of them per column – I’d like to have just the counter values for the changes in the data. Shown in the attached spreadsheet.

I'd like to do something similar with the real data using a change in value greater than a certain percentage of full range of the data. Shown in the attached spreadsheet.

Conceptually for the Boolean data, the counter information for a row should only be extracted if the exclusive OR of the data value in the same row and the data value of the row preceding it is TRUE. I suspect the Boolean case will be extensible to the enumerated and real data cases...

Ideally, there would be a macro that would process the data in the source spreadsheet file and place the data in a target spreadsheet file, either in columns in a single worksheet or in separate worksheets for each data column.

Any assistance would be greatly appreciated.
 

Attachments

  • Data reduction example.xlsx
    19.2 KB · Views: 6
Way past my bedtime, so despite incompleteness and in haste.
In the attached the important column is K; change the value in cell I1 and see changes in column K. The columns to the left are only my workings leading up to that formula in column K.

ps I tweaked your formulae further to the right in the sheet since they were taking so long to calculate because they were using entire columns.
 

Attachments

  • chandoo33288 Data reduction example.xlsx
    21.7 KB · Views: 6
p45cal, thank you for replying to my message. I have downloaded the example file and hope to use it soon. After a short review I think it will serve my purpose. Thanks again.
Regards, Rich
 
Back
Top