• Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  • When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

Extract digits before the decimal point.

persol

New Member
Hi Chandoo, hope my note finds you well.

My question today has to do with something I have tried but can not find a way to resolve. I want to extract digits from a set of numbers (both negative or positive). For instance, take as an example the below set of amounts:

A) $12,340.67

B) $98,700.21

C) $-7,892,000.98

D) $2,000.23

E) $-70,201.16


I want to be able to extract (separately in various columns) the 1st, 2nd, 3rd digits before the period: A) 0,


A) 0 A) 4 A) 3, etc

B) 0 B) 0 B) 7, etc

C) 0 C) 0 C) 0, etc

D) 0 D) 0 D) 0, etc

E) 1 E) 0 E) 2, etc


I have started a formula =(MOD(A2,10^1)-MOD(A2,10^0))/(10^0) that seemed to work but the negative numbers are giving me a bit crazy. Is there a chance I can rely on your expertise? Any help will be greatly appreciated.
 
Hi,


Try simply the mid and find string functions.


For instance place your numbers in Col A, and labes in Col B, Col C and Col D as below.


Col A Col B Col C Col D

Number Unit Tens Hundreds (Labels for your ref)

-70201.16 1 0 2


use formulaes

Col B = =MID(A3,FIND(".",A3,1)-1,1)

Col C = =MID(A3,FIND(".",A3,1)-2,1)

Col C = =MID(A3,FIND(".",A3,1)-3,1)


note: the find function gets your the decimal position and then - 1 and incrementing gets you the digit like units, tens and hundreds.


This is specific to your question, but I am curious to know what you are trying to achieve?


Regards,

Prasad
 
Prasad, I surely appreciate your help. I just tried it and the #value! message is the result, Am I missing a thing?
 
Sorry Prasad, I had missed the last part of your question. I have this schedule I am trying to finalize before I use it as a basis for a dash board. Basically, I want to be able to look at all amounts in a column (a ton of purchases made by procurement department) that would most likely have from one to six digits that would most likely be zeros (because purchases without an approved purchase order (or without one) usually are round numbers that end in zero, two zeros and so on). Hence my need to isolate these numbers and then filter them out. I can perhaps sort the amounts and visually scan through 600 thousand records and fall asleep at my desk. You see my dilemma? Let me cut and paste a piece on my list of amounts:

$212.00

$435.00

$2,500.00

$2,500.00

$5,000.00

$300.00

($300.00)

($349.00)

$115.00

$234.00

($349.00)

$115.00

$234.00

$15,000.00

($10,000.00)

($5,000.00)

($300.00)

($380.00)

($330.00)


I am looking for a way to count the number of zeros prior to the decimal point because it will help me isolate which amount needed an approval up to $25 (Rupee) or Bolivian Pesos in my case, an approval of up to $250.00, $500.00 $1,000.00, 10,000.00 or even $1,000,000.00. In my daily work I noted that some folks split their purchases to avoid approvals so they take $700 purchase and submit one for $500 and then another for the remainder, that way the employee does not have to explain to someone higher that he/she is spending more than allowed or did not have the guts to ask for the approval form the higher approving manager. I hope I did not put you to sleep with this note. But, basically that is the problem I have and want to present with dashboards (among other things).


Thanks,

Persol
 
Persol, sooner or later people at procurement would catch up and simply put in a purchase without PO by not submitting a zero next to the decimal point. Plus you still another process to add up all the split purchases going to the same vendor. How would tracking the number of zero before the decimal point assist you in identifying the "cheaters"?
 
Here's a single formula that should work, and returns a numerical value. Assuming first number is in A2:

=INT(MOD(INT($A2),10^COLUMN(A$1))/10^(COLUMN(A$1)-1))

Copy to the right as far as would be needed. First column will the the ones digit, then ten's, then hundred's, etc.
 
=SUMPRODUCT(IF((A1/(10^(ROW($1:$6))))=INT(A1/10^(ROW($1:$6))),1,0))


Use this as an array formula (Ctrl-Shift-Enter) and you can have it in a single cell. Doesn't work if you just use enter.


Note: Increase the "$6" to get numbers larger than 1X10^6.
 
Hi Luke,


using your formula above, the negative numbers would not apply properly. if you have a -5024, the digit showing up would be 6, 7 and 9.


However, I guess it can still be used as a flag-out purpose.
 
Hi Persol,


If you incorporate
Code:
ABS() around the reference to [code]$A2 the sign will be eliminated for formulas that don't already handle negatives.


For Mike's version you could handle any number of places by replacing [code]ROW($1:$6) in his formula with ROW(INDIRECT("1:"&A1/(FLOOR(LOG10(ABS(A1)),1)+1)))
. Of course, you still have to put the array in a large enough range of cells.


If all you want to know is the number of decimal places, you can use:

=FLOOR(LOG10(ABS(A1)),1)+1[/code]


You could also use

=LEN(TEXT(TRUNC(A1),"#;#"))[/code]

to get the number of places.


If it's being calculated for a lot of cells, the mathematical solution might be faster. Actually, as written, the mathematial solution will give an error if there are 0 significant digits before the decimal (i.e. $0.25) will give a #NUM error, but $1.00 will be 1.


Asa
 
Hi Persol,


Hope you were able to apply the solutions mentioned above. My initial formula works if you have any numbers after decimals, else the find() function did not get any "." position and resulted in #value.


I am not so good at CSE Formulaes, so i continued to work with basic formulaes, here is updated one. Assuming A2 has value, Place this in

Col B =(MID(A2,IF(ISERROR(FIND(".",A2,1)),LEN(A2)-0,FIND(".",A2,1)-1),1))

Col C =(MID(A2,IF(ISERROR(FIND(".",A2,1)),LEN(A2)-1,FIND(".",A2,1)-2),1))

Col D =(MID(A2,IF(ISERROR(FIND(".",A2,1)),LEN(A2)-2,FIND(".",A2,1)-3),1))


This works when there is whole number or/and neg numbers as well.

Note: how the number is decremented and incremented to find unit, tens and hundreds.


Finally, I was wonder if you try using pivot table, you might find easier methods of group by submitter, date or vendor. Just a thought.


Regards,

Prasad
 
I think, that text formulas can be useful in this case, for instance:

Code:
=TEXT(A3,"000000000.00;000000000.00-;000000000.00")
will turn numbers into text of constant lenght with leading zeros, the only exception will be negative numbers with "-" added at the end.
 
Thanks Fred for spotting that error (I keep forgetting that INT goes toward smaller number), and thanks to asa for pointing out a solution.
 
Folks, I have to thank you all very much, I am not just saying it lightly, my heartfelt thank you to all of your suggested solutions and the effort invested, thank you indeed. All for the various suggestions I received made me think hard (loved it) and noted that . I used some suggestions for other purposes, yeah!!! Gave other potential uses.


Prasaddn, hit it closer to what I was looking for. The 3 formulas do give me the expected digits; however, when the amounts are negative (-$1.50) and am looking for the 2nd or 3rd digits I either get a "_" or "#value!". I can easily ignore them and filter all my zeros by bucket and be done with my tables and prepare my charts. However, I was wondering if there is a way to get an empty or blank space instead of either the "-" or the #value! results. I tried inserting the "" to push a blank but it did not work.


To help you visualize, I just prepared the below table that outlines the various degrees of zeros. Where there are no zeros (currently I see "-" and "#value") I manually bypassed them. So, if you imagine my first 9 raws, i expect to see the actual count of each zero from each amount, if there are no zeros in the amount then leave it empty.


Number Levels of Roundness


4,230.00 1

1.000.00 3

2,000.00 3

975.00

123.81

9,000,000.00 6

300.00 2

100.01 2

10,000.23 4


At the end of the day, my litle chart will summarize all the 1s, 2s, 3s, etc as follows


Zero Levels Levels of Roundness Amounts


1s 1,025 (as an example $723,000 (an assumed totals)

2s 721 " $1,000,000 "

3s 7 " $10,000,000 "

etc


I used the =countif and the =sumif to summarize the above counts and the amounts.


Finally, Prasaddn, the first thing that came to my mind was the trusty pivot table but it did not give me what I explained above. But your formulas made me smile from ear to era, thank you so much.


Cheers,

Persol
 
Mike's should be more efficient than text processing, too.. It didn't work for me as-is though, I had to truncate the digits after the decimal with this modification for it to work:


Code:
=SUMPRODUCT(IF((TRUNC(A1)/(10^(ROW($1:$6))))=TRUNC(A1/10^(ROW($1:$6))),1,0))


Ctrl-Shift-Enter this array formula.


His formula required the cents to be removed, thus the TRUNC(A1) instead of just A1, and I changed the INT() later in the formula to match. TRUNC removes the digits after the decimal without rounding (or could be called rounding towards 0), INT rounds the number down -- the difference is in the negative numbers.


I suggested an alternative to increasing the 6 in ROW($1:$6) previously to have the formula handle larger numbers, but I wouldn't be surprised if the formula was just as fast or faster simply increasing the row #s for the largest amount you might ever handle. 6 handles numbers up to the millions (max 6 trailing zeros), 7 10 millions, 9 billions...
 
Mike and ASA, I revisited your suggested formula, and I am extremely pleased that it worked just beautifully!!!! I apologize that I did not copy it correctly, I missed a cell reference and later it changed due to a move. Now I decided to use cell names to avoid that problem.


Hey guys, thank you all so much!!!! Brilliant work.


Be well,

Persol
 
Mike and Asa,

I celebrated too soon, I seemed to have an error that reads: "Excel ran out of resources while attempting to calculate one or more formulas. As a result, these formulas cannot be evaluated."


All I tried to do was to create the column where the array would look for the amount and the array listed on next column. For illustrations, refer to the file I attached in Windows Live (Skydrive) https://skydrive.live.com/?cid=d8c3f9f208acdadb&sc=documents&event_source=CloseButton#!/?cid=d8c3f9f208acdadb&sc=documents&id=D8C3F9F208ACDADB%21107


Inside the file "Chandoo_Tests" you will see two columns, one for purchases and another for receivables. Beneath each there are two columns named amounts and digits. I would like the formula to reside on the columns titled "digits" on down to the last raw (in my original file I have up to a million raws.
 
You're very gracious, Persol.


I have another idea.. unfortunately -- it doesn't work!!! There's no reason for it not to, though, other than the fact that I've obviously madfe a basic error in my formula.


Someone handy with array formulas (Mike?) will hopefully be able to fix this:

Code:
=MAX(GCD(TRUNC(ABS(A1)),10^(ROW($1:$9)-1)))


This should return the greatest power of 10, or it's multiple, that the number (after decimal truncation) is divisible by.


The MAX() part is not working because for some reason the rest of the formula doesn't seem to be generating an array.. Maybe Mike knows why?


If working, this formula would give you.. (I'll put a $ amount followed by the return of the formula):

[pre]$      12,340.67   $          20
$      98,700.21   $         100
$  (7,892,000.98)  $       4,000
$       2,000.23   $       2,000
$     (70,201.16)  $           1
$    (999,999.96)  $           1
$   1,000,000.00   $   1,000,000
$  10,000,000.00   $  10,000,000
$ 100,000,000.00   $ 100,000,000
[/pre]
This is a similar but slightly higher resolution measure of roundness. The first $ amount is "twenties of dollars", the second "hundreds", the third "four-thousands", etc.


It could easily be modified (with the array portion working) to find roundness other than powers of ten, too. Maybe $5s, $25s,...


Peace

Asa
 
Asa, I appreciated the update, thank you for the kind words. However, all of you that volunteer help are the most courteous and most kind.


Cheers,

Persol
 
Hi Persol,

I just saw your prior update about trouble... I can't open the file on Sky Drive, it says "This item might have been deleted, expired, or you might not have permission to access it. Contact the owner of this item for more information."


Make sure it is publicly accessible.

Thanks

Asa
 
Hi Asa,


I just pasted the document in a folder called Chandoo (sorry real Chandoo, it is easy to remember and identify). The folder is in MediaFire.com, se the link below:


http://www.mediafire.com/?5by06dqmzffl8if the fiel is tittled "Chandoo_Test.xlsx"


Now, I made sure it was unprotected and is shareable with the public - I have no confidential data in there. I hope it helps you.


Cheers,

Persol
 
The first of the mediafire links worked fine, didn't try the second.


OK--

* in the first tab (Degrees of Roundness) the formjula in column B was fine except that the ROW portion was changed to
Code:
ROW($3:$8) -- it must be [code]ROW($1:$6) for up to 6 zeroes.. you can change the 6, but the first number must be a 1.

* in the "ASA's" tab, the formula is entered as a 29-cell array ranging from B1:B29.  It needs to be entered as a single cell array, then that array copied to the remaining cells.


One more update recommendation for the formula:

[code]=SUM(IF((TRUNC(A3)/(10^(ROW($1:$6))))=TRUNC(A3/10^(ROW($1:$6))),1,0))


The SUMPRODUCT
can be changed just to SUM[/code] - as there are no factors listed, just the one calculation.


I am not getting any out of memory errors.. Of course, I don't have a million rows.


It may make sense to replace the array formula with a short UDF function in VBA, which may require less memory. The array function can also be replaced with either a multi-column calculation or a really long formula. But I don't see why it would run out of memory. Make sure you enter the formula with just one cell selected when you hit ctrl-shift-enter. A large array could cause memory problems. Then copy that one cell down the column. If you already have a multi-cell array (as in the ASA's) tab, you have to highlight the entire array of cells at once to delete/clear them.


If you are interested in my last formula (with GCD[/code]), in absence of a working array-formula version of it, it would also be simple to calculate that with a few helper columns, which I can describe.
 
Alright Asa, I like your updated formula. It does all that I need. It is now easier to create a table that summarizes the digits and easier to pinpoint the potential problems. I already got dynamites for my meeting scheduled for next Monday.


One thing that has been bugging me, how did you figure that out? I was trying to put myself in your shoes thinking it logically and was not going anywhere … it must be that I was up all night trying to solve few other schedules. Coffee did not help, the clock was marching on none stop … you know where I am going. Why does the array (formula above as applied to the template you saw) needs to highlight the first 6 raws? I was expecting it to be the columns (say $A:$A)? Would you mind sharing that bit of knowledge?


Hey Asa, thank you so very much for all the support and patience.


Cheers,

Persol
 
Back
Top