Last week over at the Chandoo.org Forums, Birko asked a question about How to Import some Numbers as Times.
“I have imported some data that comes in as a number that I need to convert to h:mm. The data string will be either 1,3,4,5,6 integers long and looks like this…eg
Imported Need to equal this
Number h:mm
0 0:00
100 0:01
1000 0:10
10900 1:09
235900 23:59
Can someone please provide a smart formula to convert this (assume data is in cell A1).”
Today in Formula Forensics we will look at how this problem was solved, and the solution which may surprise you.
Importing Numbers as Times.
When I first saw this data I start by looking at patterns.
Working backwards through the list
I can see that 235900 is 23 Hrs, 59 Min and 0 second
I can see that 10900 is 1 Hr, 9 Min and 0 second
I can see that 1000 is 0 Hrs, 10 Min and 0 second
I can see that 100 is 0 Hr, 1 Min and 0 second
I can see that 0 is 0 Hr, 0 Min and 0 second
I then start to think about how to extract the Hours, Minutes and seconds independently from the Text using a series of Left, Right and Mid functions, and quickly realised that due to the varying lengths of the strings, That they will end up being complex formulas as I will need to allow for each string length.
What if I pad the strings with leading 0’s and then extract them.
That is possible, but as a single formula it will be long and cumbersome as the padding has to occur a number of times for each Hour, Minute and Second as part of the Time() function.
So padding may work but is cumbersome, then a bright light moment
What about I use the Text function to do the padding.
And I quickly posted the following formula:
=(LEFT(TEXT(A1,"000000"),2)/24)+(MID(TEXT(A1,"000000"),3,2)/1440)+(RIGHT(TEXT(A1,"000000"),2)/(24*3600))
As Time is just a number between 0 = midnight and 0.999999 = 11:59:59 pm, I can extract the Hours, Minutes and seconds separately and then simply add them together to get the actual time
I Can use the Text function to display the Strings in a consistent format that allows me to use the Left, Mid and Right functions to retrieve the Hours minutes and Seconds from the appropriate places.
Lets work through this formula section by section and see what is going on.
Hours
The Hours component of the formula is
=(LEFT(TEXT(A1,"000000"),2)/24)+(MID(TEXT(A1,"000000"),3,2)/1440)+(RIGHT(TEXT(A1,"000000"),2)/(24*3600))
=(LEFT(TEXT(A1,"000000"),2)/24)
Working from the middle out, this formula takes the value in A1 and displays it as a Number with the format “000000”
So using our data
235900 will convert to 235900
10900 will convert to 010900
1000 will convert to 001000
We can now use a Left() function to extract the hours from the first 2 characters of the converted string
Using our examples:
Left(235900,2) = 23
Left(010900,2) = 01
Left(001000,2) = 00
To convert hours to a Time we simply divide by 24
Minutes
The Minutes component of the formula is
=(LEFT(TEXT(A1,"000000"),2)/24)+(MID(TEXT(A1,"000000"),3,2)/1440)+(RIGHT(TEXT(A1,"000000"),2)/(24*3600))
=MID(TEXT(A1,"000000"),3,2)/1440
Once again, Working from the middle out, this formula takes the value in A1 and displays it as a Number with the format “000000”
So using our data
235900 will convert to 235900
10900 will convert to 010900
1000 will convert to 001000
We can now use a Mid() function to extract the minutes from the middle 2 characters of the converted string
Mid(235900,3,2) = 59
Mid(010900,2) = 09
Mid(001000,2) = 10
To convert Minutes to a Time we simply divide by 1440 (1440 is how many minutes are in a day = 24 * 60)
Seconds
The Seconds component of the formula is
=(LEFT(TEXT(A1,"000000"),2)/24)+(MID(TEXT(A1,"000000"),3,2)/1440)+(RIGHT(TEXT(A1,"000000"),2)/(24*3600))
=RIGHT(TEXT(A1,"000000"),2)/(24*3600))
Once again, Working from the middle out, this formula takes the value in A1 and displays it as a Number with the format “000000”
So using our data
235900 will convert to 235900
10900 will convert to 010900
1000 will convert to 001000
We can now use a Right() function to extract the minutes from the middle 2 characters of the converted string
Right(235900,3,2) = 00
Right(010900,2) = 00
Right(001000,2) = 00
To convert Seconds to a Time we simply divide by 86400 (86,400 is how many seconds are in a day = 24 * 60 * 60)
Total Time
To get the total Time we simply add the Hour, Minutes and Seconds together
=(LEFT(TEXT(A1,"000000"),2)/24)+(MID(TEXT(A1,"000000"),3,2)/1440)+(RIGHT(TEXT(A1,"000000"),2)/(24*3600))
Download
You can download a copy of the above file and follow along, Download Here.
Formula Forensics “The Series”
You can learn more about how to pull Excel Formulas apart in the following posts
Formula Forensics Needs Your Help
I urgently need more ideas for future Formula Forensics posts and so I need your help.
If you have a neat formula that you would like to share and explain, try putting pen to paper and draft up a Post like above or;
If you have a formula that you would like explained but don’t want to write a post also send it to Chandoo or Hui.
















7 Responses to “Extract data from PDF to Excel – Step by Step Tutorial”
Dear Chandoo,
Thank you very much for this and it is very helpful.
However, all the Credit Card Statements are now password protected.
Please advise how can we have a workaround for that
Hello sir,
How to check two names are present in the same column ?
Thanks and Regards
Hi, Thank you for the great tip. One problem, when I click on get data >> from file, I don't see the PDF source option. How can I add it?
I tried to add it from Quick Access toolbar >>> Data Tab, but again the PDF option is not listed there.
I am using Office 365
Hi, Thank you for your video. I see you used the composite table, but I when I load my pdf, it does not load any composite table. It has 20 tables and 4 pages for one bank statement. I have about 30 bank statements that I want to combine. Your video would work except that I can't get the composite table and each of the tables I do get or the pages does not have all the info. what to do?
Dear Chandoo,
How do we select multiple amount of tables/pages in one PDF and repeat the same for rest of the PDF;s in the same folder and then extract that data only on power query.
Thank you
Hi, Thank you for your video. I see you used the composite table, but I when I load my pdf, it does not load any composite table. It has 20 tables and 4 pages for one bank statement. I have about 30 bank statements that I want to combine. nice share
One bank statement takes up 20 tables and four pages in this document. I need to consolidate roughly thirty different bank statements that I have. Your video would be useful if I could only get the composite table, which I can't for some reason, and each of the tables or pages that I can get is missing some information.