• Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  • When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

Temperature circle as seen on LinkedIn?

Status
Not open for further replies.

Diederik

New Member
today, when reading on my LinkIn feed I found an animation from a series of global temparature points per country since 1880 and I was stunned by the graphics. The animation was created by a Finnish meteo scientist, Annti Lipponen, and I felt immediately challenged to see if Ii could re-create it in Excel. (link to the linkedin feed: https://www.flickr.com/photos/150411108@N06/35471910724/in/photostream/)
At first I thought about a sunburst chart but that is definitely not the answer. More like a filled radar plot. The animation could be done with VBA.
But I got stuck! First in finding the used data (no luck with that) but most important with the chart itself. Even with a dummy set (all countries per continent, 1 year of fake temperature data) I'm not getting any closer. And I've done my share of custom graphs like vertical waterfalls with subtotals and the usual gannt charts with progress bars..... I even understand the Wall Mart animation from Jorge Camoes his book "Data at Work". So it's not like I don't know how to combine datasets to a chart.
Any thoughts? Or can I find the answer somewhere else on the forum....
 
Hi P45Cal, thanks for the reply. This is exactly how far I had come with my filled radar plot (identical :) ). So it proves I was on the right track and didn't miss any other Excel feature so far.
I like the idea of the "spoke chart" for the call centre report (hourly calls) and it looks promising to try and get all the features of the temperature circle into this type of chart.
Maybe somebody has options on how to color the spokes according to their value.
Ik know you cab change color when inverted/negative but this requires a more subtle approach. From blue towards red.
Kind regards, Diederik (Holland)
 
and didn't miss any other Excel feature so far.
…only if I haven't missed any; I've only installed 2016 recently and haven't explored new charts available.


Maybe somebody has options on how to color the spokes according to their value.
Ik know you cab change color when inverted/negative but this requires a more subtle approach. From blue towards red.
I don't know the base values, nor do I know the colours you want. In the attached I've added colour, but it is only a random colour. You'll need to change the line:
Code:
  xColours(i - 1) = RGB(Application.RandBetween(0, 255), Application.RandBetween(0, 255), Application.RandBetween(0, 255))
to a calculated colour based on the value of RotatedData(i, 1), perhaps in a separate function.
It's not very fast.
upload_2018-7-16_10-49-34.png
 

Attachments

  • Chandoo39161sample file.xlsm
    36.8 KB · Views: 3
p45cal:
thanks, this is a growing thing :)
the color will change according to the absolute delta to the average. So below average it wil be a blue-ish color (cold) and above average it will get red (warmth)
I will look into this tonight after office hours. My boss wants me to do his work during these hours :)
Thanks for this next step
 
Attached.
(I thought I'd posted this yesterday)
 

Attachments

  • Chandoo39161sample file.xlsm
    69.2 KB · Views: 8
Yes , I went there earlier but couldn't find ERSSTv4 version.
upload_2018-7-17_11-47-11.png

I downloaded the
upload_2018-7-17_11-51-25.png
instead but after unzipping it turns out to be a NetCDF file, a format I've never heard of before (it seems to be specific to climate people). If I get time (and inclination) I'll dig a bit deeper. (Why couldn't they supply this in a more friendly format?)
There's more information on the format here:
https://www.unidata.ucar.edu/publications/factsheets/current/factsheet_netcdf.pdf
and here:
https://www.unidata.ucar.edu/software/netcdf/docs_rc/
and here's an online converter (not tried):
https://mygeodata.cloud/converter/netcdf-to-xlsx
and perhaps more promising for Excel is:
https://code.google.com/archive/p/netcdf4excel/
 
I think that file referred to is an SBBX file not a .NC file type which the NetCDF4 programs use?

I also suspect that the data is in 2deg blocks around the globe
So it will probably need consolidation to countries first

Power Pivot could probably do that if we can get it to a CSV or XLSX file type
 
xlsx version of Land-Ocean Temperature Index ERSSTv5, 1200km smoothing:
https://www.dropbox.com/s/fggmypxirycon7a/Land Ocean Temperature Index ERSSTv5 1200km smoothing.xlsx?dl=0
Installed NetCDF4Excel on an old Office 2010 machine using the NetCDF4Excel_3_3_setup.exe file at https://github.com/NetCDF4Excel/project/releases and converted the .nc file.
Most of the sheets are mapping/lookup sheets. Most of the data is 32.767 (one thousandth of the Integer data type max) so I guess these may as well be blank. Not sure what the other values mean. Times/dates seem to tally more or less with Excel dates.
I'll try and lighten the file so it only contains meaningful data, then it'll be a case of mapping lat/lon to country names.
 
FYI - "ERSSTv5" is just the latest version of ERSST (Extended Reconstructed Sea Surface Temperature) data.

*.nc is netCDF format file.
https://www.unidata.ucar.edu/software/netcdf/docs_rc/

SBBX is binary file, that's used to produce the NetCDF file.

data set meta data contains following in 'gistemp1200_ERSSTv5.nc'
File read using netCDF4 module in Python3.6
Code:
root group (NETCDF3_CLASSIC data model, file format NETCDF3):
    title: GISTEMP Surface Temperature Analysis
    institution: NASA Goddard Institute for Space Studies
    source: http://data.giss.nasa.gov/gistemp/
    Conventions: CF-1.6
    history: Created 2018-07-12 10:36:22 by SBBX_to_nc 2.0 - ILAND=1200, IOCEAN=NCDC/ER5, Base: 1951-1980
    dimensions(sizes): lat(90), lon(180), time(1662), nv(2)
    variables(dimensions): float32 lat(lat), float32 lon(lon), int32 time(time), int32 time_bnds(time,nv), int16 tempanomaly(time,lat,lon)
    groups:

And these are meta data for variables.
Code:
<class 'netCDF4._netCDF4.Variable'>
float32 lat(lat)
    standard_name: latitude
    long_name: Latitude
    units: degrees_north
unlimited dimensions:
current shape = (90,)
filling off

<class 'netCDF4._netCDF4.Variable'>
float32 lon(lon)
    standard_name: longitude
    long_name: Longitude
    units: degrees_east
unlimited dimensions:
current shape = (180,)
filling off

<class 'netCDF4._netCDF4.Variable'>
int32 time(time)
    long_name: time
    units: days since 1800-01-01 00:00:00
    bounds: time_bnds
unlimited dimensions:
current shape = (1662,)
filling off

<class 'netCDF4._netCDF4.Variable'>
int32 time_bnds(time, nv)
unlimited dimensions:
current shape = (1662, 2)
filling off

<class 'netCDF4._netCDF4.Variable'>
int16 tempanomaly(time, lat, lon)
    long_name: Surface temperature anomaly
    units: K
    scale_factor: 0.01
    cell_methods: time: mean
    _FillValue: 32767
unlimited dimensions:
current shape = (1662, 90, 180)
filling off

Haven't got time today to study the data set more in detail. But will play with it when I find time.
 
Ok, I've finished studying the data set. One thing to note is that data starts from Jan, 1880 and is monthly.

This poses issue for Excel as Excel isn't well equipped to deal with dates prior to 1900.

As well, data is structured as matrix (lat, lon), so there's bit of transformation needed.

I've extracted out from year 2000 to current year, and flattened the data.

https://drive.google.com/file/d/1-KLIa3axR2ohxjGz7PWoZqp9KUk1ez_4/view?usp=sharing
 
Does anybody here know how to extract the country/year data from the file ?
I've been struggling with this. One way I came across that seems to look good is google maps API, the only problem being it's not so easily available since (just a few days ago) they changed its availability to those who have provided Google with a payment method (it's still free but they want this information). Given the recent mass hacking of personal data, even from reputable companies, that sort of information is not something I'm ready to give to Google just yet, especially as this is not even my project, nor do I want to buy anything else from Google.

So, I've taken various .nc files from that NASA web site, converted them to Excel, then extracted only those points on the Earth which have data associated with them and it comes to about 77k of them. Plotted on 3DMaps within Excel produces this sort of thing:
upload_2018-7-20_18-16-1.png

So, perhaps we can do some collaborative work here (I'll atttach that list of lat/longs here); anyone with a Google account that does have access to Google APIs could perhaps process those lat/longs and produce a lookup table so that we can assign a country and continent to many of those points?

So Hui was quite right with:
Hui said:
I also suspect that the data is in 2deg blocks around the globe
So it will probably need consolidation to countries first

Power Pivot could probably do that if we can get it to a CSV or XLSX file type
but I'm not familiar enough with Power Pivot yet to do this. See linked-to xlsx file in my message #14.

Haven't heard from @Diederik for a while…
 

Attachments

  • UniqueLatLongs.xlsx
    99.6 KB · Views: 2
Last edited:
I have some code to Reverse Geocode Data to get Countries

But was going to attempt it in Power Map also
Will post it later

After getting countries
Filter out the oceans
Then consolidate country data
 
Hi everybody, there has been done a lot of work by you. I've been sort of busy with my job lately so my apologies for being "silent". I was puzzled by the dataset myself with all the Nasa files and couldn't get it in the right format, like "Country";"continent";"year";"temp"
Since 1880 is just a number the Excel date issue is actually not an issue. But anyway I got nowhere near a dataset like I described. And seeing the struggle from others to derive data from the Nasa pages I'm not even a little embarrassed that I failed :)
The original animation was done in Phyton I think and I have no knowledge of that language. I also emailed the Finnish scientist for tips on his dataset, but no answer. The next weeks I'll be on a holiday trip to Italy so I'm afraid i will have to let is rest for I while. But as soon as I can get my hands on a workable dataset I'll try to pick it up again.
 
Excel isn't well suited for visualizing this data. Due to size and how the data is structured.

You can use Python (Anaconda is probably the easiest distribution to work with), as previously mentioned to read contents of NetCDF using netCDF4 module.

Played a little with MapBox API using various Python modules.
Currently I haven't adjusted color scale. So, I believe it's using plotly's default 0 to 1 (0 to 100%) scale.

You can download file from link below.
https://drive.google.com/file/d/1xxRt-4WTCNKD1CStIZG3XoAN4s84aGhB/view?usp=sharing

Code:
from netCDF4 import Dataset
import xarray as xr
import numpy as np
f = Dataset('C:/Test/nc/gistemp1200_ERSSTv5.nc', 'r')
ds = xr.open_dataset(xr.backends.NetCDF4DataStore(f))
df = ds.to_dataframe()

df_sub = df.iloc[(df.index.get_level_values('time') == '2017-12-15') & (df.index.get_level_values('nv') == 0)]

step = 1.0
to_bin = lambda x: np.floor(x / step) * step
df_sub["latbin"] = df_sub.index.get_level_values('lat').map(to_bin)
df_sub["lonbin"] = df_sub.index.get_level_values('lon').map(to_bin)
groups = df_sub.groupby(("latbin", "lonbin"))
df_flat = df_sub.drop_duplicates(subset=['latbin', 'lonbin'])
df_no_nan = df_flat[np.isfinite(df_flat['tempanomaly'])]
import plotly.offline as py_off
import plotly.graph_objs as go
mapbox_access_token = 'Use API key obtained from Mapbox site here'
data = []
data.append(
    go.Scattermapbox(
        lon=df_no_nan['lonbin'].values,
        lat=df_no_nan['latbin'].values,
        mode='markers',
        marker=go.Marker(
            color=df_no_nan['tempanomaly'].values,
            colorscale='Bluered'
        ),
    )
)
      
layout = go.Layout(
    margin=dict(t=0,b=0,r=0,l=0),
    autosize=True,
    hovermode='closest',
    showlegend=False,
    mapbox=dict(
        accesstoken=mapbox_access_token,
        bearing=0,
        center=dict(
            lat=38,
            lon=-94
        ),
        pitch=0,
        zoom=0,
        style='light'
    ),
)
fig = dict(data=data, layout=layout)
py_off.plot(fig, filename='Da_Weather_2.html')

upload_2018-7-23_10-7-42.png
 
To aggregate it to country level, you'll first calculate/find boundaries for each country of interest (E, W, N, S limit of lat & lon value). Then average out tempanomaly values. Or have Lookup table/dictionary listing each lat/lon coordinate (at 2 degree square) and corresponding country/geographic location that it falls under.

Edit: Here's link to csv file that contains min & max values for lat/lon for each country. I believe wrapped means it crosses +/- threshold.
https://data.humdata.org/dataset/bounding-boxes-for-countries
 
Last edited:
Did some more digging. Found Python code used by Antti Lipponen used to generate temperature circle via Python.
https://gist.github.com/anttilipp/6b572512ef53cfc6bf949afdc8eb6720
Note: He hard coded data values.

To get country via lat/lon in Python, you can use number of methods.
1. Import below to your python project/environment
https://github.com/che0/countries
Then download world border shape file from...
http://thematicmapping.org/downloads/world_borders.php
Or file from below.
http://gis-lab.info/qa/vmap0-eng.html

2. Use Google API to retrieve country name from lat/lon coordinate.
 
I messed up few times transforming Pandas dataframe. But I think I got the right data and mapping to country and continent now.

FYI - Added "Arctic" as continent/region.

Average yearly temperature anomaly from 1880-2017 (there is 2018 in there but it's partial), by country.
 

Attachments

  • avgTempAnom.xlsb
    295.5 KB · Views: 5
Status
Not open for further replies.
Back
Top