You don't need to be an 'investor' to invest in Singletrack: 6 days left: 95% of target - Find out more
Sorry - I thought I was OK with Excel but I need to join the legions of "Excel help!" threads... Bit more generic this one.
It's highway metrics - specifically vehicle counts. In this case cycling but there are other sheets covering cars etc.
I've inherited a vast spreadsheet due to a guy in my team retiring. He's done this for years and although he was very conscientious with it, there's almost no coding and what there is is very messy, stuff like:
D9+D10+D11.....+D23 rather than simply =SUM(D9:D23) so it's sometimes difficult to see exactly where some of the figures have been derived.
In the column is listed the sensor from which the count was taken. Down the left is month/year.
And then a number stating the monthly count of cyclists across that sensor. All well and good except it's incredibly hard to actually analyse, it is literally just vast columns of numbers and a few of % which is easy enough to work out.
He's also added the counts from each sensor to give an overall monthly count across all sensors but then manually typed that number into a field in a new sheet (simply called "monthly totals") so rather than each sheet talking to the other, it's all been manual input. 🤯
I've got a nasty feeling I'm going to need to re-do the whole thing cos this goes well over into column BD so it's a shedload of side to side scrolling (or hiding columns). Also new sensors have been added in more recently, some old sensors taken offline so direct comparisons going back 10+ years are near impossible - part of my role here is to re-baseline the info rather than be comparing to historic data.
How would you sort and display this data and start to derive trendlines, % change etc? I've got a feeling that pivot tables should be able to do a lot of this but there's just so much data, so many columns of numbers that I'm a bit overwhelmed with where to actually start!
Any guidance or suggestions much appreciated!
Trace precedents and trace dependents is a very handy tool when you are trying to unpick someone else’s logic, or lack of it.
If I understand it correctly, the raw data is pretty basic (counts by month, by transport type, for each sensor), there's just a lot of it? As someone who works with big spreadsheets that are just columns and columns of numbers, I'd say that this is what pivot tables are really good at, and they have the ability to be changed easily on the fly.
I'd be tempted to just do a pivot table with the whole of the raw data as the source data, and just have a bit of a play. You'll get a sense of what works and you can go from there
...oh, and also as someone who inherited massive spreadsheets from someone else, it's often easier to start again than try and unpick what they did.
of course, at some point some poor bugger is going to inherit mine 🙂
+1 to IHN
sounds like it should be in a database / access.
but in excel you could always use pivots, after all the front end of power BI / microstrategy / et al is just a fancy pivot table, holding all the data, to slice and dice as you see fit
personally i'd start it all again, a few days effort over working with sh1te is my preference always
I'd echo the start again comments. Preserve the raw data, work out what output you need then build it yourself in a way you can understand and maintain (and not necessarily in Excel). Yes there will be differences between the figures they got and the ones you get, but understand those and where appropriate justify why they've changed.
The question I'd be asking is "is Excel the best tool to process this data" and then going from there. It's sounds to me an awful lot like you're hammering in screws and looking for a better hammer.
Also worth looking at the end to end process - where does the data come from, how is it delivered to you, what's the best way to get that into a processing tool, where does your output go, how often is it updated, who needs to see it. Some of that stuff might point you away from Excel.
The question I’d be asking is “is Excel the best tool to process this data” and then going from there. It’s sounds to me an awful lot like you’re hammering in screws and looking for a better hammer.
True, but sometimes you only have a hammer available.
Also worth looking at the end to end process – where does the data come from, how is it delivered to you, what’s the best way to get that into a processing tool, where does your output go, how often is it updated, who needs to see it. Some of that stuff might point you away from Excel.
Old sensor data came off induction loops buried in the road or on cycle paths and that info was downloaded monthly by the County Council and sent over.
New sensors are Vivacity which are live, real time etc and have a nice dashboard where you can download reports (usually .csv although some you can get as pdf) so it's vastly superior data.
It’s sounds to me an awful lot like you’re hammering in screws and looking for a better hammer.
Currently I only have Excel unless I can persuade the council to cough up for a better analytics package. I've got a meeting with Vivacity in a couple of weeks to look at their analysis options and methods which should be interesting.
Just played around with swapping rows and columns and actually it's starting to take shape a bit more. It's useful discussing it with you all because you're all tech minded and it's a really helpful forum. 🙂
If you moved it to a Google Sheet (which is almost exactly the same as Excel anyway) you could use Looker Studio to graphically visualise all the data in realtime (well almost..... 15 minute delay)
Makes you look really clever 🙂
https://developers.google.com/looker-studio
True, but sometimes you only have a hammer available.
Other tools are always available - you just need to know what to look/ask for! Sometimes the learning curve with something else (e.g. python is free!) looks daunting, but can actually pay off. Sometimes you actually need to whack a screw in right now and you don't have time to go find the right tool to do it properly.
Looks like the forum ate my response. If I'm understanding the layout correctly, Power Query will let you transform it into something much easier to work with. Rather than a column for each sensor you end up with 3 - date, sensor, value. This will be much easier to use with pivot tables, charts and the various analysis tools in Excel.
In a new file, Data > Get Data > From File. Load in the source data table and go to Transform Data. Select all the sensor columns (not the date) and choose Unpivot Columns from the Transform tab.
I set up a few other sheets within the workbook and tried various options including Power Query but eventually ended up splitting the old (induction loop) data out and just pulling through the far more comprehensive Vivacity data and focussng on that which cut out a huge chunk of data. I've archived the old set.
Rearranging all the info has given me a much clearer sheet with a set of Pivot Tables to analyse it which has the benefit of removing all the % change columns from the sheet too.
Just shown it to a colleague who was absolutely blown away by it all.
Going to add in a dashboard and tidy it up then present it to the manager next week - I know he's not great with Excel so it'll look like magic to him. 🤣
Thanks for the help - in particular @IHN for confirming my suspicion that starting from scratch was the best way to go and @jam-bo for the Trace command which helped unpick the utter mess of "coding" that the previous incumbent had going on.