The alpha, beta, delta of HESA Data Futures

It feels like we’ve been talking about this for years. Now it’s finally here: the long awaited HESA Data Futures coding manual. Well actually, it’s been out since mid-October, but at the time I was a little too preoccupied with other activities to take a proper look! The cynical part of me likes to wonder if this was a deliberate ploy to delay release until we were all completely absorbed in this year’s HESA Student return. But you wouldn’t do that HESA, would you? (Would you?! And I thought we were friends!) Seriously, though, something this important shouldn’t be rushed and this is why I was pleased to see some flexibility given in the timeline, meaning that we won’t actually finalise a “real” Data Futures return until September 2020. We still have a mandatory “trial” submission to contend with, probably between September 2019 (get real!) and March 2020, but at least autumn 2019 might not break us all now. If you’re planning on getting involved in the Beta pilot, you’re likely to be expected to make a couple of submissions during the 2018/19 academic year, but I’m sure we’ll be forgiven if we’re missing a few chunks of our data.

As for the Alpha pilot, the expectations have been fairly light touch thus far. Aside from attending a few workshops to weed out some of the dafter ideas before release to the general public, we won’t really be doing a great deal until the New Year. This is when we start to get our hands dirty with all of those awkward scenarios – you know, the flexible part-time options, mixed mobility experiences, and that one student who started before you were even born and is still somehow around after five course transfers interspersed with a few of years of dormancy and several repeat years due to various mitigating circumstances. If we can make the new data model work for them, it’ll work for anyone! As well as proof of concept, it will be a valuable opportunity for us to feedback on prototype collection and submission solutions. In the meantime, look out for more information coming from HESA in January regarding “in-scope” data and a mapping between the existing returns and the final DF data dictionary.

One approach HESA are strongly encouraging us to consider is sending small, frequent updates or “deltas”. This means that data would only be collected where it is new, changes or needs to be corrected. I still haven’t fully worked out what the implications of this might be, but one thing that strikes me is that this isn’t necessarily about reporting updates that have occurred on our own systems. Rather, it’s about reporting where we now hold different data to HESA. This is a subtle difference, but might have large implications for how we approach it. As far as we’re aware, the option will remain to use the “brute force method”, as HESA describe it, of submitting the whole file each time. In which case, it will be interesting to see how many of us dance to HESA’s tune here.

Before writing this blog, I thought it might be a good idea to remind myself what the point of all of this is meant to be. As HESA describe it, “Data Futures is HESA’s transformation programme that will deliver the vision for a modernised and more efficient approach to collecting data, to deliver better output for a wider range of data users.” What this means in practice is signing off data for release into the big wide world three times a year (but, as they keep telling us, it’s not three HESA returns, right?) for HESA’s statutory customers to get their hands on it a little earlier. Now this latter part is kind of justified really – it does take an awfully long time for the sector to publish data on its new entrants every year, so the argument goes. And, after all, “timeliness is a dimension of data quality” (that one’s for you, Dan Cook).

But this is supposed to be about burden reduction for HEPs as well. If the “three times a year” bit wasn’t bad enough, some of the new collection requirements don’t fill me with any more hope. For instance, we will need to return detailed data on all of our student placement activity – that’s right, including everything in the UK. We have a huge number of students out on placement each year, so this is a big deal for us (is anyone else concerned about this? I’d be keen to hear). At least our Placements Governance Manager seems quite excited about the prospect of having a more consistent, managed dataset which can be benchmarked against external sources.

I’ve also heard whispers that HEFCE may be expecting qualifications on entry data to be ready for the December dissemination. Are they for real?! We can provide some qualifications data at that point, but the question is what will be done with it? If they think it’s going to be in a fit state for calculating average tariff values, for instance, they need to spend a cycle working in an admissions office to get a reality check. Don’t even get me started on the tight December sign-off window. On the other hand, MODE seems to be a lot more straightforward. Every cloud and all that.

Whilst I think we’re all struggling to see this fabled reduction in burden right now, there are some opportunities if we choose to see them. Amongst other things, let’s not forget early dissemination of data should mean we will also get to see that incredibly useful benchmarking data that bit earlier (I see my Planning colleagues rubbing their hands with glee!). Moreover, I do find the HESA data validation checks to be very useful in picking up issues we’d missed, so having these throughout the year will be a boon, particularly with the introduction of tolerances and overrides.

More significantly, though, we have the opportunity here to bring data returns activity closer to the processes themselves (dare I say, even to integrate them). This is a real win as I see it. If we can get the processes right, the data we need should fall out naturally. This is one of the reasons we’re building a Data Futures stream into many of our process reviews and system developments, rather than just simply having Data Futures as a separate project. Key to delivering this will be development of our data management and governance capabilities, which will also compliment and support GDPR implementation (for more on data governance, take a look at Marian Hilditch’s blog from August).

The more I think about this side of the Data Futures coin, the more I feel like maybe I could throw out those comfortable ‘end-of-year’ slippers, which are beginning to have more holes than substance, and pop on a shiny new pair of ‘in-year’ ones. Sure, it will take a little while to break them in, but maybe, just maybe… they might actually be a better fit when I eventually do.

Chris Carpenter, SROC Committee, 7 December 2017

< Back to all news