Higher Education Institutions (HEIs) have always needed to ensure the student data they capture is accurate, complete, and consistent. The introduction of HESA Data Futures this academic year (22/23) means that HEIs are faced with new data quality challenges to navigate. The data required for HESA Data Futures is vast and varied, and includes, for example, the student’s highest qualification when they joined the institution, their parent’s occupation, and their study pattern/location.
In our first blog: Why is everyone talking about HESA Data Futures?we talked about what HESA Data Futures is and how HEIs should prepare. In this blog, we examine the data quality challenges faced by HEIs and outline approaches to address them.
The size and complexity of the HESA Data Futures requirement means that before final submission to HESA in October 2023, providers must:
Many HEIs experience data quality issues caused by disparate system landscapes, challenging data collection processes or a lack of system functionality. These need to be rectified or remediated to adhere to Data Futures requirements. Many HEIs will need to invest additional time and resource to ensure their data is reported in line with HESA’s revised quality rules.
Data Futures has over 1,000 business quality rules that HEIs need to adhere to when building their return. These rules define how each data field should be formatted, which values are accepted, and how they should align with other data points in the record. The specificity and interactions of the data quality rules make it challenging for HEIs to build a compliant return, particularly for those students or courses which are non-standard.
As the return is used for many different purposes, the quality of insight and reporting derived from it can be significantly impacted by poor data quality:
Monitoring and maintaining data quality in business-critical data fields, like those used to populate the HESA return, requires ongoing effort and activity. Remediation of poor data quality can be time-consuming and costly if not dealt with promptly.
In addressing data quality for HESA Data Futures there is an opportunity for HEIs to build a rich data landscape for wider business benefits, such as: developing reliable insight into attrition to drive decision making, and delivering an autonomous and personalised student experience. High quality student data can be realised across an institution through an effective data quality management programme and embedded data governance.
Below is a cyclical data quality improvement process that can help institutions deliver higher quality data for HESA and beyond:
High quality data requires input and buy-in from across the institution. If there is an absence of dedicated owners across the business who are incentivised to improve data quality, there may be a lack of time or resource to complete remediation tasks, leaving the institution with a list of unresolved issues. Before institutions go ahead with data quality remediation and cleansing activities, institutions should aim to have:
To support cross-business collaboration efforts, institutions should consider a communication strategy dedicated to ensuring that the business appreciates the significance of HESA Data Futures and can take ownership of data quality remediation activity.
Don’t have Data Owners? Take a look at our blog on Data Ownership: What’s in a Name?
Institutions should understand their business critical and HESA-critical data points so that an informed, prioritised data quality plan can be developed. HEIs should:
The data quality lead should collaborate with business data owners to translate the business requirements developed in step 1 into data quality rules that account for HESA’s requirements. Here are some example requirements to consider capturing in the data quality rules:
Once the data quality rules are configured, institutions should profile their data. The aim of data profiling is to identify issues in the data that require remediation. The data profiling assessment should align to the business and data requirements defined in steps 1 and 2. As a minimum this should involve:
Once identified, data quality metrics should be developed and combined to create a data assessment report. The data assessment report will identify critical data elements that contain significant quality issues and require prioritised remediation action.
To support data profiling, troubleshooting, and reporting activities HEIs are likely to need certain data quality software. Although HEIs will have access to the HESA Data Platform to test their data against HESA’s quality rules and requirements, they should consider using additional data quality tools to perform ongoing data validation testing as a business-as-usual activity.
Once data quality issues have been identified, there should be a clear issue escalation process to follow. Organisations should:
Improving data quality is not a one-off activity. Data quality assessment and remediation should be embedded in the organisation as a business-as-usual process, as part of an institution-wide strategy to become data-driven. Continuous monitoring and maintenance of student data quality will soon become even more pressing for HEIs as Data Futures introduces twice-annual collections in 24/25, and in later years when near-live reporting will be required.
Are you prepared for HESA Data Futures? To find out or for more support, please get in touch for a Deloitte HESA Readiness Assessment or a discussion with the any of the contacts listed below.
This is the second in a series of HESA Data Futures blogs from Deloitte.