Skip to main content

Unleashing Advanced Analytics with Deloitte SAP Data & Analytics on Absenteeism Prediction

At the start of Q1, we embraced a significant opportunity to work with
one of our clients on an advanced analytics proof of concept as a
co-development approach. With this collaborative effort, our primary objective was to evaluate the maturity of our client's business intelligence
infrastructure for advanced analytics. This objective directly supported our
overarching mission of SAP Data & Analytics, to unlock the full potential
of advanced analytics within their SAP ecosystem.

Our initial research question (RQ1) aimed precisely at this objective: What
is the maturity level of our client's business intelligence infrastructure in
facilitating advanced analytics? To tackle RQ1, advanced analytics use case was selected through strategic dialogue with HR stakeholders. As a result, it has been decided to investigate absenteeism within the company, which will be the topic of the second research question (RQ2). Absenteeism analysis and prediction is a compelling use case that not only promises to shed light on workforce dynamics but also addresses our client's overarching maturity concerns.

Armed with a fusion of SAP's robust tools like SAP Datasphere and SAP Analytics Cloud, alongside Python's versatile machine learning capabilities, we did an in-depth analysis, positioned to alter how our client understands and addresses absenteeism.

  1. What is the maturity level of our client's business intelligence infrastructure in facilitating advanced analytics?
  2. Can we develop an end-to-end absenteeism solution with descriptive and predictive capabilities to address the absenteeism rate?

We adopted a structured strategy to our project, combining Agile CRISP-DM methodologies to ensure efficiency and effectiveness. This methodology enabled us to maintain a balance between functional requirements and technical implementation, allowing for a comprehensive and iterative development process.  

  1. Data Assessment: By consolidating data from multiple source systems, including SAP ECC, SAP SuccessFactors, and SAP Datasphere, we successfully unified disparate datasets. This consolidation process, although time-consuming and requiring collaboration from various stakeholders, was essential in our assessment of the maturity. The seamless integration of these datasets underscores the importance of effective data storage practices within the organization.
  2. Data preparation: To create the final dataset, do the necessary preprocessing, and feature engineering steps, a live connection between SAP datasphere and Python was made to properly extract the dumps into a Python environment.
  3. Model Development and Evaluation: Adhering to the proposed methodology, we executed a series of steps to train our predictive model. Subsequently, we delved into data exploration to extract pivotal insights for a descriptive SAP Analytics Cloud dashboard. Following thorough data preprocessing, we began modeling efforts to address dataset imbalances and determine the best regression model, using Python as our primary tool. Our efforts resulted in the adoption of the REBAGG implementation to address imbalance as a first step, along with predictions aided by an XGBoost Regressor model. This combination yielded remarkable outcomes, evidenced by a minimal Mean Squared Error (MSE) of 0.01 and an impressive R-squared score of 0.87 on the test set.
  4. Visualization: The live connection between SAP Datasphere and SAP Analytics Cloud serves as a pivotal component in our analytical framework, facilitating seamless visualization of both descriptive and predictive analyses. This integration empowers stakeholders with dynamic insights into absenteeism trends, offering a comprehensive understanding of historical patterns and future forecasts.

Our research yielded several key outcomes:

  1. Data Landscape Evaluation: A comprehensive assessment of our client’s data infrastructure was conducted, scrutinizing factors such as data quality, accessibility, and integration capabilities. While the client's system demonstrates maturity, opportunities for enhancement were identified, particularly in the areas of storage techniques for the various source systems within the present architecture. Nonetheless, leveraging mature data sources laid the foundation for creating a dataset conducive to advanced analytics endeavors. Notably, 80 percent of the data was available in Datasphere and usable, underscoring the importance of centralizing data storage to streamline analytics processes.
  2. Absenteeism Analysis Results: Through the successful development of a predictive model, advanced preprocessing techniques were applied to address imbalanced data, resulting in the creation of a robust regression model. This model demonstrated high accuracy in forecasting monthly absenteeism rates. By leveraging these predictions, decision-makers gain the ability to interactively identify emerging trends and make well-informed decisions to effectively manage absenteeism. For instance, reallocating work resources based on predicted absenteeism patterns can optimize workforce productivity and drive organizational success.
  3. SAP Benefits: Our initiative aimed to establish a stable solution for our client by starting with dependable data models from SAP Datasphere. By leveraging SAP Datasphere's capabilities, we ensured a smooth flow of data, laying the
    groundwork for robust analytics. This, combined with the transformative power of SAP Analytics Cloud, provided HR professionals with actionable insights for effectively addressing absenteeism and driving organizational success. 

Building on our findings, we have identified several next steps to further unlock the potential of advanced analytics:  

  1. Establishing a Culture of Mature Data Governance: Implementing robust data governance practices company-wide to ensure data quality, consistency, and compliance.
  2. Enhancing Client’s Infrastructure: Improving the client’s infrastructure, including SAP systems and potential integrations with complementary tools, such as Databricks to streamline the advanced analytics pipeline.
  3. Expanding Absenteeism Analysis Impact: Enhancing our predictive model to achieve a lasting impact on absenteeism rates, thereby contributing to long term absenteeism reduction efforts.

Join the conversation and stay tuned for more updates as we continue our journey in shaping the future of advance analytics within SAP DATA & ANALYTICS!

Did you find this useful?

Thanks for your feedback

If you would like to help improve further, please complete a 3-minute survey