Originally posted on Forbes.com
In order to build an effective people analytics practice, it’s best to break the process up into three steps. The first step is to obtain good historical data. The second is to visualize and trend that data into analytics to observe the patterns and build the right algorithms. The final step is to calibrate and refine the algorithm through testing. Let’s look at each step in more detail.
1. Historical Data To Build Metrics
Fortunately, most HR departments have a wealth of people data. Most of it is stored in a company’s human resource information system (HRIS). But it’s not uncommon for organizations to have several other systems that provide germane data, such as a talent management system, a recruiting system or even a separate payroll system. This data must be fed into a platform that can convert them to meaningful metrics using underlying formulas. For example, the HRIS would provide the termination dates of former employees, but these totals would need to be divided by the entire population to get a turnover metric. The same would be done for average time to hire a new person or some other such important metric.
As important, or even more so, to having history is that the data be accurate. The best way to ensure that data is accurate is with a system that can visualize the data into easy-to-understand graphics. It’s much easier to spot an anomaly on a trended chart than to decipher it through lines of a spreadsheet. Once you have such a tool, it’s good to establish a monthly checklist to continually check the quality of your data. Typical items on such a list would be questions like, how up to date is the information? Are there wrong data types in critical fields? Are there numerical outliers that are way beyond typical ranges?
Once you are satisfied that your data is of good quality and you have built your key metrics, it’s time to move onto the next step.
2. Generating Analytics And Building Algorithms
Once certain that the data is accurate, it’s time to trend the key metrics to analytics and start generating algorithms. Two common methodologies that are often used in this process, which can be used individually and combined, are regression analysis and scoring.
Regression analysis can be as simple as tracking trajectory. Once you have visualized your analytics, a regression line can be generated using statistical analysis to plot the most likely trajectory from the given data by formulating the relationships between dependent and independent variables. These are the Xs and Ys from your high school geometry class. There are several established techniques. One of the more common is the least squares technique, which can be used with either straight or plotted curves.
Scoring data is the other common methodology. This involves applying numerical values to various groups of data that would rate the likelihood of inflection due to other factors. For example, software developers in Silicon Valley might be given a higher score for their likelihood to leave a company than ones in Nebraska due to the number of employers looking for that skill set in northern California.
The more data points and history used will provide for the most accurate algorithms.
3. Test Algorithm Versus Past Snapshots
The last step after the algorithm is generated is to use the historical data to validate its accuracy with multiple and continued testing scenarios. The best way to do this is to take snapshots of data from earlier time periods. If, for example, you have data that goes back five years, truncate your data set from three years ago, which in this instance would be 2016. Use the algorithm to project what would have happened through the end of 2017, and then compare that with the actual results. You can run these tests multiple times by taking six-month or annual snapshots to see how close the predictions were to reality. This will provide a good look at the accuracy of your model.
Also, remember that all time periods are not the same. The company may have gone through an acquisition during one time period or there could have been a recession or other event that could cause abnormal results and should be considered in your analysis. I also recommended you continue to calibrate the model in similar time periods going forward, as social and economic climates will continue to evolve.
Finally, all these steps may seem intimidating when considered in a manual framework. However, the introduction of artificial-intelligence-based platforms can do much of this work for you. By automating processes, generating algorithms and identifying anomalies in real-time, predictive analytics is very possible for any organization today. So, stop fretting the future and embrace it.