I’ve been working in the predictive analytics (“data mining”, “big data”) field for several years and have noticed a few trends:
- Clients have better (more, higher quality) data and are able to organize and deliver it more quickly
- Access to “external” (e.g. weather, economic, etc…) data has improved dramatically
- ETL tools have improved — and don’t have to cost as much as a new mini-van
- Off-the-shelf statistics packages have added non-parametric predictive analytics tools
- Purpose-specific modelling tools have improved in both function and form
- Credible open source tools are available — some have established themselves as “must haves” in the toolbox (R comes to mind)
Until recently, these topics have been barriers to practical predictive analytics for businesses outside the Fortune 500.
- Data wasn’t available — or was not available in a way that could be used within the company’s budget (manual ETL is expensive…).
- ETL doesn’t mean vi (or emacs…) and a collection of Perl scripts, though these tools are still good for special situations
- Tools help tame the volume of data — and do so quickly, making exploration possible and practical
All of this means that predictive analytics can be a practical tool for businesses of many shapes and sizes.
If you’re interested in reading more about this topic, have a look at:
Leave a Reply