Data Modeling vs. Data Mining
Data modeling refers to a group of processes in which multiple sets of data are combined and analyzed to uncover relationships or patterns. The goal of data modeling is to use past data to inform future efforts.
Data mining is a step in the data modeling process. In data mining you search for valuable and relevant data to solve the marketing question. You use that data as a basis to build a model to predict future patterns.
One of the strengths of data modeling is that it can analyze data from multiple sources and give independent judgments regarding what is relevant or not required – that is for the model to decide. Common data sets include:
Data Modeling Process
While the specific approach to a data modeling projects varies, most models are developed using the following steps:
Types of Data Modeling
Statistical Regression
Offers a good “first pass” at a data set and allows the team to describe the relationship between a dependent variable (e.g.: sales) and independent variables. It facilitates the development of hypotheses to be tested more rigorously with different tools.
Genetic Programming
Modeling is often limited by the imagination and time constraints of the modeler – what variables to include, what combinations to create? Genetic Programming removes the potential for “modeler bias” and tests ALL variables and EVERY combination…literally building tens of thousands of models.
Structural Equation Modeling
Generally used to confirm models already hypothesized by other means (e.g.: Genetic Programming). It allows for the study of complex relationships among variables. Unique in that it allows for inclusion of latent variables – not directly observable or measurable (e.g.: intelligence or consumer confidence).
Data Modeling Tools
TheGMAX™
TheGMAX™ is based on the principles of genetic programming, which use computers to solve problems without being explicitly programmed. It uncovers complex relationships among data sets and variables, unobtainable through traditional practitioners and methods.
Learn more about TheGMAX
CART
CART® searches for important patterns and relationships, uncovering hidden structure in highly complex data. CART® analyses result in relatively easy to interpret decision-trees which are formed by a series of two-way splits in the data.
MARS
MARS® is ideal for users who prefer results in a form similar to traditional regression while capturing essential nonlinearities and interactions.
TreeNet
TreeNet® is a flexible and powerful data mining tool, capable of consistently generating extremely accurate models.