Data Miner Survey - results
Some time ago I suggested that readers might like to complete a survey for Karel Rexer of Rexer Analytics and a number of you did. The results of the survey are now available here (you need to email Karl and ask for a copy, instructions on the page), and there are some useful nuggets of information in the report. A couple of things struck me when I saw my copy:
- When asked to identify the fields in which they employ data mining the most frequently identified fields were CRM/marketing, financial, academic, telecommunications, and retail.
Interestingly all categories on the blog with the exception of academia. - Data miners working with financial data strongly valued a tool's ability to automate repetitive tasks
I wonder if this relates to the tendency of financial institutions to constantly update their models and develop multiple alternatives for use in adaptive control - There was nothing on deployment
I found this curious as deployment of models seems really important to me. Not clear if Karl did not ask any questions or did not find the answers interesting (Karl?) - Top 4 challenges found by respondents included dirty data, difficult access to data, explaining data mining to others, and finding qualified data miners
No big surprises here, apart from the explaining one - while I always find that a problem I was surprised to see it come up so high - Other problems included that data mining results were not used by business decision makers and difficulties in deployment/scoring
I think using business rules as a platform for deploying analytics can help with both of these - it makes actual deployment easier and allows you to engage business people in the automation of the decision (through editing rules) in a way that might make them feel more comfortable.
Karl references some polls on KDNuggets about which I blogged before - one on what people do with analytics (here, CRM and banking came top), one on data mining deployment (here) and this one about making better decisions with computers (here).
Technorati Tags: analytic application, analytics, data mining, EDM, enterprise decision management, predictive analytics, KDNuggets

While we did not ask a specific set of questions around model deployment, we did include deployment in the list of challenges that data miners face. Twenty-two percent of respondents identified deployment as one of the challenges that they typically face. However, only three percent of respondents felt that it was the most difficult challenge.
Although we will retain several core elements of our survey in the 2008 version, we also will likely drop some questions and add new ones as people like yourself express interest in different areas of inquiry. Deployment sounds like a great area for further exploration, so we will definitely consider including it. In fact, as we revise the survey in early 2008, we'd love to get your input as we design these new questions, and hope you would provide us with your perspective prior to the next launch.
Posted by: Heather Allen | August 10, 2007 at 01:27 PM