KDnuggets Home » Polls » Largest dataset analyzed / data mined (May 2011)
Latest News

Largest dataset analyzed / data mined


What was the largest database / dataset you analyzed? [148 votes]
Largest dataset analyzed in 2011 vs 2010

Comparing the results of 2011 poll with a similar 2010 Poll: Largest Database Data Mined / Analyzed, we see that median in 2011 is in 10-20 GB range. while the median dataset in 2010 was in 8-10 GB range.

We observe steady growth of analysts with experience in the upper range of datasets.
In 2011 about 35.4% reported analyzing over databases over 100 GB (vs 32.2% in 2010), and 21.4% - over 1 Terabyte (vs 18.3% in 2010).

Regional breakdown shows that US leads in percent of data miners who worked with terabyte range datasets (about 30%).
(Note: Australia/NZ region not included, since not enough responses were received).

Region (voters)Largest Dataset Analyzed (median)% analyzed TB+ data
US/Canada (53) 11-100 GB  30.2%
Europe (49) 11-100 GB  18.4%
Asia (20) 1-10 GB  10%
Latin America (15) 1 GB  6.7%
Africa/Middle East (7) 1-10 GB  28.6%

Here is another breakdown of Largest Dataset Analyzed by region. Largest dataset analyzed in 2011 by region


Hélder Quintela, Normal
As it is expected (at least for me) it is almost normal distribution. Very small databases are not much interesting for Analysis and Knowledge Discovery to improve and impact Business, and very large databases are not so much available.

Ajay Ohri, Poll on Database Size
It would be interesting to see interaction effects and co-relation between size of database used and software name.

KDnuggets Home » Polls » Largest dataset analyzed / data mined (May 2011)

Copyright © 2013 KDnuggets.  | SUBSCRIBE to KDnuggets News email  | Tweet Twitter | facebook Facebook | RSS RSS | About KDnuggets