Follow Gregory Piatetsky, No. 1 on LinkedIn Top Voices in Data Science & Analytics

KDnuggets Home » News » 2014 » Nov » Opinions, Interviews, Reports » Surfing the Big Data Wave at H2O World ( 14:n32 )

Surfing the Big Data Wave at H2O World

Recent H2O World event showcased its open-source, scalable machine learning in the cloud, intended for people familiar with R but limited by its scalability. H2O can run on Hadoop and also on Apache Spark.

Arno Candel gave a comprehensive tutorial on doing deep learning using H2O. Free booklets on using R and on running deep learning are available at

Arno Candel gave an interesting talk on using auto-encoders for anomaly detection. In this case, the auto-encoder is a deep learning model where the number of input neurons is the same as the number of output neurons. The model learns the identity function using a hidden layer that as many fewer neurons than input or output. He also talked at a superficial level about using H2O for feature engineering.

H2O Conference Photo
Here are Sri Ambati and Arno Candel on stage.

Yan Zou and Vijay Iyengar talked about using H2O for Marketing and CRM. They also announced that they would run a competition using the KDDCup 1998 data set where participants would use H2O and would be ranked on how much better they performed than the baseline. The contest details will be posted on

Bio: Arun Swami is a Bay Area entrepreneur and tech leader, who created innovative systems using text mining, ranking algorithms, heuristic approaches, data mining, personalization technology, database algorithms and optimization algorithms. Arun was a key member of the team that started IBM's research in data mining and has published seminal work in this area. His classic data mining paper with Rakesh Agrawal "Mining Association Rules" is ranked among most cited CS papers.


Sign Up