Tag: Apache Spark (33)
http likes 242
- Top Big Data Processing Frameworks - Mar 4, 2016.
A discussion of 5 Big Data processing frameworks: Hadoop, Spark, Flink, Storm, and Samza. An overview of each is given and comparative insights are provided, along with links to external resources on particular related topics.
- Upcoming Webcasts on Analytics, Big Data, Data Science – Nov 25 and beyond - Nov 24, 2014.
Social Media to Actionable Insights, NoSQL and Big Data in the Government, NoSQL Database Architecture, All-vs-All: Correlation Using Spark/Hadoop, The 2015 Analytics Predictions Webinar, and more.
- STRATA + Hadoop World 2014 NYC Report - Nov 5, 2014.
Strata + Hadoop World this year included workshops on subjects like Spark, R, and Python, interesting keynotes, and impressive detailed technical talks on subjects on Hadoop and new trends in big data.
- BigData TechCon San Francisco Report: Focus on Spark - Nov 1, 2014.
BigData TechCon SF 2014 covered a number of data technologies from the open source ecosystem through tutorials and classes. Spark and its libraries were a significant focus of the talks.
- Upcoming Webcasts on Analytics, Big Data, Data Science – Oct 21 and beyond - Oct 20, 2014.
Big Data Changes everything, Deep Learning + Apache Spark, Data Mining - Failure to Launch, Linear Regression in Python, Demystify your data flows, and more.
- Upcoming Webcasts on Analytics, Big Data, Data Science – Oct 14 and beyond - Oct 13, 2014.
Hadoop means Business, Which Half of Your Graphs are Lying, Deep Learning + Apache Spark, Data Mining - Failure to Launch, Linear Regression in Python, and many more.
- Upcoming Webcasts on Analytics, Big Data, Data Science – Oct 7 and beyond - Oct 6, 2014.
Evolution of Classification, Billion Dollar Fraud Detection, Big Data Visualization, Deep Learning on Apache Spark, and more.
- Learn how Sparkling Water brings H2O Deep Learning to Apache Spark, Oct 29 Webinar - Oct 6, 2014.
Sparkling Water is the latest innovation to combine two best-of-breed open source technologies Apache Spark and H2O. Learn how to setup your own Sparkling Water environment at Oct 29 Webinar.
- Apache Spark: O’Reilly Certification, EU Training, University Program - Sep 26, 2014.
Recent news on Apache Spark includes developer certification from O'Reilly, upcoming training workshops in EU by Databricks, and Spark tutorial events at major universities.
- Top Analytics and Big Data trends ahead of Strata Hadoop NYC Conference - Aug 14, 2014.
Top trends in Analytics and Big Data named by our readers ahead of Strata + Hadoop NYC conference include Deep Learning, Apache Spark, democratization of analytics, and real-time processing of Big Data.
- 18 essential Hadoop tools - Aug 1, 2014.
Hadoop tools develop at a rapid rate, and keeping up with the latest can be difficult. Here we detail 18 of the most essential tools that work well with Hadoop.
- Top stories for Jul 20-26 - Jul 27, 2014.
Baby steps in Learning Python; 7 Steps for Learning Data Mining; Spotting Bad Data Visualizations; MLlib: Apache Spark component for machine learning.
- MLlib: Apache Spark component for machine learning - Jul 24, 2014.
MLlib, the machine learning component of Apache Spark, has developed into a tool that supports many common machine learning algorithms and now comes with more mature documentation and a stable API.
- Upcoming Webcasts on Analytics, Big Data, Data Science – July 22 and beyond - Jul 21, 2014.
Data Visualization, Hadoop and Hadoop 2.0, R, Apache Spark, How Can Analytics Improve Business, The Grammar and Graphics of Data Science, Data Mining: Failure To Launch and more.
- Big Data TechCon, The How-To Conference, Oct 27-29, San Francisco - Jul 15, 2014.
Plan now to attend Big Data TechCon in San Francisco, to learn HOW-TO accommodate the terabytes and petabytes of data, learn the latest big data technologies, mingle and network, and be inspired by keynotes.
- Top KDnuggets tweets, Jun 30 – Jul 1: Is “Data Scientist” more than “Data Analyst”? Good list of 41 Big Data Influencers - Jul 2, 2014.
Is "Data Scientist" more than "Data Analyst"? ; 41 Big Data Influencers - Journalists, Public Sector, Industry, Academia; Alteryx and Databricks to lead development of Apache SparkR ; Top data mining researcher @Jure Leskovec lecture on Webgraph structure.
- YARN is All the Rage at Hadoop Summit 2014 - Jun 12, 2014.
Apache YARN, which enables much broader types of computations than MapReduce, is quickly becoming an integral part of Hadoop projects. We review best practices considerations for a YARN cluster.
- Top KDnuggets tweets, Jun 4-5: “Practical Data Science with R” stands out; Top 5 cities for #BigData jobs - Jun 6, 2014.
How does "Practical Data Science with R" book stand out ? Top 5 cities for #BigData jobs: San Francisco, McLean, Boston, St. Louis, and Toronto; Big jump in #BigData applications, code built with Apache Spark ; 76 Startup Failure Post-Mortems.
- Big Data BootCamp: Highlights of talks on Day 3 - May 12, 2014.
Highlights from the presentations by big data technology practitioners from Hortonworks, Intel, Rackspace, SciSpike, and Yahoo at Big Data Bootcamp 2014 in Santa Clara.
- Top stories in April - May 2, 2014.
Apache Spark, the hot new trend in Big Data; Data Analytics Handbook - interviews with tech leaders, free download; Learning and Teaching Machine Learning; 9 Free Books for Learning Data Mining and Data Analysis.
- Top KDnuggets tweets, Apr 18-20 - Apr 22, 2014.
Cross-validation pitfalls for regression/classification and how to avoid them; Data Workflows for Machine Learning ; Apache Spark, the hot new trend in Big Data ; Visual Analysis Best Practices - download a free guidebook from Tableau.
- Top stories for Apr 13-19 - Apr 20, 2014.
Top LinkedIn Groups in 2014 for Analytics, Big Data, Data Science; Data Analytics Handbook, free download; Apache Spark, the hot new trend in Big Data; GoodData Open Analytics Platform.
- Top KDnuggets tweets, Apr 16-17 - Apr 19, 2014.
Scikit-Learn: a great python library for machine learning; A map of where nobody lives in the US; Apache Spark, the hot new trend in Big Data ; NYU @aghose on Est. Demand for Mobile Apps - Learn more: NYU Stern MS in Biz Analytics.
- Apache Spark, the hot new trend in Big Data - Apr 18, 2014.
Spark solves similar problems as Hadoop MapReduce does but with a fast in-memory approach and a clean functional style API. Leveraging Hadoop Yarn, Alpine has made it very simple to get started with Spark.
- Top KDnuggets tweets, Apr 9-10: MLlib: Scalable Machine Learning on Spark; Ensemble methods overview - Apr 11, 2014.
MLlib: Scalable Machine Learning on Spark (free ebook); Ensemble methods usually give best results in Machine Learning - an overview; Prediction.io open source machine learning server ; Maslow Hierarchy of Analytical Needs - too clever?
- Top KDnuggets tweets, Apr 4-6: Apache Spark – a Fast #BigData Analytics Engine; Facebook #DataScience tools - Apr 7, 2014.
Apache Spark - a Fast #BigData Analytics Engine - very good, detailed overview! Facebook #DataScience team releases open-source tools; Top #BigData start-ups by employee satisfaction; My answer to "Which will be better for career prospects in Machine Learning".
- HP Perspective on Big Data and Analytics: Interview with Mazhar Hussain - Apr 3, 2014.
KDnuggets talks with Mazhar Hussain, HP Big Data & Analytics Services Leader, on key topics for the industry and 4 next big areas in Big Data.
- Alpine Data expects faster, easier Data Science with Spark - Mar 18, 2014.
Alpine Data Labs becomes one of the first companies to be certified on Apache Spark, reported up to 100x faster than Hadoop. Alpine answers 3 questions from KDnuggets.
- Top KDnuggets tweets, Mar 14-16: Is Apache Spark the Next Big Thing? R Meta-Book – best CRAN posts assembled - Mar 17, 2014.
Apache Spark promises to be the Next Big Thing in #Big Data - 100x faster than #Hadoop; An R Meta-Book - best CRAN posts assembled; The Beauty of pi - the fastest (and most incomprehensible) formula; Tips for Hiring Data Scientists: look for quants with business hustle.
- Top KDnuggets tweets, Mar 3-4: Accenture/MIT data science challenge; Spark graduates, 100x faster than Hadoop - Mar 5, 2014.
Accenture and MIT data science challenge - analyze City of Chicago; Spark graduates from Apache Incubator, 100x faster than Hadoop over in-memory data; Stanford Data Mining, Finance, Statistics Courses Online; Data Mining Cup 2014 - Student Competition starts.
- Top KDnuggets tweets, Feb 5-6: A Deep Learning expert wins Dogs vs Cats competition; An alternative to R and Python: Julia - Feb 7, 2014.
A Deep Learning expert wins Dogs vs Cats competition with an almost perfect result; An alternative to R and #Python: Julia; Spark is a hot trend in #BigData but what is it exactly? Here is a explanation; etcML - Free Text-Analysis Tool - Machine Learning as a Service.
- Top Trends in Analytics and Big Data ahead of Strata 2014 Santa Clara - Jan 28, 2014.
Our survey ahead of Strata 2014 Conference revealed the top 3 trends in Analytics and Big Data for 2014: Analytics for the masses, Apache Spark, and Real Time Analytics. Read the interesting comments and details and get a KDnuggets discount for Strata.
- Top KDnuggets tweets, Jan 8-9: Great list of NLP APIs; Python erodes R hegemony, but do not go all-in Python now - Jan 10, 2014.
Great list of 25+ NLP APIs for Sentiment Analysis, Text Processing, Topic Extraction; MLbase: Distributed Machine Learning using Apache Spark; "Sexy" Data Science should be a Team Sport, or it will fail ; LinkedIn files lawsuit over data-mining bots which mine user profiles