Large-Scale Machine Learning with Spark

Lessons from Large-Scale Machine Learning Deployments on Apache Spark.

  • Section 1: Performance Tuning and Practical Integration
    • Spark 1.1: MLlib Performance Improvements
    • Recent performance improvements in Apache Spark: SQL, Python, DataFrames, and More
    • Deep Learning with Spark and TensorFlow
    • Auto-Scaling scikit-learn with Spark
  • Section 2: Machine Learning Scenarios
    • Simplify Machine Learning on Spark with Databricks
    • Visualizing Machine Learning Models
    • On-Time Flight Performance with GraphFrames for Apache Spark
    • Mining Ecommerce Graph Data with Spark at Alibaba Taobao
    • Audience Modeling With Spark ML Pipelines
    • Interactive Audience Analytics With Spark and HyperLogLog
    • Approximate Algorithms in Apache Spark: HyperLogLog and Quantiles
    • Genome Sequencing in a Nutshell
    • Parallelizing Genome Variant Analysis
    • Predicting Geographic Population using Genome Variants and K-Means
    • Apache Spark 2.0 Preview: Machine Learning Model Persistence 
  • Section 3: Select Case Studies
    • Inneractive Optimizes the Mobile Ad Buying Experience at Scale with Machine Learning on Databricks
    • Yesware Deploys Production Data Pipeline in Record Time with Databricks
    • Elsevier Labs Deploys Databricks for Unified Content Analysis
    • Findify’s Smart Search Gets Smarter with Spark MLlib and Databricks
    • How Sellpoints Launched a New Predictive Analytics Product with Databricks
    • LendUp Expands Access to Credit with Databricks
    • MyFitnessPal Delivers New Feature, Speeds up Pipeline, and Boosts Team Productivity with Databricks
    • How DNV GL Uses Databricks to Build Tomorrow’s Energy Grid
    • How Omega Point Delivers Portfolio Insights for Financial Services with Databricks
    • Sellpoints Develops Shopper Insights with Databricks
Published: December 2016
Author img Databricks