Big Data with Apache Spark and Python – Hands On!

Big Data with Apache Spark and Python – Hands On!

Apache Spark tutorial with 20+ hands-on examples of analyzing large data sets on your desktop or on Hadoop with Python!

Description

The Author Recommend This Course :

Taming Big Data with Apache Spark and Python – Hands On!

Apache Spark tutorial with 20+ hands-on examples of analyzing large data sets on your desktop or on Hadoop with Python!

Disclaimer : This courses is the recommendation from the author , and not the work of bigdataprogrammers.com

What you’ll learn

  • Use DataFrames and Structured Streaming in Spark 3
  • Frame big data analysis problems as Spark problems
  • Use Amazon’s Elastic MapReduce service to run your job on a cluster with Hadoop YARN
  • Install and run Apache Spark on a desktop computer or on a cluster
  • Use Spark’s Resilient Distributed Datasets to process and analyze large data sets across many CPU’s
  • Implement iterative algorithms such as breadth-first-search using Spark
  • Use the MLLib machine learning library to answer common data mining questions
  • Understand how Spark SQL lets you work with structured data
  • Understand how Spark Streaming lets your process continuous streams of data in real time
  • Tune and troubleshoot large jobs running on a cluster
  • Share information between nodes on a Spark cluster using broadcast variables and accumulators
  • Understand how the GraphX library helps with network analysis problems