Library - Pyshark

Back to Course

Lesson Description

Lession - #1493 PySpark-SparkConf

To run a Spark application on the local/cluster, you want to set a couple of configurations and parameters, this is the very thing that SparkConf assists with. It gives setups to run a Spark application. The accompanying code block has the details of a SparkConf class for PySpark.

class pyspark.SparkConf (
loadDefaults = True,
_jvm = None,
_jconf = None

At first, we will make a SparkConf object with SparkConf(>
, which will load the values from spark.* Java system properties too. Presently you can set various parameters utilizing the SparkConf object and their boundaries will take need over the system properties.

In a SparkConf class, there are setter methods, which backing chaining. For instance, you can compose conf.setAppName("PySpark App">
. When we pass a SparkConf object to Apache Spark, it can't be changed by any client.

Following are the absolute most commonly utilized attributes of SparkConf −

  • set(key, value>
    − To set a configuration property.
  • setMaster(value>
    − To set the master URL.
  • setAppName(value>
    − To set an application name.
  • get(key, defaultValue=None>
    − To get a configuration value of a key.
  • setSparkHome(value>
    − To set Spark installation path on worker nodes.

    Allow us to consider the accompanying instance of involving SparkConf in a PySpark program. In this model, we are setting the spark application name as PySpark App and setting the master URL for a spark application to → spark://master:7077.

    The accompanying code block has the lines, when they get included the Python document, it sets the essential configuration for running a PySpark application.

    from pyspark import SparkConf, SparkContext
    conf = SparkConf(>
    .setAppName("PySpark App">
    sc = SparkContext(conf=conf>