In Apache Spark, you can transfer your records utilizing sc.addFile (sc is your default SparkContext>
and get the path on a worker utilizing SparkFiles.get. In this manner, SparkFiles resolve the paths to records added through SparkContext.addFile(>

SparkFiles contain the accompanying classmethods −

  • get(filename>
  • getrootdirectory(>


    It determines the path of the document that is added through SparkContext.addFile(>


    It indicates the path to the root directory, which contains the document that is added through the SparkContext.addFile(>

    from pyspark import SparkContext
    from pyspark import SparkFiles
    finddistance = "/home/hadoop/examples_pyspark/finddistance.R"
    finddistancename = "finddistance.R"
    sc = SparkContext("local", "SparkFile App">
    print "Absolute Path -> %s" % SparkFiles.get(finddistancename>

    command - below is the following command -
    $SPARK_HOME/bin/spark-submit sparkfiles.py

    output - below is the output of the above command -
    Absolute Path ->