Skip to content

BoltMaud/Pyspark_pytest

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 

Repository files navigation

SPARK TESTS WITH PYTEST AND PYSPARK

Configure your machine

Download spark and hadoop

  • Download Spark, pre-built version for Hadoop.
  • Download Hadoop.

I used Spark 2.2 and hadoop 2.7.

Folders

  • Extract and move Spark in C:/ (or where you want in your computer)
  • Move the winutils.exe in C:/Hadoop/bin/

Environment variables

  • Set the SPARK_HOME and HADOOP_HOME :
SPARK_HOME = "C:/C:\spark-YOUR_VERSION"
HADOOP_HOME = "C:\winutils"

Because this doesn't work on my Windows, I added the following lines in my conftest.py

os.environ["SPARK_HOME"]="C:\spark-2.2.0-bin-hadoop2.7"
os.environ["HADOOP_HOME"]="C:\winutils"

Download the python lib

pip install pytest 
pip install pyspark

DOWNLOAD the conftest.py

In the links bellow, you can find different configuration for pytest to launch the spark context :

You also can use mine in the src folder that have the environment variables set.

Run the tests

The conftest.py should be in the folder of your test files.

To run a test in a console :

pytest main_test.py

To run a test from IntelliJ : Setting -> Python Interpreted Tools -> change "run test" to py.test
Then you can run your test file as a normal python file.

Todo

Find a way to hide WARNINGS

About

Examples and tutorials of pytest with pyspark

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages