Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 10 additions & 10 deletions docs/setup/glue.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,23 +22,23 @@ This tutorial will cover how to configure both a glue notebook and a glue ETL jo
have a working knowledge of AWS Glue jobs.

In the tutorial, we use
Sedona {{ sedona.current_version }} and [Glue 4.0](https://docs.aws.amazon.com/glue/latest/dg/release-notes.html) which runs on Spark 3.3.0, Java 8, Scala 2.12,
and Python 3.10. We recommend Sedona-1.3.1-incubating and above for Glue.
Sedona {{ sedona.current_version }} and [Glue 5.0](https://docs.aws.amazon.com/glue/latest/dg/release-notes.html) which runs on Spark 3.5.4, Java 17, Scala 2.12,
and Python 3.11. We recommend Sedona 1.8.0 and above for Glue 5.0.

!!!warning
**Important:** Since Sedona 1.8.0, Java 8 support is dropped and Spark 3.3 support is dropped. For Sedona 1.8.0+, you need to use Glue 5.0+ which supports Java 11 and Spark 3.4+.
**Important:** Since Sedona 1.8.0, Java 8 support is dropped and Spark 3.3 support is dropped. Sedona 1.8.0+ requires Glue 5.0+ which supports Java 17 and Spark 3.5+. If you must stay on Glue 4.0 (Spark 3.3, Java 8), use Sedona 1.7.1 or lower along with the `sedona-spark-shaded-3.3_2.12` artifact instead of the `sedona-spark-shaded-3.5_2.12` artifact shown below.

## Gather Maven Links

You will need to point your glue job to the Sedona and Geotools jars. We recommend using the jars available from maven. The links below are those intended for Glue 4.0
You will need to point your glue job to the Sedona and Geotools jars. We recommend using the jars available from maven. The links below are those intended for Glue 5.0.

Sedona Jar: [Maven Central](https://repo1.maven.org/maven2/org/apache/sedona/sedona-spark-shaded-3.3_2.12/{{ sedona.current_version }}/sedona-spark-shaded-3.3_2.12-{{ sedona.current_version }}.jar)
Sedona Jar: [Maven Central](https://repo1.maven.org/maven2/org/apache/sedona/sedona-spark-shaded-3.5_2.12/{{ sedona.current_version }}/sedona-spark-shaded-3.5_2.12-{{ sedona.current_version }}.jar)

Geotools Jar: [Maven Central](https://repo1.maven.org/maven2/org/datasyslab/geotools-wrapper/{{ sedona.current_geotools }}/geotools-wrapper-{{ sedona.current_geotools }}.jar)

!!!note
Ensure you pick a version for Scala 2.12 and Spark 3.3. The Spark 3.4 and Scala
2.13 jars are not compatible with Glue 4.0.
Ensure you pick a version for Scala 2.12 and Spark 3.5. The Scala
2.13 jars are not compatible with Glue 5.0.

## Configure Glue Job

Expand All @@ -55,20 +55,20 @@ and the second installs the Sedona Python package directly from pip.

```text
# Sedona Config
%extra_jars https://repo1.maven.org/maven2/org/apache/sedona/sedona-spark-shaded-3.3_2.12/{{ sedona.current_version }}/sedona-spark-shaded-3.3_2.12-{{ sedona.current_version }}.jar, https://repo1.maven.org/maven2/org/datasyslab/geotools-wrapper/{{ sedona.current_geotools }}/geotools-wrapper-{{ sedona.current_geotools }}.jar
%extra_jars https://repo1.maven.org/maven2/org/apache/sedona/sedona-spark-shaded-3.5_2.12/{{ sedona.current_version }}/sedona-spark-shaded-3.5_2.12-{{ sedona.current_version }}.jar, https://repo1.maven.org/maven2/org/datasyslab/geotools-wrapper/{{ sedona.current_geotools }}/geotools-wrapper-{{ sedona.current_geotools }}.jar
%additional_python_modules apache-sedona=={{ sedona.current_version }}
```

If you are using the example notebook from glue, the first cell should now look like this:

```text
%idle_timeout 2880
%glue_version 4.0
%glue_version 5.0
%worker_type G.1X
%number_of_workers 5

# Sedona Config
%extra_jars https://repo1.maven.org/maven2/org/apache/sedona/sedona-spark-shaded-3.3_2.12/{{ sedona.current_version }}/sedona-spark-shaded-3.3_2.12-{{ sedona.current_version }}.jar, https://repo1.maven.org/maven2/org/datasyslab/geotools-wrapper/{{ sedona.current_geotools }}/geotools-wrapper-{{ sedona.current_geotools }}.jar
%extra_jars https://repo1.maven.org/maven2/org/apache/sedona/sedona-spark-shaded-3.5_2.12/{{ sedona.current_version }}/sedona-spark-shaded-3.5_2.12-{{ sedona.current_version }}.jar, https://repo1.maven.org/maven2/org/datasyslab/geotools-wrapper/{{ sedona.current_geotools }}/geotools-wrapper-{{ sedona.current_geotools }}.jar
%additional_python_modules apache-sedona=={{ sedona.current_version }}


Expand Down
Loading