Skip to content
This repository was archived by the owner on Sep 10, 2018. It is now read-only.

Reviewing Data Submission Pull Requests for the OpenHuntingData Project

Kate Dougherty edited this page May 9, 2017 · 2 revisions

Overview

Most pull requests will be for adding or changing a single JSON file that defines the URL of a geospatial data source, and specifies how to transform that source data into GeoJSON format. Reviewing a data pull request for this project involves three stages: (1) reviewing the request manually to retrieve the name of the JSON file to process, (2) executing the python script that builds the GeoJSON file from the original source data noted in the JSON file, and (3) reviewing the output.

This repository contains both code and data. Because the code in the pull request’s branch may be out of date, you must use the python script from the master branch when attempting to build the GeoJSON file.

Prerequisites

  • Clone the OpenHuntingData repository and ensure that its submodules are up-to-date (git pull | git submodule init | git submodule update). The submodules contain the copy of the python code that you will run.

Choose a Data Pull Request to Review

Note: Automated tests run on each pull request. If a request hasn’t passed these preliminary tests (as indicated by a red “x” in the list of requests), do not proceed further. Choose a different request that has passed the tests (as indicated by a green checkmark).

Clone the Associated Branch

  • Clone or checkout the branch the pull request is coming from. You’ll find this information at the end of the line that appears directly below the pull request’s title. For example, the requesting branch in the following line would be “backpkr1:add-US-TN-all”:

     ```
     backpkr1  wants to merge 1 commit into OpenBounds:master from backpkr1:add-US-TN-all
     ```
    

    The Git command to clone the branch for this pull request (over SSH) would be:

     ```
     git clone git@github.com:backpkr1/OpenHuntingData.git --branch add-US-TN-all
     ```
    

    If you are cloning via HTTPS, use the URL associated with the requesting user’s forked repository. For example:

     ```
     git clone https://github.com/backpkr1/OpenHuntingData.git --branch add-US-TN-all
     ```
    
  • Most requests will originate from one of these two forks:

Stage 1: Retrieve the Name of the JSON File to Process

  • Identify the JSON file the pull request is submitting. You can find this information at the end of the pull request's name. For example, the file the request named “add US/TN/all.json” is submitting is “all.json.”

Stage 2: Execute the Python Script that Builds the GeoJSON File

  • Navigate to the directory containing your working code from the master branch (cd OpenHuntingData).

  • Run the python process script on the pull request’s JSON file. The first argument to the script is the path to the JSON file it will process. The second argument is the directory to which the output should be written (for this project, the output directory is “generated”).

    python ./scripts/process.py /path/to/pull/request/sources/US/TN/all.json generated
    

You should see the following code when a file processes correctly:

```

Downloading http://fwp.mt.gov/gisData/shapefiles/huntDistrictsAntelope2014.zip INFO:requests.packages.urllib3.connectionpool:Starting new HTTP connection (1): fwp.mt.gov Reading huntDistrictsAntelope2014.zip Done. Processed to generated/US/MT/antelope.geojson


# Potential Processing Errors

There are several classes of potential errors:
* The URL for the source data is invalid because it was entered incorrectly.
* The ID or name field was not specified in the pull request.    
* File not found error. This error can have one of three possible causes:
 1. The URL was a temporary address generated for the original user, and has since expired.
 2. The URL was entered incorrectly.
 3. The file has moved, or is defunct.
* The file was downloaded, but could not be opened. This error can have several possible causes:
 1. The original data source's file type is not supported.
 2. The file is not georeferenced correctly.
 3. There are multiple files in the download, and the file to open was not specified.
     * To correct this error, add the name of the appropriate shapefile to the JSON file:
         * Review the JSON code on the pull request's GitHub page, under the "Files Changed" tab.
         * Copy the URL listed there.
         * Visit the URL in your browser and download the data manually. 
         * Unzip the data and review the shapefiles in your GIS program to determine which one should be used.
         * Add the following sample code to the second line of the JSON file:
                 
             ```
             “filenameInZip” : “FilenameOfShapefileToUse”, 
             ```
         * Attempt to run the python code again.

* If you are unable to correct a processing error, document the error by adding a comment to the pull request's GitHub page.

# Stage 3: Review the Output

* Open the output file in the GIS program of your choice and check the data for errors.
 * Ensure that features have the correct attributes (i.e., ID and name fields for hunt districts or game management units).
 * Ensure that features are in their correct locations by superimposing them on a base map. If the georeferencing in the source file is incorrect, they may appear in the wrong locations.

# Merge the Pull Request

* After completing all of the above steps successfully, merge the pull request.