Whack Data     About     Archive     RSS

Converting Geospatial Files Using ogr2ogr in Scala

ogr2ogr over top of the Scala logo

Converting geospatial files from one format to another is one of the most common tasks when starting a new project. This process has been made significantly easier by GDAL’s ogr2ogr, a command line tool for doing exactly that. Using it from within your Scala code isn’t hard, but it can be tricky. I’ll show you how below.

Note
If you'd like to see the full working example ahead of time, see: https://github.com/Brideau/scalaogr2ogr

GDAL Setup

First, since ogr2ogr depends on GDAL, you’ll need the GDAL Java Bindings set up before you get started. See my previous post, Find a Geospatial File’s SRID Using Scala and GDAL, for some guidance on that.

Project Setup

Add the following to your project’s build.sbt file to get the Boundless Geo resolver and to load the GDAL library (I’m running Scala 2.12.4 and SBT 1.0.3 for the record):

Download this sample file and save it in your project under the /src/main/resources folder, creating it if it doesn’t already exist.

Merging the ogr2ogr Java Port with Scala

Next, you have to track down the version of ogr2ogr that has been ported to Java. This is buried inside OSGeo’s Github repo under the SWIG bindings folder, which you can find here. It doesn’t have the prettiest API as you’ll see (even the author admits so in the source code comments), but it does the trick.

Create a new folder in your project src/main/java to store this file, and put it in a package org.gdal.apps to keep things organized. Finally, rename the main class to execute as shown below (this should be on line 99 or so of the file) to keep it from confusing the JVM later.

Calling ogr2ogr

We’ll be using Futures to keep things nice and asynchronous, so you’ll need to import them and a few other things a the top of your class:

Next, add this function that will be used to create a folder to hold your output and to call ogr2ogr to perform the conversion:

Finally, build up your ogr2ogr command using their documentation and the output driver’s specific documentation, just as you would if running it from the command line. Store it as an array of strings. For example, to output as a Shapefile, just Google ogr2ogr driver shapefile to find this page.

Now you can build and run it, and it should create and store a shapefile in a new temp directory within the folder you ran it from. You may see a large number of warnings about the fields being too wide using the example file, but the conversion should still complete successfully.

Passing Additional Parameters

Some drivers, such as the CSV Driver, take a number of additional parameters to configure the output format. These can be added as shown here:


Hey, I'm Ryan Brideau. I work as a Senior Data Scientist at Wealthsimple. Previously, I was at Shopify. You can follow me on Twitter here: @Brideau