Converting Geospatial Files Using ogr2ogr in Scala
Converting geospatial files from one format to another is one of the most common tasks when starting a new project. This process has been made significantly easier by GDAL’s ogr2ogr, a command line tool for doing exactly that. Using it from within your Scala code isn’t hard, but it can be tricky. I’ll show you how below.
GDAL Setup
First, since ogr2ogr depends on GDAL, you’ll need the GDAL Java Bindings set up before you get started. See my previous post, Find a Geospatial File’s SRID Using Scala and GDAL, for some guidance on that.
Project Setup
Add the following to your project’s build.sbt
file to get the Boundless Geo resolver and to load the GDAL library (I’m running Scala 2.12.4 and SBT 1.0.3 for the record):
Download this sample file and save it in your project under the /src/main/resources
folder, creating it if it doesn’t already exist.
Merging the ogr2ogr Java Port with Scala
Next, you have to track down the version of ogr2ogr that has been ported to Java. This is buried inside OSGeo’s Github repo under the SWIG bindings folder, which you can find here. It doesn’t have the prettiest API as you’ll see (even the author admits so in the source code comments), but it does the trick.
Create a new folder in your project src/main/java
to store this file, and put it in a package org.gdal.apps
to keep things organized. Finally, rename the main
class to execute
as shown below (this should be on line 99 or so of the file) to keep it from confusing the JVM later.
Calling ogr2ogr
We’ll be using Futures to keep things nice and asynchronous, so you’ll need to import them and a few other things a the top of your class:
Next, add this function that will be used to create a folder to hold your output and to call ogr2ogr to perform the conversion:
Finally, build up your ogr2ogr command using their documentation and the output driver’s specific documentation, just as you would if running it from the command line. Store it as an array of strings. For example, to output as a Shapefile, just Google ogr2ogr driver shapefile to find this page.
Now you can build and run it, and it should create and store a shapefile in a new temp directory within the folder you ran it from. You may see a large number of warnings about the fields being too wide using the example file, but the conversion should still complete successfully.
Passing Additional Parameters
Some drivers, such as the CSV Driver, take a number of additional parameters to configure the output format. These can be added as shown here:
Hey, I'm Ryan Brideau. I work as a Senior Data Scientist at Wealthsimple. Previously, I was at Shopify. You can follow me on Twitter here: @Brideau