Batch setting the projection definition of a folder full of Shapefiles_Python Geospatial Analysis Cookbook-QQ阅读男生轻小说网

上QQ阅读APP看书，第一时间看更新

Batch setting the projection definition of a folder full of Shapefiles

Working with one Shapefile is fine but working with tens or hundreds of files is something else. In such a scenario, we'll need automation to get a job done fast.

We have a folder that contains several Shapefiles that are all in the same coordinate system but do not have a .prj file. We want to create a .prj file for each Shapefile in the current directory.

This script is a modified version of the previous code example that could write a .prj file for a single Shapefile into a batch process that can run over several Shapefiles.

How to do it...

We have a folder with many Shapefiles and we would like to create a new .prj file for each Shapefile in this folder, so let's get started:

Create a new Python file named ch02_05_batch_shp_prj.py in your /ch02/code/working/ directory and add the following code:

#!/usr/bin/env python
# -*- coding: utf-8 -*-

import urllib
import os
from osgeo import osr


def create_epsg_wkt_esri(epsg):
    """
    Get the ESRI formatted .prj definition
    usage create_epsg_wkt(4326)

    We use the http://spatialreference.org/ref/epsg/4326/esriwkt/

    """
    spatial_ref = osr.SpatialReference()
    spatial_ref.ImportFromEPSG(epsg)

    # transform projection format to ESRI .prj style
    spatial_ref.MorphToESRI()

    # export to WKT
    wkt_epsg = spatial_ref.ExportToWkt()

    return wkt_epsg


# Optional method to get EPGS as wkt from a web service
def get_epsg_code(epsg):
    """
    Get the ESRI formatted .prj definition
    usage get_epsg_code(4326)

    We use the http://spatialreference.org/ref/epsg/4326/esriwkt/

    """
    web_url = "http://spatialreference.org/ref/epsg/{0}/esriwkt/".format(epsg)
    f = urllib.urlopen(web_url)
    return f.read()


# Here we write out a new .prj file with the same name
# as our Shapefile named "schools" in this example
def write_prj_file(folder_name, shp_filename, epsg):
    """
    input the name of a Shapefile without the .shp
    input the EPSG code number as an integer

    usage  write_prj_file(<ShapefileName>,<EPSG CODE>)

    """

    in_shp_name = "/{0}.prj".format(shp_filename)
    full_path_name = folder_name + in_shp_name

    with open(full_path_name, "w") as prj:
        epsg_code = create_epsg_wkt_esri(epsg)
        prj.write(epsg_code)
        print ("done writing projection definition : " + epsg_code)


def run_batch_define_prj(folder_location, epsg):
    """
    input path to the folder location containing
    all of your Shapefiles

    usage  run_batch_define_prj("../geodata/no_prj")

    """

    # variable to hold our list of shapefiles
    shapefile_list = []

    # loop through the directory and find shapefiles
    # for each found shapefile write it to a list
    # remove the .shp ending so we do not end up with 
    # file names such as .shp.prj
    for shp_file in os.listdir(folder_location):
        if shp_file.endswith('.shp'):
            filename_no_ext = os.path.splitext(shp_file)[0]
            shapefile_list.append(filename_no_ext)

    # loop through the list of shapefiles and write
    # the new .prj for each shapefile
    for shp in shapefile_list:
        write_prj_file(folder_location, shp, epsg)


# Windows users please use the full path
# Linux users can also use full path        
run_batch_define_prj("c:/02_DEV/01_projects/04_packt/ch02/geodata/no_prj/", 4326)

How it works...

Using the standard urllib Python module, we can access an EPSG code via the Web and write this definition to a .prj file. We need to create a list of Shapefiles that we want to define .prj for and then create a .prj file for each Shapefile in this list.

The get_epsg_code(epsg) function returns the ESPG code text definition that we need. The write_prj_file(shp_filename, epsg) function takes two parameters, the Shapefile name and the EPSG code, writing out the .prj file to disk.

Next, we'll create an empty list to store the list of Shapefiles, switch to the directory where the Shapefiles are stored, and then list all the Shapefiles that currently in this directory.

Our for loop then populates the Shapefile list with the filenames without the .shp extension. Finally, the last for loop takes us through each Shapefile and calls our function to write each .prj file for each Shapefile in the list.