Lesson_01_read_write.png

In this exercise, you will learn how to read and write following types of spatial data in Python.

We will use following data set and could be found here.

  • New York State county shape file (Polygon: NY_County_GCS)
  • Soil Carbon Data (SOC) from Colorado, Kansas, New Mexico, and Wyoming (Point: GP_SOC_GCS.shp)
  • Road network maps of Onondaga county, New York State (polyline: "Onondaga_Street_GCS.shp)
  • 90 m SRTM DEM of Onondaga county, New York State (raster: Onondaga_DEM.tif)
  • Landsat8 multibands Images

Set working directory

In [55]:
import os
path= "C:/Users/zahmed2/Documents/spatial_data_processing_and_analysis_python/Lesson_02_read_and_write_spatila_data_in_python/"
os.chdir(path)
print("Current Working Directory " , os.getcwd())
Current Working Directory  C:\Users\zahmed2\Documents\spatial_data_processing_and_analysis_python\Lesson_02_read_and_write_spatila_data_in_python

Vector Data

Vector data consists of "geometry" or "shape" of the locations which describe information of spatial objects on earth. There are three types vector data such as polygons, points and line or polylines which structures of the geometry consists of sets of coordinate pairs (x, y).

vector.png

The shapefile, a interchangeably data format which are regulated by ESRI. It is one of the most common form of geospatial vector data used in GIS software and analyses. Three unique files are required for a shapefile, including: .shx (shape index format; this tags the shapefile with a position, so users can move it forward and backward among layers, a .shp (shape format, which stores geometric data, and a .dbf (attribute format file; which holds attributes (information) for the shapes in the file).

Polygon data

Polygons are two-dimensional geographical features covering a portion of the earth's surface, for example, forests, lakes, administrative boundaries, farmers' fields, or any other organizational unit the user defines. Polygons are important because their area and perimeter can be measured. Following example will how read ESRI polygon shape with four python libraries. We will use New York Sate County shape file (NY_County_GCS.shp).

Using Geopandas

In [56]:
import geopandas as gpd
In [57]:
# Set filepath
fp_poly_gcs="Data_02/NY_County_GCS.shp"
In [58]:
# read polygoan files with Geopandas 
poly_gcs= gpd.GeoDataFrame.from_file(fp_poly_gcs)
In [59]:
# Check file type
type(poly_gcs)
Out[59]:
geopandas.geodataframe.GeoDataFrame
In [60]:
# See data frame
poly_gcs.head()
Out[60]:
OBJECTID ID AREA PERIMETER COUNTYP020 STATE COUNTY FIPS STATE_FIPS SQUARE_MIL Shape_Leng Shape_Area geometry
0 1 4770.0 0.000 0.140 1876.0 NY Saint Lawrence County 36089 36 2756.532 11684.971307 3.916405e+06 POLYGON ((-74.82161 45.00044, -74.82099 44.998...
1 2 4771.0 0.330 2.472 1877.0 NY Clinton County 36019 36 1117.664 229364.545100 2.901000e+09 POLYGON ((-73.34179 44.54804, -73.36170 44.546...
2 3 4774.0 0.808 3.890 1880.0 NY Saint Lawrence County 36089 36 2756.532 357122.438069 7.141807e+09 POLYGON ((-74.72688 44.99080, -74.71956 44.954...
3 4 4778.0 0.496 3.252 1884.0 NY Franklin County 36033 36 1686.726 313582.649176 4.377902e+09 POLYGON ((-73.91242 44.43063, -74.06508 44.412...
4 5 4783.0 0.000 0.114 1889.0 NY Saint Lawrence County 36089 36 2756.532 9927.913795 3.088921e+06 POLYGON ((-74.93221 44.96763, -74.93738 44.966...

Using PyShp

In [61]:
import shapefile as shp
shape = shp.Reader(fp_poly_gcs)

Using Fiona

In [62]:
import fiona
shape = fiona.open("Data_02/NY_County_GCS.shp")
print (shape.schema)
{'properties': OrderedDict([('OBJECTID', 'str:80'), ('ID', 'float:24.15'), ('AREA', 'float:24.15'), ('PERIMETER', 'float:24.15'), ('COUNTYP020', 'float:24.15'), ('STATE', 'str:80'), ('COUNTY', 'str:80'), ('FIPS', 'str:80'), ('STATE_FIPS', 'str:80'), ('SQUARE_MIL', 'float:24.15'), ('Shape_Leng', 'float:24.15'), ('Shape_Area', 'float:24.15')]), 'geometry': 'Polygon'}
In [63]:
#first feature of the shapefile
first = shape.next()
print (first) 
{'type': 'Feature', 'id': '0', 'properties': OrderedDict([('OBJECTID', '1'), ('ID', 4770.0), ('AREA', 0.0), ('PERIMETER', 0.14), ('COUNTYP020', 1876.0), ('STATE', 'NY'), ('COUNTY', 'Saint Lawrence County'), ('FIPS', '36089'), ('STATE_FIPS', '36'), ('SQUARE_MIL', 2756.532), ('Shape_Leng', 11684.9713067), ('Shape_Area', 3916405.07346)]), 'geometry': {'type': 'Polygon', 'coordinates': [[(-74.82161151067312, 45.00044007941391), (-74.82098589942154, 44.998155074755665), (-74.82487689179268, 44.99634690862989), (-74.8274479992576, 44.99727387945654), (-74.83063708691292, 45.0004858550029), (-74.83772479714933, 45.001435715255646), (-74.8519536222948, 44.99876542598173), (-74.86488544973763, 44.99562592831123), (-74.8655034317208, 44.998368696749864), (-74.8545094698073, 45.00105424411329), (-74.85190784569917, 45.00332398965943), (-74.84346973336922, 45.008305985566864), (-74.83440601008554, 45.011002977276995), (-74.82731067062086, 45.01096864557287), (-74.82279406810939, 45.01094575700259), (-74.80797014892681, 45.00904603871575), (-74.80476580246477, 45.00719972428631), (-74.80220232517875, 45.005361040033975), (-74.80673418601913, 45.00356050228269), (-74.81836138707044, 45.00270601033941), (-74.82161151067312, 45.00044007941391)]]}}
E:\PyGeo\lib\site-packages\ipykernel_launcher.py:2: FionaDeprecationWarning: Collection.__next__() is buggy and will be removed in Fiona 2.0. Switch to `next(iter(collection))`.
  

Osgeo- GDAL

In [64]:
from osgeo import ogr
file = ogr.Open(fp_poly_gcs)
shape = file.GetLayer(0)
print(shape)
<osgeo.ogr.Layer; proxy of <Swig Object of type 'OGRLayerShadow *' at 0x0000025B2592D030> >

Visiualization of Vector Map

In [65]:
#%matplotlib inline
import os
import numpy as np
import geopandas as gpd
import matplotlib.pyplot as plt
from shapely.geometry import box
from mpl_toolkits.axes_grid1 import make_axes_locatable
In [66]:
# Import NY Ctate County Shape file 
poly_gcs.plot()
Out[66]:
<matplotlib.axes._subplots.AxesSubplot at 0x25b2594ca58>
In [67]:
### Change color of map
from matplotlib.colors import ListedColormap
cmap = ListedColormap(['grey'])
In [68]:
poly_gcs.plot(cmap=cmap)
Out[68]:
<matplotlib.axes._subplots.AxesSubplot at 0x25b34e4d278>
In [69]:
## Plot catgorical data
fig, ax = plt.subplots(figsize=(8, 8))
poly_gcs.plot(column='COUNTY',
                categorical=True,
                legend=True,
                ax=ax)

# adjust legend location
leg = ax.get_legend()
leg.set_bbox_to_anchor((1.25,1.25))
ax.set_axis_off()
plt.show()
In [70]:
## Plot Continues
# set a variable that will call whatever column we want to visualise on the map
variable = 'AREA'
# create figure and axes for Matplotlib
fig, ax = plt.subplots(1, figsize=(10, 6))
# create map
poly_gcs.plot(column=variable, 
              cmap='Blues', 
              linewidth=0.8, 
              ax=ax, 
              edgecolor='0.8')
# remove the axis
ax.axis('off')
# add a title
ax.set_title('NYS County Area')
# create an annotation for the data source
ax.annotate('Source: Cornell GIS, 2014',
            xy=(0.1, .08),  
            xycoords='figure fraction', 
            horizontalalignment='left', 
            verticalalignment='top', 
            fontsize=12, 
            color='#555555')
# Create colorbar as a legend
sm = plt.cm.ScalarMappable(cmap='Blues')
# empty array for the data range
sm._A = []
# add the colorbar to the figure
cbar = fig.colorbar(sm)

Lines or Polylines

One-dimensional lines, also called polylines, are used to represent geographical features like rivers, roads, railroads, trails, and topographic lines. Note that these features are linear in nature and do not have area like polygons. Hence, they can measure distance. In this exercise we use road network maps of Onondaga county, New York State.

In [71]:
# Load road shape file
fp_road_gcs="Data_02/Ononda_Street_GCS.shp"
road_gcs= gpd.read_file(fp_road_gcs)
road_gcs.plot()
Out[71]:
<matplotlib.axes._subplots.AxesSubplot at 0x25b27acb668>
In [72]:
# See data frame
road_gcs.head()
Out[72]:
ID StreetName HighwayNum Types Label TypeI geometry
0 1 Highbridge 153 Road Road 2 LINESTRING (-75.90097 42.78962, -75.90055 42.7...
1 2 Highland Park None Road Road 2 LINESTRING (-75.92192 42.79006, -75.92168 42.7...
2 3 Highland Park None Road Road 2 LINESTRING (-75.92168 42.79012, -75.92069 42.7...
3 4 Highland Park None Road Road 2 LINESTRING (-75.91374 42.79439, -75.91382 42.7...
4 5 Highland Park None Road Road 2 LINESTRING (-75.91370 42.79410, -75.91374 42.7...
In [73]:
## Plot road
fig, ax = plt.subplots(figsize=(14, 6))
road_gcs.plot(column='Types',
                categorical=True,
                legend=True,
                ax=ax)

# adjust legend location
leg = ax.get_legend()
leg.set_bbox_to_anchor((1.5,1.05))
ax.set_axis_off()

plt.show()