Multispectral Image Anaylsis with a Docker Container: Examples

Installing Docker and Starting the Container

Assuming that you are on Ubuntu Linux, you can install the latest stable version of Docker by running

sudo apt-get install docker.io

Then pull and run the image analysis container with

sudo docker run -d -p 433:8888 -v yourImageFolder:/crc/imagery --name=crc mort/crcdocker

This will run the conainer as a daemon serving the iPython Notebook kernel and map yourImageFolder on your host machine to /crc/imagery/ in the container. You can stop it at any time with

sudo docker stop crc

and re-start it with

sudo docker start crc

Now point your favorite browser to

http://localhost:433

to see the iPython Notebook home page.

Example 1: Automatic Radiometric Normalization of Landsat 7 TM Images

In this example we have three \(1000\times 1000\) Landsat 7 TM images in ENVI standard format acquired over the town of Juelich, Germany in May, June and August, 2001. We can see them in the container folder /crc/imagery:

In [1]:
ls -l imagery
total 17582
-rwxrwxrwx 1 root root 6000000 Jul 28  2006 20010525*
-rwxrwxrwx 1 root root     870 Jul 28  2006 20010525.hdr*
-rwxrwxrwx 1 root root 6000000 Jul 28  2006 20010626*
-rwxrwxrwx 1 root root     856 Jul 28  2006 20010626.hdr*
-rwxrwxrwx 1 root root 6000000 Jul 28  2006 20010829*
-rwxrwxrwx 1 root root     867 Jul 28  2006 20010829.hdr*

With the script dispms.py we can display the June and May images side-by-side. The -f option defines the left-hand image, and the -g option the right-hand image. We choose spectral bands 4,5 and 6 in a linear 2 percent stretch (the default):

In [2]:
run dispms -f imagery/20010626 -g imagery/20010525 -p [4,5,6]

Alternatively, with a 0-255 byte stretch (option -e 1) we can compare the image intensities in digital numbers. They appear to be equally bright:

In [3]:
run dispms -f imagery/20010626 -g imagery/20010525 -p [4,5,6] -e 1

Now let's run the iterative MAD (iMAD or IR-MAD) transformation on this image pair:

In [4]:
run iMad imagery/20010626 imagery/20010525
------------IRMAD -------------
Sat Dec 13 12:20:23 2014
time1: imagery/20010626
time2: imagery/20010525
rho: [ 0.66348064  0.72875464  0.86325872  0.96825218  0.99254405  0.99534643]
result written to: imagery/MAD(20010626-20010525)
elapsed time: 70.6844520569

The canonical correlations have converged nicely after 20 iterations, so we will display the last three MAD variates RGB =(6,5,4) in a linear 2 percent stretch:

In [6]:
run dispms -f imagery/MAD(20010626-20010525) -p [6,5,4]

The changes (bright and dark pixels) are clearly associated primarily with agricultural fields and to a lesser extent mining activities in the open cast mines. The featureless (middle gray) no-change regions correspond to forest canopy and built-up areas. The script radcal.py will extract no-change (or invariant) pixels from the MAD image and use them to radiometrically normalize the two contributing images. More specifically, the May image 20010525 (target) will be normalized to the june image 20010626 (reference).

In [7]:
run radcal imagery/MAD(20010626-20010525)
Sat Dec 13 12:23:26 2014
reference: imagery/20010626
target   : imagery/20010525
no-change probability threshold: 0.95
no-change pixels: 1816
band: 1  slope: 0.977922  intercept: 0.094079  correlation: 0.985445
band: 2  slope: 1.039509  intercept: -4.652407  correlation: 0.966174
band: 3  slope: 1.032707  intercept: -3.215248  correlation: 0.995546
band: 4  slope: 0.901767  intercept: 6.539983  correlation: 0.991301
band: 5  slope: 0.953878  intercept: 1.767085  correlation: 0.997023
band: 6  slope: 0.953479  intercept: -0.457727  correlation: 0.996491

result written to: imagery/20010525_norm
elapsed time: 1.82197403908

Note that the slopes are close to unity and the intercepts are small. Solar illumination and atmospheric conditions were similar for both acquisitions.

Next we look at the June and August images:

In [8]:
run dispms -f imagery/20010626 -g imagery/20010829 -p [4,5,6]

The amount of vegetation (red color) has decreased drastically between June and August due to grain harvesting. Also, as can be seen below in the 0-255 linear stretch, the lower solar elevation in August leads to a decrease in brightness and contrast.

In [9]:
run dispms -f imagery/20010626 -g imagery/20010829 -p [4,5,6] -e 1

Running iMad.py on the two scenes again gives a good result:

In [11]:
run iMad imagery/20010626 imagery/20010829
------------IRMAD -------------
Sat Dec 13 12:25:46 2014
time1: imagery/20010626
time2: imagery/20010829
rho: [ 0.66817403  0.74675512  0.85250628  0.93893862  0.97844374  0.98936266]
result written to: imagery/MAD(20010626-20010829)
elapsed time: 92.5766220093

And, normalizing the August image to the June image, we get:

In [53]:
run radcal imagery/MAD(20010626-20010829) 
Sat Dec 13 14:56:20 2014
reference: imagery/20010626
target   : imagery/20010829
no-change probability threshold: 0.95
no-change pixels: 1768
band: 1  slope: 1.141594  intercept: -0.497890  correlation: 0.970123
band: 2  slope: 1.179262  intercept: -1.879516  correlation: 0.980589
band: 3  slope: 1.256239  intercept: -7.962851  correlation: 0.986799
band: 4  slope: 1.422099  intercept: -5.731570  correlation: 0.973013
band: 5  slope: 1.253927  intercept: -3.739825  correlation: 0.988897
band: 6  slope: 1.323260  intercept: -8.025598  correlation: 0.994375

result written to: imagery/20010829_norm
elapsed time: 2.02722001076

We notice that the slopes are larger than one, since the brighter June image is being regressed onto the darker August image. Comparing before and after normalization, we can see the increase in brightness and contrast:

In [5]:
run dispms -f imagery/20010829 -g imagery/20010829_norm -p [4,5,6] -e 1

But notice what happens when we try to normalize the August image to the May image:

In [45]:
run iMad imagery/20010525 imagery/20010829
------------IRMAD -------------
Sat Dec 13 14:34:33 2014
time1: imagery/20010525
time2: imagery/20010829
rho: [ 0.39617747  0.72204781  0.74070019  0.91132772  0.93716526  0.97762114]
result written to: imagery/MAD(20010525-20010829)
elapsed time: 75.5868890285

In [47]:
run radcal imagery/MAD(20010525-20010829)
Sat Dec 13 14:37:07 2014
reference: imagery/20010525
target   : imagery/20010829
no-change probability threshold: 0.95
no-change pixels: 2948
band: 1  slope: 2.292634  intercept: -78.186526  correlation: 0.440673
band: 2  slope: 2.484006  intercept: -70.771054  correlation: 0.385093
band: 3  slope: 6.235646  intercept: -259.514316  correlation: 0.149743
band: 4  slope: 1.166662  intercept: 9.102162  correlation: 0.525678
band: 5  slope: 3.844833  intercept: -171.569574  correlation: 0.263732
band: 6  slope: 19.521318  intercept: -768.064371  correlation: 0.065242

result written to: imagery/20010829_norm
elapsed time: 2.08631086349

There is evidently simply too much real reflectance change over this time period, so that IR-MAD does not converge reliably to a background of invariant pixels.

Example 2: Unsupervised clustering

In this example we will perform unsupervised clustering on the June image. First of all, we perform a principal components transformation.

In [6]:
run pca -h
Usage: python pca.py  [-d dims] fileName

            spatial dimension is a list, e.g., -d [0,0,400,400] 


In [8]:
run pca imagery/20010626

Whoops - pca.py doesn't say what the output file is called. I'll fix that on next commit. In fact it just appends the string "_pca" to the input filename:

In [9]:
run dispms -f imagery/20010626_pca -p [1,2,3]

Now apply Gaussian mixture clustering on the first three principal components. First get help:

In [15]:
run em -h

Usage: 
---------------------------------------------------------
python em.py  [-p "bandPositions"] [-d "spatialDimensions"] 
[-K number of clusters] [-M max scale][-m min scale] 
[-t initial annealing temperature] [-s spatial mixing factor] 
[-P generate class probabilities image] filename
--------------------------------------------------------
bandPositions and spatialDimensions are lists, 
e.g., -p [1,2,4] -d [0,0,400,400]  
--------------------------------------------------------
If the input file is named 

         path/filenbasename.ext then

The output classification file is named 

         path/filebasename_em.ext

and the class probabilities output file is named

         path/filebasename_emprobs.ext
--------------------------------------------------------

and run the algorithm with its default parameter values:

In [10]:
run em -p [1,2,3] imagery/20010626_pca
=========================
     EM clustering
=========================
infile:   imagery/20010626_pca
clusters: 6
T0:       0.500000
beta:     0.500000
running EM on 62500 pixel vectors
em iteration 0: dU: 0.678484 loglike: -123817.781390
em iteration 10: dU: 0.900134 loglike: -84603.057360
em iteration 20: dU: 0.779732 loglike: -46374.621167
em iteration 30: dU: 0.624334 loglike: -37868.104042
em iteration 40: dU: 0.230582 loglike: -34688.064385
em iteration 50: dU: 0.160172 loglike: -34148.756581
em iteration 60: dU: 0.043406 loglike: -32926.433218
em iteration 70: dU: 0.203682 loglike: -27751.538246
em iteration 80: dU: 0.107137 loglike: -25059.115975
em iteration 90: dU: 0.066479 loglike: -24785.479872
em iteration 100: dU: 0.079360 loglike: -24759.645571
em iteration 110: dU: 0.018453 loglike: -24784.227966
em iteration 120: dU: 0.003625 loglike: -24823.875117
em iteration 130: dU: 0.003082 loglike: -24857.222634
em iteration 140: dU: 0.002642 loglike: -24885.134227
em iteration 150: dU: 0.002266 loglike: -24908.272379
em iteration 160: dU: 0.001950 loglike: -24927.477785
em iteration 170: dU: 0.001675 loglike: -24943.485094
em iteration 180: dU: 0.001438 loglike: -24956.882784
em iteration 190: dU: 0.001243 loglike: -24968.136396
em iteration 200: dU: 0.001073 loglike: -24977.616411
running EM on 250000 pixel vectors
em iteration 0: dU: 0.999981 loglike: -236337.235720
em iteration 10: dU: 0.010273 loglike: -89570.999754
running EM on 1000000 pixel vectors
em iteration 0: dU: 1.000000 loglike: -735483.299162
em iteration 10: dU: 0.002320 loglike: -353339.294121
Cluster mean vectors
[[  43.56737061   -3.64784781    5.18364918]
 [  51.91894642    9.20866575   -7.93955998]
 [   0.84970713  -20.30496965    4.56218903]
 [   4.93108071    7.07576732   -4.221782  ]
 [ -36.61906215   -4.83245407    4.26954474]
 [-135.75986906   15.9166997    -1.76554799]]
Cluster covariance matrices
cluster: 0
[[ 254.77445749   13.10935485    8.39337521]
 [  13.10935485   60.41291526   -2.56431271]
 [   8.39337521   -2.56431271   11.87166616]]
cluster: 1
[[ 210.73715752  275.15380245   78.060003  ]
 [ 275.15380245  593.7344348   156.06384018]
 [  78.060003    156.06384018   56.06400239]]
cluster: 2
[[ 608.40183728  -77.50067294   14.49534933]
 [ -77.50067294  120.96241483   -8.75936146]
 [  14.49534933   -8.75936146   17.24725868]]
cluster: 3
[[ 487.67706593  -75.96503328  -14.53382818]
 [ -75.96503328  571.38062385  112.79953555]
 [ -14.53382818  112.79953555   73.30487392]]
cluster: 4
[[ 1607.51857689    16.36972684   123.61821528]
 [   16.36972684   159.98997297   -14.63311478]
 [  123.61821528   -14.63311478   250.56137196]]
cluster: 5
[[ 4698.57112745  1328.31800544  -168.34690269]
 [ 1328.31800544   471.19305556  -165.36042719]
 [ -168.34690269  -165.36042719   645.05731714]]
classified image written to: imagery/20010626_pca_em
elapsed time: 146.477921009
--done------------------------

The result is a single-band ENVI classification type image, here is the header:

In [18]:
cat imagery/20010626_pca_em.hdr
ENVI
description = {
imagery/20010626_pca_em}
samples = 1000
lines = 1000
bands = 1
header offset = 0
file type = ENVI Classification
data type = 1
interleave = bsq
band names = {
Band 1}
byte order = 0
class lookup = {0, 0, 0, 0, 0, 255, 255, 255, 0, 0, 255, 255, 255, 0, 255, 176, 48, 96, 46, 139, 87}
class names = ['class 0', 'class 1', 'class 2', 'class 3', 'class 4', 'class 5', 'class 6']
classes = 7
coordinate system string = PROJCS["WGS_1984_UTM_Zone_31N",GEOGCS["GCS_WGS_1984",DATUM["D_WGS_1984",SPHEROID["WGS_1984",6378137,298.257223563]],PRIMEM["Greenwich",0],UNIT["Degree",0.017453292519943295]],PROJECTION["Transverse_Mercator"],PARAMETER["latitude_of_origin",0],PARAMETER["central_meridian",3],PARAMETER["scale_factor",0.9996],PARAMETER["false_easting",500000],PARAMETER["false_northing",0],UNIT["Meter",1]]
map info = {
UTM, 1, 1, 723116.25, 5657264.25, 28.5, 28.5, 31, North,WGS-84}

We can display it in BW here as follows:

In [17]:
run dispms -f imagery/20010626_pca_em -p [1,1,1] -e 2