# Overview

Atlassian SourceTree is a free Git and Mercurial client for Windows.

Atlassian SourceTree is a free Git and Mercurial client for Mac.

# murature.py

## basic usage

You need a recent version of Python. This scripts have been tested with Python 2.4 and 2.5.

You will need also the Python OGR bindings and the RPy module.

On Debian-like systems (including Ubuntu), this is just a matter of installing
the following packages: `python-gdal`, `python-rpy`.

To run the script, open a terminal in the directory that contains the files, and type:

python example.py 3.1

`3.1` is the *magic number* that we pass to the script. If you don't specify a
magic number, the default value is `3.0`.

The usual output is something like this:

by hand Min. 1st Qu. 3rd Qu. Median Max. Mean 9.90 12.00 16.50 13.50 30.20 15.01 Simple Height Min. 1st Qu. 3rd Qu. Median Max. Mean 9.093 12.780 20.230 15.170 48.190 16.860 Smart Height Min. 1st Qu. 3rd Qu. Median Max. Mean 7.768 11.540 18.410 13.640 34.670 15.150

The image `plotR.png` is created in the current directory and shows a
graphical comparison between the various recording methods.

## library description

This Python library was written as an help for the study of stone walls, mainly through the quantitative analysis of spatial dimensions of stones.

The code is still in the early stages of development and lacks a clear
structure, but the functions are documented quite well with docstrings. At the
moment you find an `example.py` that shows how to use the library routines,
an example dataset (made by various files), and 3 Python scripts:

`geometrywall.py`has all the geometry (OGR) related functions`rplots.py`contains the`RPlots`class that can be used to output summaries and graphs`breaks.py`has the code for automatically classifying stones into rows

The numerical analisys is done with R using the `rpy` Python module. We are
trying to use just methods and functions from the standard R library.

- the
`ogr`module is used to import geometry data and get all the needed parameters like centroid and boundary coordinates - height is calculated with two different methods:
- a simple method
`max(y) - min(y)` - the smart method

- a simple method
- we also store a central measure (like median) to be used in the next steps as a parameter to find stone rows

Each stone is assigned a rank number using a kernel density function
(`density` in the R `stats` standard library) with a narrow bandwidth. This
works because vertical coordinates of centroids aren't distributed uniformely,
but are concentrated around the mean row's value. Thus, stones who are on the
same course will get the same rank number. This allows other kind of analyses
based on the variance of single courses and other methods still to be explored.

To best compare the measures taken by hand and the automatic ones we need a complete and detailed case study, that has to be well drawn, with an attribute table containing the hand-taken measures.

## the *smart* method

The *basic* algorhythm works by finding the highest and the lowest Y coordinate
of the stone polygon, then a simple `max(y) - min(y)` subtraction gives us a
*rough* estimate of the true height of the stone. Tests carried between
hand-recorded measures and this method show this roughness is way too high for
our needs. The hand-taken measures are our reference because the expert human
operator is able to record a *significant* value for the stone height, and thus
(s)he behaves quite differently than this simple algorhythm.

--insert images and graphs here--

We can try to get a *smart* height by averaging the `n` highest and `n`
lowest Y coordinates. This way our `max(y) - min(y)` becomes something
slightly different: a difference between the *average upper limit* and the
*average lower limit*. We use a **magic number** to express the ratio of total
points to be used in this calculation.

If `magicNumber = 3.0` we are going to use `totalPoints / magicNumber`
points for the *upper limit* and `totalPoints / magicNumber` points for the
*lower limit*.

magicNumber = 3.0 self.stonePointsCount = stoneBoundary.GetPointCount() pointsRatio = int(( self.stonePointsCount / magicNumber )) + 1

The `stoneBoundary` object is the OGR boundary (of type `LINESTRING`) of
the stone. The OGR `GetPointCount()` method simply returns the number of
points in a `LINESTRING`.

The `pointsRatio` variable is used to calculate *for each stone* how many
points are needed to compute the *smart* height. This way we make sure that the
algo is consistent across all the stones. Using a fixed value doesn't make
sense here, because there will be stones with a few points and others with more
than 20 points. The numerical value `self.stonePointsCount / magicNumber` is
a floating point number, that we must convert to an integer in order to use it.

We should think about the different results of adding or not 1 to this variable.

def smartAlgo(listey,ratio): '''`smart' algo with average coordinates.''' listey.sort() asd = 0 for i in listey[0:ratio]: asd = asd + i yMin = asd/ratio asd = 0 for i in listey[-ratio:]: asd = asd + i yMax = asd/ratio yAvg = yMax - yMin return yAvg

## the *smart 2* method

This second smart algorhythm works in a slightly different way from the first one.

Instead of using a predetermined number of points for averaging the upper and lower limits, we use a range of Y coordinates based on the extreme values.

So, if `p_max` is the point with the highest Y coordinate, and `p_min` the
one with the lowest, we first obtain the *simple* height with:

simple_height = y(p_max) - y(p_min)

Then, the points that will be used for the average *upper* limit are those
whose Y coordinate is such that:

(y(p_max) - ( simple_height / n )) < y(p) <= y(p_max)

where `n` expresses the range that should be used, proportional to the stone
height. Note that `y(p)` is *less or equal* than `y(p_max)`, otherwise
`p_max` itself would be excluded from the procedure, exposing to
`ZeroDivisionError` s and other bugs. The same applies for the lower limit.
Once the points to use have been selected, the two averages are calculated and
their difference is the resulting `smart_2_height` of the stone.

An example should make it more clear. If `y(p_max)` for our stone is 410.34
and `y(p_min) = 395.16`, it's easy to obtain:

simple_height = y(p_max) - y(p_min) = 410.34 - 395.16 = 15.18

Then, for finding the average upper limit, we iterate through all the points in the current stone, and select only those such as that:

(y(p_max) - ( simple_height / n )) < y(p) <= y(p_max) (410.34 - (15.18 / n)) < y(p) < 410.34

Which value should we give to `n`? Some experiments showed that values around
7 give good results, so let's use this value for now:

(410.34 - (15.18 / 7)) < y(p) <= 410.34 408.17 < y(p) <= 410.34

So, only those points that are in that range will be included in the average upper limit.

### Performance

So far, the two algorhythms both work quite well when compared to hand-made measurements, while the difference between the two are poorly significant.

Given this correspondance, we should choose the faster and simpler one. Another important issue is the choice of parameters. At the moment parameters must be specified manually by the user (or fixed in the source code), and there are no plans to change this.

## buffer analysis

This analysis is optionally based on the height values calculated with one of the methods above.

### introduction

In 1993, Parenti and Doglioni suggested the use, among other qualitative parameters, of a quantitative parameter which would be useful to describe a stone wall and eventually compare two walls.

This quantitative parameter is calculated on a random area from the wall, as the ratio between the area occupied by stones and the "empty" areas around the stones. This value, if compared to other numeric parameters (most notably the number of stones that fall into the same area), can be useful when creating a tipology.

Our analysis is based on this method, pushing the same concept beyond the barrier of the wall taken as a whole.

### the analysis

For each stone-polygon, we start creating a *buffer* area. This is the area
that contains all the points within a certain distance from the polygon
boundary. How the distance is chosen can slightly change the results, depending
on wheter a fixed value is used or a value proportional to some other value
of each polygon (area, perimeter, height, width, etc..).

After the buffer area has been created, the procedure is as follows:

- subtract the stone area from the buffer area (which includes it) to get only the actual buffer.
- find the intersection of the buffer area with the other surrounding stones (this is obtained by creating a multipolygon that includes all stones) and retrieve the total area of this intersection
- the
`intersection_area / buffer_area`ratio is the value we will use as an indicator of (possibly hidden) groups of stones.

### results

So far, this method has failed to give the expected results. Lower values are obtained for stones that are near the wall limits, or have otherwise no stones on one or more sides. For normal stones, there are no significant variations in the obtained value.

Though not related with the original idea, this method could be used to find which stones are suitable for further analyses that are based on the wall fabric.