VECTOR OR RASTER?

Henri J.G.L. Aalders
Faculty Civil Engineering and Geosciences
Subfaculty of Geodetic Engineering
Delft University of Technology
Faculty of Applied Sciences
Katholieke Universiteit Leuven
E-mail: h.aalders@geo.tudelft.nl

SUMMARY

When creating databases for GIS-applications often existing maps are scanned and vectorised for future use. However, vectorisation becomes obsolete when GIS-objects can be referred to both in theme and geometry in a raster environment.

This article shows possibilities to use raster-based-objects for GIS-applications in both the graphical and image structure as well as the structural background for GIS raster database development.

Introduction

In many applications such as road management, urban management, utility-, cadastral- and municipal applications, maps are widely used. With the introduction of information technology Geographic/Land Information Systems (GIS/LIS) are being developed to store and use the data in a digital form. In these systems a well-known cartographic technique is used, i.e. strings of lines are defined as a concatenation of straight connections between points, usually called vectors, to represent line features as well as the circumference of areas. The human mind is easily capable of interpreting these chains (a strings of vectors) as a polygon or line type feature.

Alternative methods of representing features using gridcell techniques have been widely used in GIS applications but rarely in LIS. The gridcell approach divides an image of the area of concern into small picture elements (pixels). Pixels tend to be equal in size, regular in shape and adjacent to each other, leaving no gaps in between. The pixel size determines the resolution of the image geometry. The required processing is done on so called image processing systems.

GIS/LIS are special cases of general information systems because they contain spatial attributes to each entity, described by the position of points in line strings or the spatial position of pixels. Because of its nature the organisation of data in a GIS/LIS with a vector structure differs significantly from the gridcell data organisation (fig.1). A raster structure can be seen as a special case of the gridcell organisation [2].

DEVELOPMENTS

The application of information technology to GIS/LIS has been influenced by several developments that enable the application of:

 

Fig.1 .

  1. vector data organisation: each line string is represented by the connection of its characteristic points carrying co-ordinates;
  2. gridcell data organisation: each pixel has a row- and column numbers for positional definition as well as a non-metric value, e.g.:
  1. raster organisation: adjacent pixels in the same row having equal values are stored as one object with the length (being the number of equal valued pixel in the row) as a metric definition.

    Fig.2. Raster/vector and vector/ raster conversions in
    an integrated hybrid GIS/LIS system
    .

  2. Fundamental to the development of GIS/LIS, recognised long before the existence of object-orientation, are the real world features. These are defined for the conceptual and logical model of the terrain through abstraction and description by characteristics, resulting model objects and attributes. Most contemporary GIS/LIS systems use this object-based modelling. This approach appears to be also the basis for object-oriented data modelling.

In conclusion, a GIS/LIS should at least be based on the real world features in order to extract and analyse processes and phenomena of the real world, no matter if vector- or raster technology is used. The application of a raster or a vector technology should enable the object based approach.

OBJECT-BASED GRID TECHNOLOGY

Originally the basic object in grid structured data bases is the pixel with spatial (row- and column number) and non-spatial attributes stored for each pixel separately. A grid tessellation bears no relationship with the geographical unit boundaries in the terrain. So it is necessary to label each pixel with all the (non-spatial) attributes of the geographical unit. There are two methods to determine to which geographical unit a pixel belongs:

  1. V = { p i,j | M p e V }
  2. V = { p i,j | O ( p i,j ) U V > ½ O ( p i,j ) }

in which: p i,j = the pixel of concern;

Mp = a predefined point of pixel p i,j;

O ( p i,j ) = the area of pixel p i,j;

V = the least common geographical unit under consideration;

in other words:

a. the pixel pi,j is part of the geographical unit V if a predefined point Mp is positioned inside the pixel and the geographical unit;

b. the pixel pi,j is part of the geographical unit V if more than half of the area of the pixel also falls inside the geographical unit.

Fig. 3 Logical and mathematical representation of an object and its attributes in a vector- and grid structured model.

A two-dimensional terrain representation of the terrain only contains area objects since all terrain elements have a physical space. In cartography these elements are generalised resulting in point-, line- or area objects represented. Area objects are represented by there bounding ring in a vector method by the subsequent points' co-ordinates of the line (ring) around the area).

In fact a pixel is also a two-dimensional object and so it is possible to describe each occurrence of a geographical object as a set pixels. In this way the pixel is not the basic object in a grid structured model, each carrying all attributes but a set of pixels represents a geographical object. This results in a considerable reduction of storage space of a grid structured model (fig.3).

The behaviour of topology in vector and grid structured models requires special attention.

In a vector model there exists a (one-dimensional) topology between points in a chain indicating the sequence of points defining the chain and also a (two-dimensional) topology between areas, linked to the chain and referencing the area objects on either side of the chain. In the vector model both topological properties are physically stored in the database.

 

Fig. 4 Implicit (e.g. in Remote Sensing images) and explicit boundaries (e.g. in a scanned map).

Because of the regular tessellation in a gridcell structured model, requiring adjacency of pixels with no overlaps or gaps, an implicit topological relation exists between adjacent pixels. Therefore a grid structured model does not explicitly contain any boundaries: a boundary exists between two adjacent pixels if they have different non-spatial attributes (fig.4)

 

Model

Type

Grid

Vector

Dimension

2 - D

1 - D

Explicit

polygon

(plus interior)

boundary

topology

Implicit

boundary

topology

ring

(no interior)

 

Fig.5 Theoretical differences between grid and vector
structured models
.

The theoretical differences between grid and vector structured model are based upon the dimension of its basic object, and the implicit and explicit registration of objects (fig.5).

By scanning maps, a bitmap is created in which each pixel representing a line is black with value 1 and representing an area is white with value 0.

Because of the existence of the implicit topological relation between adjacent pixels in a bitmap all pixels belonging to the same area object can be found as a set. This is done by a searching all neighbouring pixels with the same value (0) as the indicated seed pixel and their neighbouring pixels and so on (the thickening technique). Similarly, line objects can be represented as a set of contiguous pixels because line objects are physically present in a bitmap and can be detected as a set of pixels having the value 1 (see figure 8).

In photographs object are represented by a set of pixels having multiple grey tone values. So the object representation in a database will result in a variable list of pixels numbers to link to object and the set of pixels belonging to the object which is not allowed in relational databases. A solution to this is to use linked lists (see fig.6). Each object is linked through a pointer to one of its pixels, which is linked to another pixels and so on until the last pixel that refers back to the object. In this way raster elements are chained to their object definition by means of pointers and reference from object to its spatial definition and reverse (notice that in this way the relational database solution can be applied and also that no adjacency is required between pixels!). The same techniques can also be applied to runlength encoding or other structures as quad tree hierarchy.

 

 

 

Fig. 7. Structure for linkage by labelled pixels or seed pixels.

 

PROTOTYPE EXPERIENCES

An experimental system has been set up to test the possibility of this theoretical model [2]. In order to reduce data quantities, run-length encoding is applied to bit map data. This means that adjacent pixels in a row are combined into one object. From the theoretical point of view this approach is the same but in practice the size of the data base will be reduced to about 10% of its original bitmap storage space.

Models

In order to define the geometric attributes of LIS objects as they appear on a map two logical models have been developed in which spatial and non-spatial attributes of an object are related by (see fig 7):

  1. seed-pixel-link;
  2. run-length-link.

In the seed pixel linkage model the relationship between spatial and non-spatial attributes is realised by a link from the set of non-spatial attributes to a seed pixel and vica versa. (Instead of a seed pixel a seed run-length can also be used to reduce the database size.) The set of pixels describing the spatial attributes of an occurrence of an object can by found using the implicit topological relationships of the set, using a growing algorithm (see fig 8). For all objects (as well area-, line- or point-objects) a seed pixel is to be determined either by automatic means (by thinning techniques finding the areas' centred) or manually digitised. For objects representing a line the node pixels should be found since the thickening techniques for finding chains should stop in a node. The exact position of the node is not of utmost importance but it should not be too far off. This means that automatic node searching techniques can be applied as used in the raster vector conversion processes.

In the run-length linkage model for every run-length a direct link is explicitly present in the database between the run-length and the set of non-spatial attributes.

Hardware

Although it is known that the system also runs on other platforms, the necessary programmes are developed on a Intergraph Microstation system running on a standard Interpro 3070 with 16 Mb memory running at 16 MHz. The raster display is driven by the IRAS-B software and the database management system is Informix, using SQL commands.

Results

The scanning of the Dutch cadastral map (Lunteren A1, size 1 m. by 0.7 m.) took about 5 minutes on a Optronics scanner (resolution 200 pixels per cm., resulting in 5 cm. real world pixels). The process is very sensitive to line gaps in lines created by the cartographer’s uses of different line styles. About four hours are used the close line gaps although also automated techniques (line thickening and subsequent line thinning) can be also be applied [2]. This editing also includes the time for manual indicating the seed and node pixels.

In this prototype searching a set of pixels for the geometric definition of an occurrence of an object took on average 3 seconds using the seed pixels linkage and a maximum of 15 seconds (for long lines running perpendicular to the scanline direction).

 

Fig. 8 A bitmap as a result from a scanned map.

Applying the growth algorithm

The use of seed pixels.

Pixel values: 25 area seed pixel - 37-39 line seed pixel

4 node pixel - 1 line pixel

 

PROTOTYPE EXPERIENCES

An experimental system has been set up to test the possibility of this theoretical model [2]. In order to reduce data quantities, run-length encoding is applied to bit map data. This means that adjacent pixels in a row are combined into one object. From the theoretical point of view this approach is the same but in practice the size of the data base will be reduced to about 10% of its original bitmap storage space.

Speed was not critical in the testing the prototype but it turned out that an operational system based on these ideas is feasible. The prototype shows how a hybrid LIS system with data structures in both vector- and raster can be updated from scanned images converted into either form as necessary. It should be realised that the raster based structure does not use a CAD-system but an image processing system. The practical implementation of this system requires extra dedicated functions for line operations such as

MOVE, MODIFY, DELETE, INTERSECT, PARALLEL and

PERPENDICULAR.

The great advantage of this system is the capturing of data in bulk by complete automatic means in a very short period using existing maps. Even when in a later stage the user decides to change to a vector based data structure the non-spatial part of the data base can be used as it is stored in the raster based structure, since both systems use the same information and structure if both the raster and the vector data base is object-based (fig.3).

REFERENCES

  1. Ingen Housz, F.C., Het gebruik van rastergegevens bij een Vastgoed Informatie Systeem. Afstudeerscriptie Faculteit der Geodesie, TU Delft, January 1990.
  2. Peuquet, P.J., Raster Processing: An Alternative Approach to Automated Cartographic Data Handling. American Cartographer, Vol. 6, no.2, 1979 p.129-139.
  3. Maquire, David J and Michael F. Worboys and Hilary M. Hearnshaw: An introduction to Object-Oriented GIS, in Mapping Awareness Vol. 4 no. 2, March 1990