This page last changed on Mar 25, 2008 by jdeolive.

Exploring Possibilities of GeoServer with Amazon Web Services.

Overview

Amazon Web Services (AWS) are a set of web services focused on providing a way for applications to scale to the masses in terms of users and data. As mass amounts of data is something common in the geospatial world, utilizing these web services could lead to some very interesting possibilities.

This project involves exploring some of these ideas and experimenting with ways of hooking GeoServer up to AWS. This project is experimental in nature and students will not be required to implement a 100% workable system. But more to perform research and present valuable findings.

Technical Details

The following are some ideas integrating GeoServer and AWS:

GeoServer AMI and Clustering

EC2 provides a service which allows one to fire up new machine instances on demand. As a website becomes more and more popular and the demand on it increases, this web services makes it easy to scale by adding new machine instances to a cluster.

This idea involves creating an Amazon Machine Image (AMI) consisting of GeoServer, OpenLayers, and PostGIS. The student will research ways to cluster the GeoServer AMIs in a way that they share the same configuration. A potential idea is using another AWS, S3, to achieve this.

SimpleDB GeoToools DataStore

Clustering database instances is hard and the systems that support them are complex and hard to maintain. [SimpleDB|] is a web service which provides database functionality while at the same time hiding all the details of clustering and scalability.

This idea involves creating a GeoTools datastore backed directly onto SimpleDB. This would provide an extremely scalable back end for GeoServer users without a high operational cost. The challenge will be to figure out a way to make SimpleDB "spatially aware". Figuring out how to store spatial information, and how to facilitate efficient spatial queries via indexing.

Students will utilize the Spatial Database in a Box project, which provides guidance on how to spatial y enable a non-spatial database.

Tile Processing Service

A relatively recent innovation in the web mapping world is the idea of "tiling", splitting web maps up into small chucks (called "tiles"), each of which are easier to manage, and more importantly can be cached.

An important part to building a tile cache is "seeding" it, ie. pre-processing the image and building the tile structure before hand. A seeded cache is important to achieve acceptable performance. One of the problems with seeding is that it is expensive in terms of time and cpu.

This project involves exploring ways to utilize web services such as EC2 and S3 for building and storing tiles for large amounts of data. This idea builds off the others as it depends on setting up a cluster of GeoServer / GeoWebCache and figuring out a way to easily start the seeding job. The ideal is to have a user to be able to supply some data, some information about number of instances they want to run, and hit go. The result being stored in S3, or perhaps made available for download.

This idea is somewhat less defined and more challenging than the others.

Document generated by Confluence on May 14, 2014 23:00