This page last changed on Feb 11, 2006 by cholmes.

Structure of RelationshipDataStore

The main goal of this document is to present the structure and design of the RelationshipDataStore. The document shows how this kind of data store is built up and explains the main ideas when you are thinking of reading and writing complex features.
The fundamental idea is to create a data store which implements the org.getools.data.DataStore interface and is able to handle exactly two data stores. One of the two data stores can be set to null explicitely. This data store is called a RelationshipDataStore. So the RelationshipDataStore will cover two data stores knowing nothing at all about the contained data stores, except that they are data stores defined by the org.getools.data.DataStore interface. If a RelationshipDataStore will cover two data stores without detailed knowledge about the dependending data stores it can also cover a RelationshipDataStore that you are able to construct a binary tree upon it. On the left side in the figure below you can see a RelationshipDataStore containing two BasisDataStores (data stores providing flat structured features are named BasisDataStores). On the right side you can see a configuration of a RelationshipDataStore covering four data stores.

The second figure shows the static structure. As mentioned above the RelationshipDataStore encapsulates two data stores only knowing that they're org.getools.data.DataStore implementations. If the RelationshipDataStore class will cover themself it has to implement the org.getools.data.DataStore interface, too. Like every data store that keeps the GeoTools conventions it is created by its factory - the RelationshipDataStoreFactory.

Now it is time to go on with relationships and how they are treated by RelationshipDataStores.
A relation in the view of a RelationshipDataStore is a chaining between two attribute types of two or sometimes even one feature types. Every RelationshipDataStore retains the relationships of the feature types it contains and the relationships are only stored once in the binary tree construct and as near as possible to the bottom of the binary tree.
There is an example configuration of a RelationshipDataStore below that might be useful to understand the design. As you can see there are four data stores which can be any data store, e.g. MySQLDataStore, ArcSDEDataStore, PropertyDataStore or any other implemention of the org.getools.data.DataStore interface you can imagine. Every data store contains its feature types and a RelationshipDataStore contains all the feature types that are in any child data store. In the figure there is a data store for "person", "car", "habitation" and "address". Please note that it is no obligation for a data store to contain only one feature type like it is handled in ShapeFileStores. Furthermore you can see in the illustration below some relationships managed in relation tables. In the case of RDBMS the source column of the relation table refers to the foreign key attribute and the target column refers to the primary key attribute. So the direction of a relation is very important.

As you will see in the next paper "Reading Complex Features" source attributes will be replaced by a referencing feature because of redundant data retainment while target attributes must not be replaced because of identification. As mentioned above the relationships are defined in the RelationshipDataStore which contains the features of the relationship nearest to the bottom of the binary tree structure.
In the example the relation between "person" and "car" has to be stored in the RelationshipDataStore as shown in the figure. The RelationshipDataStore at the level above is not responsible for this relation because it isn't the nearest to the bottom according to the feature types "person" and "car". But the relation between "person" and "habitation" has to be managed in the data store one level above because none of the data stores below have knowledge about the feature types "person" and "habitation". So in this example the top level data store is the nearest to the bottom containing "person" and "habitation". Finally the relationship between "habitaion" and "address" is similar to the relationship between "person" and "car".
Let us have a quick look on the functional aspects. The following lines are to show how reading of features and its types should work. For example if you want to read the feature type of "person", you will not expect a feature type described like that:

  • ID in the type of Integer
  • FistName in the type of String
  • LastName in the type of String

Right? What you will expect while reading complex this feature type is:

  • ID in the type of Integer
  • FistName in the type of String
  • LastName in the type of String
  • Habitation in the type of Feature
  • Car in the type of Feature

Note that a FeatureCollection defined by GeoTools is also a Feature!
In an other case you want to read the feature type of "habitation" and you probably want to get this:

  • ID in the type of Integer
  • Type in the type of String
  • Area in the type of Integer
  • Person in the type of Feature
  • Address in the type of Feature

But how does it work? Let us start with a situation where a user is asking the top level RelationshipDataStore for the feature type of "habitation". The first task the RelationshipDataStore will do is to get the feature type from the child data store which contains the requested feature type. After that, if it has an relationship entry in its relation table, it will know that there is at least one attribute type to be replaced if it is a source attribute or to be attached if it is a target attribute.
And so it will be done in this example: The root data store asks its child data store for the feature type "habitation". The child data store asks its child data store for the feature type. As it is a basic data store it returns the usual simple feature type. The child RelationshipDataStore knows about a relation regarding the feature type "habitation", so it replaces or attaches a new attribute type with the attribute type name of the depending feature type and the type of the GeoTools feature's interface to the feature type. After modifying all related attribute types, it will return the semi-complex feature type back to the root RelationshipDataStore. The top level RelationshipDataStore knows a relation according to feature type "habitation", too. Of course it also starts modifying the feature type. And finally it'll return a complex feature type, too.
The second and last part of this outlook will describe the schema how a complex feature will be read. For example there is a user who wants to read a feature of feature type "person". Accepted there is a person who owns three cars, one house and one flat in different places. A RelationshipDataStore will work like that: At first it checks if there is an entry in its relation table. If there is no entry it will ask its child data store for the requested feature and returns it back after getting it from its child data store.
And that is the way it works in the example: There is a user who wants to read a feature in the type of "person". The root RelationshipDataStore gets the request. It'll have a look into its relation table to see if there is a relationship entry for the requested feature's type. There is one entry according feature type "person" and feature type "habitation". Now there are three things to be done: Read the root feature, read child feature and combine them. So the root RelationshipDataStore asks its child RelationshipDataStore for the root feature. The child RelationshipDataStore gets a request for a specific person. The child RelationshipDataStore also has knowlefge about its relationship entry between "person" and "car". At first it gets the specific person. After that it gets all the cars belonging to this person by comparing the key attributes. The child attributes are wrapped into a FeatureCollection which will be linked to the root feature. So the child RelationshipDataStore returns the semi-complex feature to the root RelationshipDataStore. In the root RelationshipDataStore only the root feature is read yet. So the root RelationshipDataStore asks its child RelationshipDataStore for all the habitations owned by the person. The root RelationshipDataStore gets all habitations the person has with its related addresses which were assambled the same way the person and its cars were combined. The root RelationshipDataStore wraps also all habitations into an additional FeatureCollection and links the root feature to the collection and other way round the collection's members to the root feature. As a result the Root-RelationshipDataStore will return the complex feature.
I hope you enjoyed reading my first article about the fundamental structure of this special data store. I am always open minded for your incitement, criticism and questions but also for commendation.
See you next time when I will give you a report to the RelationshipFeatureReader.


Document generated by Confluence on May 14, 2014 23:00