GeoServer : selectDerivedFeatureType
This page last changed on Mar 07, 2005 by dblasby.
Please see DerivedFeatureType for an overview.
This describes an implementation for a datastore that wraps another datastore and produces a new FeatureType computed from the FeatureType produced by the wrapped datastore.
This "virtual" datastore needs to be able to transform an input FeatureType into a new FeatureType, plus be able to handle queries.
The simple idea is to setup a simple set of associations: (a) new attribute name (b) new attribute definition (based on the OGC Filter Function clause) and (c) new attribute type.
For example, consider the query:
With the following input and output:
This defines an FeatureType with four attributes, defined according to the FeatureType they are derived from:
"geom" (of type Geometry) with the definition (see "Functions" below):
"nStudies" (of type integer) with the definition:
"river_len" (of type double) with the definition:
"width" (of type double) with the definition:
NOTE: I havent given any FID information - this should be derived from the wrapped datastore's FID. For example, converting "myRiverTable:3587" to "computedRiverTable:myRiverTable:3587" using namespace information.
The attribute type not be required because it can be determined by looking at the definition, see discussion on functions (below).
See the section on Functions, below, but reading from the "virtual" datastore should be as simple as reading a single feature from the wrapped datastore and creating a new feature by executing the above function calls on the wrapped feature. The abstract "filter evaluator" (for non-sql datstores) should be able to handle this type of evaluation.
Handling queries is actually quite simple. We just re-write the input query so it applies to the wrapped datastore. This should be as simple as replacing any "<PropertyName>...</PropertyName>" in the input query with the function-based definition given above.
For example, consider the following query on the wrapped datastore defined above:
by replacing the "<PropertyName>nStudies</PropertyName>" with its definition we get:
This is then executed on the wrapped dataset.
Unfortunately, the re-written queries can be inefficient. This inefficiency can be reduced by taking advantage of the wrapped datastore's advanced indexing abilities.
Here's a example of the problem (in the spatial context):
In this example, we see that the derived (buffered) dataset intersects the query bounding box, but the original (wrapped) dataset does not. If you were to execute a query like this on a PostGIS datastore, you would see a query along the lines of this:
This is clearly too inefficient. I am, therefore, recommending that when the datastore is first configured that the user can specify special behavior for a bounding box search.
For example, if the user knows the widest river is 100m then they can "grow" the input bounding box by 100m and send that bbox to the wrapped datastore. NOTE: the 100m is a constant amount that is added to every query bbox.
This solution is not ideal because it could return too few rows (if the user's bbox expansion behavior is incorrect). It is up user to ensure that their specified behavior will produce the correct results. This solution is, however, very easy to implement and should work well for the majority of cases.
NOTE: if the user does not actually modify the wrapped geometries (ie. the new geometry is defined to be "<PropertyName>...<PropertyName>") in the "virtual" datastore, then there will be no problems. The bounding box will "pass through", unmodified, to the wrapped datastore and be indexable by whatever means the datastore does its spatial indexing.
Most advanced databases (like postgresql and oracle) allow for an index to be built not only on actual columns, but on expressions involving data in the columns. The postgresql manual has more details here.
Once this index has been created, then the queries (given above) will actually use the index. Building indexes on expressions works to solve problems of this sort in general, but:
I expect that most actual WFS/WMS queries will involve a bounding box - the simple bounding box solution given above is both extreamly simple, effective, and will capture most of the use-cases.
Obviously, the derived datastore is read only. For example, if we update the derived attribute "nStudies" to 10, we have insufficient information to update the underlying datastore (with num_fed_studies and num_state_studies).
In general, one would allow modification to the wrapped datastore - the derived datastore will automatically "pick up" modifications.
The OGC specification allows for arbitrary functions to be called. It appears that there are only a few actually implemented in the Geotools Expression package.
Simple math functions ("+", "-", "/", and "*") and logic functions ("and", "or") are handled more directly in the Filter Expression.
It appears the only "extension" functions that Geotools supports is "Max(number,number)" and "Min(number,number)". I think this can be easily extended in the same manner as the Hypersonic SQL DB allows custom (static) functions to be called.
Spatial DB in a Box already sets up a StaticGeometry class that converts almost all the JTS functionality into "static" method. For example:
The basic idea is to:
NOTE: polymorphism (ie. functions with the same name, but different argument types or argument numbers) should not be allowed because they will make things much more difficult to code. Simply make all functions have unique names. For example, JTS supports a geometry.buffer(double) and a geometry.buffer(double,int) - these could be called "buffer", and "buffer_with_segmentCount").
For example, consider this inside a Filter:
Pseudo-code for evaluating this function (see MaxFunction.java in the Filter package):
Looking at the code, it appears that Geotools currently supports expressions of numberic types, Geometry, and String. This is probably enough for most users, but they may want to do things like call functions on Date objects.
As I mentioned above, the return types of these functions should be explicit (and determined with reflection) - this will allow for automatic generation of AttributeTypes for the 'calculated' columns.
Other popular functions:
When I first used a WFS, I just assumed that you could perform functions on the returning columns - not just functions for the filters. This type of "virtual" datastore that makes derived features operates like a very simple SQL view and has many actual use-cases:
|Document generated by Confluence on May 14, 2014 23:00|