Open Source on IBM i

Processing XML Data

For some time now I had the idea to port a REST service of mine from Java to RPG. Mostly just out of curiosity because the REST service was running well enough. The REST service was about serving the geo coordinates of the boundaries of postalcode areas.

Import

The data source for these coordinates is the web site suche-postleitzahl.org. They generously supply us with the latest geographic data of Germany. Thanks guys. Keep it going =) .

The data is in the KML format which is an XML based format. So we can process the file with native RPG tools, in our case the opcode XML-SAX. Boundaries are made up of multiple coordinates. The coordinates for one boundary are located in the KML file as one long string of characters. The string can have a length of over 75000 characters. And lately (probably since IBM i 6.1) we can declare variables big enough to hold the content in RPG. But what about storing such a big character string in the database. The CHAR data type has a max. size of 32k characters (plus some more but that doesn't matter here). But like so many other databases the DB 2 for i also supports the data type CLOB (Character Large Object). So this should work out of the box with embedded SQL.

But like so often ... it won't work out of the box. CHAR and CLOB are two fundamentally different data types and are not compatible in embedded SQL. The following code looks nice and straight forward but doesn't work.

dcl-s coords char(100000);
EXEC SQL INSERT INTO postalcodes(country, postalcode, city, coords)
         VALUES('DE', :postalcode, :city, :coords);

Since V5R4 RPG supports working with LOBs with the RPG data type SQLTYPE. At first this didn't seem to work out in this case at all because most tutorials and information about RPG and CLOBs were written when it came out with V5R4. But at that time RPG still had the limit for CHAR fields of 65k and this was also pointed out in the articles. So this wouldn't help because the coordinate data is bigger than 65k. But that was at V5R4. Since then much has happened and now you can declare CHAR variables with up to 16m characters.

dcl-s coords sqltype(clob : 100000);
EXEC SQL INSERT INTO postalcodes(country, postalcode, city, coords)
         VALUES('DE', :postalcode, :city, :coords);

REST Service

The REST service part of it would be implemented by using the ILE REST framework BlueDroplet. But some features are not implemented yet in the framework and so this part has to wait a little bit longer.

Source Code

You can take a look at the code here.

Notes

Saving the coordinates in the format from the KML file is not ideal in any way because now every time a boundary is requested the string of coordinates need to be converted into the desired format. It would be much better to convert them directly on the import so we would have to convert them once instead of x times. But that has to wait till the BlueDroplet project has proceeded further and has implemented the feature to parse the query string of an HTTP request. So that we can really use this data and return it not as JSON but as JSONP which makes it much easier to access and process it in the fronted (JavaScript).

Mihael

Tags : XMLSQLCLOB