Archive for May, 2011|Monthly archive page

Release 0.8

It’s been a while since there was a major bcaching release. There are two new features:

  • Add support for geocaching Cache Attributes. Note that you must set your GPX version to 1.0.1 or later for cache attributes to be included. You can set the preference on your geocaching.com account details page. Cache attributes are displayed on the mobile cache details page.
  • Add support for Metric distance units. You can set your preference on your bcaching profile page or in the mobile options page.

The most significant change is a migration of the database from MySql to MongoDB. It may not sound like much, but it required a major rewrite of a lot of behind-the-scenes logic and some reorganization of the data model.

Due to limited server resources (disk space), it was not possible to migrate all the finder logs at once so logs are still being retrieved from MySql for now and they will be migrated gradually over the next week or two. During that period the synchronization process with Geobeagle/Geohunter/OpenGPX may be a little slower since logs are being retrieved from both databases, but performance will be better overall once the migration is complete.

There were a lot of changes and the risk of bugs and issues is high so please be on the lookout for any problems and report them on the forums.

Advertisements

What’s new database?

Every month bcaching grows a little bigger with more users and more data. The database has grown to over 12 GB in order to accommodate 1.5 million geocache records and over 32 million finder logs. One of the persistent problems has been how to keep that geocache data up-to-date as efficiently as possible and working fast enough on reasonably-priced hosting services (currently a single medium-sized Windows based VPS).

Over 700 GPX files are processed every day. Over the past 24 hours, 890 GPX files were processed containing 388,891 geocaches, 1,902,996 finder logs, 106,832 waypoints, and 85,612 travel bugs. It took 17 hours and 41 minutes of processing time to read and load those files. That’s 2,341 data objects per minute and it’s too slow even after several performance improvements over the years. It also takes resources away from serving web and api requests, even though GPX processing is run at a lower priority than other work.

For the past couple of months I have spent some time investigating, testing and implementing a completely new back-end database using MongoDB to replace MySql. Mongo is fast. It sacrifices features that could slow it down and requires the application to take on responsibility for more functions but it provides excellent performance in return.

The jump from a traditional SQL database to a schema-less database required a complete rewrite of all the data access logic but it simplified some of the logic as well (especially the GPX file processing). Some of the data model also had to be reorganized to best take advantage of certain mongo features.

One of the remaining problems is how to migrate from the old database to the new one. Normally a release includes only minor database changes and can be completed in a couple hours or less but a full database migration would take the better part of a day and I’m not willing to shut down the site for that long. Another approach would be to synchronize the two databases while the site is live, then shut down the site only long enough for a final synchronization before switching over. That is not an option either because there is not enough disk space to support two full databases at the same time.

The remaining option (and current plan) is to synchronize some of the data, then switch to the new database but continue to use the old database in a temporary hybrid mode until the remaining data can be moved. The new database is now being synchronized with everything but finder logs (logs take up the largest percentage of space) and the new application can load logs from both databases and merge the results. There is a definite performance hit to loading the logs this way, bit it’s not much worse than before and it will get better after the migration is complete.

There is still more testing to do so I haven’t scheduled a release date but I wanted to give everyone a heads up. There will also be at least one new feature with this release: support for Cache Attributes! If you’re interested in testing the new site, you can use your existing credentials at http://test.bcaching.com but any uploads or logs may be overwritten by the nightly sync from the main database.

Questions or comments are welcome here or in the forums.