Site maintenance


Some of you may have noticed bcaching was out of commission for a lengthy period from Thursday night until Friday morning. This was in order to complete as much of the database reorganization (a.k.a. phase III) as possible.

When cache logs were migrated to the new database (mongodb) using the same layout as before (a single list of logs by cache log id with a secondary index on cache id), it caused similar slowdowns as before (on mysql). It’s just too much data and the indexes are too large for the limited resources we have available.

The reorg – to a list of cache documents, each containing all of its related logs – was going painfully slow for about a week and a half when I finally decided to shut the site down for an extended period and use all the server resources to try to just get it done.  It was going reasonably well (maybe 85% complete) until around 5:30 in the morning when mongo decided it needed to allocate more space in the filesystem, even though there was more than adequate space already freed up from dropping the old cache logs table. The problem was likely due to fragmentation within the database files. A “repairdatabase” would have solved it by defragmenting and really freeing up the unused space, but like mysql, mongodb (1.8) requires free disk space to create a new clean copy of the database before it deletes the old one. I didn’t have the space. Luckily mongodb has excellent backup and restore functions that allows backups to be done to and from a separate server so that’s what I did. Incidentally, the next release (2.0) will support an in-place repairdatabase function.

So after the database was restored, I brought the system back up with partially migrated cache logs. You may encounter a few caches that have no logs, but don’t be alarmed. They will reappear over the next few days.

Update 6/19/2011: The cache logs migration is complete. Finally!

Advertisements

3 comments so far

  1. Tom on

    I just signed up for BCaching and uploaded my “My finds” PQ. Well, sorta, it is in a queue behind over 600 other files. My rough estimation is that this will take about 4.5 hours to complete.

    Is this delay the result of Phase III?

    BTW, in the first link in your “External Links” the correct spelling is Boulter. I’m sure Jeff would appreciate it.

  2. mark on

    Yes, the queue is backed up since the site was down for an extended period. Most of those PQs arrived between 6 and 9 am (EDT) but bcaching wasn’t brought back online until 10. In those 90 minutes it’s processed 314 PQs and has 480 to go (I’m not sure where yours is in the queue, but probably toward the end if you uploaded it after the site came back up) so it will likely be a couple hours.

    The good news is that with the data reorganization, the site is running as speedily as I’d hoped. PQs are being processed 4 – 5 times faster than before and API requests are no longer timing out.

    Thanks for the spelling correction. I’ve updated the link.

  3. Sam (insanimal) on

    Mark, just wanted to say thankyou, excellent job! I just started Geohunter syncing with a completely fresh db and it got all 5000 odd caches in one go – no request fails at all! I am one happy cacher 🙂 all your hard work is very much appreciated!


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: