Navigation
Monday
May142012

Peer to peer synching with TouchDB

Updated 2012-06-05: Incorporated Jens’s suggestions and corrections.

TouchDB is a lean CouchDB-compatible database framework that can be embedded in iOS applications (or more generally, mobile or desktop applictions but this post is about iOS). Jens Alfke, its author, describes it this way: “If CouchDB is MySQL, then TouchDB is SQLite.” The project is available on github.

TouchDB is CouchDB-compatible with respect to its replication API when initiated on the device against another ‘regular’ CouchDB. You can create push and pull replication tasks on TouchDB. However, out of the box, TouchDB does not offer an HTTP interface for other TouchDB (or CouchDB) instances to connect to. This means that initially, you are limited to a “star” topology with a regular CouchDB at its center and iOS devices with TouchDB connecting to it as a synchronization hub.

However, with a little extra work, it is quite easy to turn this into a peer to peer setup, thanks to the Listener framework Jens has included in TouchDB.

In order to get this to work, you first need to build the listener framework. To do so, clone the git repository, pull the submodules and build the “Listener iOS Framework” target as follows:

git clone https://github.com/couchbaselabs/TouchDB-iOS
cd TouchDB-iOS
git submodule init
git submodule update
xcodebuild -target "Listener iOS Framework"
open build/Release-ios-universal

The open command will open a Finder window with the framework, which you need to add to your existing project.

After you have done that, you need to start the listener. One place where you might want to do that could be application:didFinishLaunchingWithOptions:. Add the following code to start the listener:

CouchTouchDBServer *server = [CouchTouchDBServer sharedInstance];
[server tellTDServer:^(TDServer *tdServer) {
  NSLog(@"Starting listener");
  _listener = [[TDListener alloc] initWithTDServer:tdServer port:59840];
  [_listener start];
}];

NB: Make sure _listener is retained outside the block and lives on, otherwise your listener goes out of scope and stops listening immediately. And as you can tell from the unbalanced alloc message: these code snippets are assuming ARC.

This is basically all you need to do to connect to your TouchDB instance via HTTP. For example, you could use curl to query it for documents. However, peer to peer benefits from advertising and discovering your service via Bonjour and the rest of this article briefly describes how to achieve this.

First off the advertising part. Add the following to a startup section of your application, for example right after creating the listener:

UIDevice *device = [UIDevice currentDevice];
self.netService = [[NSNetService alloc] initWithDomain:@"local" type:@"_myapp._tcp" name:device.name port:59840];
NSData *data = [NSNetService dataFromTXTRecordDictionary:[NSDictionary dictionaryWithObject:conf.localDbname forKey:@"path"]];
[self.netService setTXTRecordData:data];
[self.netService publish];

Replace “myapp” and 59840 with values of your choosing and note that it is advisable to choose a better service name than simply the device name as I have done in this example.

For discovery, you create an NSNetServiceBrowser and search for hosts of your service type:

self.browser = [[NSNetServiceBrowser alloc] init];
self.browser.delegate = self;
[self.browser searchForServicesOfType:@"_myapp._tcp" inDomain:@"local"];

You will be notified of any matches by implementing the following NSNetServiceBrowserDelegate protocol callback:

- (void)netServiceBrowser:(NSNetServiceBrowser *)netServiceBrowser didFindService:(NSNetService *)netService moreComing:(BOOL)moreServicesComing
{
  [self.services addObject:service];
  if (! moreServiceComing) {
    [self.tableView reloadData];
  }
}

In this example, I’ve added the service to an array. This could be an array that is driving a UITableView for example. (There’s a complete bonjour browser example available on the iOS Dev Center that includes a browsing UI and discovery and resolution for bonjour services that these code examples are based on.)

As Jens Alfke correctly points out in the comments, it is important to implement the companion method netServiceBrowser:didRemoveService:moreComing: as well in order to remove a service from the list when it disappears:

- (void)netServiceBrowser:(NSNetServiceBrowser *)netServiceBrowser didRemoveService:(NSNetService *)netService moreComing:(BOOL)moreServicesComing
{
  [self.service removeObject:service];
  if (! moreServiceComing) {
    [self.tableView reloadData];
  }
}

Once a service is selected in this table view, we try to resolve it:

- (void)tableView:(UITableView *)tableView didSelectRowAtIndexPath:(NSIndexPath *)indexPath
{
  NSNetService *service = [self.services objectAtIndex:indexPath.row];
  [service setDelegate:self];
  [service resolveWithTimeout:0.0];
}

Finally, we implement the relevant part of the NSNetServiceDelegate protocol to handle the resolved address. This is where we would then update the sync settings for our app, which is encapsulated in the [self updateSync:url] in this example. This would be the same updateSync: present in the TouchDB example apps.

- (void)netServiceDidResolveAddress:(NSNetService *)sender {
  // Construct the URL including the port number
  // Also use the path, username and password fields that can be in the TXT record
  NSDictionary* dict = [NSNetService dictionaryFromTXTRecordData:[service TXTRecordData]];
  NSString *host = [service hostName];
  NSString* user = [self copyStringFromTXTDict:dict which:@"u"];
  NSString* pass = [self copyStringFromTXTDict:dict which:@"p"];
  NSString* portStr = @"";
	
  // Note that [NSNetService port:] returns an NSInteger in host byte order
  NSInteger port = [service port];
  if (port != 0 && port != 80) {
    portStr = [[NSString alloc] initWithFormat:@":%d",port];
  }

  NSString* path = [self copyStringFromTXTDict:dict which:@"path"];
  if (!path || [path length]==0) {
    path = [[NSString alloc] initWithString:@"/"];
  } else if (![[path substringToIndex:1] isEqual:@"/"]) {
    NSString *tempPath = [[NSString alloc] initWithFormat:@"/%@",path];
    path = tempPath;
  }
	
  NSString *ipAddress = nil;
  for (NSData* data in [service addresses]) {
    char addressBuffer[100];
    struct sockaddr_in* socketAddress = (struct sockaddr_in*) [data bytes];
    int sockFamily = socketAddress->sin_family;
    if (sockFamily == AF_INET /* || sockFamily == AF_INET6 */) {
      const char* addressStr = inet_ntop(sockFamily,
                                         &(socketAddress->sin_addr), addressBuffer,
                                         sizeof(addressBuffer));
      int port = ntohs(socketAddress->sin_port);
      if (addressStr && port) {
        NSLog(@"Found service at %s:%d", addressStr, port);
        ipAddress = [NSString stringWithCString:addressStr encoding:NSASCIIStringEncoding];
      }
    }
  }

  NSString* url = [[NSString alloc] initWithFormat:@"http://%@%@%@%@%@%@%@",
                   user?user:@"",
                   pass?@":":@"",
                   pass?pass:@"",
                   (user||pass)?@"@":@"",
                   ipAddress?ipAddress:host,
                   portStr,
                   path];
	
  NSLog(@"service: %@", service);
  NSLog(@"url: %@", url);
  [self updateSyncURL:url];
}

The method above references one simple helper method to access bonjour data from the service:

- (NSString *)copyStringFromTXTDict:(NSDictionary *)dict which:(NSString*)which {
  // Helper for getting information from the TXT data
  NSData* data = [dict objectForKey:which];
  NSString *resultString = nil;
  if (data) {
    resultString = [[NSString alloc] initWithData:data encoding:NSUTF8StringEncoding];
  }
  return resultString;
}

As mentioned above, this bonjour code is mostly from the Apple example code of BonjourWeb but it required some minor changes. I’ve added the path component to broadcast which database to replicate with. I’ve also commented out the AF_INET6 socket family part, because it did not work with the replication and for the same reason I’m using the IP address for the URL rather than the clear name, because this also did not yield a working connection.

Hopefully this post will help people getting started with TouchDB peer-to-peer replication!

Monday
Apr022012

Asynchronous, lazy initialization with synchronous accessor

I’ve come to love Grand Central Dispatch and blocks for making it so easy to add asynchronous tasks to your application. Without the overhead of thread class instantiation or defining callback methods you can send a task in the background and keep your main thread unblocked.

However, sometimes you need a mix of synchronous and asynchronous tasks or more specifically you want to start something asynchronously initially and to block and wait for its completion elsewhere in your code. One example of this could be a unit test of an asynchronous algorithm where you need synchronous access to the results for validation.

Another example is a current project of mine which involves plotting of medical data that is parsed from CSV files. There are four CSV files and each takes about a second to parse. It’s not long but when you try and do it on demand when a plot is about to be displayed on screen, you find that blocking your main thread for a second can be very annoying. The obvious solution is to do the parsing on a background queue but that immediately raises the question: How do you then handle the plotting? Do you show an empty plot which populates later, when the data is available? That doesn’t look good. Another alternative would be to make the whole display plot action asynchronous. But then you’ve decoupled user interaction (user taps a button to bring up a plot) and GUI action (plot actually displays) and will probably find that users tap multiple times until the plot shows up.

Ideally then, the data would be loaded early on and the actual plotting would be synchronous. In my application, the data is loaded and parsed asynchronously in the initializer of a singleton which is used throughout the application for global data. Therefore, as soon as my global is being accessed for the first time the data gets loaded in the background. I can then afford to use blocking access to the data, because there is no (or very little) chance for the user to activate the GUI to display the plot before the data has been parsed. And even if they do, the processing is far along and the delay minimal.

So in summary, the requirements for my use case are:

  • Several initialization tasks need processing
  • Processing can happen in parallel
  • Processing must not block the main thread
  • Access to the results should block while processing is in progress

Here is how it’s implemented:

First off, we have an initializer that does our parsing, slowInitForKey in the example code. The idea here is that initialization work is based on a key (e.g. a filename) and returns a single result object that can be stored in a results dictionary.

Next we define a singleton Globals which is instantiated early on in our code, for example in viewDidLoad and has the following init method:

- (id)init {
  self = [super init];
  if (self) {
    valuesSerialQueue = dispatch_queue_create("valuesSerialQueue", NULL);
    self.values = [NSMutableDictionary dictionary];

    [[NSArray arrayWithObjects:@"A", @"B", @"C", @"D", @"E", @"F", nil]
     enumerateObjectsUsingBlock:^(id obj, NSUInteger idx, BOOL *stop) {
      dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0), ^{
        NSString *value = [self slowInitForKey:obj];
        dispatch_async(valuesSerialQueue, ^{
          [self.values setObject:value forKey:obj];
        });
      });
    }];
  }
  return self;
}

What does this do?

  • First, we set up a serial queue and a dictionary for the results. We use a serial queue to make sure that only one thread will access the values dictionary at a time. Think of it as a locking mechanism in GCD terms.
  • Next, we iterate over the initialization keys (A-F in this example – these would be filenames in the CSV parsing example). Each key we send to a concurrent dispatch queue for parallel processing of slowInitForKey:. After processing is finished, the result is written to the dictionary via an async dispatch to our serial queue valuesSerialQueue. Again, this ensures that no two threads access the values dictionary at the same time.

Now that initialization is on its way, all that’s left is the synchronous access to the results. This is pretty simple:

- (NSString *)valueForKey:(NSString *)key {
  __block NSString *result = nil;
  do {
    // keep polling until there’s a value
    dispatch_sync(valuesSerialQueue, ^{
      result = [self.values objectForKey:key];
    });
  } while (result == nil);
  return result;
}

All we do is simply poll the results dictionary via the serial queue until there is a value. Of course you need to make sure that your initializer will always set a result – otherwise you would block forever. A safer way would be to set a time limit on how long you block before you eventually break from this method and return nil.

If you are worried that there may be a lot of polling going on until there is a result you could add a little delay after each unsuccessful poll. It’s probably irrelevant though, because polling only happens until initialization is finished.

An example project is available on github.

 

Tuesday
Feb072012

iOS User Accounts

Wouldn’t it be convenient if you could pick up any iPhone or iPad and have it personalized with your settings quickly? This is something that occurred to me last week when my girl friend had left her iPhone at home and wanted to continue reading her book in iBooks. I had my iPad with me but of course it is tied to my iTunes account, not hers, and it’s way too much hassle to reconfigure it just for a brief reading session.

But it made me wonder what that feature could look like on iOS and what it would take to make it happen. Basically, you’d want an extension of something that’s already possible on OSX: signing in with an Apple-ID. Once you’re authenticated with your Apple-ID, your content and settings are only a few steps away: iCloud, if you’re using it, has got it and in theory, that’s all you need to restore your device.

I’ve upgraded quite a few devices in the past and so far backup and restore has worked really well. Now imagine there were an (optional) login screen on iOS devices where you could log in to your iCloud account and immediately you’d get your home screen, with your content and settings trickling in in the background - just like it’s happening now when you restore through iTunes or from iCloud. With future devices having more storage space, the OS could cache multiple user accounts so that on subsequent logins your data would only need an update rather than a completely fresh pull. Also, you can imagine some things like big apps being referenced from multiple accounts and therefore needing to be stored only once on a device and not per account.

If that use-case still sounds esoteric to you, because your iPad is yours alone, think about places where iPads could be shared by larger audiences: Schools, universities, sales people, etc. For example, if a school wanted to start using iPads in one course only, say their biology class, they’d only need to get enough iPads for their maximum class size, not for the total number of students attending that class. (Caveat: no iPad based learning at home unless students log in using their private iPad.) Or there could be iPads per course that wouldn’t need to be moved: Your course material appears at your desk wherever you are - you don’t actually carry it there anymore. It would certainly help reduce the risk of iPads being dropped between classes or on the bus.

Technically, I would assume something like that being investigated or even in place already at Apple. It’s probably just a matter of broadband connections catching up to make this a smooth experience. One that Apple would be willing to ship and tout as a new feature.

Monday
Jan022012

autotm 0.94 supports local backups

As introduced in a previous blog post, autotm is an OSX system daemon that automatically switches Time Machine targets depending on their availability. The initial version of autotm only supported network based targets but I’ve recently updated the script to also allow locally connected disks (e.g. USB). This update requires some minor changes to your autotm.conf file: The server section is now called “destinations” and each destination has a “type”, which can be remote or local. For example:

destinations:
 - type: remote
   hostname: myhomeserver.local
   username: jdoe
   password: s3cr3t
 - type: remote
   hostname: myofficeserver.local
   username: john_doe
   password: pa55
 - type: local
   volume: /Volumes/Time Machine

To learn more about autotm, have a look at the Readme on Github. Please file any problems you encounter on the issue tracker at github.

Thanks to Andy and Daniel for their help in testing this release!

Monday
Dec192011

CouchDb Migrations

A few weeks ago I attended CouchConf in Berlin and during the sessions (and in between) one topic was raised several times: How to migrate data between “schemas” or document versions. I described how we are migrating documents and I want to take a moment to explain the process in more detail. It might sound trivial but there was interest in the description during the conference, so I’m hoping it may prove helpful for others nonetheless.

Since CouchDb is inherently unstructured, there’s no global schema that you manage to control your data’s structure. That’s often a good thing, because it gives you flexibility, but it can also cause problems, for example when you want to access documents without handling against all sorts of different “versions” of your document you might have.

For example, say you have started out with an initial player document (we’re sticking with the RPG theme set in the Couchbase examples ;)):

{
  'version' : 1,
  'name:' : 'Player A',
  'xp': 1234
}

but you find during testing that you need to know a player’s level. You’ve decided that the level should always be xp/100 + 1 but you don’t want to recompute this all the time in code but rather store it in the document directly. For various other reasons you’ve also decided against creating a view and therefore you want to migrate all your documents to this format:

{
  'version' : 2,
  'name:' : 'Player A',
  'xp' : 1234,
  'level' : 13
}

Note that the initial document already included a version attribute that we’re using to keep track of our migrations but even if this weren’t the case from the start, it’s easy to simply treat documents without a version attribute as “version 0” so to speak and handle them similarly to the rest of this example.

So how do we migrate from version 1 to version 2 then?

The idea is to create a view that shows all old revision documents and process them until the view has no more items. The view would be defined with the following (trivial) map function:

function(doc) {
  if (doc.version && doc.version == 1) {
    emit(doc._id);
  }
}

Now it’s simply a matter of processing all items in this view, for example with the following python-couchdb method that takes a database object as a parameter:

def migrate_v1_v2(db):
  v1 = db.view('_design/migration/_view/v1')
  for row in v1.rows:
    doc = db[row.key]
    if doc['version'] == 1:
      doc['version'] = 2
      # we want to add the level stat, which is simply xp/100, starting from 1
      doc['level'] = doc['xp']/100 + 1
      db[doc.id] = doc

and where “v1” is the name of the view we defined above.

The complete example in the form of a unit test is available on github. The only dependency is python-couchdb. It should be trivial to translate this pattern to other client libraries. It might also be useful to extend this concept to a migration framework á la Ruby on Rails.