The Guinea Pig in the Cocoa Mine

Underground Cocoa Experiments

Contentless SQLite FTS4 Tables for Large Immutable Documents


TL;DR? If you use SQlite FTS4 for indexing large immutable documents, you might want to consider contentless tables. With minor adjustments, the database could be 2 to 3-time smaller.

Gus Mueller recently detailed how Acorn 5’s Help menu implements live searching of all their online documentation. The actual help content is huge and hosted on Acorn’s website. To provide local search from within the app, a small SQLite database is embedded in Acorn. The database uses the brilliant FTS4 extension extension module which allows to create virtual tables for super-fast full-text search. Combine this with NSUserInterfaceItemSearching, and you get a clever yet simple setup for a powerful Help menu that seamlessly connects Acorn and its extensive online documentation (did I mention Acorn 5 is awesome?).

Contentless FTS4 Tables

I just happen to have been knee-deep in FTS4 last month while working on full-text search for the Findings app (of course leveraging Gus’ own FMBD). One section of the SQLite documentation in particular caught my attention: contentless FTS4 tables. As the name implies, such a table does not store a copy of the content. Yet, just like contentful FTS4 tables, you can perform efficient full-text search on that content. For instance, type this:

SELECT docid FROM data WHERE body MATCH 'bar'

… and you get the list of docids for all the documents containing the word “bar” somewhere in the body column.

Behind the scenes, SQLite creates various data structures for the index, and they are the same for contentful and contentless tables. In contrast, the ‘content’ part of a contentless table is simply empty:

Contentful vs contentless.

This difference is particularly relevant when indexing large documents that exist outside a database, such as a collection of web pages or PDFs. Normal contentful FTS4 tables store the entire content of those documents in the database, which means this data is needlessly duplicated. This is not the case with contentless tables. They work more like the SearchKit framework on OS X, that focuses on building and querying an index, and does not keep a copy of the documents (the client is responsible for that).

It should thus be obvious why contentless FTS4 tables could be more advantageous: they are potentially smaller than a contentful table. In theory. Is that really the case? And how much smaller, really? Let’s use the scientific method and run an experiment to answer those questions!

Contentless vs Contentful

As an example of content, I used the SQLite database that indexes the Acorn help content (it can be easily extracted from the Acorn app package). The database contains 187 entries, and occupies 1,028,096 bytes on disk (roughly 1 Mo). That is pretty small if you consider that the actual Acorn help site is really quite extensive and weighs in at 524 MB! The index of course only deals with the text, but that is still roughly 400,000 characters (quick observation: that’s about half of the database size).

Using this dataset, I performed the following experiment:

  1. create a database using a “contentful” FTS4 table; I expect the database to be pretty much the same as the original; this is what scientists would call a positive control; it is here to make sure that whatever comparison I make with database (2) is valid; if Gus used some special pragma or a different SQLite build and I used his database for comparison, I would end up comparing apples to oranges;
  2. create a database using a contentless FTS4 table (note: the database also has an “info” non-FTS4 table, for reasons explained in the section “Querying Contentless Tables”);
  3. same as (1) but adding the data 10 times (1870 entries);
  4. same as (2) but adding the data 10 times (1870 entries).

Databases (3) and (4) are meant to extrapolate what happens with more entries. Since the exact same data is used 10 times, the ‘content’ will be 10 times bigger while the ‘index’ part will probably not grow as much 1.

The exact script I have used is available for download. It is a Ruby script that expects the original SQLite database ‘HelpIndex.db’ in the same directory when you run it. It creates 4 databases (index1.db, index2.db, index3.db and index4.db). If you don’t want to run the script yourself, feel free to download index1.db and index2.db and compare their contents, e.g. using the great Base app.

Now, the results.

An image is worth five numbers.

Database       Size (bytes)   
--------       ------------   
Original          1,028,096   
1                 1,028,096   
2                   442,368   (43%)
3                 9,089,024   
4                 3,276,800   (36%)

First, database (1) is the exact same size as the original. Good job. Positive controls are very boring when all goes right. But then, of course, it only highlights the really interesting part: a very significant size reduction for contentless tables. Using a contentless table brings the database to less than half of its contentful size. For a larger database, the relative savings appear to be even larger (down to almost one third), in line with our expectations.

The conclusion? For indexing a collection of large documents, contentless FTS4 tables are worth considering. For indexing help pages for instance, and depending on the number and size of the indexed documents, database size could be a significant fraction of an application package. In the case of Acorn, using a contentless table would lead to a 2% size reduction of the app. If Gus decided to switch to a downloadable index instead of packaging it with the app, it would save bandwidth and would provide faster updates to the user. As the help site grows in size, the potential savings would potentially be even larger. Small is beautiful.

Now, to be fair, contentless tables are not all rosy. There are still a couple of drawbacks that I will cover in the next sections. Fortunately, in the use case we are considering here, only the first one is relevant and it is easy to work around.


Querying Contentless Tables

With contentless FTS4 tables, the only values you can get back when running a full-text query are the docids for the matches. It is an error to attempt to retrieve the values of any other column. Here is comparing what happens with contentful and contentless tables with the same column setup:

When querying a contentless table, you can only get docids.

The docids are not useful on their own. In the case of Acorn, we need to display the title of the entry in the Help menu and we need the url to open the page in a browser if the user chooses that entry. Since the original content is not stored in the contentless table, the information is simply not available.

There are different ways to work around this limitation. A simple solution I am describing here is to store the url and title in a separate, normal, non-FTS4 table, populating it in parallel to the FTS4 table, so that the rowids match. Here are the statements for creating the contentful and contentless tables; then populating the tables (latter statements left as an exercise; this is the setup I used in the above experiment where I compared the database sizes, so you can also just check the script):

Let’s add an ‘info’ table to complement the contentless table.

When querying the full-text index of the contentless table, we still only get docids, but we can now use those to get at the information we need: the title and the url. This adds a bit of complexity to the query, and makes it slower. I settled on using 2 subsequent queries, but I am no SQL ninja, and a faster approach probably exists (please let me know!). Given the type of data we consider here, the query will still be very fast (we’re not indexing the HHGTTG). Here is how it goes, again comparing contentful and contentless tables:

A simple 2-step query for a contentless table.

Update Sept 14, 2015: Evan Schoenberg (from the Disqus comment thread at the end of this post) suggests the following single SELECT statement for the query:

SELECT info.url, info.title FROM info, data WHERE (data.body MATCH 'bar') AND (info.rowID = data.docid)


Deleting Content in Contentless Tables

Unlike in contentful tables, it is not possible to delete or update entries in a contentless table. Once you index data, it cannot be altered, and will remain part of any query you do now and in the future. The reason for this requirement is the way an FTS4 index is stored internally. It is optimized to get the rows from a list of tokens (searched words), but not the other way around. To prune a row without knowing which tokens were indexed, SQLite would have to pretty much scan the entire index searching for the rowid. With a contentful table, SQLite can check the content for that row, tokenize it again, quickly get to the relevant tokens in the index and alter things as needed.

It is possible to work around this by keeping track of obsolete docids and adding a new entry for each modification. Then, when you get results from a full-text search, you ignore the entries corresponding to obsolete data. I experimented for a while with such a setup, but Findings documents are heavily edited as the user works on them, and the search index needs to be updated on a regular basis. The table then keeps growing, which defeats the initial purpose of having a smaller database. In the end, a contentful table is not that big anyway, even with hundreds of documents in the Findings library. For better or for worse, Findings documents are mutable, so contentless tables are not a good fit. This is very different from what I covered here…

Going back to our use case: we don’t care about deletion. For immutable documents, this limitation of contentless FTS4 tables is completely irrelevant.

Where does that leave us, then? My earlier experiment showed that the relative space savings for large documents can be significant. When indexing help pages for instance, it is thus an alternative really worth considering. Contentless SQLite FTS4 tables + large immutable documents = Yay!

[Update Sept 8: Comment thread on HN]



1. Using the exact same data 10 times seemed like the best I could do for the scope of this post. Of course, it is a very artificial dataset: in the real world, nobody will ever index the same documents 10 times (right?). Using real data, though, would have made it harder to compare (3)/(4) with (1)/(2) since we would have to use a different corpus of data, and that could introduce artifactual differences (one set of data could be inherently easier to index, independent of the size). Still, the problem with my approach here is that the index will be smaller than it would be with real data, because we are not introducing new tokens with the additional documents. But hey, counter-argument: anyway, with real data, new tokens become increasingly less frequent even as you index more documents because most words will have already appeared in previous documents. OK, ideally, I’d scour the web for all kind of representative datasets, and index all of these. Alas, I only had time to write this footnote as an excuse, so let’s just move on. ↩︎

ASCIImage: NSConf Slides, Editor and More

My previous post presenting ASCIImage received a lot more attention than it deserved (#1 on Hacker News and reddit/programming, holy cow!), and I am truly humbled. The response has been overwhelmingly positive, with lots of excitement, oohs, wows and aaahs. That’s an incredibly fun experience for me.

Here are a few related items to follow-up on all this:

  • My slides from NSConference last week, the best conference I have ever attended: slides in Keynote format (also in pdf). Check out the picture taken by @danielpunkass below: isn’t that the best stage ever?
  • I put together with my friend @mz2 a landing page at asciimage.org
  • On this page, you’ll find an editor to play with ASCIImage (OS X only!)
  • If you are excited by ASCIImage, you’ll be blown away by MonoDraw, an ASCII art editor. I am not affiliated in any way with the app, I just love the concept and the execution is incredible. You can try it out for free for all the duration of the beta (I am not even sure how it’s still only in beta, given the high level of polish and stability already!).

Last shameless plug: I make my living from the sales of Findings, a lab notebook app for scientists and researchers. The best way to support me is to check it out and pass the information to your friends working in science or in research. Thanks!


Credits: @danielpunkass

Replacing Photoshop With NSString


Hello! This post somehow got a lot of attention. Thanks for visiting! If you like it, it would be awesome if you’d check my app Findings, a lab notebook app for scientists and researchers, and let others know about it.

An app is not just made of code. It also contains static assets like images and sounds. Images are typically created and edited with dedicated tools like Acorn (my favorite), Pixelmator, or the 800-pound gorilla, Photoshop. Ideally, the graphics are handled by an actual designer, which really is one of the best things we did for our app Findings. But as a developer, it can be tedious to have to use a separate tool or involve another person, when all you need is a simple little icon with just a few straight lines, a square or a circle. Because of “retina”, you also have to create separate files for 1x, 2x, and now 3x-scale versions of the same drawing. Any small change or the addition of small variants can quickly become a cumbersome and error-prone endeavour.

I am a programmer, I can surely draw those in code!

What’s a developer to do? Write code! I don’t remember the first time I decided to draw an image directly in code, but that seemed like a good idea at the moment. From a developer’s perspective, it is very tempting. Why use Photoshop when you have the most flexible tool ever: code? Photoshop was written in code, so whatever Photoshop is doing, code can do! Alas, in practice, this is only a reasonable approach for very simple graphics. And even then, it is not a straighforward task, and it is not quite the amount of fun I had naively hoped for. I will first show you an example of what it entails, but fear not, I also have an alternative fun solution right after that.

Way too much code

As promised, here is an example of one of those first times I actually drew an image using Objective C. Brace yourself:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
// chevron is defined by 3 points, the angle is always 90 degrees
// 
// A 
//   # 
//     # 
//       B 
//     # 
//   # 
// C 

CGFloat rightMargin = 12.0;
CGFloat chevronHeight = 9.0; // then chevronWidth = chevronHeight/2
CGFloat lineWidth = 2.0;
NSRect bounds = [self bounds];
NSPoint middle = NSMakePoint(NSMaxX(bounds)-rightMargin-lineWidth/2.0,
                                  (NSMinY(bounds)+NSMaxY(bounds))/2.0);
NSPoint top = middle;
top.x -= chevronHeight/2.0;
top.y += chevronHeight/2.0;
NSPoint bottom = top;
bottom.y -= chevronHeight;

// draw the chevron in grey
NSBezierPath *chevronPath = [NSBezierPath bezierPath];
[chevronPath setLineWidth:lineWidth];
[chevronPath setLineJoinStyle:NSMiterLineJoinStyle];
[chevronPath setLineCapStyle:NSButtLineCapStyle];
[chevronPath moveToPoint:top];
[chevronPath lineToPoint:middle];
[chevronPath lineToPoint:bottom];
NSColor *chevronColor = [NSColor colorWithCalibratedWhite:0.4 alpha:1.0];
[chevronColor set];
[chevronPath stroke];

Wow, that is a lot of code for just drawing two lines at a 90-degree angle! And that is not even including the actual NSImage code. It is nice that I can easily change the color and the size, and that I get 1x, 2x and 3x in one go. But was all this code really worth the trouble? After this first experience, I was not sold, but still used that approach in a few more occasions, where very simple graphics were needed. It got a little easier as I gained experience, and the invested time paid off, but I remained frustrated by the situation. After a while, though, I realized that the most interesting part of the code was actually the ASCII art I was using as a guide to my drawing code:

1
2
3
4
5
6
7
//    A 
//      # 
//        # 
//          B      <-- I WANT TO WRITE JUST THAT,
//        #            NOT THE REST OF THE CODE!
//      # 
//    C 

This “drawing” described very nicely what I wanted to do, better than any comment I could ever write for any kind of code, in fact. That ASCII art was a great way to show directly in my code what image would be used in that part of the UI, without having to dig into the resources folder. The actual drawing code suddenly seemed superflous. What if I could just pass the ASCII art into NSImage directly?

ASCIImage: combining ASCII art and Kindergarten skills

Xcode does not compile ASCII art, so I decided I would write the necessary ‘ASCII art compiler’ myself. OK, I did not write a compiler, but a small fun project called ‘ASCIImage’! It works on iOS and Mac as a simple UIImage / NSImage category with a couple of factory methods. It is open-source and released under the MIT license on GitHub. I also set up a landing page with a link to an editor hacked together by @mz2 in just a few hours during NSConference: asciimage.org.

It is very easy to use and has limited capabilities. It is not just a toy project, though. I have been using it in a real app for the past year: Findings. But whatever you do, here is a good rule of thumb: as soon as you feel limited by it, you should fire off Acorn instead, or better yet, contact a designer.

Here is how you would use ASCIImage, to draw a 2-point-thick chevron:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
+ (UIImage *)chevronImageWithColor:(UIColor *)color
{
NSArray *asciiRep =
@[
@"· · · · · · · · · · · ·",
@"· · · 1 2 · · · · · · ·",
@"· · · A # # · · · · · ·",
@"· · · · # # # · · · · ·",
@"· · · · · # # # · · · ·",
@"· · · · · · 9 # 3 · · ·",
@"· · · · · · 8 # 4 · · ·",
@"· · · · · # # # · · · ·",
@"· · · · # # # · · · · ·",
@"· · · 7 # # · · · · · ·",
@"· · · 6 5 · · · · · · ·",
@"· · · · · · · · · · · ·",
];
return [self imageWithASCIIRepresentation:asciiRep
                                    color:[UIColor blackColor]
                          shouldAntialias:NO];
}

And below are the images that will be generated depending on the drawing environment:

ASCIImage results from chevron ASCII art

On iOS, the 1x/2x/3x versions will be generated based on the screen resolution of the device on which the app is running. On the Mac, the ASCIImage implementation uses the NSImage block API, which means the drawing will happen at the right resolution the moment the image is rendered on screen. Note that I disabled anti-aliasing in the example code (so only the images on the top row will be generated as needed). For this kind of shape, the rendering is actually sharper and looks better without anti-aliasing.

Behind the scenes, ASCIImage is doing simple, boring stuff. There are probably ways to make the parsing smarter and more user-friendly, but I just wanted things to work quickly without too much fuss and too much coding and debugging:

  • it strips all whitespace; this is why all pixels need to be marked somehow (I chose the character ‘·’ as the background in the example above);
  • it checks consistency: all rows should have the same length;
  • it parses the string to find digits and letters; everything else is ignored, namely the ‘·’ and ‘#’ characters in the example;
  • each digit/letter is assigned a corresponding NSPoint;
  • it creates shapes based on the good old “Connect the Dots” technique you learnt in Kindergarten;
  • each shape is turned into NSBezierPath;
  • each Bezier path is rendered with the correct color and anti-aliasing flag

In the chevron example, there is just one shape, which is created and rendered as follows:

ASCIImage rendering steps

Basics

Here is a quick overview of ASCIImage usage. The valid characters for connecting the dots are, in this order:

1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P
Q R S T U V W X Y Z a b c d e f g h i j k l m n p
q r s t u v w x y z

Each shape is defined by a series of sequential characters, and a new shape is started as soon as you skip a character in the above list. So the first shape could be defined by the series ‘123456’, then the next shape with ‘89ABCDEF’, the next with ‘HIJKLMNOP’, etc. The simplest method +imageWithASCIIRepresentation:color:shouldAntialias: will draw and fill each shape with the passed color (there is also a block-based method for more options). Here is an example with 3 shapes:

ASCIImage example with 3 shapes

You can also draw straight lines by using the same character twice. In this case, you don’t need to skip a character before the next shape or line. Here is an example with a bunch of lines (remember, the ‘#’ are only here as a visual guide for when you look at your code, but are ignored by ASCIImage’s parser):

ASCIImage example with a bunch of lines

And you can combine shapes and lines, of course:

ASCIImage example combining shapes and lines

There are just 2 more special cases. You can create a single (square) pixel if you use an isolated character. And you can draw an ellipse by using the same character 3 or more times. The ellipse will be defined by the largest enclosing rectangle for the points. If the rectangle is a square, the ellipse is a circle:

ASCIImage example with ellipse

And finally, a more elaborate composition showing how far you can get with it. This particular ASCII art is entering obfuscation territory, which clearly defeats the purpose. The fun is still there, though!

ASCIImage complicated example that draws a bug

That’s it for the basics!

Bells and whistles

There is a second factory method defined in ASCIImage:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
/// This method offers more advanced options that can
// be set on each "shape", using the `contextHandler` block.
// The mutable dictionary passed by the block can be modified
// using the keys listed in the constants below. The dictionary
// initially contains the `ASCIIContextShapeIndex` key to
// signal which shape the context will be applied to.
+ (PARImage *)imageWithASCIIRepresentation:(NSArray *)rep
              contextHandler:(void(^)(NSMutableDictionary *ctx))handler;

/// keys for the dictionary context
extern NSString * const ASCIIContextShapeIndex;
extern NSString * const ASCIIContextFillColor;
extern NSString * const ASCIIContextStrokeColor;
extern NSString * const ASCIIContextLineWidth;
extern NSString * const ASCIIContextShouldClose;
extern NSString * const ASCIIContextShouldAntialias;

This method allows you to apply different settings to the drawing of each element of the graphic. This is done via a mutable dictionary used as an argument in a block. Information goes both ways: from ASCIImage to you, and then from you to ASCIImage. You get the shape index (ordered based on the characters used in the ASCII art), and you set a stroke color, fill color, antialias flag, etc. Note that this context has not much in common with an actual NSGraphicsContext. It is very limited, and unfortunately, it is not possible to directly manipulate NSGraphicsContext for the kind of drawing ASCIImage needs to do (or at least, there were enough gotchas that I decided against it).

Here is an example of how you could use the block-based method to layer multiple shapes on top of each other:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
- (NSImage *)deletionImage
{
    NSArray *asciiRep =
    @[
      @"· · · · 1 1 1 · · · ·",
      @"· · 1 · · · · · 1 · ·",
      @"· 1 · · · · · · · 1 ·",
      @"1 · · 2 · · · 3 · · 1",
      @"1 · · · # · # · · · 1",
      @"1 · · · · # · · · · 1",
      @"1 · · · # · # · · · 1",
      @"1 · · 3 · · · 2 · · 1",
      @"· 1 · · · · · · · 1 ·",
      @"· · 1 · · · · · 1 · ·",
      @"· · · 1 1 1 1 1 · · ·",
      ];
    return [NSImage imageWithASCIIRepresentation:asciiRep
           contextHandler:^(NSMutableDictionary *context)
    {
        NSInteger index = [context[ASCIIContextShapeIndex] integerValue];
        if (index == 0)
        {
            context[ASCIIContextFillColor]   = [NSColor grayColor];
        }
        else
        {
            context[ASCIIContextLineWidth]   = @(1.0);
            context[ASCIIContextStrokeColor] = [NSColor whiteColor];
        }
        context[ASCIIContextShouldAntialias] = @(YES);
    }];
}

And here is the result:

ASCIImage drawing a white cross in a gray circle, using layered shapes of different colors

Here is now one that is pushing ASCIImage to its limits, but further shows how you can take advantage of layering basic shapes to create a more complex icon:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
- (PARImage *)lockImage
{
    NSArray *asciiRep =
    @[
       @" · · · · · · · · · · · · · · · ",
       @" · · · · 1 · · · · · · 1 · · · ",
       @" · · · · · · · · · · · · · · · ",
       @" · · · · · · · · · · · · · · · ",
       @" · · · · · · · · · · · · · · · ",
       @" · · 3 · 1 · · · · · · 1 · 4 · ",
       @" · · · · · · · · · · · · · · · ",
       @" · · · · · · A · · A · · · · · ",
       @" · · · · 1 · · · · · · 1 · · · ",
       @" · · · · · · · C D · · · · · · ",
       @" · · · · · · A · · A · · · · · ",
       @" · · · · · · · · · · · · · · · ",
       @" · · · · · · · B E · · · · · · ",
       @" · · · · · · · · · · · · · · · ",
       @" · · 6 · · · · · · · · · · 5 · ",
    ];
    return [PARImage imageWithASCIIRepresentation:asciiRep
           contextHandler:^(NSMutableDictionary *context)
      {
          NSInteger index = [context[ASCIIContextShapeIndex] integerValue];
          if (index == 0)
          {
              context[ASCIIContextFillColor]   = [PARColor blackColor];
          }
          else
          {
              context[ASCIIContextFillColor]   = [PARColor whiteColor];
          }
          context[ASCIIContextShouldClose]     = @(YES);
          context[ASCIIContextShouldAntialias] = @(YES);
      }];
}

ASCII art obfuscation! The method name gives it away. Sort of. Here is how the string is parsed, shape after shape, layer after layer:

ASCIImage drawing a lock using multiple layered shapes of different colors

Again, not sure you’d want to go that far, but now you know you can!

Tricky Bits

Implementing ASCIImage was very straightforward, but there were still a few tricky bits:

  • “Filling” out a shape actually involves both a fill and a stroke on NSBezierPath. To properly fill pixels and have proper pixel alignment, the vertices defining each bezier path are in fact set to the middle of the 1x1-pt “pixel” represented in the ASCII art (1 x 1 pt ends up being 3 x 3 pixels in 3x scale for instance). When filling the path, the edges of the bezier paths are thus drawn half-a-point away from the actual border. We then need to also apply a stroke of width 1-point, with the same color, to fill the full intended shape.

To really fill, you need to fill… and stroke.

  • Without anti-aliasing, it is tricky to get the correct pixels to turn black. For this, I found that one should use a thicker line width for 45-degree lines, equal to the diagonal of a 1-pt square: the square root of 2. This width works fine for other angles, including horizontal and vertical lines, thus drawing of the lines is done using this width for aliased rendering, instead of the 1-pt width for anti-aliased rendering.
  • For tests, one need to trick the system into believing that the scale is 1x, 2x or 3x. On iOS, ASCIImage has a special method with a scale argument, which is also used by the actual implementation (which simply passes the current device scale), ensuring that the same code path is in fact used. On OS X, it is trickier, in that the NSImage has to be rendered in a context where we control the “scale”. For this, the test actually renders the image returned by ASCIIImage into… another NSImage, with the correctly-scaled dimensions, so we get an artificial 1x context at a scaled-up size.
  • The scaling on iOS and OS X is handled differently. On iOS, the bezier paths need to be drawn directly at the right pixel size, and the Y axis is upside down. On OS X, scaling is implicit, and drawing is done using points, not pixels.

If you are curious, you can check it on GitHub and see for yourself!

Comment thread on Hacker News

Comment thread on reddit

Replacing Photoshop With NSString – Charles Parnot from NSConference on Vimeo

NSConf Slides: The Kitchen Sink Database

I gave a blitz talk at NSConf 6 earlier this year. The video has just been posted on vimeo, where anybody can see it (and my blitz talk from last year is there as well).

The video is of excellent quality. It is not too painful to watch myself, so I would say I am happy with the result. The only problem with the video is that I wish more time was spent on showing the slides. To help you follow along my ramblings, here is the Keynote file, which also has the presenter notes if you don’t want to even watch the video: Keynote slides for “The Kitchen Sink Database”.

There were many other great talks this year, all available on the NSConf 6 collection (and of course, all NSConf 5 videos are also available). For instance, you could watch the bliz talks “MPGestures : supervised multistroke gesture recognition”, “A Quick Intro To OSXFUSE” or “Core Data Sync with Ensembles”, where Drew McCormack officially launched Ensembles 1.0. There are still more talks to be added: keep an eye on macdevnet on twitter.

Making of Findings

Findings has only been out for 8 days, and I am really proud of the launch, impressed by the response and excited about all the work that’s ahead. But before marching into the future, I thought I should look back into the past. While the core functionality of the app has remained the same, it is quite amazing to see how much of the look and the design of the app has changed over the years… I am a big fan of ‘making of’ posts on apps. I wish there were more of these, so here is one for Findings!

Today

Findings is an application for scientists that help them keep track of their experiments and protocols. You can think of a protocol as a recipe, and an experiment as the actual instantiation of a series of recipes… or what normal people would call “dinner”. But scientists need to remember all they did in details, like timing, temperatures, volumes, etc… so they can write publications about what they did, and so they and others can reproduce what they did. If they do it right, knowledge progresses. If they don’t keep track of what they do, their work is for nothing. They need a lab notebook. It can be a paper notebook, or it can be Findings.

Now, let’s first have a look at Findings version 1.0 released last week (all the screenshots or mockups below can be zoomed in and out with one click). First, the experiment dashboard, where each experiment is represented by a card. Then, the content of an experiment you would see after double-clicking (“opening”) an experiment.

The title bar has all the navigation: on the left, the flask icon is used to get back to the dashbord. As you open experiments, individual tabs open for each (which you can reorder and close). There is also a special place for protocols, with a list / detail view, shown in the first screenshot. These protocols can be added in a calendar view to create an experiment, shown in the second screenshot (like you would drag an appetizer, entrée and dessert to create a meal):

I am very pleased with the look of Findings 1.0, which has been crafted by the wonderful designer wrinklypea (Marcello Luppi, also an ex-scientist, as you may be able to tell from his handle). The result has been recognized as a visually pleasing “iOS 7-like OS X app”.

Just as imporant as the look, the user experience has been refined and improved a lot since we started (more about that below), and has just the elements needed.

5 Years Ago

Now it is time to poke your eyes with all kind of ugly sticks. That is also the fun part, and a great way to see all the progress made. Here are some very early designs. On the left, a mockup done in html for the main experiment library view. On the right, another mockup of the experiment content mocked up in Omnigraffle:

Yikes, that looks really old (and it is). The noticeable thing is that “Experiments” and “Protocols” are already there, as is the main logic of the app: you manage experiments, run them, organize them in “Notebooks” (called “Projects” in version 1). Experiments can be built from protocols. Experiments can be edited in a text editor. Some things are gone: collections, the inspector view, timers (these will come back!), locks in the text editor,…

The look was the classic source list from Mac OS X.

4 Years Ago

Coding started based on the above mockups, and Findings was becoming a real thing. This was interrupted by a switch to another project, Papers, the brain child of Alexander Griekspoor.

But not before Alex and I, tired of the painful-to-see look of the early Findings builds, tried to make it look more like Papers. After all, Papers was very successful, had won an Apple Design Award, and had a tried-and-true user interface and user experience.

Here are a few attempts at what Findings would look like with a list view or a collection view (mockups done in Photoshop):

The source list was still there, although looking better than a year before. The collection view looked quite nice, actually. The text view of an experiment looked much better… but was totally ripped off the Ruby on Rails guides of the time (not sure how they look like nowadays). The mockups are Frankeinstein-like, but that “redesign” helped make Findings look more attractive.

Around the same time, I also produced some mockups with some interesting ideas, and a more concrete calendar view (mockups done in Omnigraffle):

1 Year Ago

After a 3-year hiatus to work on Papers, Findings development was resumed. The workflow of the app was to remain the same as had been imagined 4 years before, but there was a clear need for a fresh start on the code base, and on the design.

One of the first thing we did was to introduce the concept of experiment “cards”, displayed in a “dashboard”. It looks like a collection view, but the content of each card is not just a preview, but rather a summary of what is going on with this experiment. The overall dashboard gives you an overview of your week or of past weeks. We tried with white and black backgrounds (mockups done in Omnigraffle + Acorn):

The next step was to finally get rid of the source list. For fun, here are a few of the initial mockups, generated over the course of a day, searching for inspiration from Coda (mockups done in Omnigraffle):

The design quickly settled on something more restrained, but still based on the “classic” Mac OS X look. The navigation was now in the toolbar (again, greatly inspired by Coda). Note how the ‘Ruby on Rails’ styling is still here in the experiment editor. All the mockups were done in Omnigraffle again (with some help from Acorn for mixing elements from screenshots of the app of the time):

Next, the experiment cards were made more useful by adding more details and thinking through the user experience. The text editor was also given a fresh coat of paint. There are a number of elements in those mockups that have not made it to the app, and that will hopefully do one day, in one form or another:

6 Months Ago

The above started to look much better once integrated into the app. But as we were working on the iOS app in parallel, I started to grow tired of the “old” Mac OS X look. Inspired by the Cactus app, I started to think about a white-ish flat-ish interface that would look more modern, and that would in fact match the “pro” aspect of the the app. Here was my first attempt, again mocked up in Omnigraffle. At the time, we still had mixed tabs for experiments and protocols (in version 1, only experiments can live in a tab), so I figured we could have a colored dot to distinguish them:

I thought it was promising, but it was time for a real designer to step in! Marcello very quickly came up with fresh ideas that made the “white” interface something I finally felt comfortable to add to the app. A distinct interface was also created for the protocols.

2 Weeks Ago

The initial design still had a large “forehead” that raised some eyebrows in our early adopters, though. One week before the release, using a design tweaked by Marcello, I moved the separator above the segmented button, and that made things even better. Here we are, circling back to today and the future:

Goodies

Along the way, Marcello also created great designs for the first-run workflow (a.k.a. the ‘Welcome Guide’) and a teaser. I can’t think of a better way to end this making of. Enjoy!