Friday 29 March 2013

See you at ACSNoelA?

I'll be presenting at the Spring ACS National Meeting in New Orleans in just over a week. The last ACS I was at was three years ago so I'm looking forward to catching up with what's been going on, and meeting up with some familiar faces.

I've got three talks lined up, the slides for which I'll post after the event:

1. I'll be talking about "What's new and cooking in Open Babel?", as part of Rajarshi's CINF Flash talks on April 7 (Sunday) sometime between 12.30 and 2.00 in Room 350 (Morial Convention Center). I think Rajarshi is still accepting CINF Flash talks so get them in.

2. "Universal SMILES: Finally, a canonical SMILES string?"
This presents the work described in a recent paper.
DIVISION: CINF: Division of Chemical Information
SESSION: Public Databases Serving the Chemistry Community
DAY & TIME OF PRESENTATION: April 10 (Wed) from 8:35 am to 9:05 am
LOCATION: Morial Convention Center, Room: 350

3. "Roundtripping between small-molecule and biopolymer representations".
This describes some of the issues and challenges that have arisen in the development of the Sugar & Splice software, a toolkit for perceiving biopolymer structures, depicting them and converting them between formats.
DIVISION: CINF: Division of Chemical Information
SESSION: Linking Bioinformatic Data and Cheminformatic Data
DAY & TIME OF PRESENTATION: April 09 (Tues) from 3:10 pm to 3:35 pm
LOCATION: Morial Convention Center, Room: 349
To any first-time ACSers reading this, don't miss the various CINF functions. (I seem to recall that they are not exactly easy to find in the printed programme but details as ever are on the webs.)

Monday 25 March 2013

Time and the InChI


Notes:
1. Created using a Google spreadsheet and Timeline JS.
2. If anyone wants to send me updates (e.g. for commercial software like ChemDraw I found it hard to find dates and versions), feel free.

Monday 18 March 2013

Police your code with fuzz testing

I've just become a convert to Fuzz testing. This is a rather simple idea for testing software that processes some user input: just send in some random junk and see what happens (and keep repeating until something does).

I tried this for some software I've been working on, thinking it was so dumb it couldn't possibly flush out anything useful, but quickly changed my mind. Even more interesting, where the space of all possible user input can be stepped through systematically, fuzz testing can be used to map out the space of allowed input (which may or may not be what you were expecting). For example, you could find all 4-letter words acceptable to OPSIN by using the code below to generate fuzz and use OPSIN through Cinfony (for example) to find out whether an error is raised.

import random

class Fuzzer:
    def __init__(self, allowed, length):
        self.allowed = allowed
        self.length = length

    def systematic(self, text=""):
        if len(text) == self.length:
            yield text
        else:
            for x in self.allowed:
                for y in self.systematic(x+text):
                    yield y

    def random(self):
        return "".join([random.choice(allowed) for x in range(self.length)])

if __name__ == "__main__":
    allowed = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVCWXYZ"
    fuzzer = Fuzzer(allowed, 3)
    for fuzz in fuzzer.systematic():
        print fuzz
    fuzzer = Fuzzer(allowed, 5)
    for i in range(10):
        print fuzzer.random()

Sunday 3 March 2013

Avogadro hydrogen-clicking craziness with Sikuli

There's something that's always annoyed me when using Avogadro. On Windows, it often takes several clicks on a hydrogen before I hit the sweet spot and it sprouts a methyl group. This is only mildly annoying when building a structure, but very frustrating for operations where misclicking cause it to lose selection (e.g. rotating around a dihedral). It's awkward to report a bug about this though because it's hard to prove.

Enter Sikuli. It's a Java application running Jython that's used for testing GUIs (among other things). It uses a sort of visual programming style to match regions of the screen based on screenshots and carry out operations based on whether or not a match exists. (At this point you may want to check out the videos on the Sikuli website to figure out what I'm talking about.)

My original idea was to get Sikuli to click all over the offending hydrogen and see which points worked and which didn't. In the interests of time, I reduced this down to all points on a horizontal line through the centre of the hydrogen. First of all, here is the result, with the 'allowed area' indicated by the red line:
I would argue that that red line is neither long enough nor in the expected location. In any case, check out the sort of code needed to run this test:
You mightn't understand everything here, but you should get the gist. On every iteration of the loop it clears the Avogadro window, clicks to create a methane, and then clicks at point x, y on a particular hydrogen (where x, y are relative to the centre of the image of the matched hydrogen) trying to sprout a methyl. The "success test" checks that a methyl sprouted. The "fail test" checks whether a methane was added (i.e. the region to the right of the red bar in the diagram of the atom above). Note that fuzzy matching is used to match screenshots against the actual contents of the screen or a region.