My wife recently explained the difficulty involved in keeping track of orchestra auditions. With few central lists or registries, musicians must rely on word-of-mouth and manual website browsing to discover auditions. It sounded like something that computer science could help with, so I wrote a tool to do it automatically and have posted the results on my website.. 🙂
Ultimately, the optimum solution would be some sort of data mining algorithm with a spider to crawl the web for orchestra websites. The algorithm would need to be intelligent enough to find audition postings and to extract the pertinent data (position name, number of rounds, dates, repertoire, etc.). This might end up being a fairly complicated bit of software, so I figured I would try something simpler.
I realized that as long as I was given an orchestra’s “auditions” website URL, I could extract the basic information from the raw HTML using regular expressions. Most websites use formatting to highlight various parts of audition posts, and these can be extracted fairly easily with regular expressions.
So I coded it up in Ruby, and it works fairly well. Since the regular expressions are hardcoded for each website, it’s rather brittle; i.e. if the orchestras change their website format even a little, the tool will no longer work. Also, it generally takes about 15 minutes to custom-craft an appropriate regex for each website. Some websites even require two levels of examination since they post the audition details on a separate page for each position.
Anyway, I’m posting it here to see if anyone else thinks it would be a useful tool. Currently, I’ve added support for 25 U.S. orchestras (mostly the major ones), but I imagine I’d need need to add more to be truly helpful. Unfortunately, maintaining the regular expressions long-term is probably not something that I can realistically devote enough time to. If anyone is familiar with regular expressions and is interested in learning how to maintain the tool, let me know.
Here’s the link: http://freearrow.com/auditions/