I read this thread with great interest as I am working on a similar project myself. I am interested in recording video directly from the Game Boy, rather than streaming it to a PC. I hope to eventually build a portable solution that isn't tethered to a computer. Nonetheless, my first step will likely be to read the data with a Teensy 3.1 and send it to a computer over USB so that I can capture and analyse some frames (this saves me buying a logic analyser).
I think the Teensy is a good choice as it is cost-effective (especially compared to Arduinos) and reasonably fast. I'd like to avoid FPGAs if possible, simply because I've never worked with them before and it looks as though the learning curve is quite steep. Nonetheless, it seems like if we use the same chip our projects could share some code, which I would be happy to do, although this may be slow going due to my work commitments. What follows are a few thoughts I've had so far which might help you out.
[Edit, 2014-04-26: To anyone reading this thread from the start: The Teensy 3.1 is proving to be a less suitable board than I had anticipated, and so I would no longer recommend it for this application without reservations. See pp. 5-6.]
The basic idea of connecting a Teensy would be to put the clock signal on an interrupt and use port manipulation to read the two data lines (as Jazzmarazz suggests). It may be desirable to also use interrupts for VSync and HSync.
I initially suspected that the effective framerate of the Game Boy would be lower than the refresh rate, but having done some investigation it seems that this is not the case. This means that we do need to be able to handle around 60 fps if we don't want to drop frames.
The Game Boy's LCD receives data at 4.19 MHz (the Game Boy's CPU clock). The screen resolution is 160x144 px with a bit depth of 2 bpp (as you point out), and the VSync is 59.73 Hz. This means that while we need to capture the data at 4.19 MHz, we only need to transmit it at around 2.75 Mbps if we send the raw data. The Teensy apparently transmits serial data at USB Full Speed (12 Mbps) (although I have not tested to see if it can maintain these speeds), so this is well within spec. Since the clock speed is fast relative to the resolution and refresh rate, it means there must be some dead time in between pixels (about a third of the time is spent actually transmitting pixels). This time seems to occur at the end of each line and lasts about 78 µs, giving the microcontroller time to do other processing.
The Game Boy LCD is driven with 5V logic. It is worth noting that only the Teensy 3.1 has 5V tolerant inputs; the Teensy 3.0 does not (I don't know about the older ones), meaning that you would need to add a voltage divider (although this is trivial). Also, on the subject of voltage, as Craig of Flashing LEDs points out, "be careful of the -30V DC contrast signal," which can be found on the LCD connector.
You mentioned making a device which uses the same protocol as webcams. USB Webcams are USB Video Class Devices, the specification is here: http://www.usb.org/developers/devclass_
ss_1_5.zip, but it would be a lot of work to implement this on a microcontroller. I think that writing a (userspace) driver would be a much better option, but if you really want a device that doesn't need custom drivers, you could output a video signal and then feed it into a USB video capture card (although there would be some loss in quality with this approach).
Anyhow, for reference, here are some links to the relevant research (which you have no doubt seen, but will be of interest to others reading this thread). Possibly more to come.
Gameboy VGA Adapter (Rival-Corp) [archive]
http://web.archive.org/web/201206110330
adapter-2/
Intercepting the Gameboy LCD (Flashing LEDs)
http://flashingleds.wordpress.com/2010/
meboy-lcd/
NintendOscope (Flashing LEDs)
http://flashingleds.wordpress.com/2011/
endoscope/
Gameboy Classic VGA-Adapter (snesy)
http://circuit-board.de/forum/index.php
A-Adapter/ [German]