Powerful uC's are kinda pricey pretty much as FPGA, I'd better buy the latter to be honest...

I've been thinking for some time about how do we simply get five signals directly to PC as rvan suggested. Because what we're trying to do now is to use a uC as middleware, which turns into a bottleneck.
I'm thinking of some sort of a shift register based circuit that would connect to the USB or something like that... Basically we need some solution that is clocked by LCD CLOCK signal... FPGA can be used for that but I wonder is there an easier way and maybe FPGA is an overkill for that...

Let's also get back to Flashing LEDs solution - there he used 2-SPI bus uC. We could try that. Although I'm not sure we'll be able to send data to PC at the necessary speed...

rvan, you've mentioned that when you put a byte to the array it takes a lot of time. Have you tried sending data directly to PC? This might work but also might introduce some delays (e.g. buffered USB data sent later)

You can call me crazy but uXe's old idea with Play Station eye doesn't seem that bad now. This thing (well, PS3 version for sure) can record data with 60fps. And they are ridiculously cheap these days I've just ordered 2 for $20. They are heavily used by computer vision guys - this is exactly what we need. There are drivers for multiple platforms. So it's an interesting thing to mess with. But I'd call it a spin off of this project.

Some thoughts on further steps:

1. Remember my suggestion about webcam based approach? Well it looks like all the web cams are 30fps and GB is 60 fps so it's not an option. There are probably some faster web cams but they're expensive.

2. I PM'ed Nitro about clock master and here's how he'd go about it:

Nitro2k01 wrote:

You need a loop, where you do something like the following. This assumes digital pin 0 is connected to the clock input of the Gameboy.

PORTD |= 1; // Turn on clock signal
(Wait a little and/or do something.)
PORTD ^= ~1; // Turn off clock signal
(Wait a little and/or do something.)

PORTD ^= 1; // Flip clock signal (xor)
(Wait a little and/or do something.)
PORTD ^= 1; // Flip clock signal (xor)
(Wait a little and/or do something.)

If done perfectly, you now get a perfect square wave output. It does not have to be perfect, but...
1) It can't be too fast, or the CPU or cartridge maybe will not be able to do everything it needs before the next clock pulse arrives.
2) A little deviation from the correct clock frequency is tolerable, as long as the long term average is correct. For example, you might send a slightly too slow clock during the period when the data is being sent and a slightly too fast clock during the blanking periods when no data is transmitted. Even though the CPU itself is fine with a slightly varying clock, the audio output may get artifacts from this, like maybe a buzzing or FM type effect.

However he also mentions that he's not sure if after all it can help us to get the video and we gonna need to use assembly. That's a bit discouraging, but ok, let's keep this in mind.

3. FPGA. I personally like this option because at least I'm sure it CAN be done with FPGA. However I have no idea how big a board we gonna need for that (how big = how expensive). A good powerful board (that will most certainly do the thing) will cost around $150. I'd buy one if have no other choice, but at the moment it's cheaper to buy an SNES or its clone and tear it apart and use SGB.

5. VRAM - uXe posted that it is hardly doable (although a number of other things can be done with this approach, so the idea is still brilliant IMO).

Other thoughts.

Now let's look one more time at what we're facing.

- It's a 4MHz (rough estimate, I don't count dead time) 5-channel data flow.
- Even with 2GHz uC (let's assume one exists) we'll have 500 cycles to monitor 5 channels and send data to PC as fast as 4Mbs (or if we're assembling the frame in uC then 2,7mbs but then we need cycles to assemble the frame).
- Even making uC a clock master - we have predictable timing BUT it's still 4M just more precisely measured.

Looks pretty bad now. The only one cheap enough viable option is using SGB. What I don't like about this approach (and rvan has already mentioned this) is that we end up with the de-digitized signal that we'll have to re-digitize. It will work, but it's like turning your night lamp via Internet through a proxy in China (unless you're in China but then replace China with USA).

Apart from that it's either FPGA or some cam-corder type. Thoughts?

rvan wrote:
friendofmegaman wrote:

Doesn't work for me...

Post your code!

Sure, sorry guys I'm just loaded with work lately so things keep slipping off my mind. Here's the code (I know Jazz won't approve):

(Teensy only, won't work on Arduino)
Timer based

const int pClk = 10;
int clk_state = LOW;
IntervalTimer myTimer;

void setup(){
    pinMode(pClk, OUTPUT);
    // I played with the timeout - no lucj
    myTimer.begin(tick, 0.125);
}

void tick(){
  if(clk_state==LOW)clk_state = HIGH;
  else clk_state = LOW;
  digitalWrite(pClk, clk_state);
}

Thanks for sharing rvan!

rvan wrote:

Although we saw that the DMG's clock crystal produces a sine wave, I am not convinced that this waveform is necessary for the CPU's clock, given that neither the LTC6930 (Kitsch-Bent's easy_CLK) nor the LTC1799 produces a sine wave.  Synthesizing a square wave (which should in theory work fine) is as simple as toggling a digital output pin (other waveforms are more CPU intensive)

Doesn't work for me...

rvan could you share a bit more about your shift register approach?
Concretely I wonder how did you wire the pins:

Q0 - Q7 -> Teensy
GND, Vcc -> GND, +5V
Q7S - I guess not connected or grounded?
MR - ?
SHCP -> LCD CLOCK?
STCP - what's the role of the storage clock?
OE  - ?
DS -> Data0 / Data1

How do you know when it's time to read data from the register?

Also I was theorizing on how this would work out. Since we can read a byte in one go we're reading with roughly 500KHz and we have 96M/500K = 192 cycles are left to shoot data to PC. So in theory looks legit. Especially if combined with Nitro's approach it might be super neat.

Another question I have is how do we produce clock signal with Arduino / Teensy?

I see, so the RAM space is composed of VRAM and working RAM.
So it appears that CPU "sees" those two as a single chunk if data is that correct?
If so then when Arduino reads from the RAM we need to fix the addresses?

So from the implementation POV we're gonna need several shift registers (e.g. 8x8), two dual RAM chips.

Now I'm beginning to doubt is this idea really worth it. It's too much work to do for just video out... on the other hand I at least can comprehend it unlike FPGA that I have absolutely no idea about.

Anyways I'd like to see how's rvan's idea with shift registers before doing something that hardcore.
Also Nitro's suggestion seems easier to implement, so we should not be hasty here.

uXe wrote:

Yeah, like I said above, you would probably want TWO dual-port chips to be able to access both the Video RAM and the Working RAM - you are going to need access to the Object Attribute Memory to be able to draw the sprites...

Now I'm a bit confused are there two RAMs?

It's true that we're going to need some emulator code to build the frame again.

Let's think about it for a second. Because I see dramatic change in the direction. So instead of dealing with clocked frame data we are dealing with GB's internal structures. Let's weight pros and cons.

**** Frame data approach ****
Pros:
- Just 5 bits required
- It's easy to reconstruct the frame
- Easy soldering job

Cons:
- Clocking issues

**** RAM approach ****
Pros:
- Solves clocking issues (allegedly)
- Can display the whole 256x256 background (although I'm not sure that overlaying sprites window is also 256x256)
- We can also write memory
- We have control over building blocks of the frame and hence have more options to mess with them on the PC side (it's especially interesting from the coloring point of view). It would be interesting to reverse engineer pieces of Super Game Boy to add the colors and extra content.

Cons:
- Trickier to solder (actually de-soldering the existing RAM is the hardest part)
- More complex frame reconstruction algo
- We're moving away from the idea of video capture. E.g. we could just read a cartridge rom and feed to the emulator - pretty much the same effect but less work then.

ultramega wrote:

I was kind of wrong, but you gotta consider sprites, screen on/off, color palette... there are some things that affect the screen that are outside of vram, and if you want 1:1 video capture they need to be considered.

I'm not sure actually... this fragment (talking about GB RAM):

The lower 32K ($0000-$7FFF) is used for accessing cartridge rom. A gameboy cartridge is basically carved up into 16K pieces, called BANKS, with the first 16K bank of cartridge space being addressable at the first 16K of the address space ($0000-$3FFF). This first 16K bank is called the HOME BANK. This piece of the cartridge is ALWAYS accessed from this area. This is where most of your game's code will be placed, since it's always available. This first 16K also holds the HEADER of a GameBoy game, located at $0100. The header is basically an ID tag for a game. The rest of the cartridge banks can be "paged" into the address space from area $4000-$7FFF. This area is called paged because you can change. NOTE, only ONE bank of the cartridge is visible in the paged space at any given time. To change which bank is visible, you need to tell the hardware what bank to change it to. This is pretty easy and will be detailed later.


So the data from the cartridge ROM is *loaded* into GB RAM. So even if we need something from the ROM we can grab it from here. The only thing that bothers me is that we may run into timing issues again. Let's say I need a particular piece of data from the cartridge region, however by the time I'm accessing it it can be already gone (replaced by another bank). This holds of course only if we really need the data from there.

ultramega wrote:

i could be wrong but...

having access to the GB's ram will only give you the tiledata, you will need access to the rom to assemble the tiledata into screens / images. This includes sprite data and locations, priorities etc. And you would need to know what code is executing in the rom to figure out what needs to be where on the screen. i dont think its as simple as using image generation code from an emulator.

Good point, but in this document (http://gameboy.mongenel.com/dmg/lesson5.html) says that data in the video ram (which is a dedicated region of the 64Kb SRAM) suffices to build the frame. But then again it worth double-checking.

Any ideas how do we map the pins?

Pins marked A0-A12 on gameboy SRAM are mapped to either A0r-A12r or A0l-A12l

It seems like D0-D7 are mapped to IO0-IO7 (either r or l)

NOT OE also presented in both chips

I will try this direction then. It's an interesting approach allowing us (in the future) do something more than mere video capture or even extend this idea for GBC. But I'll be asking for some help here, hope uXe will help me out with theory smile

Anybody wants to try Nitro's idea to make uC the clock master?

uXe wrote:

A good explanation from http://gameboy.mongenel.com/dmg/lesson5.html

It becomes more and more interesting... basically this way we could access all the data including variables used in the game... neat!

uXe wrote:
uXe wrote:

Actually, scratch that - just use a dual-port SRAM like this:

http://www.idt.com/products/memory-logi … l-port-ram

and there won't be any conflicts for the GameBoy and the Arduino to share the same memory!

...and then Batsly announces this:

http://chipmusic.org/forums/topic/14236 … interface/

Ha! Beautiful! big_smile

I'll be damned! It's bloody brilliant.

So, uXe, you seem like an expert in these things - any chance you're willing to try this direction? If you don't want to bother yourself, could you at least provide clear instructions how to do it? Concretely:
1. Are these the same types of RAM? Do we need any adapters to plug it to GB?
2. In my understanding the dual-port RAM replaces RAM in GB and connects (not replaces) to Arduino since the latter can work with external RAM is it correct?
3. Let's assume we have steps 1 and 2 done, then to get the frame data we just need to read the corresponding region of RAM using Arduino's RAM API at the rate we want (may be 60Hz as screen, may slower if we are saving bandwidth) - did I get it right?
4. Why do we need a piece of emulator code? Is it to get background and tiles data and build a 160x144 screen (or as you suggested 256x256 which is even better)?

This in fact could be even used in GBC...

It depends on what do you mean by exactly the same way? For the most part it's a yes. But you won't be able to put 6.5 audio jack, or 2 MIDI jacks. However with some PCB trimming and careful measuring a lot can be done. As for simple mod sets like line out + pitch + backlight there's no problem at all.