Posts: 2
Joined: Thu Jan 11, 2018 2:14 am

Off Topic - Ubicom ip3k - cool chip

Postby drudru » Thu Jan 11, 2018 2:37 am


I think there used to be a forum here for the Ubicom, but I cannot find it.

I was re-reading "Computer Architecture" (Dave Patterson and John Hennessy) last night. I was thinking about avoiding branch mispredicts, yet still getting utilization of the units. I thought about multiplexing threads by time. Of course, after some searching, I learn that this is not a new idea. In fact, I am sure I must have heard Alan Kay mention that the Alto CPU had something similar.

When I found the page on 'barrel processors' at Wikipedia, it led me to the Ubicom ip3k chips. Wow - what a slick design!
I am embarrassed that I missed knowing about this chip, but I am not in the embedded field.

I spent the rest of the evening reading up on it. I learned that there used to be a forum here and that the administrator was one of the designers.

I had a question that Nick may be able to answer:

The chip had a 10 stage pipeline. Why did the memory stages require 2 stages (or clocks?) per function (fetch, read, write)?

Best regards,

Redwood City

Posts: 14168
Joined: Tue Jan 13, 2004 9:39 am

Re: Off Topic - Ubicom ip3k - cool chip

Postby nickk » Thu Jan 11, 2018 10:44 pm


Haven't thought about that in a while - it was my University research back in 98/99 before joining Ubicom. I wrote a Verilog simulation of a PIC with a 4-stage pipeline (single cycle execution) and 4 hardware threads.

The ip3k pipeline wasn't that long... pretty much one stage per action.

The ip3k architecture had two key things that made is great - cycle-level multithreading, and that every ALU instruction could be memory to memory (single cycle when using internal memory or if in cache).

Great chip to write assembly code for.


Posts: 2
Joined: Thu Jan 11, 2018 2:14 am

Re: Off Topic - Ubicom ip3k - cool chip

Postby drudru » Sat Jan 13, 2018 3:19 am

Thanks Nick!

Yeah, memory to memory opcodes made assembly much simpler and higher level (aka fun)

BTW, I found a presentation online for Hot Chips that discussed the ip3k. It says it had 10 stages.
The opcode fetch was 2, the memory read was 2 and the final memory write was 2.
The other stages were 1.
I'm guessing that the memory still needed two clocks vs. the CPU, but I've not seen that before.
If you add in the thread scheduler, it is pretty reasonable, as you say. Nothing like the old Pentium 4.

Thanks again :) ,


