Messages from 102000

Article: 102000
Subject: Re: Can an FPGA be operated reliably in a car wheel?
From: "Symon" <symon_brewer@hotmail.com>
Date: Tue, 9 May 2006 16:17:08 +0100
Links: << >> << T >> << A >>

"Johan Bernspång" <xjohbex@xfoix.se> wrote in message 
news:e3q9ej$j5i$1@mercur.foi.se...
> Is it the tire pressure you are going to meassure? In that case there are 
> indirect ways of doing that: http://www.niradynamics.se/tpi.htm You don't 
> need any sensors in the wheel, and thus no FPGA in, or communication with, 
> the wheel either...
>
Of course, if you've got those run-on-flat tyres maybe you don't care about
the pressures?

http://www.funny-videos.co.uk/downloads/runonflat.wmv

:-)
Cheers, Syms.

Article: 102001
Subject: Re: PCI Express and DMA
From: "SongDragon" <songdrgn@g.m.a.i.l.n0.spam.com>
Date: Tue, 9 May 2006 11:23:45 -0400
Links: << >> << T >> << A >>

Thanks for the helpful responses from everyone.

The basic idea seems to be as follows:

1) device driver (let's say for linux 2.6.x) requests some kernel-level 
physical memory
2) device driver performs MEMWRITE32 (length = 1) to a register 
("destination descriptor") on the PCIe device, setting destination address 
in the memory
3) device driver performs MEMWRITE32 (length = 1) to a register ("length 
descriptor") on the PCIe device, setting length "N" (We'll say this also 
signals "GO")
4) PCIe device sends MEMWRITE32s (each length = up to 128 bytes at a time) 
to _______ (what is the destination?) until length N is reached
5) PCIe device sends interrupt (for now, let's say INTA ... it could be MSI, 
though)
6) device driver services interrupt and writes a zero to a register 
("serviced descriptor"), telling the PCIe device the interrupt has been 
fielded.

I have a number of questions regarding this. First and foremost, is this 
view of the transaction correct? Is this actually "bus mastering"? It seems 
like for PCIe, since there is no "bus", there is no additional requirements 
to handle other devices "requesting" the bus. So I shouldn't have to perform 
any bus arbitration (listen in to see if any of the other INT pins are being 
triggered, etc). Is this assumption correct?

In PCI Express, you have to specify a bunch of things in the TLP header, 
including bus #, device #, function #, and tag. I'm not sure what these 
values should be. If the CPU were requesting a MEMREAD32, the values for 
these fields in the MEMREAD32_COMPLETION response would would be set to the 
same values as were included in the MEMREAD32. However, since the PCIe 
device is actually sending out a MEMWRITE32 command, the values for these 
fields are not clear to me.


Thanks,

--Alex

Article: 102002
Subject: Re: FPGA-based hardware accelerator for PC
From: "JJ" <johnjakson@gmail.com>
Date: 9 May 2006 08:31:13 -0700
Links: << >> << T >> << A >>

pbdelete@spamnuke.ludd.luthdelete.se.invalid wrote:
> >> A while back, Toms Hardware did a comparison of 3GHz P4s v the P100 1st
> >> pentium and all the in betweens and the plot was basically linear
>
> >Interesting. In fact I don't care about P4, as its architecture is one
> >big mistake, but linear speedup would be a shame for a Pentium 3...
>
> What in particular do you think is wrong with the P4 ..?

Well how about power/heat or even cost v AMD, a constant issue on Intel
for the last few years.

It was the return to the P3 that allows them to move forward again with
the Centrino, then the DualCore, not sure what this new direction is
really all about though. But Netburst and maximum clock freq at any
cost for marketing sake is dead.

The only thing good about P4 I ever heard was in memory throughput
benchmarks and maybe the media codecs which makes sense for the deeper
pipelines that were used.

John Jakson
transputer guy

Article: 102003
Subject: TME Free Verilog/VHDL framework generation tool
From: topweaver@hotmail.com
Date: 9 May 2006 08:34:34 -0700
Links: << >> << T >> << A >>

As a table based edit tool for HDL module's interface definition,
Topweaver Module Editor (TME) unifies the process of HDL coding and
document writing.

In a chip design process, for example ASIC or FPGA, people usually
first plan the project on a top-down flow, then write the HDL code to
implement the idea. However due to the restriction of traditional
development tools, people are needed to code from the very bottom,
which means a bottom-up coding style. This conflict between the plan
and coding often brings to huge human labor and unreliable product
quality. When finish coding, engineers always find the original project
plan out-of-date and need rewriting. Topweaver Module Editor wants to
change this situation.

With the tools from Topweaver family, people can quickly build the HDL
code's framework, simultaneously with the document file. TME will be
integrated into the next major version of Topweaver.

Features
Unify the document and the HDL code with a powerful table based edit
environment
Full function table editor to simplify the human labor
Generate HDL ports definition on Verilog95, Verilog2001 and VHDL
Visually adjustment of HDL code format
Support complex HDL code template
HDL code synchronization enable secure modifications on existed design
Extensible interface library to manage common used signals

Screenshot: http://www.topweaver.com/doc/tme/images/overview.jpg
Demo Movie: http://www.topweaver.com/demo/TME_1.htm
Download: http://www.topweaver.com/download.htm

Article: 102004
Subject: Re: Xilinx 3s8000?
From: "Jeff Brower" <jbrower@signalogic.com>
Date: 9 May 2006 08:44:36 -0700
Links: << >> << T >> << A >>

Tom-

> I really think you haven't looked enough into what can be done in
> software before jumping to hardware; gmp-ecm is in the public domain,
> and Bruce Dodson runs it full-time on a 120-node Opteron cluster at
> LeHigh university.

Haha, be careful of what you suggest.  Now Ron will need a
complimentary Opteron cluster from AMD :-)

-Jeff

Article: 102005
Subject: Re: Xilinx ISE 8.1 Makefile
From: Sean Durkin <smd@despammed.com>
Date: Tue, 09 May 2006 18:09:28 +0200
Links: << >> << T >> << A >>

Sanka Piyaratna wrote:
> Hi,
> 
> I am wondering if there is anyone who has worked out a way to use ISE
> 8.1 projects with Makefiles to compile FPGA images.
> I am actually
> wondering if it would be possible to automatically generate a Makefile
> from the project file.
I guess in previus versions that wouldn't have been too hard with a
little Perl-script. But in ISE8.1 they changed to a proprietary binary
format for the project-file, so it's kind of hard to make sense out of
it. Unless someone has by now found a way to decode it properly, that is.

cu,
Sean

Article: 102006
Subject: Re: Xilinx 3s8000?
From: "radarman" <jshamlet@gmail.com>
Date: 9 May 2006 10:02:48 -0700
Links: << >> << T >> << A >>

Agreed - my primary gripe is that the older versions don't play well
with a newer version installed and vice-versa.

Right now, my copy of 7.1 is unusable - and I'll have to get an admin
to clean it up to the point where it is usable again. I can manually
make it mostly usable by altering the system environment variables -
but that's a bit of a pain. To top it off, the 4.2 install doesn't
quite work properly either - probably for the same reason. It seems to
go through mapping and PAR OK, but I have to create the downloadables
on another system. We fought for a week with every part of the
toolchain until we switched to another workstation to create the
bitstream files - no errors, just corrupt binaries.

When I'm more or less done maintaining the old design, I'm probably
going to just wipe both off, and install the latest version again.

Article: 102007
Subject: Re: help me to about clock in fpga
From: "Slurp" <slip@slop.slap>
Date: Tue, 9 May 2006 18:09:15 +0100
Links: << >> << T >> << A >>


"kaps" <kapilaryan2003@gmail.com> wrote in message 
news:1147170734.951729.198600@i40g2000cwc.googlegroups.com...
> respected knowledgious persons,
> i want to know about how to give clock to fpga.as i know do we give it
> by software or hard ware.also i wants to know that if we have 12 inputs
> each ofmore than 8 bits and and one input is of  upto 16 bit,but our
> kit has only 4 to 8 switches so how we can check our design on board.
> thanks in advance.
>

oh great one,
i thinks to and it of multiplex muchness and compile over switch load 8 bit.
when 8 bit circumspect on 12 input (when over ride n * x bit bus) then bit 
blit over see 16 bit.
only where if and but causal case but multiplex clock input pin when bus 
equivalent.

hope it helps

Article: 102008
Subject: Max operating freq in a breadboard
From: George Orwell <nobody@mixmaster.it>
Date: Tue, 9 May 2006 19:48:25 +0200 (CEST)
Links: << >> << T >> << A >>

Hi,

I am using a Xilinx Spartan-3 Starter board and an accessory breadboard (DBB1) from Digilent for prototyping. 
(http://www.digilentinc.com/Products/Catalog.cfm?Nav1=Products&Nav2=Accessory&Cat=Accessory)

I have two questions:

- What is the maximal operation frequency for 
  a circuit on the breadboard? (around 1 MHz?  10 MHz?) 

- What will be the best I/O pin configuration for 
  external LS-TTL-compatible logic?  
  I am currently using the following in .ucf file  
  NET "ttl_in" LOC = "XX" | IOSTANDARD = LVTTL | SLEW = SLOW | DRIVE = 4;

Thank you in advance.

Steve

Article: 102009
Subject: Re: FPGA-based hardware accelerator for PC
From: "bart" <bart.borosky@latticesemi.com>
Date: 9 May 2006 10:57:06 -0700
Links: << >> << T >> << A >>

> Yeh, I have been following Lattice more closely recently, will take me
> some time to evaluate their specs more fully, may get more interested
> if they have a free use tool chain I can redo my work with.

> Does anyone have PCIe on chip though?

> John Jakson
>transputer guy

For PCIe x4, I think LatticeSC has the PHY and data link layers in the
structured ASIC (MACO) portion of the chip. For the transaction layer,
you'd need IP. See:
http://www.latticesemi.com/corporate/webcasts/preengineeredpciexpressso/index.cfm

Hope this helps.
Bart Borosky, Lattice

Article: 102010
Subject: ml-403 and USB
From: "Anonymous" <someone@microsoft.com>
Date: Tue, 09 May 2006 18:11:59 GMT
Links: << >> << T >> << A >>

Has anyone used USB from linux on the ml-403 board? I'd like to get some
peripherals like usb memory or bluetooth adapter to work on it but usb is
not in the kernel they provide. The hardware does appear to be on the board
though.

Thanks,
Clark

Article: 102011
Subject: Re: Can an FPGA be operated reliably in a car wheel?
From: fpga_toys@yahoo.com
Date: 9 May 2006 11:51:15 -0700
Links: << >> << T >> << A >>


cs_posting@hotmail.com wrote:
> Well someone please go turn one upside down, measure it with a
> micrometer, and make a note of it, so at least our grandkids will be
> able to settle this.
>
> (make sure it's not near where said grandkids are likely to be playing
> ball in the years before they gain an appreciation for experimental
> science)

Sounds more like time for a good FPGA accellerated computer model :)

Article: 102012
Subject: Re: help me to about clock in fpga
From: MikeShepherd564@btinternet.com
Date: Tue, 09 May 2006 19:55:36 +0100
Links: << >> << T >> << A >>

>
>oh great one,
>i thinks to and it of multiplex muchness and compile over switch load 8 bit.
>when 8 bit circumspect on 12 input (when over ride n * x bit bus) then bit 
>blit over see 16 bit.
>only where if and but causal case but multiplex clock input pin when bus 
>equivalent.
>
>hope it helps 
>
Hold on...let me guess...it was you documented Altera's LVDS
megafunction?

Article: 102013
Subject: Re: Anyone use Xilinx ppc405 profiling tools?
From: "Joel" <jceven@gmail.com>
Date: 9 May 2006 12:01:53 -0700
Links: << >> << T >> << A >>

Alan,

I recently finished my Masters Thesis on Algorithm Acceleration in
FPGA.  Part of my research and experiementation was running some
algorithms on the PPC405 core in V2PRO.  I used both ISE/EDK 7.1 and
EDK 8.1 (and then developing IP in the FPGA and attaching to PLB).  I
got software profiling to work using PIT and I used PLM BRAM memory for
storing the profiling information.  Initially I ran into a lot of
problems, but eventually got it to work on 7.1.  I was using latest ISE
SP and EDK SP for 7.1.  Most of my research was infact done on ISE/EDK
7.1, and towards the end I repeated same experiements using EDK 8.1.
Profiling worked on EDK 8.1 also for PPC406 using -pg gcc and setting
PIT for software profiling in software platform settings.

Another benchmark I used was a OPB timer that was reset and measured
before and after the algorithm started and stopped.

Something noteworthy (and probably makes a lot of sense) is the EDK8.1
generated a lot faster software (EDK 7.1 and 8.1 was compared for -O0,
-O1, -O2, -O3) on the -O2 and -O3 gcc options, but this is probably
most likely do to the fact they are using a newer version of GCC in EDK
8.1.  Alan, If you have any specific questions you can email me, and I
can tell you my experiences using sw profiling in 7.1.  It was quite a
hassle but eventually worked.

-Joel

Article: 102014
Subject: Re: Xilinx ISE 8.1 Makefile
From: "John Retta" <jretta@rtc-inc.com>
Date: Tue, 09 May 2006 19:39:22 GMT
Links: << >> << T >> << A >>

Here is sample .bat file that I typically use for command
line synthesis ... that bypasses gui entirely.  There are lots
of advantages to command line flow ... Entire synthesis
flow is documented ( and can be archived) in text document.
Portability of synthesis flow among workstations/PCs is
ensured - not determined by check box in a nested dialog
box.

Take a look at the developers reference guide in the Xilinx
install path /Xilinx/doc/usenglish/books/docs/dev/dev.pdf.
This document describes all command line options for
various utilities.

---------------------------------------------------------

cd ..
rmdir /s /q synth
mkdir synth
cd synth

xst -ifn ..\scripts\xst.txt -intstyle xflow

ngdbuild %1.edf -p xc3s400-4-pq208 -uc ..\files\%1.ucf

map -k 6 -detail -pr b %1
rem pause

par -ol med -w %1.ncd %1_r%2

copy %1.pcf %1_r%2.pcf

trce -e -o %1_err.twr %1_r%2
trce -v -o %1_ver.twr %1_r%2


rem **************************************************
rem * first make bitgen for rom, then remake for JTAG
rem **************************************************
rem bitgen  -w -g UserID:55550%2 %1_r%2 %1_r%2
bitgen  -w -g UserID:55550%2 -g DonePipe:yes -g UnusedPin:Pullup  %1_r%2 
%1_r%2


-- 
Regards,
John Retta
Owner and Designer
Retta Technical Consulting Inc.
Colorado Based Xilinx Consultant

email : jretta@rtc-inc.com
web :  www.rtc-inc.com


"Sanka Piyaratna" <jayasanka.piyaratna@gmail.com> wrote in message 
news:12618sbec3j0d08@corp.supernews.com...
> Hi,
>
> I am wondering if there is anyone who has worked out a way to use ISE 8.1 
> projects with Makefiles to compile FPGA images. I am actually wondering if 
> it would be possible to automatically generate a Makefile from the project 
> file.
>
> Thank You,
>
> Sanka.

Article: 102015
Subject: Re: Xilinx 3s8000?
From: "Jeff Brower" <jbrower@signalogic.com>
Date: 9 May 2006 12:46:21 -0700
Links: << >> << T >> << A >>

Radarman-

> We fought for a week with every part of the
> toolchain until we switched to another workstation to create the
> bitstream files - no errors, just corrupt binaries.

I don't understand the struggle.  You are violating basic usage
principles of ISE.  Never a) uninstall and re-install, or b) install a
new version on the same machine as an earlier working version.  The
only thing permitted on any given machine is service pack upgrades
within a version.  Yes that leaves you with the "5.1 machine", the "6.1
machine", etc. but that's the rule.  Your FAE should have told you
that; it's all we've ever heard since 1999 and several different FAEs.

-Jeff

Article: 102016
Subject: Superscalar Out-of-Order Processor on an FPGA
From: "Luke" <lvalenty@gmail.com>
Date: 9 May 2006 12:50:14 -0700
Links: << >> << T >> << A >>

I've got a little hobby of mine in developing processors for FPGAs.
I've designed several pipelined processors and a multicycle processor.
Now I want to design a more modern processor.  To me, this means
superscalar, out of order execution, branch prediction, and data and
instruction cache.

I know that this type of processor isn't the best fit for the
technology.  Has anyone come across such a design for an FPGA?  I've
perused the net looking for them and have come up short.

I think a 4-Issue processor should be feasible on a Spartan 3 1000.
I've done an analysis on the major bits of hardware, and it seems quite
possible.

I've posted some links to papers I found on OOO scheduling algorithms
on my blog: http://bitstuff.blogspot.com.  The most space efficient
algorithm I found uses dependency chains and schedules instructions
into multiple FIFOs.

Any ideas or thoughts?

Article: 102017
Subject: Re: Putting the Ring into Ring oscillators
From: Jim Granville <no.spam@designtools.co.nz>
Date: Wed, 10 May 2006 07:52:56 +1200
Links: << >> << T >> << A >>

Kolja Sulimma wrote:
> Jim Granville schrieb:
> 
> 
>>>No. The article is not talking about chained buffers for high timing
>>>resolution. Such a setup would still charge the clock lines from VDD and
>>>discharge to GND for each clock cycle.
>>>
>>>They are really talking about sending a wave around a transmission line.
>>
>>
>>Correct - see my comment on using the routes inside a FPGA for this.
>>
>>
>>>Standing wave clocking is an exotic but established technique in PCB
>>>design. At high frequencies you can use it inside ICs.
>>>A physical wave uses the same charge again and again, only resistive and
>>>EMI losses need to be refreshed by buffers PARALLEL to the transmission
>>>line.
>>
>>
>>yes, Parallel drive might test the FPGA tools some more :)
>>
>>Series drive would be a compromise, where the physical delay dominates
>>the gate delay. With each generation, the gate delays shrink faster than
>>the physical delays.
> 
> 
> I believe you are still missing the point.
> If you are chaining buffers you are still fully charging and discharging
> each node in each clock cycle using the power supply. That is not a wave.

Correct.

> The same energy is traveling back and forth (for standing waves) or in a
> circle (for circular waves). You only need to replace some damping
> effects (Resonance). 

Correct

> You definitely did not do that in a CPLD.

No, if you thought that was what I was claiming, then sorry.

> 
> You cannot have the energy cross FET-gates. At least not at frequencies
> that low so series will achieve nothing.
> 
> http://www.sigda.org/Archives/ProceedingArchives/Dac/Dac2003/papers/2003/dac03/pdffiles/40_1.pdf

Interesting - shame there is no Freq/Temperature or Freq/Vcc info.
Have you ever seen that for these circuits ?

The Freq of this wave osc is dominated (but not wholly determined by) 
by the length-transit times. It will have some dependance on Vcc, as the 
drivers (even parallel) will load and thus shift the frequency.

Bringing this back into the FPGA domain:

  The idea is to build the closest thing a FPGA fabric allows. Use the 
routing path-lengths to dominate the delays, and place the (series) 
buffers only sparingly.
  The result should be a Physical Ring Osc, where the Physical
ring dominates, and thus gives better precision.
  With each FPGA generation, the buffer effects will decrease.
65nm FPGAs are in the labs now ?

-jg

Article: 102018
Subject: Re: Xilinx 3s8000?
From: "Isaac Bosompem" <x86asm@gmail.com>
Date: 9 May 2006 13:02:51 -0700
Links: << >> << T >> << A >>

Jeff Brower wrote:
> Tom-
>
> > I really think you haven't looked enough into what can be done in
> > software before jumping to hardware; gmp-ecm is in the public domain,
> > and Bruce Dodson runs it full-time on a 120-node Opteron cluster at
> > LeHigh university.
>
> Haha, be careful of what you suggest.  Now Ron will need a
> complimentary Opteron cluster from AMD :-)
>
> -Jeff

haha, I wouldn't mind one as well. Perhaps it will make my games run
smoother :).

Hi Ron,

Are you attempting a 100% hardware solution or are you doing a mix of
both hardware and PC software?
(Forgive me if this question has already been answered).

-Isaac

Article: 102019
Subject: Re: help me to about clock in fpga
From: "Slurp" <slip@slop.slap>
Date: Tue, 9 May 2006 21:07:39 +0100
Links: << >> << T >> << A >>


<MikeShepherd564@btinternet.com> wrote in message 
news:p9p16251v2kbvms3pbg2p10qi1c76isqjr@4ax.com...
> >
>>oh great one,
>>i thinks to and it of multiplex muchness and compile over switch load 8 
>>bit.
>>when 8 bit circumspect on 12 input (when over ride n * x bit bus) then bit
>>blit over see 16 bit.
>>only where if and but causal case but multiplex clock input pin when bus
>>equivalent.
>>
>>hope it helps
>>
> Hold on...let me guess...it was you documented Altera's LVDS
> megafunction?

LOL - much unnecessary spraying of alcoholic beverage over calculating 
machine monitor (32 bit) chortle chortle

Article: 102020
Subject: Re: Superscalar Out-of-Order Processor on an FPGA
From: "Stephen Craven" <scraven@vt.edu>
Date: 9 May 2006 13:21:20 -0700
Links: << >> << T >> << A >>

The register file will likely be a bottleneck.  You will likely have to
time-multiplex the ports to get the required 8-read ports and 4-write
ports to support 4-issues / cycle.  Or I suppose you could make a RF
out of flops with as many ports as you need, but this would be huge.

Did have anything specific in mind for the register file?

Stephen

Article: 102021
Subject: Re: Funky experiment on a Spartan II FPGA
From: lenz19@gmx.de
Date: 9 May 2006 13:35:48 -0700
Links: << >> << T >> << A >>

Peter Alfke schrieb:

>The long checkerboard shift register is a popular way to measure power
>consumption, but it is actually very benign with respect to Vcc spikes,
>since half the loads go Low and half go High, nicely compensting each
>other.
>The tougher test is to switch every bit in synchronism, from Low to
>High on one clock, and from High to Low on the next, etc.

Hi Peter,

could you please further elaborate this ?

How is the checkerboard shift register different than the switching
of every bit from high to low in terms of Icc and Vcc ?

I would think that Icc behaviour is the same in both cases but what
about Vcc ?

If Vcc behaves different in both cases, what is the reason ?

Thank you.

Article: 102022
Subject: Re: Funky experiment on a Spartan II FPGA
From: Austin Lesea <austin@xilinx.com>
Date: Tue, 09 May 2006 13:59:55 -0700
Links: << >> << T >> << A >>

lenz,

You really have to draw yourself a picture.

I don't think anyone has really thought this through, unless they are 
doing it in reality.

For example, if I place a 1,1,1,1,... in a shift register, and clock it, 
I get a transition from a 1 to a 1, and no charging or discharging, so 
no current!

If I place 1,0,1,0,1,0 ... in the shift register, then I maximize my 
average dynamic current, as on every clock, I make a node change from 0 
to 1, or 1 to 0.

If I place an isolated 0 to 1 transition I can see the effective impulse 
response for a single transition of 0 to 1.  This would have to be done 
with all the DFF tied to the same D input, and not a giant shift 
register, however.

One has to examine how the skew across the global clock will affect the 
outcome (nothing is really synchronous in reality - never all the exact 
same phase).

So, there are many experiments one can perform, and as Peter points out, 
many of them are degenerate cases (unlikely to exist in reality).

These are exacly the kinds of patterns we use in verification and 
charaterization.  And, we have been doing this for many years now.

Austin

Article: 102023
Subject: Re: Superscalar Out-of-Order Processor on an FPGA
From: "JJ" <johnjakson@gmail.com>
Date: 9 May 2006 14:07:32 -0700
Links: << >> << T >> << A >>

Luke wrote:
> I've got a little hobby of mine in developing processors for FPGAs.
> I've designed several pipelined processors and a multicycle processor.
> Now I want to design a more modern processor.  To me, this means
> superscalar, out of order execution, branch prediction, and data and
> instruction cache.
>
> I know that this type of processor isn't the best fit for the
> technology.  Has anyone come across such a design for an FPGA?  I've
> perused the net looking for them and have come up short.
>
> I think a 4-Issue processor should be feasible on a Spartan 3 1000.
> I've done an analysis on the major bits of hardware, and it seems quite
> possible.
>
> I've posted some links to papers I found on OOO scheduling algorithms
> on my blog: http://bitstuff.blogspot.com.  The most space efficient
> algorithm I found uses dependency chains and schedules instructions
> into multiple FIFOs.
>

I don't consider OoO and the rest of it as real progress anymore as
least when you get to multiple issue that isn't sustained and esp not
in an FPGA. They only really help get us get lots of theoretical
performance and mostly hitting the Memory Wall when you get to GHz of
clock. SInce you won't get that clock rate, all the assumtions are
different too. On the other hand if its a prototype for an ASIC or full
custom you never intend or afford to fab, then performance doesn't
really matter so much.

If you realy want some performance out of an FPGA I'd stick to
multithreading and latency hiding like say the Niagara and you can make
such cores quite a bit smaller and put a few in too. Then what to do
with all those threads?

I suspect if you could fully implement all the usual stuff that goes
into a x86 OoO, Branch Prediction, Register Renaming, multilevel caches
but with a much simpler instruction set say Mips or DLX, I bet the
clock rate would be no better than 25MHz. By going the opposite route
and keeping the architecture as simple as possible, one can hit the max
freq of BlockRam cycles in multi cycle design and probably get the same
performance with almost no hardware.

Take a look at the Leon, it seems to be pretty well regarded. Also Sun
did release the gate level Verilog for their Niagara, you might be able
to make some sense of it but they aren't OoO.

The more interesting part is in the MMU, do you go the same route and
implement the usual caches with regular DRAM or look at a more
interesting DRAM like RLDRAM that has about 10-20x more real random
throughput which changes the whole memory cache scene alltogether.

I assume you have your copy of Computer Architecture: A Quantitative
Approach by H & P around to guide you,  pretty good in most respects.

> Any ideas or thoughts?

I am actually kind of surprised that no one has taken up the MMIX
instruction set by D Knuth, it is well documented, should be completely
free of legal issues, the design took in advice from many noted
architects and isn't too far from the Mips machine. It would probably
bring some notoriety to any implementors, esp if MMIX is actually
tought in schools.

John Jakson
transputer guy

Article: 102024
Subject: Re: Xilinx 3s8000?
From: "JJ" <johnjakson@gmail.com>
Date: 9 May 2006 14:19:57 -0700
Links: << >> << T >> << A >>


radarman wrote:
> Agreed - my primary gripe is that the older versions don't play well
> with a newer version installed and vice-versa.
>
> Right now, my copy of 7.1 is unusable - and I'll have to get an admin
> to clean it up to the point where it is usable again. I can manually
> make it mostly usable by altering the system environment variables -
> but that's a bit of a pain. To top it off, the 4.2 install doesn't
> quite work properly either - probably for the same reason. It seems to
> go through mapping and PAR OK, but I have to create the downloadables
> on another system. We fought for a week with every part of the
> toolchain until we switched to another workstation to create the
> bitstream files - no errors, just corrupt binaries.
>
> When I'm more or less done maintaining the old design, I'm probably
> going to just wipe both off, and install the latest version again.

Would a VM like VMWare or VPC do the trick, keep your OS with ISE
version installed in different virtual boxes. As long as only one runs
all the machine resources are available to that VM guest, whether its
Linux or Windows.

John Jakson

Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources

Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Custom Search