Messages from 88475

Article: 88475
Subject: looking for OLD OLD software
From: "Chris Beg" <cbeg@engmail.uwaterloo.ca>
Date: Thu, 18 Aug 2005 23:03:50 -0400
Links: << >> << T >> << A >>

Hello all ...

I know this is an odd request, but i'm looking for some old xilinx software
to program some xilinx xc3000 parts.  If anybody has such software they are
looking to give away/sell (or know where to find) ... let me know.

Thanks and Regards
-Chris

Article: 88476
Subject: Re: Synthesis : HowTo Preserve FSM encodings
From: "des00" <Denis.Shekhalev@gmail.com>
Date: 18 Aug 2005 22:02:25 -0700
Links: << >> << T >> << A >>

for simplify use
attribyte SYN_ENUM_ENCODING
and SYN_ENCODING = "original"
and don't forget to disable FSM Compiler

Article: 88477
Subject: PLL
From: "Designfreek" <vryaag@gmail.com>
Date: 18 Aug 2005 22:33:05 -0700
Links: << >> << T >> << A >>

Hi Guys
i am working on the image processing side and i wanted to impliment a
function on the APEX20KE device and the clock that i want to use is the
internal PLL so Kindly let me know how can i do that.

Article: 88478
Subject: Re: Chipscope pro : timing constraint?
From: "Pasacco" <pasacco@gmail.com>
Date: 19 Aug 2005 01:36:32 -0700
Links: << >> << T >> << A >>

Hi

According to synthesis/implementation report, there is no indication
that the clock is gated.

This example is just an exercise.
My intention is just to make the value of an internal signal 'counter'
change " 0->1->2->3->4" and see that in ChipScope Pro.

Problem is that the ILA core captures only static value '4',   even
though 'RST' and 'EN' signal are driven by VIO "on the fly".

If it is possible to see the change of the value, instead of a static
value, I am doing something wrong.  What am I missing?

Thankyou.

Article: 88479
Subject: Re: Two microblaze in EDK
From: Adrian Knoth <adi@thur.de>
Date: Fri, 19 Aug 2005 08:40:43 +0000 (UTC)
Links: << >> << T >> << A >>

Paul Hartke <phartke@Stanford.EDU> wrote:

> Since you didn't list the errors, it is hard to tell.  

I refrain from posting so many lines, but I put it on the net:

   http://www.minet.uni-jena.de/~adi/xflow.log

> My sense is two Microblazes in the S3 Starter Kit will be a real
> tight squeeze.  

Oh, that's good to hear. There were guessing that even five
Microblazes could fit into the S3, but I never agreed with
this optimistic expectation. ;)

-- 
mail: adi@thur.de  	http://adi.thur.de	PGP: v2-key via keyserver

Jeder is sein Glückes Schmied, doch nich jeder hat ein schmuckes Glied

Article: 88480
Subject: Re: XST Help - Device Utilization Woes
From: "Simon Peacock" <simon$actrix.co.nz>
Date: Fri, 19 Aug 2005 22:27:35 +1200
Links: << >> << T >> << A >>

replication is usually from two sources...
1/ the fanout as you suggest
2/ speed improvement
3/ borg

I would suggest there is little hope for your design as you have too much
logic.  Correct answer is get a bigger device :-)

However.. take a look at any memories... if they are distributed and not
block they will eat memory
Then look at shift registers.. SLR16 ?
Think about what you are trying to achieve and see if there's a simpler
solution.

Simon


"Brandon" <killerhertz@gmail.com> wrote in message
news:1124375215.587610.307000@g43g2000cwa.googlegroups.com...
> Hello,
>
> I'm synthesizing a design in XST and I'm having a hard time figuring
> out what's consuming all of the devices resources.
>
> I wrote mostly structural VHDL, so I decided to synthesize each
> component separately to get a better idea of the low level utilization.
> I haven't seen any option in XST to see a hierarchal analysis of
> area... Anyway, I estimated the resource consumption of my design,
> excluding routing, the FSM, and some other small amounts of logic and
> multiplexing:
>
>        Slice Count    Slice FFs        4-input LUTs
>        -----------    ---------        ------------
> used:  10936          29048            12406
> total: 23616          47232            47232
>        -----------    ---------        ------------
>        46.31%         61.50%           26.27%
>
> Here is the actual:
>  Number of Slices:                   45523  out of  23616   192% (*)
>  Number of Slice Flip Flops:         22611  out of  47232    47%
>  Number of 4 input LUTs:             78378  out of  47232   165% (*)
>
>
> When looking in the synthesis report, I noticed some warnings
> indicating that duplicate FFs were removed, so that explains the
> reduction in FF count. However, I cannot explain the HUGE increase in
> LUT and Slice usage. What can I infer from this?
>
> The report also tells me that some of my 6-bit counter signals are
> being replicated (once or twice). What is the cause of this? High
> fan-out?
> <SNIP>
> FlipFlop cnt_dout_ins_cnt_v_0 has been replicated 2 time(s)
> FlipFlop cnt_dout_ins_cnt_v_1 has been replicated 1 time(s)
> FlipFlop cnt_hreg_ins0_cnt_v_0 has been replicated 2 time(s)
> FlipFlop cnt_hreg_ins0_cnt_v_1 has been replicated 1 time(s)
> FlipFlop cnt_hreg_ins10_cnt_v_0 has been replicated 2 time(s)
> FlipFlop cnt_hreg_ins10_cnt_v_1 has been replicated 1 time(s)
> FlipFlop cnt_hreg_ins11_cnt_v_0 has been replicated 2 time(s)
> </SNIP>
>
> Is there anyway to decipher the cell usage count perhaps? Does anyone
> have a URL that includes an explanation of all the cell names? I also
> checked the macro statistics and everything is accounted for in that
> table.
>
> Thanks.
> -Brandon
>

Article: 88481
Subject: Re: PLL
From: "Subroto Datta" <sdatta@altera.com>
Date: Fri, 19 Aug 2005 10:28:48 GMT
Links: << >> << T >> << A >>

Click on

http://answers.altera.com/altera/resultDisplay.do?page=http%3A%2F%2Fwww.altera.com%2Fliterature%2Fan%2Fan115.pdf&result=12&responseid=b0b5a30041c7c737%3A1381e7%3A105cd8c9f6e%3A34&groupid=1&contextid=195%3A4512.4608%2C36605.36685%2C43030.43213&clusterName=DefaultCluster&doctype=1002&excerpt=APEX+20K+Devices+APEX+20K+devices+have+one+PLL+that+features+ClockLock+and+ClockBoost+circuitry.#Goto4512


If that does not work go www.altera.com. Click on Find Answers under Support 
on the altera.com home page.
Enter APEX20K and PLL in the ask a question.
It is the 5th answer from the top labeled AN115: Using the Clocklock and 
ClockBoost features.

Hope this helps.
Subroto Datta
Altera Corp.




"Designfreek" <vryaag@gmail.com> wrote in message 
news:1124429585.211289.227250@z14g2000cwz.googlegroups.com...
> Hi Guys
> i am working on the image processing side and i wanted to impliment a
> function on the APEX20KE device and the clock that i want to use is the
> internal PLL so Kindly let me know how can i do that.
>

Article: 88482
Subject: Re: Two microblaze in EDK
From: Paul Hartke <phartke@Stanford.EDU>
Date: Fri, 19 Aug 2005 04:21:00 -0700
Links: << >> << T >> << A >>

You are not using any Service Packs.  I'd upgrade to both the latest EDK
and ISE service packs as a first step.  

Don't forget that no matter what you are doing, its very likely that
more logic than just the Microblaze core(s) itself will be required to
make a useful system.  

Paul

Adrian Knoth wrote:
> 
> Paul Hartke <phartke@Stanford.EDU> wrote:
> 
> > Since you didn't list the errors, it is hard to tell.
> 
> I refrain from posting so many lines, but I put it on the net:
> 
>    http://www.minet.uni-jena.de/~adi/xflow.log
> 
> > My sense is two Microblazes in the S3 Starter Kit will be a real
> > tight squeeze.
> 
> Oh, that's good to hear. There were guessing that even five
> Microblazes could fit into the S3, but I never agreed with
> this optimistic expectation. ;)
> 
> --
> mail: adi@thur.de       http://adi.thur.de      PGP: v2-key via keyserver
> 
> Jeder is sein Glückes Schmied, doch nich jeder hat ein schmuckes Glied

Article: 88483
Subject: Re: PLL
From: Ben Twijnstra <btwijnstra@gmail.com>
Date: Fri, 19 Aug 2005 13:35:40 +0200
Links: << >> << T >> << A >>

Hi Designfreek,

> Hi Guys
> i am working on the image processing side and i wanted to impliment a
> function on the APEX20KE device and the clock that i want to use is the
> internal PLL so Kindly let me know how can i do that.

I guess Subroto's answer is as complete as it can get. Just a few comments:

Be sure that you're targeting an APEX20KEblablabla-X device - the non-X
devices' PLLs are untested.

I'd personally pick a newer Altera family member, such as the Cyclone or
Stratix devices - much more versatile PLLs and much higher performance -
but I guess you've got to work using an existing board.

Best regards,


Ben

Article: 88484
Subject: Re: Chipscope pro : timing constraint?
From: Ed McGettigan <ed.mcgettigan@xilinx.com>
Date: Fri, 19 Aug 2005 08:11:03 -0700
Links: << >> << T >> << A >>

Pasacco wrote:
> Hi
> 
> According to synthesis/implementation report, there is no indication
> that the clock is gated.
> 
> This example is just an exercise.
> My intention is just to make the value of an internal signal 'counter'
> change " 0->1->2->3->4" and see that in ChipScope Pro.
> 
> Problem is that the ILA core captures only static value '4',   even
> though 'RST' and 'EN' signal are driven by VIO "on the fly".
> 
> If it is possible to see the change of the value, instead of a static
> value, I am doing something wrong.  What am I missing?
> 
> Thankyou.
> 

You sent me more information on this in email and added another note
here. Try doing this.

1) In the ILA window change the trigger condition to a falling edge
    transition of the RESET line instead of the static low that you
    have now.

2) In the VIO window pulse the RESET button

3) Go back to the ILA window and you should see 0, 1, 2, 3, 4, 4, 4, 4...

Ed

Article: 88485
Subject: Re: looking for OLD OLD software
From: Austin Lesea <austin@xilinx.com>
Date: Fri, 19 Aug 2005 15:45:18 GMT
Links: << >> << T >> << A >>

Open a webcase,

We will support old software if you tell us what you need,

Austin


Chris Beg wrote:

> Hello all ...
> 
> I know this is an odd request, but i'm looking for some old xilinx software
> to program some xilinx xc3000 parts.  If anybody has such software they are
> looking to give away/sell (or know where to find) ... let me know.
> 
> Thanks and Regards
> -Chris
> 
> 
>

Article: 88486
Subject: Best FPGA for floating point performance
From: "Marc Battyani" <Marc.Battyani@fractalconcept.com>
Date: Fri, 19 Aug 2005 17:59:35 +0200
Links: << >> << T >> << A >>

Hello,

Does anybody already made a comparison of the high performance FPGA (Stratix
II, V4, ?) relative to double precision floating point performance (add,
mult, div, etc.) ?

It's for an HPC aplication.

Thanks

Marc

Article: 88487
Subject: Re: Chipscope pro : timing constraint?
From: "Pasacco" <pasacco@gmail.com>
Date: 19 Aug 2005 09:11:52 -0700
Links: << >> << T >> << A >>

Hi

Falling edge trigger, and now it works. Thankyou for nice comment and
correction.

With gratitude

Article: 88488
Subject: USB Blaster
From: Jo Schambach <jschamba@physics.utexas.edu>
Date: Fri, 19 Aug 2005 11:29:42 -0500
Links: << >> << T >> << A >>

Does anybody know how to program access to the Altera USB Blaster? I am 
trying to port Altera's SRunner software, which currently only supports 
a ByteBlaster II on a Windows environment to support the USB Blaster 
(since this is the only download cable I have) on either Windows or Linux.
Any help would be appreciated, Altera does not seem to release the API.

--
Dr Joachim Schambach
The University of Texas at Austin
Department of Physics
1 University Station C1600
Austin, Texas 78712-0264, USA
Phone: (512) 471-1303; FAX: (814) 295-5111
e-mail: jschamba@physics.utexas.edu

Article: 88489
Subject: Kingston module structure
From: jms019@gmail.com
Date: 19 Aug 2005 09:31:41 -0700
Links: << >> << T >> << A >>

I have inherited a nearly-working FPGA SDRAM controller but my testing
shows I have got the structure wrong, partly due to lack of data on
Kingston's site.

The module in question is the Kingston KVR133X64C3/1G.

The verilog I have inherited caters for 11 column bits, 13 row bits, 4
banks and two select lines. The module has sixteen chips on it which I
thought might be eight bit each so there would have to be two chip
select lines.

But my testing shows something wrong with the way I assign
row/column/bank/cs. Maybe it is in fact 16 off 4 bit chips and just the
one chip select but a test assuming that shows I'm still losing a bit
somewhere.

So what is the structure of this module and does the column go out on
A0-A9(,A11,A12) ? Googling throws up surprisingly little data given
that I'm not out to buy them.

Jon

Article: 88490
Subject: Re: Best FPGA for floating point performance
From: Austin Lesea <austin@xilinx.com>
Date: Fri, 19 Aug 2005 16:35:51 GMT
Links: << >> << T >> << A >>

Marc,

IEEE floating point standard?  You need to be more specific.

Does it need to integrate with a processor?

I believe the Xilinx IBM 405 Power PC using the APU interface in Virtex 
4 with the floating point IP core provides the best and fastest performance.

Especially since no other FPGA vendor has a hardened processor to 
compete with us.

If all you want is the floating point processing, without a 
microprocessor, then I think you will find similar performance between 
Xilinx and our competition, with us (of course) claiming the superior 
performance edge.

It would not surprise me at all to see them also post claiming they are 
superior.

For a specific floating point core, with a given precision, for given 
features, it would be pretty easy to bench mark, so there is very little 
wiggle room here for marketing nonsense.

I would be interested to hear from others (not competitors) about what 
floating point cores they use, and how well they perform (as you 
obviously are interested).

Austin

Marc Battyani wrote:

> Hello,
> 
> Does anybody already made a comparison of the high performance FPGA (Stratix
> II, V4, ?) relative to double precision floating point performance (add,
> mult, div, etc.) ?
> 
> It's for an HPC aplication.
> 
> Thanks
> 
> Marc
> 
>

Article: 88491
Subject: Re: Best FPGA for floating point performance
From: christopher.saunter@durham.ac.uk (c d saunter)
Date: Fri, 19 Aug 2005 20:26:19 +0000 (UTC)
Links: << >> << T >> << A >>

Marc Battyani (Marc.Battyani@fractalconcept.com) wrote:
: Hello,

: Does anybody already made a comparison of the high performance FPGA (Stratix
: II, V4, ?) relative to double precision floating point performance (add,
: mult, div, etc.) ?

: It's for an HPC aplication.

Hi Marc,
     I don't have a comparisom of various cores but a lot of info is out 
there in datasheets.

However, in an HPC application the performance of your maths cores may not 
be the bottleneck, rather it is likely to be a question of how fast can 
you interface the host system to the FPGA, how fast can you shunt data 
around between CPU, CPU RAM, FPGA and FPGA RAM etc.

The heavyweight HPC/FPGA hybrid systems I have seen, such as the Cray-XD1 
and SGI NUMAflex/Altix stuff use Xilinx FPGAs.

Although I wouldn't want to generalise for the whole field, other 
interested parties such as Nallatech and Starbridge Systems tend to go 
for Xilinx.

Certianly Xilinx seem to have a head start in the field (not thanks to 
their tools from the word on the street :-) - possibly this has more to do 
with interfacing than FP core performance.

Not answering the origional question, but there you go :-)

Cheers,
     Chris

(A strong believer in FPGA type stuff for HPC, although perhaps the 
granularity is less than optional and the tools not very well suited, but 
hey it's early days.)

 : Thanks

: Marc

Article: 88492
Subject: Re: Best FPGA for floating point performance
From: "JJ" <johnjakson@yahoo.com>
Date: 19 Aug 2005 13:55:34 -0700
Links: << >> << T >> << A >>

While an x86, or cell cluster could whip FPGA at IEEE FPU in raw clock
speed ( I am not sure about cost though), you can flip the odds some by
defining your own numerics with a direct mapping to the plentifull
18bit muls.

If I am not mistaken IEEE is not the be all and end all of FPU and has
a certain no of detractors esp in some fields regarding rounding,
exceptions etc. If you do define your own FP set you can simulate it
farely easly right on your HPC app and see if it gives comparable
results. For instance 1,2 or4 multipliers running a 37b mantissa might
be enough to not use double IEEE, only you can figure that out.

I think I even go for a custom cpu design with a highly serial by 18.18
datapath and try to pump it as fast as the fabric will allow. I notice
that the soft core FPUs out there don't run anywhere near the 300MHz
speeds being quoted for mul units. Perhaps the V4 500MHz DSP block can
be microcoded into a decent FPU unit but as soon as you need the odd
features,

Anyway I think thats what I  would do, if that doesn't work too well
then I look at qinetix and other vendors, these links can be found on
the X,A sites.

So what is your app and what hardware are you running on?

Article: 88493
Subject: Re: Best FPGA for floating point performance
From: "Marc Battyani" <Marc.Battyani@fractalconcept.com>
Date: Fri, 19 Aug 2005 23:26:30 +0200
Links: << >> << T >> << A >>


"Austin Lesea" <austin@xilinx.com> wrote
>
> Marc,
>
> IEEE floating point standard?  You need to be more specific.

IEEE 754. It's for a computational accelerator. It will get values from a
general purpose processor (Xeon, Itanium, etc.) and send the results back in
the same format.

Though the internal computations could be done in another format.

The other stuff needed is pretty standard (PCI(-X or Express), DDR2, etc. )

> Does it need to integrate with a processor?

No.

> I believe the Xilinx IBM 405 Power PC using the APU interface in Virtex
> 4 with the floating point IP core provides the best and fastest
performance.
>
> Especially since no other FPGA vendor has a hardened processor to
> compete with us.

OK, that one is easy. ;-)

> If all you want is the floating point processing, without a
> microprocessor, then I think you will find similar performance between
> Xilinx and our competition, with us (of course) claiming the superior
> performance edge.

The idea is to hardwire some formula by doing the maximum of concurrent
FLOP. This is the only way to go faster than a very fast processor like an
Itanuim II or even a simple Xeon.

> It would not surprise me at all to see them also post claiming they are
> superior.
>
> For a specific floating point core, with a given precision, for given
> features, it would be pretty easy to bench mark, so there is very little
> wiggle room here for marketing nonsense.
>
> I would be interested to hear from others (not competitors) about what
> floating point cores they use, and how well they perform (as you
> obviously are interested).

Sure! And this time it should be easy to get useful technical numbers.

Marc

Article: 88494
Subject: Re: Best FPGA for floating point performance
From: Austin Lesea <austin@xilinx.com>
Date: Fri, 19 Aug 2005 21:35:18 GMT
Links: << >> << T >> << A >>

JJ,

Perhaps you should read:

http://www.xilinx.com/bvdocs/ipcenter/data_sheet/floating_point.pdf

first?

At 429 MHz for a Virtex 4 for a square root, that is 56 clocks, or 130.5 
ns for the answer.  7.663 million floating point sqrure roots per second.

And, if you need more, you can implement more than one core, and get 
more than one answer per 56 clocks....

I am not aware of any x86 that can run quite that fast (even for one 
core).  Their claims are that the floating point hardware unit speeds up 
the software exection by at least a factor of 5.  We are talking here 
about a speedup of 80 to 100 times over using fixed point integer 
software to emulate a floating point square root....not a factor of 5!

Austin

JJ wrote:

> While an x86, or cell cluster could whip FPGA at IEEE FPU in raw clock
> speed ( I am not sure about cost though), you can flip the odds some by
> defining your own numerics with a direct mapping to the plentifull
> 18bit muls.
> 
> If I am not mistaken IEEE is not the be all and end all of FPU and has
> a certain no of detractors esp in some fields regarding rounding,
> exceptions etc. If you do define your own FP set you can simulate it
> farely easly right on your HPC app and see if it gives comparable
> results. For instance 1,2 or4 multipliers running a 37b mantissa might
> be enough to not use double IEEE, only you can figure that out.
> 
> I think I even go for a custom cpu design with a highly serial by 18.18
> datapath and try to pump it as fast as the fabric will allow. I notice
> that the soft core FPUs out there don't run anywhere near the 300MHz
> speeds being quoted for mul units. Perhaps the V4 500MHz DSP block can
> be microcoded into a decent FPU unit but as soon as you need the odd
> features,
> 
> Anyway I think thats what I  would do, if that doesn't work too well
> then I look at qinetix and other vendors, these links can be found on
> the X,A sites.
> 
> So what is your app and what hardware are you running on?
>

Article: 88495
Subject: Re: Best FPGA for floating point performance
From: "Marc Battyani" <Marc.Battyani@fractalconcept.com>
Date: Fri, 19 Aug 2005 23:51:30 +0200
Links: << >> << T >> << A >>


"c d saunter" <christopher.saunter@durham.ac.uk> wrote :
> Marc Battyani (Marc.Battyani@fractalconcept.com) wrote:
> : Hello,
>
> : Does anybody already made a comparison of the high performance FPGA
(Stratix
> : II, V4, ?) relative to double precision floating point performance (add,
> : mult, div, etc.) ?
>
> : It's for an HPC aplication.
>
> Hi Marc,
>      I don't have a comparisom of various cores but a lot of info is out
> there in datasheets.
>
> However, in an HPC application the performance of your maths cores may not
> be the bottleneck, rather it is likely to be a question of how fast can
> you interface the host system to the FPGA, how fast can you shunt data
> around between CPU, CPU RAM, FPGA and FPGA RAM etc.

Yes, memory bandwidth is one of the bottlenecks, especially for the general
purpose processors.

> The heavyweight HPC/FPGA hybrid systems I have seen, such as the Cray-XD1
> and SGI NUMAflex/Altix stuff use Xilinx FPGAs.

Very interesting. In fact this is what we want to do (on a smaller scale
probably ;-)
I find it somewhat depressing to see that Cray can't come up with something
much better than a bunch of FPGAs but at the same time it's very cool to
have access to the same technology than Cray. Or even better as they seem to
use Virtex II :)

> Although I wouldn't want to generalise for the whole field, other
> interested parties such as Nallatech and Starbridge Systems tend to go
> for Xilinx.

OK.

> Certianly Xilinx seem to have a head start in the field (not thanks to
> their tools from the word on the street :-) - possibly this has more to do
> with interfacing than FP core performance.
>
> Not answering the origional question, but there you go :-)

Well in fact I'm also interested by all the HPC/FPGA question anyway.

> Cheers,
>      Chris
>
> (A strong believer in FPGA type stuff for HPC, although perhaps the
> granularity is less than optional and the tools not very well suited, but
> hey it's early days.)

Sure, much fun anyway.

Marc

Article: 88496
Subject: Re: Best FPGA for floating point performance
From: "Marc Battyani" <Marc.Battyani@fractalconcept.com>
Date: Fri, 19 Aug 2005 23:54:44 +0200
Links: << >> << T >> << A >>


"JJ" <johnjakson@yahoo.com> wrote in message
news:1124484934.397020.194050@g43g2000cwa.googlegroups.com...
> While an x86, or cell cluster could whip FPGA at IEEE FPU in raw clock
> speed ( I am not sure about cost though), you can flip the odds some by
> defining your own numerics with a direct mapping to the plentifull
> 18bit muls.

Using a grid is fine when the problem can be parallelized with a rather
coarse granularity but it's not always the case.

> If I am not mistaken IEEE is not the be all and end all of FPU and has
> a certain no of detractors esp in some fields regarding rounding,
> exceptions etc. If you do define your own FP set you can simulate it
> farely easly right on your HPC app and see if it gives comparable
> results. For instance 1,2 or4 multipliers running a 37b mantissa might
> be enough to not use double IEEE, only you can figure that out.

Yes, I though about using a 36 bit mantissa to reduce the number of hard
multiplier needed and the latency.
The input/ouputs need to be in IEEE754 though.

> I think I even go for a custom cpu design with a highly serial by 18.18
> datapath and try to pump it as fast as the fabric will allow. I notice
> that the soft core FPUs out there don't run anywhere near the 300MHz
> speeds being quoted for mul units. Perhaps the V4 500MHz DSP block can
> be microcoded into a decent FPU unit but as soon as you need the odd
> features,
>
> Anyway I think thats what I  would do, if that doesn't work too well
> then I look at qinetix and other vendors, these links can be found on
> the X,A sites.
>
> So what is your app and what hardware are you running on?

The apps can be rather diverse. In fact as Chirstopher pointed out, it looks
like we are doing some kind of small Cray-XD1 ;-) As for the hardware, we
are designing it.

Marc

Article: 88497
Subject: Re: Best FPGA for floating point performance
From: "JJ" <johnjakson@yahoo.com>
Date: 19 Aug 2005 15:22:35 -0700
Links: << >> << T >> << A >>

Hi Austin

Very interesting, but V4,S3E is still pretty darn new, I don't check on
it every 5mins but QinetiQ is definitely hot in this area (not
surprising given their (sq) roots at RSRE).

At some point I will do a detailed study of FPGA FPU design v x86 FPU
numbers for my transputer project.

>From the OPs website I can't guess what iron he'd use but the
application seems a bit clearer now.

Usually when I see HPC-FPGA, I might infer somebody working with
Opteron+VirtexII Pro sytems like Cray, SGI kits but doesn't look like
it here.

Regards

JJ

Article: 88498
Subject: Re: Best FPGA for floating point performance
From: Austin Lesea <austin@xilinx.com>
Date: Fri, 19 Aug 2005 22:47:18 GMT
Links: << >> << T >> << A >>

JJ,

Something I just couldn't find anywhere was the actual performance of 
the x86 co-processor for something like a floating point square root.

We have clock cycles for each IEEE floating point operator, and the 
speed of the synthesized palced and routed core for various families, 
from Spartan 3 to Virtex4 in that pdf file.

I suppose uP software people don't really care about performance in 
terms of cycles or ns or mops....its all about what game screen graphics 
are displayed in the coolest fashion....

Does anyone have a link to such a site that has 'real' data of floating 
point op performance?

Austin

JJ wrote:

> Hi Austin
> 
> Very interesting, but V4,S3E is still pretty darn new, I don't check on
> it every 5mins but QinetiQ is definitely hot in this area (not
> surprising given their (sq) roots at RSRE).
> 
> At some point I will do a detailed study of FPGA FPU design v x86 FPU
> numbers for my transputer project.
> 
>>From the OPs website I can't guess what iron he'd use but the
> application seems a bit clearer now.
> 
> Usually when I see HPC-FPGA, I might infer somebody working with
> Opteron+VirtexII Pro sytems like Cray, SGI kits but doesn't look like
> it here.
> 
> Regards
> 
> JJ
>

Article: 88499
Subject: Re: Best FPGA for floating point performance
From: "JJ" <johnjakson@yahoo.com>
Date: 19 Aug 2005 16:20:22 -0700
Links: << >> << T >> << A >>

Austin

Well I can't help with any FPU perf links just yet, I would have
thought that QinetiQ would have a lot of that. I always thought sq root
was in the same ballpark as division for cycles. The comp.arch NG often
has sessions on FP math, some of the regulars are fairly clued up on it
esp Nick M.

As they say the only good benchmark is your own application and usually
there's many other factors involved than raw SpecFP nos. Atleast FPGAs
can be fairly "transparent" (if you can put a prototype synthesis
together) where as figuring timing for OoO code can be tricky.

JJ

Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources

Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Custom Search