Messages from 45425

Article: 45425
(removed)

Article: 45426
Subject: Re: xilinx v ti
From: Uwe Bonnes <bon@elektron.ikp.physik.tu-darmstadt.de>
Date: Tue, 23 Jul 2002 10:01:25 +0000 (UTC)
Links: << >> << T >> << A >>

In comp.arch.fpga Kevin Neilson <kevin-neilson@removethistextattbi.com> wrote:
: Word.  C->gates is still a pipe dream.  Maybe some day...

: "Ray Andraka" <ray@andraka.com> wrote in message
<4.5 kBytes> of quote deleted>

Kevin,

please cut down what you quote to avoid spoiling the archives.

Also faked retun addresses are depricated.

Bye

-- 
Uwe Bonnes                bon@elektron.ikp.physik.tu-darmstadt.de

Institut fuer Kernphysik  Schlossgartenstrasse 9  64289 Darmstadt
--------- Tel. 06151 162516 -------- Fax. 06151 164321 ----------

Article: 45427
Subject: Re: xilinx v ti
From: Ray Andraka <ray@andraka.com>
Date: Tue, 23 Jul 2002 12:16:57 GMT
Links: << >> << T >> << A >>

I suspect it always will be.  Software and hardware have totally different
design constraints.  C code is inherently sequential, and it takes great pains
in coding to make it map to parallel hardware.   The processors for C can afford
to have very elaborate instruction units because there is usually only one and
it is used for all instructions.  Hardware solutions, on the otherhand should
stive to minimize the complexity of the data path, because each part is
typically only used by a small part of the algortihm.

Kevin Neilson wrote:

> Word.  C->gates is still a pipe dream.  Maybe some day...
>

Article: 45428
Subject: Re: spiral / waterfall /watersluice : Which are your methods?
From: Abernathey Family <family2@aracnet.com>
Date: Tue, 23 Jul 2002 07:26:48 -0500
Links: << >> << T >> << A >>

Domagoj wrote:
> 
> Verification patterns (like in E) and formal verification patterns seem like
> an obvious copying
> of well-established and studied programming patterns. Project management and
> versioning
> is pretty much the same, too.
--snip--

I think you're right. But we will see a pick & choose approach to
software methods. Why? Moore's Law drives processor to a lesser extent
ASIC design. What drives software design? Gate's Law? My point is that
chip design is already more complex than software design with no
slowdown in sight. Chip design methodology must advance faster than
software methodology.

-Don

Article: 45429
Subject: Re: How could I generated an efficient 16*16 multiplier in Vertex-II?
From: Ray Andraka <ray@andraka.com>
Date: Tue, 23 Jul 2002 12:27:22 GMT
Links: << >> << T >> << A >>

Re: the carry chains.  The Tciny and the time to get off the carry chain into
the flip-flop are both much longer than you would expect considering the die
shrink and everything.  Unlike virtex and virtexE (which often got limited by
SRL16 minimum pulse width, routing distances and fanout), it looks like the
carry chain is going to set the upper limit on DSP performance in V2.

The improved multipliers are around 250 MHzdepending on which set of numbers you
believe and how carefully you add pipeline registers immediately before and
after the multipliers.  You can get to the timing numbers by putting the
CONFIG_STEPPING property on the multipliers.

Kevin Neilson wrote:

> Ray,
> Thanks for that observation about the carry chains.  I thought I might be
> crazy because I was getting worse response from the adders on the V2 than I
> was on the Virtex Es.  I posted a question about this but never got a
> response.  I think there is a value called Tciny(?) that deals with getting
> data on the carry chain that's a lot bigger for V2 than it was for VE.  It's
> depressing looking at the paths, because the carry chains are so fast, but
> getting data to the chain and on it is so slow.
>
> How much faster are the enhanced multipliers going to be?  I've been
> disappointed with the fact that with each service pack the multipliers get
> slower and that the pipelined multiplier doesn't yield very much benefit,
> but I have to say that I'm still very glad Xilinx put these in.  In the
> design I'm doing right now I'm using about 25 for halfband interpolators,
> mixers, linear interpolators, gain stages, etc.  I would never have the
> gates to do this with fabric-based multipliers.  In many cases the
> multipliers weren't fast enough for my needs and I had to double up and
> operate in parallel, but since I have 40 multipliers in the part this isn't
> a problem.
>
> Hua:
> I'd recommed this technique over using pipelined fabric multipliers.  A
> fabric multiplier can get you over 200MHz if you pipeline every stage, but
> this will eat up a lot of gates.  Using two embedded multipliers will also
> get you over 200MHz in fewer cycles if you demux the data stream into two
> multicycle paths and use two multipliers and then remux.  (You have to set
> the constraints and clock enables properly.)  Then you haven't burned up
> nearly as many of the valuable fabric gates.  This might not work in every
> application, but works in most DSP applications.
> -Kevin
>
> "Ray Andraka" <ray@andraka.com> wrote in message
> news:3D3CAE7C.9325F62F@andraka.com...
> > This depends on which silicon.  Silicon produced before this spring has
> slow multipliers that are
> > pretty easy to beat with a pipelined multiplier in the fabric.  The
> silicon with the fixed
> > multipliers is difficult to match the speeds of the multipliers with a
> pipelined multiplier in
> > the fabric because it takes a pretty long time getting on and off the
> carry chains, in fact from
> > what I've seen so far the carry chains are no faster than, and perhaps a
> little slower than the
> > virtexE carry chains :-(.
> >
> > Jay wrote:
> >
> > > Those hardwired multipliers are about as fast as you're going to get
> > > for a single cycle multiply that wide, in that process technology.  If
> > > you can stand the latency, you could probably get a faster pipelined
> > > multiplier using the logic and hand placement.  What speed do you
> > > need?  Are both factors really 16 bits wide and every bit can vary
> > > every clock?
> > >
> > > Regards
> > >
> > > HUA QIAN <qianhua@ece.gatech.edu> wrote in message
> news:<3D39828F.189F00AB@ece.gatech.edu>...
> > > > Hello, all,
> > > >
> > > > I noticed that Xilinx Vertex-II provide 18*18 multipliers, which
> > > > introduce a lot of delays.  Can I generate a more efficient 16*16
> > > > multiplier, which is my target, and give me a shorter delay?
> > > >
> > > > Another question is how to determine the clock speed for the Vertex-II
> > > > embedded multiplier?
> > > >
> > > > Any advice or help is greatly appreciated!
> > > >
> > > > Hua
> >
> > --
> > --Ray Andraka, P.E.
> > President, the Andraka Consulting Group, Inc.
> > 401/884-7930     Fax 401/884-7950
> > email ray@andraka.com
> > http://www.andraka.com
> >
> >  "They that give up essential liberty to obtain a little
> >   temporary safety deserve neither liberty nor safety."
> >                                           -Benjamin Franklin, 1759
> >
> >

--
--Ray Andraka, P.E.
President, the Andraka Consulting Group, Inc.
401/884-7930     Fax 401/884-7950
email ray@andraka.com
http://www.andraka.com

 "They that give up essential liberty to obtain a little
  temporary safety deserve neither liberty nor safety."
                                          -Benjamin Franklin, 1759

Article: 45430
Subject: Re: How could I generated an efficient 16*16 multiplier in Vertex-II?
From: Ray Andraka <ray@andraka.com>
Date: Tue, 23 Jul 2002 12:35:05 GMT
Links: << >> << T >> << A >>

Kevin, you probably could have used distributed arithmetic to get the logic to
fit within the fabric if you did not have the multipliers.  After all you have a
total of 40 multiplies which are presumably used in a sum of products
architecture and that have a top clock rate in the 130-150 MHz range if you use
pipeline registers before and after them.

Kevin Neilson wrote:

> How much faster are the enhanced multipliers going to be?  I've been
> disappointed with the fact that with each service pack the multipliers get
> slower and that the pipelined multiplier doesn't yield very much benefit,
> but I have to say that I'm still very glad Xilinx put these in.  In the
> design I'm doing right now I'm using about 25 for halfband interpolators,
> mixers, linear interpolators, gain stages, etc.  I would never have the
> gates to do this with fabric-based multipliers.  In many cases the
> multipliers weren't fast enough for my needs and I had to double up and
> operate in parallel, but since I have 40 multipliers in the part this isn't
> a problem.
>

--
--Ray Andraka, P.E.
President, the Andraka Consulting Group, Inc.
401/884-7930     Fax 401/884-7950
email ray@andraka.com
http://www.andraka.com

 "They that give up essential liberty to obtain a little
  temporary safety deserve neither liberty nor safety."
                                          -Benjamin Franklin, 1759

Article: 45431
Subject: Re: 16 X 16 multplier
From: Ray Andraka <ray@andraka.com>
Date: Tue, 23 Jul 2002 12:36:39 GMT
Links: << >> << T >> << A >>

The construction is the same as pipelined multipliers but without the
registers.  YOu might look at the multipliers page on my website as a starting
point.

Reala wrote:

> Dear all,
>
> I would like to design of 16 X 16 multiplier with single clock cycle.
> I try to search internet but fail. Any free design and hint to design this
> kind of multiplier? Size reduction is need for the design.
>
> Thank a lot.
> Reala

--
--Ray Andraka, P.E.
President, the Andraka Consulting Group, Inc.
401/884-7930     Fax 401/884-7950
email ray@andraka.com
http://www.andraka.com

 "They that give up essential liberty to obtain a little
  temporary safety deserve neither liberty nor safety."
                                          -Benjamin Franklin, 1759

Article: 45432
Subject: Re: Translate the design from FPGA to Custom IC
From: Ray Andraka <ray@andraka.com>
Date: Tue, 23 Jul 2002 12:43:00 GMT
Links: << >> << T >> << A >>

Back in the 60's RTL  meant resistor-transistor logic, which was a forerunner to
TTL logic.  Today it means register transfer level.  Register transfer level
design basically means that you specify the registers explicitly and the logic
between the register implicitly (behaviorally).  Most HDL synthesizers do RTL
synthesis.

Reala wrote:

>
> What is RTL (Register Tran...Logic) I know the name but not really know the
> meaning? What tools for RTL synthesis?
>

--
--Ray Andraka, P.E.
President, the Andraka Consulting Group, Inc.
401/884-7930     Fax 401/884-7950
email ray@andraka.com
http://www.andraka.com

 "They that give up essential liberty to obtain a little
  temporary safety deserve neither liberty nor safety."
                                          -Benjamin Franklin, 1759

Article: 45433
Subject: Editing constraints in WebPack
From: "Børge Strand" <borge.strand.remove.if.not.spamming@sintef.no>
Date: Tue, 23 Jul 2002 15:21:15 +0200
Links: << >> << T >> << A >>

The pride of getting hello world (dip switches to seven-segment displays)
working in Verilog on my SpartanIIe board has been replaced by a bunch of
new questions. I hope you can help me out a bit on these ones;

I'm a bit at a loss when it comes to handling the different constraints
files and constraints databases in WebPack.

What I do is change the name of an input or output in my .v file, hit "Edit
Implementation Constraints (Constraints Editor)", and have it reporting
errors when parsing the .ucf-file. What is the recommended way to maintain
constraints?

Also, how should I enter timing and compactness constraints? And equally
imprtant, which report do I read to find out what timing can be expected
from the actual implementation?


Thanks,

Børge

Article: 45434
Subject: Re: Translate the design from FPGA to Custom IC
From: "Børge Strand" <borge.strand.remove.if.not.spamming@sintef.no>
Date: Tue, 23 Jul 2002 15:59:08 +0200
Links: << >> << T >> << A >>

This is interesting. I too work on something that is tested in Verilog and
later to be implemented in an ASIC. When the Verilog code works on the FPGA
board, I'm going to be travelling the same route as you are.

What my boss said was that we could work in parallel on the code and on
implementing and laying out some dedicated building blocks. Like, it is
quite probable that we will need a 4-bit adder and friends. Our ASIC process
is proprietary and has to work with some touchy analog stuff. So what we do
is make our own digital cells.

I guess, though I am in no way experienced in this, there are programs which
can be told what your cells look like (fanout, fanin, digital function, etc)
and produce a netlist for that particular "library". Even greater would be
if it could also be told about the pin locations of the cells and then
autoroute the whole thing.

Any information about typical FPGA and digital ASIC flows will be
appriceated! I have worked mostly with analog board layout, where the flow
is draw schematic, simulate, layout. The digital flow seems a bit more
complicated to me.

Regards,

Børge

"Reala" <manfield.chow@scoreconcept.com> wrote in message
news:ahitls$qkr10@imsp212.netvigator.com...
> Dear Kevin,
>
> Thank you for your detailed reply.
> Actually, I work in a IC design company. My boss want to develop a low-end
> DSP chip. However, we are less experience in this.
> We think that one of the important building block is 16X16 small size,
> single cycle multiplier.
> I write simple verilog and synthesis by Xilinx Web pack tools. It seems
that
> work.
> Assuming it is work, I want to open some output files to see what
"circuit"
> is synthesised, because I will design a DSP chip. But i do not know which
> output files mention the netlist of the "systhesised design" in gate
level.
>
> I guess that the verilog code will be synthesised by synthesis according
to
> synthesis tool's library. Am I correct? Can i force the synthesis tool to
> synthesis the verilog code without using library? (I means the design is
> systhesis in gate level ...AND OR XOR.....) Then, can i see the netlist in
> gate level such that I can study the design synthesised by the synsthesis
> tool?
>
> You say that:
> >To make sure the synthesized design was synthesized correctly,
> >do a gate-level simulation of the synthesized design.
> >You should be able to run the same testbench code you used for an RTL
> simulation.
>
> I am not really understand because I am a beginner of IC design.
> what is the meaning of gate-level simulation? by what kind of tools?
> Modelsim? Xilinx? or other?
>
> What is RTL (Register Tran...Logic) I know the name but not really know
the
> meaning? What tools for RTL synthesis?
>
> Thank again ^_^
> Reala
>
>
>
>
> "Kevin Brace" <killspam4kevinbraceusenet@killspam4hotmail.com> wrote in
> message news:ahip0o$rea$1@newsreader.mailgate.org...
> >         You will want to avoid using vendor specific features (vendor
> > specific primitives) as much as possible if your goal is to do an FPGA
> > to ASIC conversion.
> > Also, you will need to have sufficient volume to justify the NRE (Non
> > Recurring Engineering) fee you will have to pay upfront.
> > There are firms like AMI Semiconductor, Chip Express, Lightspeed
> > Semiconductor, and NEC (And a few more I cannot think of right now.)
> > that do an FPGA to ASIC conversion if you submit them the EDIF netlists
> > generated from your FPGA synthesis tool.
> >         To make sure the synthesized design was synthesized correctly,
> > do a gate-level simulation of the synthesized design.
> > You should be able to run the same testbench code you used for an RTL
> > simulation.
> > Also, before firing up your FPGA, it is probably a good idea to do a
> > post P&R simulation of your P&Red design.
> > I always do a post P&R simulation before firing up an FPGA board I got.
> > When I made sure the design worked fine in a post P&R simulation, my
> > design always worked fine in a real system.
> >
> >
> >
> > Kevin Brace (In general, don't respond to me directly, and respond
> > within the newsgroup.)
> >
> >
> >
> >
> > Reala wrote:
> > >
> > > Hi,
> > >
> > > I have a question is that : if I design a circuit by verilog. Then, I
> > > synthesis this and implement by FPGA.
> > > Assuming the design is work in FPGA, then, I want to make it a custom
> IC.
> > > Can I know the netlist in gate level of the design after synthesis?
> > > Otherwise, how can i translate the design from FPGA to IC?
> > >
> > > Thanks a lot. ^_^
> > > Reala
>
>

Article: 45435
Subject: Re: Xilinx ISE 4.2i Is A Step Backwards! Beware!!!
From: hamish@cloud.net.au
Date: 23 Jul 2002 14:32:29 GMT
Links: << >> << T >> << A >>

Ken Schmidt <kschmidt@peerless.com> wrote:
> ISE 4.2 seems like a big step backwards. What happened? Are others
> having troubles with 4.2?

My own experience is that it's a fairly painless upgrade from 4.1, for
Virtex-II designs. I don't remember much real change in the tools; I 
just track them for the latest bug fixes, plus access to the latest 
speed files.

For Virtex-E it seems to be a significant upgrade from 4.1.

It sounds like most of your problems are with XST - I don't use it, so
my experience is limited to the routing tools only. I don't use Verilog
either. I suppose XST behaves pretty similarly for Virtex-E and Virtex-II.

> I am very disappointed in Xilinx and in ISE 4.2. Are we doing
> something wrong, or is a 4.2 a step backwards???

Use Synplify?

Hamish
-- 
Hamish Moffatt VK3SB <hamish@debian.org> <hamish@cloud.net.au>

Article: 45436
Subject: Re: Clock-gating in Virtex-E parts
From: "John Adair" <newsanswer@removethisenterpoint.co.uk>
Date: Tue, 23 Jul 2002 16:30:09 +0100
Links: << >> << T >> << A >>

You can do a gated clock.but only with extreme care. You will need to
instantiate the global clock buffer and feed it with the logic gated clock
signal. Depending on you application care will also be needed when you
change over the clock. You should also add constraints on path through the
gating logic or your timing could be anywhere. Check also that clock period
constraints are properly applying and not being ignored, you may have to add
period constraints to the net after the clock buffer and of course consider
the gated clock as a different clock to the source and apply clock boundary
crossing techniques or ensure that you meet setup and hold where you cross.

John Adair
Enterpoint Ltd.

--
The views expressed in this message are those of the writer and not
necessarily those of Enterpoint Ltd.. The use of information in this message
is without warranty and persons using the information are advised to make
their own checks as to it's validity. No responsibility will be accepted for
any incorrect, inaccurate or missleading information supplied.


"Jason Crawford" <jace@cisco.com> wrote in message
news:3D3BAC69.A03017A@cisco.com...
> Hi,
>
> Apart from using clock-enables, does anyone know of any
> way to use clock-gating in Virtex-E parts?
>
> We have a design that is partially written for an ASIC
> target and expects to see a gated clock. Rather than have
> to get the designers to pour throught the code and add
> clock enables to all flip flops (I can hear teeth gnashing
> already) I am hoping against hope that someone has an
> alternate answer to this rather difficult problem.
>
> yours in hope,
> Jason.

Article: 45437
Subject: Re: xilinx v ti
From: John_H <johnhandwork@mail.com>
Date: Tue, 23 Jul 2002 15:45:08 GMT
Links: << >> << T >> << A >>

In order that I, personally, don't "spoil the archives," what is it about the post
that was bad?

The original posting had expired from my newsgroup so to see the original text
would take a journey to groups.google.com.  Are a few kbytes really that
important?

Also, it's pretty standard in the newsgroups to include text that obviously marks
the email with bogus text that can be removed by anyone reading the address.  This
prevents the massive amounts of spam to include porn, viagra, and get rich quick
schemes.  I wish now that I'd done the same thing when I first started porting;
I'm thinking it's useless to change now.

I appreciate the forum and the people who participate.  I like to see everyone get
along reasonably.  The professional exchange benefits many people to include
myself.

- John_H

Uwe Bonnes wrote:

> Kevin,
>
> please cut down what you quote to avoid spoiling the archives.
>
> Also faked retun addresses are depricated.
>
> Bye

Article: 45438
Subject: Re: xilinx v ti
From: Jerry Avins <jya@ieee.org>
Date: Tue, 23 Jul 2002 12:35:06 -0400
Links: << >> << T >> << A >>

Uwe Bonnes wrote:
> 
>   ... faked retun addresses are depricated.
> 
I would describe kevin-neilson@removethistextattbi.com as munged, rather
than faked. It's easy to see what the real address is.

Jerry
-- 
      "The rights of the best of men are secured only as the
       rights of the vilest and most abhorrent are protected."
    -Chief Justice Charles Evans Hughes, around when I was born.
 
"They that give up essential liberty to obtain a little temporary
safety deserve neither liberty nor safety."  -Benjamin Franklin, 1759
¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯

Article: 45439
Subject: Re: spiral / waterfall /watersluice : Which are your methods?
From: Ray Andraka <ray@andraka.com>
Date: Tue, 23 Jul 2002 16:53:11 GMT
Links: << >> << T >> << A >>

Hmm, I think a translation is in order:

spiral = circling the drain
waterfall = falling off the cliff
watersluice = down the tubes

Pick your poison   :-)



--
--Ray Andraka, P.E.
President, the Andraka Consulting Group, Inc.
401/884-7930     Fax 401/884-7950
email ray@andraka.com
http://www.andraka.com

 "They that give up essential liberty to obtain a little
  temporary safety deserve neither liberty nor safety."
                                          -Benjamin Franklin, 1759

Article: 45440
Subject: RLOC Origin problems in ISE4.2sp3?
From: Ray Andraka <ray@andraka.com>
Date: Tue, 23 Jul 2002 17:03:42 GMT
Links: << >> << T >> << A >>

Fellow experts:

I'm hoping someone has fallen on this one and found a
workable solution.
So far the Xilinx hotline has been unusually unresponsive in
handling this
case (they've had it since 7/12 and haven't even
acknowledged whether
or not they see the problem with the test case I sent).
Seems the hotline
is not as responsive and helpful as it once was, which is a
shame.
Anyway, this is the case:

RLOC_ORIGIN being ignored by placer. The macro shows up in
the
correct position in the floorplanner but with the G and Y
elements
missing (that is a separate issue which is still an open
case). The RLOC
origin is being ignored by the place and route, so the macro
is not landing
in the specified position. It is critical I be able to
specify the RLOC
origins in this design (2V6000, 200 MHz, high utilization.
Is this a known
problem (I don't see it in the answers data base). I tried
adjusting the
RLOC origin by single slice steps as suggested in answer
record 12192
to no avail.

The RPMs are not created under FPGA editor, rather they are
RLOC'd in the
VHDL source. That gets me relative locations for each BEL in
the design
which in turn give me a macro that I should be able to place
or let the
tools place. I am trying to put an RLOC_ORIGIN on it by
adding an
RLOC_ORIGIN attribute to the UCF file. The syntax is
correct, and indeed
the part of the RPM that is not decimated by the
floorplanner shows up in
the floorplanner editable window in the correct position
(there is a
previously reported bug in the floorplanner that prevents
the RPM from
showing correctly unless you first go through auto PAR, then
constrain from
placement, then unbind and bind the RPM). However, when the
design is run
through the PAR, the macro is placed at some location other
than that
indicated by the RLOC_ORIGIN.

Thanks in advance for any info you may have seen on this
problem.



--
--Ray Andraka, P.E.
President, the Andraka Consulting Group, Inc.
401/884-7930     Fax 401/884-7950
email ray@andraka.com
http://www.andraka.com

 "They that give up essential liberty to obtain a little
  temporary safety deserve neither liberty nor safety."
                                          -Benjamin
Franklin, 1759

Article: 45441
Subject: Re: spiral / waterfall /watersluice : Which are your methods?
From: "Domagoj" <domagoj@engineer.com>
Date: Tue, 23 Jul 2002 21:18:12 +0200
Links: << >> << T >> << A >>

Hi Don,

"Abernathey Family" <family2@aracnet.com> wrote in message
news:3D3D4B88.E7621A86@aracnet.com...
> I think you're right. But we will see a pick & choose approach to
> software methods. Why? Moore's Law drives processor to a lesser extent
> ASIC design. What drives software design? Gate's Law? My point is that
> chip design is already more complex than software design with no
> slowdown in sight. Chip design methodology must advance faster than
> software methodology.
> -Don

Perhaps we are flattering ourselves :) . I agree that the whole ASIC design
process is composed of many complex different activities (simulation,
synthesis, floorplanning...). Software design is composed of activities that
are actually very similiar to each other and that's why it's simpler to
design
software.

But complexity of the software itself is enormous today. I believe that
trends seen in the software arena, like object cooperation, AOP and
dependibility are far ahead of any programming related to digital design.

Also, formal verification of hardware is much easier than of software
because hardware is simpler in its nature.

--
        Domagoj Babic
domagoj (et) engineer.com

Article: 45442
Subject: Re: spiral / waterfall /watersluice : Which are your methods?
From: vhdlcohen@aol.com (ben cohen)
Date: 23 Jul 2002 15:18:34 -0700
Links: << >> << T >> << A >>

Seen a lot of replies on software use of these methods, and how sw is
different than hw.   But these are not the real issues.  In hardware.
as in software, you still have requirements, architectural plans,
design, verification plans, and validation.  What we I talking about
is METHODOLOGY!
In spiral, you don't have a hard spec, but start to build right away,
and then keep on correcting until you get something that works.  In
waterfall, you MUST define all your requirements/architectural plans
BEFORE you build.  It's a bit like the process involved in getting a
house built.  Firt you define plans/architecture (on paper), then you
have them approved (the city in this case), then the builder builds
the house acording to plans.  Inspectors verify that the house
conforms to design and standards.  If you were to build a house using
the spiral method, you have an idea, and start pouring the concrete. 
If you don't like what you see, you just tear everything, or some of
the cement, and make changes.  You keep this process, until house is
built.  Then you call the inspectors (maybe).
Bottom line, spiral/waterfall deals with PROCESSESm but not with sw/hd
and differences or adaptation to a methodology define for sw.  Whne
you think about it, the methodology applies to many other disciplines
(i.e., constructions, relationships, traveling, jobs, proessions,
etc.)

For the record, on hw designs I had more experience using the
waterfall approach, and that process is documented in the book
"Component Design by Example"
Ben 
----------------------------------------------------------------------------
Ben Cohen     Publisher, Trainer, Consultant    (310) 721-4830  
http://www.vhdlcohen.com/                 vhdlcohen@aol.com  
Author of following textbooks: 
* Real Chip Design and Verification Using Verilog and VHDL, 2002 isbn
0-9705394-2-8
* Component Design by Example ",  2001 isbn  0-9705394-0-1
* VHDL Coding Styles and Methodologies, 2nd Edition, 1999 isbn
0-7923-8474-1
* VHDL Answers to Frequently Asked Questions, 2nd Edition, isbn
0-7923-8115
------------------------------------------------------------------------------

Article: 45443
Subject: Re: Do you know a parallel algorithym for 2D convolution
From: John Williams <j2.williams@qut.edu.au>
Date: Wed, 24 Jul 2002 08:26:51 +1000
Links: << >> << T >> << A >>

Hi Antonio,

Antonio Martínez Álvarez wrote:

> I'm using Handel-C and VHDL to make a 2D filter. (DOG (Diference of
> Gaussians) filter indeed).
> I'm more interested in Handel-C. (DK-1.1)

I haven't used Handel-C, but I don't think it will ever match the speed
from well-designed hand-written VHDL, particularly for DSP applications.

> I'm doing it secuentially. For every pixel I read the pixels which are
> below the mask and multiply...
> 
> Well. I'm using a RAM that I've defined for a Virtex-E. The filter
> works... but it's very slow. (12 frames per second).

The bottleneck in the design you describe is probably the RAM access to
get the pixel values.   It sounds like you are fetching NxN pixel values
for each output pixel, and this is slowing you down.

Depending on your available on-chip RAM/LUT resources, a faster way to
do it is shift the image pixels out of RAM into a local line buffer. 
The SRL16 in the Virtex architecture is great for that, but if you've
got a very wide image it will eat a lot of floor space.

For a 3x3 mask and a 256x256 image, you need a shift register that is
(256*2+3) = 515 pixels long, tapped at positions
0,1,2,256,257,258,512,513,514.  Each new pixel is shifted in at position
0, and the rest of the pixels shifted along.  

If your line buffer can support it, you may wish to take the taps in
parallel, to a set of parallel multipliers.  Doing it this way you could
also pipeline each multiplier, increasing the delay slightly but giving
you blistering throughput.  

Using the on-chip line buffer concept, but with a sequential
non-pipelined multiplier, I achieved about 45 frames/sec in a Virtex
(speed grade 4) for a 256x256 8-bit image, 3x3 convolution, without any
real optimisation or fine-tuning.  By doing the multiplies in parallel
and pipelining them, I think the frame rate could ultimately be limited
by the RAM access time for a single pixel.

Hope this helps,

Regards,

John
-- 
Dr John Williams,    Postdoctoral Research Fellow
High Performance Computing Group, CRC for Satellite Systems
Queensland University of Technology,   Brisbane,  Australia
Phone : (+61 7) 3864 2427           Fax : (+61 7) 3864 1517
Web   : http://www.crcss.bee.qut.edu.au/comp.html

Article: 45444
Subject: Xilinx DCMs, RST, and phase coherence
From: "Doug Wilson" <doug_wilson@3mtsNOSPAM.com>
Date: Tue, 23 Jul 2002 17:02:21 -0700
Links: << >> << T >> << A >>

If I have two separate boards in a system, each with a VirtexII 
DCM used in frequency synthesis mode, both generating the same
frequency, is there anything I can do with the DCM rst line 
(or anything else) to ensure that they come up in phase or is 
it impossible to guarantee the phase?

Thanks,
Doug

Article: 45445
Subject: Re: delay pipes in verilog for spartan IIe?
From: "Kevin Neilson" <kevin-neilson@removethistextattbi.com>
Date: Wed, 24 Jul 2002 00:48:24 GMT
Links: << >> << T >> << A >>

John,
You definitely don't want to use flops for this.  (I don't think there are
even enough; you need 24500 flops and that part only has about 6000 I
think.)  You could use the SRLs, which are 16-bit shift registers.  You
would need about 1530 of these, and the part has about 6000 of these.
Better yet would be to use the dual-port blockRAMs.  You could use 24 of
these in a 16wide x 256deep configuration.  Then you just set the write and
read addresses 16 apart to get a 16-deep pipe.  If your clock is slow
enough, you split up the RAM and use only six RAMs.  This part has about 32
blockRAMs I think.
-Kevin

"John Hovell" <jhovell@yahoo.com> wrote in message
news:9402973.0207231559.2030b94a@posting.google.com...
> Hello all --
>
> I am trying to implement a delay pipe that is 384 bits wide and 64
> bits long in Verilog.
>
> I was trying to build one out of fairly simple D-flops, but my design
> has been "synthesizing" in Xilinx Web Pack for nearly 2 hours now, so
> I think I mus have done something wrong.
>
> Is there an efficient or correct way to implement such a pipe on a
> 300K gate Spartan IIe?  I think the size should be OK since a Spartan
> IIe 300K gate could theoretically make a 98kbit distributed memory...
> but maybe I am missing something here.
>
> TiA for any help, pointers, etc.
>
> Cheers,
> John

Article: 45446
Subject: Re: delay pipes in verilog for spartan IIe?
From: John_H <johnhandwork@mail.com>
Date: Wed, 24 Jul 2002 01:01:49 GMT
Links: << >> << T >> << A >>

I like the BlockRAM idea.  But - to get 32 bit mode - use the same read and
write address (cycle through n addresses) and use port A as the upper 16 bits
and port B as the lower 16 bits.  The output data is the registered version of
the memory at that address so there aren't any timing problems.  The memory can
look like a 32 wide by 128 deep in this mode.  Only 12 memories needed which
would fit in a Spartan-IIE 150!  Lots of extra logic to develop a "pong" game
as well.

I tried compiling an array of 384x64 regs in Synplify and I quit after 8
minutes of compiling.  Synplify usually goes much faster!  The SRL16s with
arrays of instatiations would probably work much faster in Synplify but I don't
know if Webpack supports the "arrays of instances" which - I believe - was
Verilog 1995, just not supported by many.

The BlockRAM approach would work out so very nice.

Kevin Neilson wrote:

> John,
> You definitely don't want to use flops for this.  (I don't think there are
> even enough; you need 24500 flops and that part only has about 6000 I
> think.)  You could use the SRLs, which are 16-bit shift registers.  You
> would need about 1530 of these, and the part has about 6000 of these.
> Better yet would be to use the dual-port blockRAMs.  You could use 24 of
> these in a 16wide x 256deep configuration.  Then you just set the write and
> read addresses 16 apart to get a 16-deep pipe.  If your clock is slow
> enough, you split up the RAM and use only six RAMs.  This part has about 32
> blockRAMs I think.
> -Kevin
>
> "John Hovell" <jhovell@yahoo.com> wrote in message
> news:9402973.0207231559.2030b94a@posting.google.com...
> > Hello all --
> >
> > I am trying to implement a delay pipe that is 384 bits wide and 64
> > bits long in Verilog.
> >
> > I was trying to build one out of fairly simple D-flops, but my design
> > has been "synthesizing" in Xilinx Web Pack for nearly 2 hours now, so
> > I think I mus have done something wrong.
> >
> > Is there an efficient or correct way to implement such a pipe on a
> > 300K gate Spartan IIe?  I think the size should be OK since a Spartan
> > IIe 300K gate could theoretically make a 98kbit distributed memory...
> > but maybe I am missing something here.
> >
> > TiA for any help, pointers, etc.
> >
> > Cheers,
> > John

Article: 45447
Subject: How to implement efficient wide word comparator?
From: e-engineer@eastday.com (Sniper Daryl)
Date: 23 Jul 2002 21:14:27 -0700
Links: << >> << T >> << A >>

Here,

   I am Daryl and I have to trouble you. :-)

   When I design a chip used for optical network, a lot of effort must
be made to increase the clock speed and reduce the chip resource cost.
In a timing interface module, there is a counter with 14-bit width to
provide timing to the outgoing frame. So, a comparator used to compare
the counter word with a series of registers set by the controller.
I've notice that the slices cost increases seriously and the maxinum
clock speed decreases a lot, when the counter and the comparator get
wider.

   Troubled with it, I firstly tried a wider counter(14-bit) and a
narrower comparator(4-bit) and got 20MHz upgrade of speed and more
than 20 slices saving. Then, a 4-bit counter and 14-bit comparator
with a result of 10MHz upgrade and about 10 slices saving. So, I think
the critical factor is the wide comparator. This is proved by studying
the report and schematics from the synthesis tools(FCII3.6.1 and
Synplify Pro with Amplify).

   To improved the performance, I've tried to use CoreGen tool to
generate a core of comparator. But,after implement, the result is no
better than from myselft code.

   The synthesis tool I used is FCII 3.6.1, the device is
VirtextII1000, implement by ISE4.2SP3. Here is the result of my trials
:

      14-bit counter,  14-bit comparator and other logic :      63
slices used(36 FFs and 105 LUTs);     95MHz

      4-bit counter,    14-bit comparator and other logic :      50
slices used(26 FFs and 85 LUTs);      115MHz

      14-bit counter,    4-bit comparator and other logic :      41
slices used(26 FFs and 62 LUTs);       127MHz

   Would you give me some advice about it from your experience? Or
some resource to study?

 

Thanks in advance for you time!

Daryl

Article: 45448
Subject: Re: How's the FPGA design job market near you??
From: B__ S_______ <B___S_____@nonoonnoo.com>
Date: Wed, 24 Jul 2002 04:33:29 GMT
Links: << >> << T >> << A >>

> Although I have a fair amount of experience with FPGA, there is little
> chance for
> us to startup on our own. Unlike the US, companies here are not
> out-sourcing.
> They would rather buy tools and hire people to do in-house development.

My impression working in Southern California (so I have a small view of
the EE world!) is US companies prefer to keep project development 
in-house.  If an outside vendor already has a finished, working
IP-block,
the company will consider buying it.  In essence trading money for time
(the idea being they can 'buy' the IP block and just drop it in their
current project.)

Otherwise, the company will have to invest money AND time(since the
contracter's development-cycle would be non 0-day.)  In that case,
for a little additional expenditure, the company can keep the 
development expertise in-house, so why pay someone else to learn
your project?

I know there are some success stories when it comes to design
services companies.  But given the size of the fabless semiconductor
industry, wouldn't you expect there to be *more* design services
consulting?

Article: 45449
Subject: 32-bit PCI Target core
From: "Jeff Reeve" <teamreeve@attbi.com>
Date: Wed, 24 Jul 2002 04:47:54 GMT
Links: << >> << T >> << A >>

I'm looking for a synthesizeable 32-bit 33MHz PCI Target only design to be
placed into a FPGA or large CPLD. Minimal implementation is fine. Does
anybody know if such a thing is available in VHDL or Verilog and is open
sourced? I seem to recall Xilinx publishing a target only design quite some
time ago but I can no longer find it on their web site.

Any help is much apprecieated!
Jeff

Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources

Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Custom Search