Messages from 47750

Article: 47750
Subject: Re: Large Multiplexer
From: Ray Andraka <ray@andraka.com>
Date: Thu, 03 Oct 2002 13:43:20 GMT
Links: << >> << T >> << A >>

That is part of the 'careful design' I was referring to.  There is also an issue
of route congestion, which the current autorouters do not handle well in high
performance designs.   Hopefully, the selects can be registered to cut down on the
timing problems there.  At that high data rate, one normally looks for ways to
increase the permissible pipelining in an FPGA.  There really are not all that
many scenarios where pipelining can't be applied.  Most of the time, these occur
when there is a tight feedback loop, which in the case of a multiplexer there
probably should not be.  Also, it occurs to me that he may be looking for a
combinatorial multiplexer with the inputs and outputs going off-chip, in which
case the FPGA is not anywhere close to fast enough.

Uwe Bonnes wrote:

> : In comp.lang.vhdl Ray Andraka <ray@andraka.com> wrote: WIth careful design
> : and floorplanning, in a -6 part it should be, but you also need to consider
> : the timing in and out of it.  Don't count on the automatic place and route
> : to get a minimal delay solution: you may find you need to do some hand
> : routing on this.  I suggest you try it out with a one or 2 bit version first
> : and see if you can make your requirements.
>
> What about the select signals. With a large multiplexer for a wide bus, the
> load for these signals becomes large. Buffering these signal in a sensible
> way is probably needed. Has the user to care in his code or does place and
> route?
>
> Bye
>
> --
> Uwe Bonnes                bon@elektron.ikp.physik.tu-darmstadt.de
>
> Institut fuer Kernphysik  Schlossgartenstrasse 9  64289 Darmstadt
> --------- Tel. 06151 162516 -------- Fax. 06151 164321 ----------

--
--Ray Andraka, P.E.
President, the Andraka Consulting Group, Inc.
401/884-7930     Fax 401/884-7950
email ray@andraka.com
http://www.andraka.com

 "They that give up essential liberty to obtain a little
  temporary safety deserve neither liberty nor safety."
                                          -Benjamin Franklin, 1759

Article: 47751
Subject: Re: Xilinx Cordic Core and Square Root...help
From: Thomas Wambera <thomas@wambera.de>
Date: Thu, 03 Oct 2002 15:51:05 +0200
Links: << >> << T >> << A >>

Hi!
I wrote to the Xilinx Support, they gave me the following example,
possibly it might help in the future:

##start

Below, you will find the format of the input and output and how does
this relate to the
square root.

The inputs and outputs to the CORDIC core are Q1 format binary
numbers. Some examples of the Q1 number format is given below;

For a 8 bit input the format of the input is
0.0001000 = (SQRT 1/16)
and the format of the output is
0.0100000 = (1/4)

For a 20 bit input the format of the input is
0.0001000000000000000 = (SQRT 1/16)
and the format of the output is
0.0100000000000000000 = (1/4)

The input/output of the SQRT can be interpreted differently to
"change" the input range. If the input data is left shifted by 2*N
bits the output data is left shifted by N bits.

The 8 bit input to the Square Root.
0000100.0 = (SQRT 4)
0010.0000 = (2)

A 20 bit input to the Square Root.
0000000000000000010.0   =  SQRT(2)
0000000001.0110101000   = 1.4140625

A 20 bit input to the Square Root.
0000000000000000100.0 = SQRT(4)
0000000010.0000000000 = 2

Here is a quick summary of how to instantiate an integer in
and integer out CORDIC Square Root.

X_IN : STD_LOGIC_VECTOR(20-1 DOWNTO 0);
X_OUT : STD_LOGIC_VECTOR(20-1 DOWNTO 0);

Instantiate a 21 bit square root:
X_IN_SQRT : STD_LOGIC_VECTOR(21-1 DOWNTO 0);
X_OUT_SQRT : STD_LOGIC_VECTOR(21-1 DOWNTO 0);

Make the following assignments:
Inputs:
X_IN_SQRT <= X_IN & '0';
Outputs:
X_OUT(20-1 DOWNTO 11) <= (OTHERS=>'0');
X_OUT(10 DOWN TO 0) <= X_OUT_SQRT(21-1 DOWN to 10)

##end

Thomas Wambera

Article: 47752
Subject: Re: ANN: Embedded processor for Tcl language
From: "Scott Thibault" <thibault@gmvhdl.com>
Date: Thu, 3 Oct 2002 10:18:15 -0400
Links: << >> << T >> << A >>

Tcl interpreter uses a bytecode compiler internally to improve performance.
That bytecode is not suitable for hardware implementation, however.  We
designed a special bytecode for this purpose, and yes, the FPGA is like a
virtual machine for this bytecode.

--Scott

"Phil Tomson" <ptkwt@shell1.aracnet.com> wrote in message
news:anf7en0ct1@enews3.newsguy.com...
> In article <uplqnffa9droca@corp.supernews.com>,
> Scott Thibault <thibault@gmvhdl.com> wrote:
> >AcroDesign Technologies has announced results from its work on an
embedded
> >processor for the Tcl language.  More information, and a recent
presentation
> >is available at: http://www.gmvhdl.com/acrodesign/research.html#tob
> >
> >--Scott Thibault
> >AcroDesign Technologies
>
> Hmmmm.... so this is basically a FPGA implementation of the Tcl virtual
> machine? (actually, I wasn't aware that Tcl had a bytecode interpreter,
> but I guess it isn't surprising).
>
> Phil

Article: 47753
Subject: Re: ANN: Embedded processor for Tcl language
From: "Scott Thibault" <thibault@gmvhdl.com>
Date: Thu, 3 Oct 2002 10:35:00 -0400
Links: << >> << T >> << A >>

"Nicholas C. Weaver" <nweaver@ribbit.CS.Berkeley.EDU> wrote in message
news:anf99u$2h35$1@agate.berkeley.edu...
> In article <anf7en0ct1@enews3.newsguy.com>,
> Phil Tomson <ptkwt@shell1.aracnet.com> wrote:
> >Hmmmm.... so this is basically a FPGA implementation of the Tcl virtual
> >machine? (actually, I wasn't aware that Tcl had a bytecode interpreter,
> >but I guess it isn't surprising).
>
> Well, TCL is more a ascii string munging hack, based around textual
> replacement, so you could treat the program as a (really ugly)
> bytecode.

Tcl was originally implemented as a string processing engine, but can be
described in more traditional terms just as other languages (i.e., a BNF
grammer, compilers, etc.)

>
> Why you would WANT to however, is beyond me.  Compile scheme or
> something to a nice vanilla uP core and have hardware support for
> garbage collection.
>

Tcl is not so different from Scheme if you replace []'s with ()'s, but ...
there are a couple of advantages to using Tcl. First, memory is managed with
reference counting, which is simple and predictable.  Second, the extensive
built-in string processing abilities that very are useful for embedded
devices that communicate using command based protocols over TCP.  There are
other advantages such as being easy to learn and pointer safe etc.

We compiled to a custom processor because compiling very high-level
languages to a vanilla processor can generate large executables, and our
target was a memory constrained device.

--Scott

Article: 47754
Subject: Re: Need advice wiring up a CPLD
From: "Christopher R. Carlen" <crcarle@sandia.gov>
Date: Thu, 03 Oct 2002 08:12:46 -0700
Links: << >> << T >> << A >>

Hal Murray wrote:
 > [snip description of magic-box gizmo]
 >
 > Sounds like a fun project.
 >
 > I'd suggest that you look at your current collection of circuits your
 > users are using and try implementing them with various proposed IO
 > connections/mappings.

I have to get the thing installed ASAP, because a few people are
clamoring to fix crosstalk, and to get more n-input gates.  So I want to
avoid trying to route all the things I might want to try.  Especially
considering that some of the other functions involve one-shots.  Some of
those might work out Ok in the CPLD with digital delays, but will
require some thought to convert from the mixed signal to the strictly
digital domain.

Thus, my plan has been to use a CPLD that is far in excess of the
typical logical complexity that is needed, and as long as the thing lets
me route in a mostly unconstrained manner, then I anticipate it being
very rare that anything would be needed that exceeds the capabability of
the CPLD.

Note the context here:  most of the circuits currently implemented are
just n-input AND and OR gates, with a few cases of combined boolean
product and sum functions.  Thus, this thing is highly overkill.  But
the point is that it is 2002, so what should I put in there?  To put in
a CPLD vs. discrete logic packages doesn't change the cost equation at
all, since for something like this the labor far exceeds the hardware
cost.  Thus, the best thing to do is to implement the most flexible
thing that is within reason.  An FPGA is overkill, but a mid-range CPLD
seems just right.  The CPLD seems better than an FPGA anyway, because
this is a combinatorial heavy, as opposed to register heavy application.
  If you can call a handful of ANDs and ORs heavy at all!

 > How are you going to program the CPLD?  Perhaps another small
 > connector on your daughter card so you can plug in a PC/laptop.

Indeed, I will put a female DB9 on the front panel, to make it easy to
plug in a programmer at any time and change the config.

 > Are you expecting to do a lot of programming on the fly, or mostly
 > use a handful or normal/popular boxes?  How are you going to test
 > things?

Infrequent reprogramming.  The boxes will be delivered with a "default"
configuration of a basic assortment of AND, OR, NAND, NOR, looking
similar to the existing magic boxen.  Later I can modify the progamming
to suit the specializations that are in place in the various labs, as
well as start implementing some of the "more complicated" functions that
I would otherwise have implemented in a custom peice of hard-wired
hardware.  Some of those special functions will ultimately become
dedicated custom hardware pieces anyway in the future.  But in the short
term, if I can do it on the nice looking CPLD panel, it will be much
better that what I do now, which is to install a breadboard with a
debugged prototype circuit, sitting on a table where people can bump
into it and accidentally yank the wires out.  It will sit there for
about a year typically before I have a chance to come back and convert
it into a PCB in a permanent chassis.

Thus, I hope to avoid the crudeness factor of the temporary prototype
hardware that I have sitting around in various labs waiting to be made
permanent.  Of course, the CPLD doesn't help me much for analog
problems, but usually I have very similar overall circuits:  a few
analog inputs get conditioned, fed to comparators, and logic-ed with
some other digital signals, then spit out some digital results.  So
perhaps the next step will be to build some general purpose analog
building blocks.

 > Are you going to program them all or will you setup a system so your
 > users can program their own boxes?

Both.  They will have a front panel JTAG port as I mentioned.  I may
construct a little computer cart to wheel around and program the things
when needed.  Or I may just install the Xilinx software on one of small
army of PCs that typically populate the labs, and do the programming
from their.  Whether the users learn to program them I am not sure if
that will happen.  Unlikely, I suspect.  That's why they hired me, and
they are all mechanical engineers anyway, though exceptionally bright ones.

 > Are you going to leave enough room on the front to attach a drawing
 > of the circuit diagram?  Maybe the drawing should slip over the BNC
 > connectors so it's really obvious which gate is connected to which
 > connector.

Ah, this is a good question.  I am leaning toward a "Battleship"
appearance right now, with silk screened row and column labels on the
panel.  When the programming is done, the schematic can be printed, and
that will serve as a map.

Though it would be nice to have something more specific on the panel.  I
had envisioned slipping a chart over the BNCs as you mentioned.  Not
sure about this.

 > Are the in/out LEDs really necessary of you have a good drawing?

The LEDs aren't absolutely necessary, but not that much trouble to
include.  They will benefit those situations where one might wonder if a
signal is present or not, without having to connect a scope.  Most of
our signals are slow.  We are controlling big diesel engines.  But there
are some time critical laser and camera sync pulses going around.  Not
high frequency, just relative timing.

 > How many are you going to make?  Would it be simpler/cheaper to dump
 > the LEDs, switches, and unused IO gear and always put the output
 > connectors on the top (or someplace distinctive)?

I had first considered making the allocation of inputs and outputs
fixed.  But my considerations of the cost of putting in the added
flexibility led to the conclusion that it was worth it to have the
selectable IOs, and the LEDs.  I will make between 3 and perhaps 10 of
these things.  The added hardware might increase the total assembly time
of each one by 10-20%.  This is reasonable.

 > If you have more IO pins than connectors, can you parallel several of
 > them and get rid of your 74ACTQ14 output buffer chips?  (Might be
 > ugly to program.)

I like having the buffer chips in between the CPLD and the user.  That
way it is likely that if they break something, it will only be an
individual channel.  Also, I want to be sure that if they connect say,
16 of the outputs to actual 50 ohm loads, which my design is capable of
handling, that it won't break.  Thus, I paid careful attention to things
like the maximum allowable DC current per IC package, etc.  If the CPLD
drives the outputs directly, I suspect it won't be happy with having
many 100 ohm loads attached (the 50ohm cable terminations, in series
with the 50ohm back terminations).

As it stands, I can drive unterminated as well as terminated 50ohm
cables, as many as 32 terminated cables if I really want to, for a total 
output of 1.6A without any risk to the CPLD.  And there will be little 
ringing of edges with or without terminations, due to the back 
terminations.  The nice thing about ACTQ or similar AC drivers, is that 
by paralleling several of them, you lower the non-linearity of the 
output impedance, so that you can concentrate most of the output 
impedance in the purely resistive series resistor.  And the ACTQs can 
handle a direct short circuit on the output, with the series resistor, 
indefinitely.  Plus there is enough current capacity left over to drive 
the LED associated with each BNC.

Oh, one other important thing!  The CPLD runs at 3.3V, so I need level 
shifting anyway.

Thanks for your interest.

Good day!

-- 
____________________________________
Christopher R. Carlen
Principal Laser/Optical Technologist
Sandia National Laboratories CA USA
crcarle@sandia.gov

Article: 47755
Subject: Re: Xilinx Cordic Core and Square Root...help
From: fred <not@for.mail>
Date: Thu, 03 Oct 2002 15:18:47 GMT
Links: << >> << T >> << A >>

In article <3D9C4B49.9832D9B2@wambera.de>, Thomas Wambera
<thomas@wambera.de> writes
>I wrote to the Xilinx Support, they gave me the following example,
>possibly it might help in the future:
<soln snipped>
Thanks for the feedback, I looked at your problem but could not see a clear
solution, your answer is now bookmarked :)
-- 
fred

Article: 47756
Subject: Re: Low power design
From: Steve Prokosch <steve.prokosch@xilinx.com>
Date: Thu, 03 Oct 2002 10:01:54 -0600
Links: << >> << T >> << A >>

check the xilinx CPLD app note xapp 346 for some starters.
http://www.xilinx.com/apps/epld.htm
Steve

skillwood wrote:

> Hi all,
>   Can some one give me an introduction to low power SoC design . What is
> difference from an ordinary design and low power design in the design stage
> . Suppose I am designing a fsm based sequential logic ,  at which stage the
> "LOW POWER " Comes in .
>
> thanks
> skillie

Article: 47757
Subject: Re: TCP/IP in FPGA
From: kempaj@yahoo.com (Jesse Kempa)
Date: 3 Oct 2002 09:43:02 -0700
Links: << >> << T >> << A >>

Hi,

I think this is really a case of where to draw the line between SW and
HW implementations. There are certain very mundane things that C code
running on a general purpose processor is much better at... sure, you
can do it in hardware, but you're burning gates very quickly. OTOH, if
a SW implementation isn't cuttinng it in terms of performance, you can
convert specific areas of a TCP/IP stack to hardware.

Two such examples are any data shoveling present in your stack that
copies data as the packet is being assembled - use DMA instead.
Another is the TCP checksum, something that will take a few thousand
cycles on a normal RISC CPU (for a large IP packet)... this can be
taken down to *tens* of cycles by converting a few lines of C code to
a few lines of HDL..... I guess what I'm getting at is that C -> all
gates in the system doesn't make much sense. Select areas of C that
are slow -> gates does make sense.

This is straying from topic, but looking "down" a couple layers, there
are also Ethernet MAC cores (such as the opencores ethmac project)
which include their own DMA engine that masters memory. The host CPU,
or logic, or whatever you're interfacing to assembles a packet and
passes its location and a 'go' bit to the MAC core... it does the
rest. It takes a bit of effort, but its not very difficult to wire up
such a MAC to a processor core such as Nios (myself and a few other
people who have posted to the opencores site have done so).

- Jesse

> FPGA TCP
> On the fpga-cpu list, Anand Gopal Shirahatti asks:
> "... What I was wondering is, are there are Implementations of the 
> TCP/IP Implementation over a Single FPGA, for mutilple connections. ..." 
> The simplest thing to do is run a software TCP/IP stack on a soft CPU 
> core. For example, at ESC I saw TCP/IP running on uCLinux on Altera Nios 
> with a CS8900A ethernet MAC.
> 
> Note that a compact FPGA CPU core with integral DMA (e.g. xr16) may be 
> hybridized into the data shovel aspect of an ethernet MAC. (Flexibly 
> shovel the incoming bits to/from buffers, etc.) Indeed, one enhanced 
> FPGA CPU might (time multiplexed or otherwise) manage several physical 
> links.
> 
> You can also build hardware implementations of the TCP/IP protocol 
> itself. There are several such implementations in custom VLSI. For FPGA 
> approaches, see:
> 
>      * Smith et al's XCoNet.
> 
>      * BlueArc SiliconServer white paper.
>        "The SiliconServer runs all normal TCP/IP functionality in state 
> machine logic with a few exceptions that are currently dealt with by 
> software running on the systems attached processor (e.g. ICMP traffic, 
> fragmented traffic reassembly)."
> 
> And related things: FPXKSM

Article: 47758
Subject: Re: Need advice wiring up a CPLD
From: "Theron Hicks" <hicksthe@egr.msu.edu>
Date: Thu, 3 Oct 2002 12:55:09 -0400
Links: << >> << T >> << A >>

Chris,

If you had asked this question about a year or two ago I would say you
should seriously consider using a Spartan  (5 volt compatibility)  Given the
low cost of decent FPGAs I personally would still use an FPGA.  Use a
reprogrammable serial prom to hold the code.  Given the capacity of MEs to
short or otherwise mis-connect outputs and inputs, I would condsider
something in a socketable package for the FPGA (or, if you decide
differently, the CPLD.)  The reason I suggest the FPGA is that from personal
experience, the project can quickly grow all out of proportion.

Here is an idea that would perhaps really simplify the job for you...  What
about one of the demo boards from one of the distributors to do the job for
you.  Typically they have at least one clock input with an on-board
oscillator(in case you need a digital one-shot or two)  total cost would be
about $200 at most.  One source that I have seen that has impressed me is
Insight Electronics.  I have seen their boards and they are pretty good
quality.

http://www.insight-electronics.com

and click on the Xilinx development kit window.

I see they have a kit for  the coolrunner XPLA3 for $125.

I don't work for them or anything, it just seems that it is a waste of time
to re-invent the wheel.

Good luck,
Theron


"Christopher R. Carlen" <crcarle@sandia.gov> wrote in message
news:3D9B8FDE.8000405@sandia.gov...
> Hi folks:
>
> (Please skim down to "Question:" if you don't want to read the details...)
>
> In our engine labs we have a "magic box" which is a chassis with a panel
> covered with BNCs, connected to the inputs and outputs of a variety of
> basic logic gates (AND, OR, etc).  This magic box is used to implement
> various glue logic functions for our research engine control,
> experimental apparatus, and data acquisition control schemes.
>
> There are two problems with the existing magic boxes:  1.  They have
> terrible crosstalk problems, since they were done with wire wrapping and
> not much thought to the existance of such things as mutual inductance
> and capacitance.  2.  The scientists tend to use up a lot of one type of
> gate, leaving the others unused.  Then they come to me and say "I ran
> out of AND gates" or "I need a 6-input OR gate, can you modify the magic
> box?"  They also come to me periodically asking for me to implement
> various logical gizmos of somewhat greater complexity than the magic box
> can handle, requiring the design of custom hardware.
>
> Rather than get my name associated with the poorly functioning device
> after performing mods, and rather than waste my time futzing with the
> wiring of the thing to add more gates only to have to make another
> hardware change in a few more months, I decided to get modern:
>
> I plan to build a universal magic box with a Xilinx Coolrunner XPLA3
> XCR3128XL-VQ100 CPLD device.  This seemed like a good way to start
> learning the ropes with PLDs, which I've been eager to do for some time.
>   My box will have nice ESD protection networks on each of an array of
> 32 BNC connectors.  Each connector can be changed from a Schmitt trigger
> input buffer, or to a 50R back terminated 3x paralleled 74ACTQ14 output
> buffer, by switching a little DPDT switch (on the inside of the
> chassis).  A bi-color LED will go with each connector, and will glow red
> for outputs active, and green for inputs active.
>
> Most of the space on the new magic box PCB, which will fit directly into
> the panel so I don't have to run any wires from the connectors to
> anywhere, is consumed by the IO buffering, switches, and LEDs.  So I
> plan to fit the CPLD on a little daughterboard that will plug into the
> main board, kind of like a giant DIP package.  The CPLD daughterboard
> will have available 40 IOs, the 4 global clock inputs, the JTAG signals,
> and will have an on-board 3.3V regulator.
>
> Everything is pretty well thought out so far, I think.  The only problem
> is, the CPLD has 84 IOs available, of which I plan to use up to 40.  32
> IOs will be connected to the user BNCs for certain, and I will have a
> little header on the main PCB for expansion to another 8 user IOs if
> needed in the future, as well as a header for access to the global
> clocks, which will also be jumper selectable to connect to four of the
> BNCs, if desired.
>
> Enough bells and whistles?  ;-)  I hope the CPLD will allow me to
> reconfigure the logic available to the user on the fly, and even to
> implement those "more complicated than just a few gates" functions that
> get asked for now and then, without having to build a new physical
> circuit prototype breadboard and PCB.
>
> Question:
>
> How should I map the user IOs to the CPLD IOs, ie. function blocks and
> macrocells, so as to result in the greatest likelyhood of always being
> able to route whatever functions I want, to the pins I choose?
>
> There seem to be two possible approaches:  1.  Take a few IOs from each
> function block, so that all function blocks are ultimately represented
> to the outside (example: take 5 random macrocells from each of the 8
> function blocks, for 40 IOs).  Or 2.  Use all the macrocells of the
> first few function blocks until my 40 IOs are mapped out, then leave the
> rest of the function blocks available for internal-only routing
> (example, take all 16 macrocells from the first 2 function blocks, plus
> 8 macrocells from the 3rd function block, leaving 5.5 function blocks
> not connected to the outside).
>
> Any suggestions?
>
> Is this a wierd problem?
>
> Thanks for comments!
>
> Good day.
>
>
>
> --
> ____________________________________
> Christopher R. Carlen
> Principal Laser/Optical Technologist
> Sandia National Laboratories CA USA
> crcarle@sandia.gov
>

Article: 47759
Subject: A MAC design question
From: Kuan Zhou <zhouk@rpi.edu>
Date: Thu, 3 Oct 2002 13:07:30 -0400
Links: << >> << T >> << A >>

Hi,
   I am going to implement a small MAC (multiply-accumulation) unit on
FPGA.But I can't find any detailed information on its architecture.All the
architechtures on the papers are very complicated.I need an easy,small
implementation.Does anyone know any materials describing it?

   Thank you very much!

sincerely
-------------
Kuan Zhou
ECSE department

Article: 47760
Subject: Re: A MAC design question
From: nweaver@ribbit.CS.Berkeley.EDU (Nicholas C. Weaver)
Date: Thu, 3 Oct 2002 17:15:58 +0000 (UTC)
Links: << >> << T >> << A >>

In article <Pine.SOL.3.96.1021003130537.25536A-100000@vcmr-86.server.rpi.edu>,
Kuan Zhou  <zhouk@rpi.edu> wrote:
>Hi,
>   I am going to implement a small MAC (multiply-accumulation) unit on
>FPGA.But I can't find any detailed information on its architecture.All the
>architechtures on the papers are very complicated.I need an easy,small
>implementation.Does anyone know any materials describing it?

Be more specific:

What FPGA? 

What design criteria?  Functionality?  Low latency?  High performance
with pipelining?

What class is this homework for?
-- 
Nicholas C. Weaver                                 nweaver@cs.berkeley.edu

Article: 47761
Subject: Re: USB2 in FPGA?
From: "Martin Euredjian" <0_0_0_0_@pacbell.net>
Date: Thu, 03 Oct 2002 17:25:50 GMT
Links: << >> << T >> << A >>

This is a bit far fetched, but might work very nicely for your application.
Have you thought of using DVI I/O chips?  DVI is a relatively recent
connectivity methodology for computer displays.  It is, in escence,
serialized 8 bit RGB.  A single link can deliver in the order of 5 or 6
Gigabits per second, if I recall.  The chips (both TX and RX) are less than
ten bucks a piece.  You can certainly clock DVI at less then the max single
link 165 MHz rate and transport your data to via a serial link.  I think the
chips will go down to 25 MHz clocking.  At low data rates you can probably
go many more feet than the standard provides for.  Heck, you could have
three redundant links delivered over a commodity cable.

Anyhow, just a thought.  Check out the Silicon Image site for more details:
http://www.siimage.com/home.asp


HTH,


--
Martin Euredjian

To send private email:
0_0_0_0_@pacbell.net
where
"0_0_0_0_"  =  "martineu"



"Theron Hicks" <hicksthe@egr.msu.edu> wrote in message
news:anctr2$2brl$1@msunews.cl.msu.edu...
> Hello,
>     I am developing an instrument that is currently communicating over a
> special high speed parallel board.  The data rate is 6.4 million 8 bit
words
> per second.  The board works great but it costs in excess of $1600 US per
> copy.  It also occupies a full sized PCI slot.  We are considering
> implementing an alternative I/O arrangement such as USB2 or ethernet
> (TCP/IP).  Is anyone aware of free-ware USB2 implemented in VHDL or some
> other FPGA friendly technology?  Note: target FPGA  is a Spartan2E (or if
> absolutely necessary, Virtex2).
>
> Thanks,
> Theron
>
>

Article: 47762
Subject: Re: Need advice wiring up a CPLD
From: "markp" <map.nospam@surfanytime.co.uk>
Date: Thu, 3 Oct 2002 19:04:57 +0100
Links: << >> << T >> << A >>


"Christopher R. Carlen" <crcarle@sandia.gov> wrote in message
news:3D9C5E6E.5060500@sandia.gov...
<snip>
> Oh, one other important thing!  The CPLD runs at 3.3V, so I need level
> shifting anyway.
>
> Thanks for your interest.
>
> Good day!
>

If you're using buffers you might want to consider using 74LVC4245A level
shifting buffers from Philips, that way you can drive from the PLD 3.3V
logic and be able to select 3.3V or 5V CMOS levels on your outputs.

http://www.philipslogic.com/products/lvc/pdf/74lvc4245a.pdf

Mark.

Article: 47763
Subject: Re: TCP/IP in FPGA
From: "Jan Gray" <jsgray@acm.org>
Date: Thu, 3 Oct 2002 11:06:08 -0700
Links: << >> << T >> << A >>

Another point on the curve is to add hardware functional units to your soft
processor core to do the expensive inner loop computations (e.g. the packet
checksum example) in a dedicated datapath, either explicitly, or as a side
effect of loading/storing the data.

See also a recent article:

Jesse Kempa, Altera, at ChipCenter: Maximizing Embedded System Performance
in the Era of Programmable Logic,
(http://www.chipcenter.com/pld/images/pldf097.pdf).  A very nice article,
based upon the task of speeding up a Nios SoC-based HTTP server,
illustrating that creative application of programmable logic can deliver big
speedups over a pure software approach.

Also:

"Jesse Kempa" <kempaj@yahoo.com> wrote
> ... This is straying from topic, but looking "down" a couple layers, there
> are also Ethernet MAC cores (such as the opencores ethmac project)
> which include their own DMA engine that masters memory. The host CPU,
> or logic, or whatever you're interfacing to assembles a packet and
> passes its location and a 'go' bit to the MAC core...

As I originallhy wrote (http://www.fpgacpu.org/log/apr02.html#020405):
> > Note that a compact FPGA CPU core with integral DMA (e.g. xr16) may be
> > hybridized into the data shovel aspect of an ethernet MAC. (Flexibly
> > shovel the incoming bits to/from buffers, etc.) Indeed, one enhanced
> > FPGA CPU might (time multiplexed or otherwise) manage several physical
> > links.

Elaboration:

If you think about it, a processor datapath is already superbly outfitted to
implement DMA.  Fetching sequential instructions is equivalent to DMA, and
branches/jumps are equivalent to loading the next DMA transfer address.

In the xr16 design, I replaced a single PC register with a 16-entry "PC
register file" that makes it easy to run multiple threads or do multiple
channels of DMA.  Datapath cost (n-bit wide datapath): +n LUTs, -n FFs.
This idea can save you any number of DMA address counter(s), address
mux(es), and the address mux delay(s), elsewhere in the design..

In the XSOC/xr16 Kit, I used one hardwired channel of DMA to stream in video
data from external RAM.  (Necessity (shoehorning the processor, video
controller, and rest of the SoC, into a '4005) being the mother of
invention.)  The next step (not taken in xr16) is to add instructions to
programmatically schedule these DMA operations -- addresses, counts,
arbitration.

Thus it seems to me that an enhanced { 200 LUT, 1 BRAM } 16-bit RISC soft
core could make a pretty capable MAC, and as a bonus you can build a
(software) TCP/IP engine for zero additional LUTs.

Jan Gray, Gray Research LLC

Article: 47764
Subject: Re: TCP/IP in FPGA
From: "Jan Gray" <jsgray@acm.org>
Date: Thu, 3 Oct 2002 11:43:57 -0700
Links: << >> << T >> << A >>

In response to a posting by "Jesse Kempa" <kempaj@yahoo.com>, I wrote
> See also a recent article:
> Jesse Kempa, Altera, at ChipCenter: Maximizing Embedded System Performance
> in the Era of Programmable Logic,
> (http://www.chipcenter.com/pld/images/pldf097.pdf). ...

Oops, sorry, I knew that name seemed familiar.  Nice article though.

Jan Gray, Gray Research LLC

Article: 47765
Subject: Altera FPGA as ISA I/O device
From: m8931612@student.nsysu.edu.tw (Ru-Chin Tsai)
Date: 3 Oct 2002 11:44:20 -0700
Links: << >> << T >> << A >>

I now emulate ISA bus model and my core design on the FLEX 10k. PC can
communicate with my core design for large testbench.  Now the FLEX 10k
act as a I/O card device. It is assigned with a IRQ and a segment of
I/O port address. ISA I/O device must be initialized at 'power on' of
motherboard. And the OS will load my device driver when booting. The
problem is that I use ByteBlaster(LPT) to download programming data of
FPGA. So I must boot twice, one for programming FPGA as a ISA I/O
device(contian ISA bus model and my core design) and the other for
initialing ISA I/O device and loading my device driver. Does I can
program FPGA without PC and ready the ISA I/O device first, then power
on the PC? Which programming method sould I select?

Article: 47766
Subject: Re: Need advice wiring up a CPLD
From: "Christopher R. Carlen" <crcarle@sandia.gov>
Date: Thu, 03 Oct 2002 11:55:42 -0700
Links: << >> << T >> << A >>

Ken Smith wrote:
> In article <3D9B8FDE.8000405@sandia.gov>,
> Christopher R. Carlen <crcarle@sandia.gov> wrote:
> [....]
> 
>>Everything is pretty well thought out so far, I think.  The only problem 
>>is, the CPLD has 84 IOs available, of which I plan to use up to 40.  32 
> 
> 
> 
> I suggest you spread the I/O connection among the logic blocks in logical
> groups. 
> 
> If you are fairly certain the other lines will not be needed, you can hook
> pairs of them together so that you have another way to get signals between
> logic blocks or you can wire up a socket for a 22V10 or spare I/O.  This
> will help keep your options open.

Perhaps it will suffice to bring the unused IOs out to vias.  Then I can 
tie them together later if need be.

It may even be less trouble to build a new daughterboard for the XCR3256 
chip later on.  But the XCR3128 seemed like it would give a lot of room 
for growth over the XCR3064 which I originally considered.  But the main 
problem their was having to share the JTAG pins, which I wanted to avoid 
for simplicity.

-- 
____________________________________
Christopher R. Carlen
Principal Laser/Optical Technologist
Sandia National Laboratories CA USA
crcarle@sandia.gov

Article: 47767
Subject: Re: Need advice wiring up a CPLD
From: "Christopher R. Carlen" <crcarle@sandia.gov>
Date: Thu, 03 Oct 2002 12:00:27 -0700
Links: << >> << T >> << A >>

Theron Hicks wrote:
> Chris,
> 
> If you had asked this question about a year or two ago I would say you
> should seriously consider using a Spartan  (5 volt compatibility)  Given the
> low cost of decent FPGAs I personally would still use an FPGA.  Use a
> reprogrammable serial prom to hold the code.  Given the capacity of MEs to
> short or otherwise mis-connect outputs and inputs, I would condsider
> something in a socketable package for the FPGA (or, if you decide
> differently, the CPLD.)  The reason I suggest the FPGA is that from personal
> experience, the project can quickly grow all out of proportion.

Thanks for your input.

Egad, an FPGA just isn't necessary.  Remember, the existing 
functionality utilizes about 5-10 2-input gates.

And the Coolrunner XPLA3 has 5V tolerant IO.

> 
> Here is an idea that would perhaps really simplify the job for you...  What
> about one of the demo boards from one of the distributors to do the job for
> you.  Typically they have at least one clock input with an on-board
> oscillator(in case you need a digital one-shot or two)  total cost would be
> about $200 at most.  One source that I have seen that has impressed me is
> Insight Electronics.  I have seen their boards and they are pretty good
> quality.

Yes I am aware of that.  Unfortunately, I tend to like to make my own 
boards, because I am very fussy about having control over every little 
aspect of the circuitry.  In fact I just completed a CPLD dev. board 
that took several months of after work hours at home to design.  But it 
is *so* flexible and just the way I like it that I have no regrets about 
spending the time rather than buying one off the shelf.

Good day!
-- 
____________________________________
Christopher R. Carlen
Principal Laser/Optical Technologist
Sandia National Laboratories CA USA
crcarle@sandia.gov

Article: 47768
Subject: Re: Need advice wiring up a CPLD
From: "Christopher R. Carlen" <crcarle@sandia.gov>
Date: Thu, 03 Oct 2002 12:02:31 -0700
Links: << >> << T >> << A >>

markp wrote:
> "Christopher R. Carlen" <crcarle@sandia.gov> wrote in message
> news:3D9C5E6E.5060500@sandia.gov...
> <snip>
> 
>>Oh, one other important thing!  The CPLD runs at 3.3V, so I need level
>>shifting anyway.
>>
>>Thanks for your interest.
>>
>>Good day!
>>
> 
> 
> If you're using buffers you might want to consider using 74LVC4245A level
> shifting buffers from Philips, that way you can drive from the PLD 3.3V
> logic and be able to select 3.3V or 5V CMOS levels on your outputs.
> 
> http://www.philipslogic.com/products/lvc/pdf/74lvc4245a.pdf
> 
> Mark.
> 
> 

Hmm, in this case there is zero likelyhood of having to output 3.3V 
levels.

What is the likelyhood that commercial instruments in the next few years 
will have user inputs that are 3.3V level instead of 5V TTL compatible 
levels?

There are so many logic levels these days, it makes sense to keep the 
external world interface standardized on one thing, so all instruments 
can talk to each other.  5V works for me.  I hope it stays that way.

Thanks for your input!

-- 
____________________________________
Christopher R. Carlen
Principal Laser/Optical Technologist
Sandia National Laboratories CA USA
crcarle@sandia.gov

Article: 47769
Subject: Re: Altera FPGA as ISA I/O device
From: rickman <spamgoeshere4@yahoo.com>
Date: Thu, 03 Oct 2002 15:35:34 -0400
Links: << >> << T >> << A >>

Ru-Chin Tsai wrote:
> 
> I now emulate ISA bus model and my core design on the FLEX 10k. PC can
> communicate with my core design for large testbench.  Now the FLEX 10k
> act as a I/O card device. It is assigned with a IRQ and a segment of
> I/O port address. ISA I/O device must be initialized at 'power on' of
> motherboard. And the OS will load my device driver when booting. The
> problem is that I use ByteBlaster(LPT) to download programming data of
> FPGA. So I must boot twice, one for programming FPGA as a ISA I/O
> device(contian ISA bus model and my core design) and the other for
> initialing ISA I/O device and loading my device driver. Does I can
> program FPGA without PC and ready the ISA I/O device first, then power
> on the PC? Which programming method sould I select?

Your request is not completely clear to me, but I think you are looking
for a way to automatically load the FPGA on boot up.  If your design is
a little more stable you can wire a serial EEPROM onto the board and the
FPGA will load directly from that.  There are app notes on this at the
Altera web site.  Atmel makes some nice reprogrammable parts for this. 
I believe one or the other site even has plans on how to connect a
serial memory along with a cable to allow you to reprogram the EEPROM or
the FPGA, your choice, IIRC.  

-- 

Rick "rickman" Collins

rick.collins@XYarius.com
Ignore the reply address. To email me use the above address with the XY
removed.

Arius - A Signal Processing Solutions Company
Specializing in DSP and FPGA design      URL http://www.arius.com
4 King Ave                               301-682-7772 Voice
Frederick, MD 21701-3110                 301-682-7666 FAX

Article: 47770
Subject: Re: Need advice wiring up a CPLD
From: Jim Granville <jim.granville@designtools.co.nz>
Date: Fri, 04 Oct 2002 08:16:19 +1200
Links: << >> << T >> << A >>

Christopher R. Carlen wrote:
> 
> 
> Oh, one other important thing!  The CPLD runs at 3.3V, so I need level
> shifting anyway.

 Given the style of this project, you should perhaps look at a 
5V PLCC device. ( eg Atmel ATF1508ASL )
 That way users can replace it if it gets damaged, or you could
even use a ZIF socket, they can have their own chips.

( Note : JTAG has a finite cycle count, so many changes will 
'wear out' the chip - and they might prefer to 'pick a chip' 
over find a file.. )

 ATF1508ASL quotes > 10,000 (re)pgm cycles, 
 XC2 quotes 1,000 (re)pgm cycles, 
 MAX3000A quote > 100  (re)pgm cycles, 

 If the functions are simply 'mostly AND, some OR', then
almost any PLD will handle that - your only ceiling 
is Fan-in, which is the limit on logic functions per block,
so you are limited to a XX IP AND gate 
( XX = varies with brand 36/40/..)

 With spare I/O pins, you can also merge logic, so the physical
limit will be all IP 

 - jg

Article: 47771
Subject: Re: ANN: Embedded processor for Tcl language
From: Jim Granville <jim.granville@designtools.co.nz>
Date: Fri, 04 Oct 2002 08:25:16 +1200
Links: << >> << T >> << A >>

Scott Thibault wrote:
> 
> Tcl interpreter uses a bytecode compiler internally to improve performance.
> That bytecode is not suitable for hardware implementation, however.  We
> designed a special bytecode for this purpose, and yes, the FPGA is like a
> virtual machine for this bytecode.
> 
> --Scott

 Sounds interesting - Can you post a small example of the flows ?
 - something like tiny source code / intermediate sizes / final speed,
and size of the Tcl engine itself.. ?

 Not everyone here will know Tcl in detail, but the general 
application of script handling within FPGA is usefull to get a handle
on.
 
 - jg

Article: 47772
Subject: Re: ANN: Embedded processor for Tcl language
From: nweaver@ribbit.CS.Berkeley.EDU (Nicholas C. Weaver)
Date: Thu, 3 Oct 2002 20:41:42 +0000 (UTC)
Links: << >> << T >> << A >>

In article <upolclncc0qr16@corp.supernews.com>,
Scott Thibault <thibault@gmvhdl.com> wrote:

>Tcl is not so different from Scheme if you replace []'s with ()'s, but ...

Actually, its a big difference due to scoping rules.  Tcl's scope
semantics (dynamic scope) is a BUG, but a bug which arrises from its
original incarnation as string munging which meant that it couldn't
have proper closures.

This is a big issue as the Tcl/TK windowing model is rightly patterend
around the notion of binding closures to events.  Yet binding closures
to events are much more predictible and useful with lexical scope, a
total nightmare with dynamic scope.

>there are a couple of advantages to using Tcl. First, memory is managed with
>reference counting, which is simple and predictable.  

Only because Tcl doesn't allow real structures & references.
Reference counting can't collect cyclic structures.
-- 
Nicholas C. Weaver                                 nweaver@cs.berkeley.edu

Article: 47773
Subject: Re: Need advice wiring up a CPLD
From: "markp" <map.nospam@surfanytime.co.uk>
Date: Thu, 3 Oct 2002 21:48:54 +0100
Links: << >> << T >> << A >>


"Christopher R. Carlen" <crcarle@sandia.gov> wrote in message
news:3D9C9447.70908@sandia.gov...
> markp wrote:
> > "Christopher R. Carlen" <crcarle@sandia.gov> wrote in message
> > news:3D9C5E6E.5060500@sandia.gov...
> > <snip>
> >
> >>Oh, one other important thing!  The CPLD runs at 3.3V, so I need level
> >>shifting anyway.
> >>
> >>Thanks for your interest.
> >>
> >>Good day!
> >>
> >
> >
> > If you're using buffers you might want to consider using 74LVC4245A
level
> > shifting buffers from Philips, that way you can drive from the PLD 3.3V
> > logic and be able to select 3.3V or 5V CMOS levels on your outputs.
> >
> > http://www.philipslogic.com/products/lvc/pdf/74lvc4245a.pdf
> >
> > Mark.
> >
> >
>
> Hmm, in this case there is zero likelyhood of having to output 3.3V
> levels.
>
> What is the likelyhood that commercial instruments in the next few years
> will have user inputs that are 3.3V level instead of 5V TTL compatible
> levels?
>

Pretty low I guess, they'd probably make them backwards compatible with bomb
proof inputs anyway.

> There are so many logic levels these days, it makes sense to keep the
> external world interface standardized on one thing, so all instruments
> can talk to each other.  5V works for me.  I hope it stays that way.
>

OK, the reason I mentioned it is because it is easy to go from LVTTL to 5V
CMOS, and easy to go from LVTTL to 3.3V CMOS, but it's quite difficult to
have LVTTL inputs and to be able to select 3.3V or 5V CMOS outputs without
fitting different chips. Interest value only really.

Mark.

Article: 47774
Subject: Re: TCP/IP in FPGA
From: sbilik@NOSPAM.mv.com (Scott Bilik)
Date: 3 Oct 2002 21:38:51 GMT
Links: << >> << T >> << A >>

[Disclaimer: I'm employed by ARC, but I'm an engineer, not in marketing...]

In article <ani10j$9pb$1@slb7.atl.mindspring.net>, Jan Gray wrote:
> Another point on the curve is to add hardware functional units to
> your soft processor core to do the expensive inner loop computations
> (e.g. the packet checksum example) in a dedicated datapath, either
> explicitly, or as a side effect of loading/storing the data.

Certainly this is the focus of companies like ARC and Tensilica. Make
a baseline GPP or DSP and allow the end user to add new instructions
via the control and datapaths of the CPU. The software tools benchmark
and profile where you waste your time and you can then determine what
functions make sense to augment, either with customer opcodes, compute
engines, etc. It's taking time but more customers are getting used to
the idea. It helps tremendously in area, performance, and low power
applications. Jan takes it to an interesting tangent by focusing on
cool FPGA optimization techniques to embed lots o' small processors on
reprogrammable logic. In our ASIC domain a number of our customers
embed multiple ARC processors on a single piece of ASIC silicon.

Regarding your comments about DMA engines, I find most of our
customers, *especially* on the USB side, are very nervous at first
about have the peripheral put the data right where you want it (or
pull it right from memory) -- that's been our approach. There's
something very simple about simply reading and writing to FIFOs.
Unfortunately if an external (to the peripheral) DMA engine reads and
writes to peripheral FIFOs, you're using twice the memory bandwidth.
(read from the FIFO, put it in memory) If the uProcessor does the
movement to/from the FIFOs, you're wasting a ton a bus bandwidth
between the opfetches and the actual data movement. Tell the
peripheral where to put the payload data and let it read/write
directly with a protocol aware DMA engine. Doing that scares a bunch
of customers in the embedded domain.

> In the xr16 design, I replaced a single PC register with a 16-entry
> "PC register file" that makes it easy to run multiple threads or do
> multiple channels of DMA. Datapath cost (n-bit wide datapath): +n
> LUTs, -n FFs. This idea can save you any number of DMA address
> counter(s), address mux(es), and the address mux delay(s), elsewhere
> in the design..

Interesting about the multiple threads: do you rotate the general
purpose register file of the xr16 (between multiple register files) in
sync with the PC thread rotation? How do you keep context between the
threads?

We have two sets of x86 HDL cores -- Classic and Turbo. In our
Turbo186 we threaded the standard programmable DMA engine tightly into
the control of the general uProcessor execution engine to eliminate
arbitration. If there are any DMA channels pending operation, the bus
cycles are rotated automatically by the execution state machine:

  DMA read (1 cycle), 
  opcode execution (1 cycle of it),
  DMA write (1 cycle),
  DMA read (1 cycle), 
  opcode execution (1 cycle of it),
  DMA write (1 cycle),
  [...]

This way there is no arbitration and the processor can't be starved
by big DMA movement. The DMA engine needed the dead cycle anyway so
at least the processor got to do some useful work in between.

> Thus it seems to me that an enhanced { 200 LUT, 1 BRAM } 16-bit RISC
> soft core could make a pretty capable MAC, and as a bonus you can
> build a (software) TCP/IP engine for zero additional LUTs.

> Jan Gray, Gray Research LLC

We had at least two customers take our small V8 RISC softcore and bolt
it to our MAC cores to build things like filters, switches, bridges,
network aware devices. Yes, they could prototype them in cheap
Spartans. It all depends on how advanced you need your TCP/IP
processing to be. Our V8 may not be quite as small as the xr16 but
it's in HDL and was targetted at ASICs, not the neat tricks you can
play in FPGAs at the low level (esp the embedded RAMs). I've read your
papers on the xr16 though. (a couple of years ago) Very cool stuff.
Have you made progress on the compiler/software side in the past
couple of years?

-- 
Scott Bilik
http://BilikFamily.com/

Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources

Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Custom Search